How I Understand Diffusion Models

Поделиться
HTML-код
  • Опубликовано: 11 янв 2025

Комментарии • 90

  • @JoseColmenarezMoreno
    @JoseColmenarezMoreno 10 месяцев назад +15

    BRAVO! No one ever have explained the diffusion model in such an easy way with all the details.

    • @jbhuang0604
      @jbhuang0604  10 месяцев назад +1

      Thank you so much for your kind words! This makes my day!

  • @rtluo1546
    @rtluo1546 9 месяцев назад +11

    This is truly a great tutorial video, so well-made. Cannot believe covering so many things within only 17 minutes.

    • @jbhuang0604
      @jbhuang0604  9 месяцев назад +1

      Thanks a lot! Happy that you enjoyed the video!

  • @wangy01
    @wangy01 9 месяцев назад +4

    Thank you for your great work removing the need of the audience to know much prior knowledge before they could enjoy your video. For example, you mentioned maximum likelihood and explain what it is immediately. It is such a challenge to straighten all these in a 17-minute video, but you did a great work. Thank you!

    • @jbhuang0604
      @jbhuang0604  9 месяцев назад

      Glad that you liked it! Appreciate your kind words! This made my day!

  • @ayushsaraf8421
    @ayushsaraf8421 Год назад +14

    incredible explanation with so much detail packed in so little time. Looking forward to more of these

    • @jbhuang0604
      @jbhuang0604  Год назад

      Thanks, Ayush! Glad that you like it!

  • @agnivsharma9163
    @agnivsharma9163 13 дней назад +1

    This is the best video on diffusion models, I can't even imagine how you were able to distill this much info into 17 minutes

    • @jbhuang0604
      @jbhuang0604  12 дней назад +1

      Glad it was helpful! Thanks a lot!

  • @LeviAckerman99999
    @LeviAckerman99999 3 месяца назад +4

    I can only dream that you were my PhD advisor. This is so nicely explained!

  • @alexpeng6705
    @alexpeng6705 Год назад +6

    Thanks for your efforts in making such a high-quality video!
    I like the way you break down such complex ideas in a concise manner and visualize them intuitively and elegantly. I wish I could have this video six months ago, lol.

    • @jbhuang0604
      @jbhuang0604  Год назад

      Thanks for your kind words! It's a fun video to make, and I also learn a lot about diffusion models through the process.

  • @HangLe-ou1rm
    @HangLe-ou1rm 22 дня назад +1

    Thank you for such a great videos with all the steps and equations explained so clearly! I was looking for the referenced papers to dive deeper and found those in the video description! I've learned so much through the video! Your students are so lucky to have such a dedicated instructor!

    • @jbhuang0604
      @jbhuang0604  21 день назад

      Thanks so much for your kind words!

  • @JionghaoWang-fs1uq
    @JionghaoWang-fs1uq Год назад +5

    You are a true educator! Great video!

    • @jbhuang0604
      @jbhuang0604  Год назад

      Thank you so much! Glad that you like the video.

  • @Funnyshoes321
    @Funnyshoes321 Год назад +1

    Thanks a lot for the videos! I've been self-studying diffusion models on the side for a few months now and this is the only video I've seen that gives an in-depth yet intuitive explanation of the math.

  • @4thlord51
    @4thlord51 8 месяцев назад +2

    I'm building my own diffusion model myself. This is the best breakdown and visualization of the mathematics and implementation. Well done.

    • @jbhuang0604
      @jbhuang0604  8 месяцев назад +1

      Thank you! This comment just made my day!

  • @yuktikaura
    @yuktikaura 11 месяцев назад +1

    @Jia-Bin Huang we want to maximize likelihood and also minimize KL divergence so that we can "maximize" similarity between two distributions..it is stated other-way round at timestamp 1:19 to 1:121

    • @jbhuang0604
      @jbhuang0604  11 месяцев назад

      Yes! You are right! Maximize likelihood -> Minimize KL divergence -> Maximize similarity between the two distributions.
      I got confused with too many negations. :-P

  • @curiousobserver2006
    @curiousobserver2006 9 месяцев назад +1

    seriously one of the best educational videos I've ever watched.

  • @faiz.wahab7
    @faiz.wahab7 Год назад +1

    Very compressive and precise. Thanks. Also thanks for tweedie formula and simplifying score based model. That is the most convoluted part in most papers. Looking forward to demystified NERFs from you!

  • @Charles-my2pb
    @Charles-my2pb Год назад +1

    Thank you so much for your contribution. It's a tutorial make me clear about Diffusion, as beginner.

    • @jbhuang0604
      @jbhuang0604  Год назад

      You are welcome. Glad it was helpful!

  • @khalilsabri7978
    @khalilsabri7978 8 месяцев назад

    Just one minute in the video, you know it's extremely well done. Thanks for the video !

    • @jbhuang0604
      @jbhuang0604  8 месяцев назад

      Glad you liked it! Thanks so much for the comment!

  • @bingzha6099
    @bingzha6099 Год назад +1

    Really enjoying watching this video and learned a lot. Hope more such videos in the future.

  • @pedroenriquelopezdeteruela6545
    @pedroenriquelopezdeteruela6545 9 месяцев назад +1

    Awesome post, Jiang, thank you so much for the great job!
    Anyway, a small comment/question on your video (without too much importance, I assume). At minute 5:56 you comment that (direct derivation of formula (7) in the paper "Denoising Diffusion Probabilistic Models"), mu^hat_t(x_t,x_0) is on the line joining x_0 and x_t. And, while this is approximately true for "normal" beta_t scheduling, I think that the estimated mean as a function of x_0 and x_t need not be exactly on such a line since, in general, the respective multipliers of x_0 and x_t in such an equation need not (in general) add up to one.
    In fact, in "normal" scheduling, as t increases, it seems that this sum keeps progressively moving away from 1, so that although obviously mu_t will continue to be a simple linear combination of both x_t and x_0, the fact is that it will progressively move away (although by a small amount) from this line.
    Would you agree with this observation?
    Greetings, and again, congratulations for the video and thank you very much for clarifying us the inners of diffusion models!

    • @jbhuang0604
      @jbhuang0604  9 месяцев назад

      Thank you so much for your comment! You are right! It won’t be on the line when the multipliers are not adding up to one.

  • @420_gunna
    @420_gunna Год назад +2

    Awesome video, hope I'm smarter when I try to rewatch it in 3 months ;)

    • @jbhuang0604
      @jbhuang0604  Год назад

      Glad you liked it! Let me know if you have questions.

  • @welann
    @welann 7 месяцев назад +1

    Thank you for making such a high quality video! It's very helpful for me to understand the diffusion model!

    • @jbhuang0604
      @jbhuang0604  7 месяцев назад +1

      You're very welcome! Happy that it was helpful!

  • @emreakbas9289
    @emreakbas9289 11 месяцев назад +1

    Great explanation, Jia-Bin! Thanks!

  • @nikitadrobyshev7953
    @nikitadrobyshev7953 9 месяцев назад +1

    OK, this is the best video explanation of diffusion models I saw. Ideal ratio between simplifications and depth☺👏

    • @jbhuang0604
      @jbhuang0604  9 месяцев назад

      Glad it was helpful! Thank you so much for your kind words!

    • @wangy01
      @wangy01 9 месяцев назад +1

      I agree. The author must have carefully chosen the most efficient way cutting into the complex concept hierarchy and every single word to achieve that efficiency.

  • @AIwithAndy
    @AIwithAndy 11 месяцев назад +1

    I appreciated the explanation of conditional generations. Nice job!

    • @jbhuang0604
      @jbhuang0604  11 месяцев назад

      Thanks so much! Glad that you like it.

  • @pinkpig7505
    @pinkpig7505 Год назад +1

    What a timing 🙌 needed this explanation so bad... thanks ✌️

  • @Otroidentificador
    @Otroidentificador Год назад +1

    I would say Top quality video! Congratulations!🎉 More like this would by awesome!

  • @nutshell1811
    @nutshell1811 9 месяцев назад +1

    Best video on diffusion!!

    • @jbhuang0604
      @jbhuang0604  9 месяцев назад

      Great! Glad that it’s helpful!

  • @orisenbazuru
    @orisenbazuru 8 месяцев назад

    Great video! At 1:21 should be maximizing similarity between two distributions. Or minimizing the distance between two distributions.

    • @jbhuang0604
      @jbhuang0604  8 месяцев назад

      Thanks for pointing this out! Yes, you are right! It should be *maximizing* the similarity between the two distributions.

  • @ye8495
    @ye8495 6 месяцев назад +1

    great video explained! A lot of things behind for me to explore

  • @youtube_showcase
    @youtube_showcase 8 месяцев назад +1

    Amazing work! Thank you for sharing 😀

  • @morrisfan2004
    @morrisfan2004 4 месяца назад +1

    Great explanation

  • @diodin8587
    @diodin8587 2 месяца назад

    3:55 isn't that we drop the first term because it doesn't dependent on θ? q(xT|x0) is just an approximation of true N(0,1).

  • @HuangMichel
    @HuangMichel 6 месяцев назад +1

    Great content!

    • @jbhuang0604
      @jbhuang0604  6 месяцев назад

      Thanks a lot! Glad you like it!

  • @visioncai293
    @visioncai293 2 месяца назад

    Like this video so much! It is quite helpful to learn the math behind it, with a lot of humor and fun as vital as the Gaussian to the diffusion. Wonder what the distribution of Professor Huang's humor is. Thanks for making this video.

    • @jbhuang0604
      @jbhuang0604  2 месяца назад

      Cool! Glad you enjoyed it!

  • @Raymond-zv5gr
    @Raymond-zv5gr 8 месяцев назад +1

    BRO YOU ARE EPIC

  • @RezaMohammadi-c7s
    @RezaMohammadi-c7s 29 дней назад +1

    Great video

  • @kathyker3498
    @kathyker3498 4 месяца назад +1

    shout out to NCTU alumni! great video with so many sound effect, good visualization and metaphor!
    Just wish there's more reference to the derivation of the math part, as it's still a bit hard to follow even though I suspended the video so many times haha

    • @jbhuang0604
      @jbhuang0604  3 месяца назад

      Noted! Thanks a lot for the comment!

  • @SurajBorate-bx6hv
    @SurajBorate-bx6hv 7 месяцев назад

    Thankyou for great step by step explanation. Can you share any good resources and insights for implementing diffusion for own custom images?

    • @jbhuang0604
      @jbhuang0604  7 месяцев назад

      Hi! No problem. I think huggingface's diffuser probably has the best resources. Check it out: huggingface.co/docs/diffusers/en/index

  • @johnini
    @johnini 6 месяцев назад +1

    I still need to get my head around the math! but like everyone else said, amazing video!!
    One question!
    How to you imagine a distribution of high resolution images?!
    Would it be like a point in high dimensional space? where the coordinates are the intensities of its pixels?! and from a high dimensional noise vector we move to the vector on the dataset distribution?
    Thanks looking forward future videos

    • @jbhuang0604
      @jbhuang0604  6 месяцев назад +1

      Thanks for the question. I agree that it's kind of difficult to imagine the distribution of images as it's high-dimensional. For a grayscale 100x100 image, we are talking about a 10,000-dim space! And you are right, the "coordinate" of each dimension indicates the intensity of a particular pixel. Diffusion models learns to predict the vectors in this space so that iteratively we push some random noise to regions in this high-dimensional space so that they look like real images in the dataset.

  • @yasserothman4023
    @yasserothman4023 6 месяцев назад

    thanks for the work, if i want to get x from y=Hx+n if i have noisy x (which is y) by using diffusion work what should be done ? what literature you know that had tackled similar problems ?

    • @jbhuang0604
      @jbhuang0604  6 месяцев назад

      Thanks for the question. Diffusion models have been applied to various image restoration tasks.
      The earliest work is probably this one: arxiv.org/pdf/2011.13456 (see section 5), where they can perform conditional (on noisy/masked image) restoration using an unconditioned model.
      You can also directly train a model for image restoration if you have paired examples. See a recent work here arxiv.org/abs/2303.11435

  • @mcarletti
    @mcarletti 8 месяцев назад +2

    My like comes with the 5th Symphony (9:39) 😸🎶

    • @jbhuang0604
      @jbhuang0604  8 месяцев назад +1

      Oh My! Finally one person noticed that! (Spent a lot of time making that lol)

  • @theglobalconflict6904
    @theglobalconflict6904 3 месяца назад +1

    Can u tell me which topics i need tk master to understand the notations

    • @jbhuang0604
      @jbhuang0604  3 месяца назад

      I believe that some basics of probability would be sufficient to understand the notations.

  • @sokak01
    @sokak01 7 месяцев назад

    I think there should be a
    abla log q(x_t) instead of p(x_t) at the score matching part.

  • @truonggiangnguyen8844
    @truonggiangnguyen8844 9 месяцев назад

    I have a question: Are all distribution mentioned is distribution of a continuous variable, since we're using integral here?

    • @jbhuang0604
      @jbhuang0604  9 месяцев назад

      Good question! I think there are some development of discrete variational autoencoder and diffusion models. Those methods can deal with discrete variables.

  • @herrbonk3635
    @herrbonk3635 Год назад

    Wish I could hear what you say:
    0:36 "this stickholder"?
    0:43 "hyber we do not know"
    1:13 "just the cadirabigdes"
    and so on

    • @jbhuang0604
      @jbhuang0604  Год назад +2

      You can see the full script by turning on the subtitles/CC. Hope this helps.

    • @herrbonk3635
      @herrbonk3635 Год назад +1

      @@jbhuang0604 I will try, thanks!

  • @sanoj8497
    @sanoj8497 Месяц назад +1

    Awesome explanation