Great intuitive explanation, thank you. Currently taking Stanford XCS236 “Deep Generative Models”. Your video was very helpful in clarifying some of the math, particularly the role of the determinant.
wow... this is absolutely brilliant. Due to the bijective nature of the normalizing flow, you're constrained to only utilizing bijective functions (which is quite limiting indeed). However, by designing the NN structure in this way, you're able to offload parameter learning to an entire internalized NN, where the NN outputs parameters for a bijective function. Mind blown, crazy stuff! After all this is complete the learning piece simply being MLE makes a ton of sense. Dank je wel!
This explanation is amazing... it asks for an understanding of basic concepts of linear algebra and and statistics, but it is still clear enough to understand it, when knowledge in these subjects is more based on intuition than on in depth education Thanks a lot for this, it's really great!
Thank you so much for such a great, clear and easy to follow explication, I like the comparison between flow-based models, GANs and VAEs at the end of the video ! Also, the math explanation is very clear :)
Thank you so much for this video. I'd watched several videos on flow before this one, but this is where it really clicked for me. I echo @MathStuff1234, absolutely brilliant.
Hello, thanks for your explanation. I don't understand how shuffling is an invertible function, do you have to remember the places where you shuffled your points?
I think each random shuffling is selected once when the architecture of the network is established, and does not change after that. Done this way, the shuffling can be undone.
Great intuitive explanation, thank you.
Currently taking Stanford XCS236 “Deep Generative Models”. Your video was very helpful in clarifying some of the math, particularly the role of the determinant.
This is the greatest explanation of coupling layers I've seen. Thank you
wow... this is absolutely brilliant. Due to the bijective nature of the normalizing flow, you're constrained to only utilizing bijective functions (which is quite limiting indeed). However, by designing the NN structure in this way, you're able to offload parameter learning to an entire internalized NN, where the NN outputs parameters for a bijective function. Mind blown, crazy stuff!
After all this is complete the learning piece simply being MLE makes a ton of sense.
Dank je wel!
This is an incredibly nice explanation! Thank you so much
This explanation is amazing... it asks for an understanding of basic concepts of linear algebra and and statistics, but it is still clear enough to understand it, when knowledge in these subjects is more based on intuition than on in depth education
Thanks a lot for this, it's really great!
This is the most useful lecture for starting normalizing flow!!!
Thanks for your generative model series!
Great explanation! Straight to the point and clear!
Thank you so much for such a great, clear and easy to follow explication, I like the comparison between flow-based models, GANs and VAEs at the end of the video ! Also, the math explanation is very clear :)
Hello, I really enjoyed the explanation. It was easy to follow and the analogy was very useful!
Thank you so much for this video. I'd watched several videos on flow before this one, but this is where it really clicked for me. I echo @MathStuff1234, absolutely brilliant.
The explanations! very impressive.
very nice explanation for me as an data science studen thank you
Extremely good!
it like diffusion model now aday! greate!
Hi, that is an amazing lecture. Thank you so much for the video. Could you please post the lecture powerpoints?
Nice video again, however I could not wrap my head around: If I use random shuffling, how can it still be invertible?
I think at 9:00, because of the chain rule we must evaluate not at x, but like this (example for 2 functions): Df(x)=Df_1(f_2(x)) Df_2(x)
Hello, thanks for your explanation.
I don't understand how shuffling is an invertible function, do you have to remember the places where you shuffled your points?
I think each random shuffling is selected once when the architecture of the network is established, and does not change after that.
Done this way, the shuffling can be undone.
Same here. This does not look differentiable
Are diffusion models are specific implementation of it? Or something else?