This is the best explanation I’ve found on GANs, thank you!! I’m currently training a DCGAN, however in terms of theory, are there any differences between DCGAN vs GAN as from my understanding, the difference is DCGAN utilizes deep convolution networks and GAN utilizes fully connected layers, however is the theory described in this video also applied to DCGAN? Thanks for your help, appreciate it!
I believe the presenter is knowledgeable. However, some details are not well explained and not consistent, such as 11:46, he mentioned that this formula and there is no intuitive explanation, as this type of RUclips presentation is for the general public who has no in-depth of understanding of either maths or deep-learning.
🎯 Key Takeaways for quick navigation: 00:00 *🧠 Overview of Generative Adversarial Networks (GANs)* - GANs consist of two models: a generative model (G) and a discriminative model (D). - Generative models learn the joint probability distribution of input and output variables, while discriminative models learn the conditional probability of the target variable given the input variable. - GANs use an adversarial setup where the generator produces fake data points, and the discriminator distinguishes between real and fake data, leading to both models improving over time. 02:13 *📊 Structure and Components of GANs* - GANs consist of multi-layered neural networks representing the generator (G) and discriminator (D). - Theta G and theta D represent the weights of the respective networks. - GANs utilize a noise distribution as input to the generator to produce data points similar to the original distribution. 05:26 *🔢 Understanding the Value Function of GANs* - The value function represents the objective of G (minimize) and D (maximize) in the GAN setup. - The value function resembles the binary cross-entropy function, crucial for training GANs. - Expectation (E) is used to calculate the average value over the entire dataset, essential for continuous distributions. 08:37 *🔄 Training Process and Optimization of GANs* - GAN training involves an iterative process where the generator and discriminator alternate updates. - Stochastic gradient descent is used to optimize the loss function. - The discriminator is updated to maximize the value function, while the generator is updated to minimize it. 10:42 *🎯 Convergence and Guarantee of GANs* - The goal is to prove that the generator's distribution converges to the original data distribution. - Jensen-Shannon divergence is a method used to measure the difference between two distributions. - At the global minimum of the value function, the generator's distribution becomes indistinguishable from the original data distribution. 14:48 *⚙️ Phases of GAN Training* - GAN training progresses through phases where initially, both generator and discriminator perform poorly. - As training continues, the discriminator becomes adept at distinguishing real and fake data, while the generator's distribution approaches that of the original data. - At convergence, the discriminator cannot differentiate between real and generated data, achieving the desired outcome. Made with HARPA AI
9:29 in gradient update part, why would generator try to make good images.....as D(g(z)) will be close to zero imples nothing to learn for generator...I dint get ascent and descent at all from any video
correct, i still don't get this part, how can he do that? it's true that they can have the same range of values (both are images with same width x height dimension) but that doesn't mean they can swapped each other's places?
Thank you so much for this video! *I think the JS divergence equation needs a second ln sign after ...+1/2 Ex pg ? The equation appears at 12:53. Thank you again!
Brilliant presentation - simplifies the math using intuitive explanations and examples (same for the video about the binary cross entropy ) - thank you for this
Hey ! What does it mean, when people say, data points/ images/ texts (on which we train our model) belong to a distribution ? What is its inuitive meaning about belonging to a distribution and how are they sure about the real life data belonging to a distribution ?
I really liked your question. So here's the thing... I hope you are comfortable with the distributions in 1 or 2 dimensions e.g. distribution of height and weight of a population. Now imagine we are talking about images. Can we represent an image with 1 or 2 dimensions? No. For a 256px*256px RGB image we need 256*256*3 dimensions. Suppose you have 1000 such images of flowers. Now you can plot the pixel values in each dimension right? If you do this for 1000 images you will get the pixel distribution or simply the distribution of your dataset. Then the goal of your ML model will be to capture this distribution. I talked about pixels but the idea can be used in words (text data) also. And something belonging to a distribution means it follows (looks similar) the training dataset. Obviously in Statistics we can mathematically say if something belongs to a distribution or not. But intuitively it means "looks similar".
Great video, thanks! The label of 0 for the reconstructed image. Is that correct? According to another reference I have, it should set the labels to 1 to fool the discriminator into thinking the image is real? Edit, my bad. Yes, you are correct, feeding y = 0 into the discriminator is correct. The label 1 is then used to train the generator :)
What would really help is if you add links to your videos if some concepts in a video have been discussed more in depth. So if I don't understand some concept that is glossed over here I can scroll down and click the video that explains it in more depth.
So I heard someone saying it's easier for the discriminator to predict a fake data than It is for the generator to create a fake data who could pass as original. That would make the system unbalaced. Is It true and If so, Is there a way to fix It?
@Giselle Rodrigues Yes, it is true especially at the initial stages of the training. As a matter of fact, GANs are very unstable. It generally requires a lot of trial and error to find the best architecture (and other hyper-parameters) for a given dataset. But luckily, researchers have found some ways to improve the training. Here, you can find some of them. machinelearningmastery.com/how-to-train-stable-generative-adversarial-networks/
@@NormalizedNerd thank you for the reply!!! I wasn't expecting It to be so fast! Haha I am gonna read It and maybe come back with more questions. Hahaha
I like how he pronounce "z" as "zed" but "G of Z" as "G of zee".
I present you the indian english : )
focus on the message he is explaining not the pronunciation
Hey Sujan, I just started learning GANs and I happen to stumble upon your video & your channel ! Great work :) You have a bright future ahead :)
This is the best explanation I’ve found on GANs, thank you!! I’m currently training a DCGAN, however in terms of theory, are there any differences between DCGAN vs GAN as from my understanding, the difference is DCGAN utilizes deep convolution networks and GAN utilizes fully connected layers, however is the theory described in this video also applied to DCGAN? Thanks for your help, appreciate it!
Yeah, same theory. As you correctly mentioned, just replace fully connected layers with conv and pooling layers.
This is a well made and well explained video, one can be extremely grateful to you for this
Thanks a lot!
I didn't manage to understand starting from Binary Crossentropy Function :(
I believe the presenter is knowledgeable. However, some details are not well explained and not consistent, such as 11:46, he mentioned that this formula and there is no intuitive explanation, as this type of RUclips presentation is for the general public who has no in-depth of understanding of either maths or deep-learning.
This is a great explanation! I'd love to see more in depth videos! If you could cover autoencoders that'd be really cool too!
Thanks! I have one for autoencoders:
ruclips.net/video/m2AyljDHYes/видео.html
Awesome explanation thank you very much. Subscripted
🎯 Key Takeaways for quick navigation:
00:00 *🧠 Overview of Generative Adversarial Networks (GANs)*
- GANs consist of two models: a generative model (G) and a discriminative model (D).
- Generative models learn the joint probability distribution of input and output variables, while discriminative models learn the conditional probability of the target variable given the input variable.
- GANs use an adversarial setup where the generator produces fake data points, and the discriminator distinguishes between real and fake data, leading to both models improving over time.
02:13 *📊 Structure and Components of GANs*
- GANs consist of multi-layered neural networks representing the generator (G) and discriminator (D).
- Theta G and theta D represent the weights of the respective networks.
- GANs utilize a noise distribution as input to the generator to produce data points similar to the original distribution.
05:26 *🔢 Understanding the Value Function of GANs*
- The value function represents the objective of G (minimize) and D (maximize) in the GAN setup.
- The value function resembles the binary cross-entropy function, crucial for training GANs.
- Expectation (E) is used to calculate the average value over the entire dataset, essential for continuous distributions.
08:37 *🔄 Training Process and Optimization of GANs*
- GAN training involves an iterative process where the generator and discriminator alternate updates.
- Stochastic gradient descent is used to optimize the loss function.
- The discriminator is updated to maximize the value function, while the generator is updated to minimize it.
10:42 *🎯 Convergence and Guarantee of GANs*
- The goal is to prove that the generator's distribution converges to the original data distribution.
- Jensen-Shannon divergence is a method used to measure the difference between two distributions.
- At the global minimum of the value function, the generator's distribution becomes indistinguishable from the original data distribution.
14:48 *⚙️ Phases of GAN Training*
- GAN training progresses through phases where initially, both generator and discriminator perform poorly.
- As training continues, the discriminator becomes adept at distinguishing real and fake data, while the generator's distribution approaches that of the original data.
- At convergence, the discriminator cannot differentiate between real and generated data, achieving the desired outcome.
Made with HARPA AI
Good initiative but very similar to Ahlad Kumar's explanation.
Thanks for your feedback. I actually didn't know about that.
Do you provide personal lessons? @normalized nerd
Sorry, but I don't.
Truly an exceptional and informative video! Literally made me understand the concept very well!!
AWW, I'm disappointed..... 22,322 views...... so close to 22,222 =]
9:29 in gradient update part, why would generator try to make good images.....as D(g(z)) will be close to zero imples nothing to learn for generator...I dint get ascent and descent at all from any video
This is the Best resource fo gan i've come across so far. Very detailed explanation with complicating the terms. You are a life saver! . thank you
really good explanation!! Understood clearly
at 0.22, you pronounced content wrong. sound of "con" in content should be like "con" in con man
Woah !!!! I consider myself bad at math, but this video was like a hot knife in my dense but butterish brain! thank you !
Amaizng video bro..Could you please make a playlist for Deep learning(include this video) and/or reinforcement learning.
Ok I'll try to create a playlist.
I almost never leave comments... this is an amazing summary of the mathematical aspect and quirks of the paper. Thank you!
This video is a perfect and most explanatory video on this topic, absolutely love it.
speak slowly my dude. Noone is chasing you.
What about the sign he told us to forget? should it not be considered at the end? why?
was very easy to understand. thank you
Nicely explained.
Do make a playlist of regression classification nlp deep learning so that we can easily follow up.
Great job 👌👌👌
Thanks!
Actually, I have playlists for NLP, ML from scratch. Will try to make one for Deep Learning.
It's not clear why you replace p_z(z) by p_g(x) when showing global optimality. x and z are on a different space.
correct, i still don't get this part, how can he do that? it's true that they can have the same range of values (both are images with same width x height dimension) but that doesn't mean they can swapped each other's places?
Great work thank you very very much❤
Sorry please teach me, I don’t understand how the function V(G,D) corresponds to the loss of generator and discriminator…
V(G, D) is total loss of the model(fake image loss + real image loss), u then do partial derivation with respect to generator and discriminator
Amazing.. Just no words. Standing ovation to you.
means a lot ❤
mathematically convincing
Very nice explanation
Outstanding content
Best video on Gan
"geee izz da virgin ass" lol
Thanks Sujan for the excellent tutorial on the math behind GANs. I just started learning about GANs today, and cleared my eyes on the subject.Kudos!
Great job!!
Dude. Awesome!! Literally you explain better than medium "how to"'s :) expecting awesome content
Nice explanation next lets create a neuro network from scratch
@Green UFO_010 thank you man! Yeah more interesting videos are on the way. Keep supporting :D
Neural Network from scratch is definitely in my bucket list!
Spectacular
Such a good explanation, man! Thank you so much!!
Amazing video. I'm still confuse that what is difference between normal GAN vs GAN CLS? Can you explain a little bit
In GAN CLS, the input is a sentence vector + some noise. In normal GAN it's just noise.
@@NormalizedNerd would love to see your video on GAN vs GAN CLS if you make one.
This is exactly what I wanted. Your explanation was amazing and very clear.
Thanks Man
Congratulation....one of best marh explanations of GAN ever🎉🎉
Thank you
Killer!!
How do you get the equation of binary cross entropy from the cross entropy definition?
SOOO GOOOD
Thank you very much for the video! Can someone help me and explain why 1 - was dropped at 12:30
Thank you so much for this video! *I think the JS divergence equation needs a second ln sign after ...+1/2 Ex pg ? The equation appears at 12:53. Thank you again!
Oh...You are right. I forgot the ln sign. Thanks for pointing this out :D
@@NormalizedNerd Great, just wanted to make sure I understood this correctly :)) Thank you!!
Wow.this was very educational.
Please make more videos on gans.
Thank you! Very well explained. Mentioned all the underlying mathematical concepts on which GAN is based on.
Most welcome!
Nice tutorial. Which software you are using for writing on the board here ?
This was a fantastic video, Im might actually pass my class now!
I'm really thankful. Great explanation!
Brilliant presentation - simplifies the math using intuitive explanations and examples (same for the video about the binary cross entropy ) - thank you for this
❤️❤️
Thanks for explaining this advanced topic!
Glad you liked it. Keep supporting the channel :D
Cool video, thank you very much :)
Why is your channel so underrated ?!!!
Very well explained.
Can you once try gan inversion as well?
Hey ! What does it mean, when people say, data points/ images/ texts (on which we train our model) belong to a distribution ? What is its inuitive meaning about belonging to a distribution and how are they sure about the real life data belonging to a distribution ?
I really liked your question. So here's the thing...
I hope you are comfortable with the distributions in 1 or 2 dimensions e.g. distribution of height and weight of a population. Now imagine we are talking about images. Can we represent an image with 1 or 2 dimensions? No. For a 256px*256px RGB image we need 256*256*3 dimensions. Suppose you have 1000 such images of flowers. Now you can plot the pixel values in each dimension right? If you do this for 1000 images you will get the pixel distribution or simply the distribution of your dataset. Then the goal of your ML model will be to capture this distribution.
I talked about pixels but the idea can be used in words (text data) also.
And something belonging to a distribution means it follows (looks similar) the training dataset. Obviously in Statistics we can mathematically say if something belongs to a distribution or not. But intuitively it means "looks similar".
Great explanation for something as complex as GAN.
Wow wow Thank you! Well explained.
Thanks for such a great explanation
Great video, thanks! The label of 0 for the reconstructed image. Is that correct? According to another reference I have, it should set the labels to 1 to fool the discriminator into thinking the image is real? Edit, my bad. Yes, you are correct, feeding y = 0 into the discriminator is correct. The label 1 is then used to train the generator :)
Thanks for the great effort in making the videos. God bless you
what whiteboard software is that ?
Microsoft OneNote
Thank you for posting this!
My pleasure!
Good explanation. Thank you!
Clear mathematical explanation
Would like to see Use of Regularization functions/terms in loss function through equations....Plz make VDO on this
Btw I talked about l1, l2 regularization a bit here: ruclips.net/video/FiSy6zWDfiA/видео.html
you deserve Best explanation award
sera porali bhai
onek dhonnyobad bhai!
It was great explanation
Best explanation ever!
What would really help is if you add links to your videos if some concepts in a video have been discussed more in depth. So if I don't understand some concept that is glossed over here I can scroll down and click the video that explains it in more depth.
Feedback noted!
What platform you're using for the videos?
For this video I used Microsoft OneNote
@@NormalizedNerd no for the animation
@@subratswain6775 For the animations I use manim (open source python library)
@@NormalizedNerd can you send the installation steps. I tried to create but couldn't
@@subratswain6775 I'll suggest you to follow youtube tutorials for installing manim. It's not very easy to set up.
Well, thanks for the transfer learning🤭... You have explained it in a very crisp manner.. Keep up the good work 👍
My pleasure 😊
Amazing video man!
Thanks! Wonderful explanation. But it seems that 9:58 needs to adding a sum sign in front of the square brackets.
The summation is taken care by the inner loop
great video!
Apni ki Bengali?
হ্যাঁ 😌
Ek bangali e arakbangalir kotha sune bujde pare. Anyway cheers mate
Excelent video!
Incredibly well made !
thank you. Great explanation
You are welcome!
Extremely good explanation!!!!
Thanks! :D
This is so.. GOOD! Thank you so much!!!!
❤❤
Great, dont stop! Keep making such nice videos.
More to come!
Awesome explaination
Glad you think so!
wow, gan er gan beregelo amar thanks for that.
Hee Hee ❤️❤️
Really nice, I really like the part at 11:52, made that part so much easier to understand with that visual example
Thanks mate :D
So I heard someone saying it's easier for the discriminator to predict a fake data than It is for the generator to create a fake data who could pass as original.
That would make the system unbalaced.
Is It true and If so, Is there a way to fix It?
@Giselle Rodrigues Yes, it is true especially at the initial stages of the training. As a matter of fact, GANs are very unstable. It generally requires a lot of trial and error to find the best architecture (and other hyper-parameters) for a given dataset. But luckily, researchers have found some ways to improve the training. Here, you can find some of them.
machinelearningmastery.com/how-to-train-stable-generative-adversarial-networks/
@@NormalizedNerd thank you for the reply!!! I wasn't expecting It to be so fast! Haha
I am gonna read It and maybe come back with more questions. Hahaha
Haha. Sure. Keep supporting.
Very informative. Thanks for this clear explanation!
You are welcome!
The first well explained GAN I've ever seen, thanks
Thanks! :D
Loved that intro 👌
Do you use tablet to make these notes?
Yeah
You are my favorite person on the internet right now
😁😁
thanks bro
You're welcome :)
Thank you!
You're welcome!
Amazing, keep up the good work (y)
Thanks man!
great job ! thank so much
@Dung Pham You're welcome! Support this channel for more videos :D