@2:45 - looks like StackGAN didn't do so good. That bird has black wings Maybe it's never seen any red birds with white wings, but that means it's not zero-shot really. not performing well as zero-shot. Just adding some few notes of possible generation flow here (generally speaking) Transposed Convolution 1 - generate visual patterns like edges or color gradients from noise seeds Transposed Convolution 2 - generate encodings of larger shapes, textures, or object parts Transposed Convolution 3 - maps add fine-grained details and realistic effects like shading and reflections. Output Layer -encode the final image structure and details in a way that combines all previous encodings into an image. something like that... I'm still learning, but I like to see it from high perspective.
Haha, you're totally right about the bird! Thanks for pointing that out! I appreciate your breakdown of the transposed convolution layers too, it's great to see how you're thinking about the generation flow. Keep up the great work!
Hi Arohi I have one doubt, How does discriminator works for batches of images? So if batch size is 4 then 4 real images will be compared with 4 random generated images?
Yes, that's correct! For real images- The discriminator's predictions are compared with the valid labels (a tensor of ones, [1, 1, 1, 1] for a batch of size 4). For real images the discriminator tries to output probabilities close to 1. For fake images- The discriminator's predictions are compared with the fake labels (a tensor of zeros, [0, 0, 0, 0] for a batch of size 4).For fake images discriminator tries to output probabilities close to 0.
Can you please make a video showing the python coding for GAN network, also please help me, I need to extract local features of the images to train the model, not the overall global positioning, is CNN a good option for that?
Your way of explaination is excellent as always
Thanks!
Great video ma'am🙌
Keep posting such informative videos
I'm glad you found it helpful!
THANKYOU SO MUCH MAM
FOR YOU THIS TO AND TO MUCH KIND VIDEO YOU DESERVE MILLIONS OF VIEW SUBSCRIBER
You're very welcome! I'm glad you found it helpful.
Awesome video ma’am
Thanks!
Thank You So Much Ma'am,
You're welcome! 😊
@2:45 - looks like StackGAN didn't do so good. That bird has black wings Maybe it's never seen any red birds with white wings, but that means it's not zero-shot really. not performing well as zero-shot.
Just adding some few notes of possible generation flow here (generally speaking)
Transposed Convolution 1 - generate visual patterns like edges or color gradients from noise seeds
Transposed Convolution 2 - generate encodings of larger shapes, textures, or object parts
Transposed Convolution 3 - maps add fine-grained details and realistic effects like shading and reflections.
Output Layer -encode the final image structure and details in a way that combines all previous encodings into an image.
something like that... I'm still learning, but I like to see it from high perspective.
Haha, you're totally right about the bird! Thanks for pointing that out! I appreciate your breakdown of the transposed convolution layers too, it's great to see how you're thinking about the generation flow. Keep up the great work!
very help full mam
Glad it helped!
Thank You!!!
You're welcome!
Superb vedeo Madam your explanation is clr and awesome
Glad it helped!
Nice video
Thanks!
Super
Thanks
Nice video ma’am
please explain VAEs
Noted! I will make a video on VAEs soon.
Can you please create a separate playlist for GANs
Amazing 😻 😻 , I would like to request you a video related to Diffusion Models for Scene Text, please. thanks.
Thank you! Noted
Hi Arohi I have one doubt,
How does discriminator works for batches of images? So if batch size is 4 then 4 real images will be compared with 4 random generated images?
Yes, that's correct!
For real images- The discriminator's predictions are compared with the valid labels (a tensor of ones, [1, 1, 1, 1] for a batch of size 4). For real images the discriminator tries to output probabilities close to 1.
For fake images- The discriminator's predictions are compared with the fake labels (a tensor of zeros, [0, 0, 0, 0] for a batch of size 4).For fake images discriminator tries to output probabilities close to 0.
Can you please make a video showing the python coding for GAN network, also please help me, I need to extract local features of the images to train the model, not the overall global positioning, is CNN a good option for that?
Sure, Soon!
Please share the notes. It will be very helpful. Thanks
Ma'am please provide the ppt of it