Great improvements over cyclegan! It's so much faster to train and occupies less gpu memory so I was able to train on larger images (than cyclegan). Thanks for thr detailed explanation, great work as always 👍
Sweet! Special props for explaining patches. In the PacthNCE formulation, the addition of the representations from the other spatial locations has a great impact on the contrastive learning task here I think.
Let's consider an image of a synthetic hand, we want to translate it to a real hand, different patches in the source image are very similar, how do you eliminate the false positive patches in this case?
Won't making the embeddings of patches at similar locations same make the feature extractor learn style invariant or content features. Will this not make the discriminator bad ?
There are two separate losses here, the adversarial loss discriminator is separate from this. This loss is derived from a projection head after re-passing x and y' through the generator's encoder layers.
Great improvements over cyclegan! It's so much faster to train and occupies less gpu memory so I was able to train on larger images (than cyclegan). Thanks for thr detailed explanation, great work as always 👍
Thank you so much! That's really awesome to hear, thanks for sharing your experience!
Sweet! Special props for explaining patches.
In the PacthNCE formulation, the addition of the representations from the other spatial locations has a great impact on the contrastive learning task here I think.
Thank you! Yeah, it reminds me of early style transfer algorithms with the gram matrix the way they compare features in intermediate layers.
Damn right!
What about unpaired language translation? That would be cool!
Great video @Henry AI Labs Can you make a video on implementation of Contrastive Learning for Unpaired Image-to-Image Translation?
Let's consider an image of a synthetic hand, we want to translate it to a real hand, different patches in the source image are very similar, how do you eliminate the false positive patches in this case?
Won't making the embeddings of patches at similar locations same make the feature extractor learn style invariant or content features. Will this not make the discriminator bad ?
There are two separate losses here, the adversarial loss discriminator is separate from this. This loss is derived from a projection head after re-passing x and y' through the generator's encoder layers.
@@connor-shorten ah got it
great video
Thanks for watching!