Contrastive Learning with SimCLR | Deep Learning Animated

Поделиться
HTML-код
  • Опубликовано: 22 дек 2024

Комментарии • 40

  • @Deepia-ls2fo
    @Deepia-ls2fo  2 месяца назад +2

    To try everything Brilliant has to offer-free-for a full 30 days, visit brilliant.org/Deepia .
    You’ll also get 20% off an annual premium subscription.

  • @Darkev77
    @Darkev77 Месяц назад +1

    I SWEAR I was trying to understand BYOL just a few minutes and was struggling, then this video came up, THANK YOU! CAN'T WAIT! Also, please do SwaV as well!

  • @zyansheep
    @zyansheep 2 месяца назад +8

    Day by day, we inch closer and closer to creating The Great Compressor.

    • @Tothefutureand
      @Tothefutureand 2 месяца назад

      Like the one in Silicon Valley TvSeries.

    • @WhiteWeaver-hk2nt
      @WhiteWeaver-hk2nt Месяц назад

      I'd love to be compressed between my robot anime waifu's thighs 🤤

  • @Higgsinophysics
    @Higgsinophysics 2 месяца назад +2

    Insane technique! awesome video thanks for explaining this with tons of examples.

  • @AbideByReason
    @AbideByReason Месяц назад

    Really nice video. Love your presentation style, so clean and well explained!

  • @MutigerBriefkasten
    @MutigerBriefkasten 2 месяца назад

    Amazing presentation again 🎉 thank you for your efforts and time

  • @KoHaN7
    @KoHaN7 2 месяца назад

    Amazing content! Looking forward to the next videos 😄

  • @deror007
    @deror007 2 месяца назад

    How exactly in the original contrast loss, does y = 0 in the positive case and the y=1 in the negative case? In addition what is y representing here? 6:27

    • @deror007
      @deror007 2 месяца назад

      Is it just a positive and negative pair label which forces the contrastive loss to focus on the positive and negative metrics in the loss function?

    • @Deepia-ls2fo
      @Deepia-ls2fo  Месяц назад +1

      Yes exactly !

    • @linc008
      @linc008 22 дня назад

      @@deror007 But how are the labels collected? Didn't the author say no labels are required?

  • @user-ht4rw5wp4x
    @user-ht4rw5wp4x Месяц назад +2

    How does the model/programmer know if two pictures are a positive or negative pair without labels?

    • @Deepia-ls2fo
      @Deepia-ls2fo  Месяц назад

      @@user-ht4rw5wp4x Well you have several ways of defining the pairs, for instance you create positive pairs with data augmentation as in SimCLR !

    • @henriksundt7148
      @henriksundt7148 23 дня назад

      I was wondering the same. Even if positive pairs are created by augmentation (11:34 in the video), there is no way to pick a cat for sure to the negative pair. How can it be a cat (at 12:15) without the labels?

  • @delistoyer
    @delistoyer 2 месяца назад

    I like that you're focusing on computer vision

  • @khoakirokun217
    @khoakirokun217 2 месяца назад

    Outstanding technique :D thank you, it was not wrong to subscribe the channel :D

  • @ProgrammingWithJulius
    @ProgrammingWithJulius Месяц назад

    Great video as always

  • @itz_lucky6472
    @itz_lucky6472 2 месяца назад +1

    At 12:04 you say that SimCLR select multiple negative pairs and then you show a picture of a cat, and a dog. I am confused, the second dog picture is also considered as a negative pair even though it's the same animal? If yes, does this mean the model train to lower the distance ONLY with the original image even though other could be dogs?

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад +2

      Exactly! The negatives can be any other image in the batch, including very similar objects

    • @itz_lucky6472
      @itz_lucky6472 2 месяца назад +2

      @@Deepia-ls2fo That is very interesting, thank you for your answer, I have another question if you do not mind
      At the end when comparing classification accuracy you compare supervised, SimCLR+finetune and SimCLR, the last one have me confused, how can the model without any finetuning even work for classification? Or do they not count a trained dense layer that learn to use the latent space of SimCLR for classification, and SimCLR+finetune mean finetuning the latent space instead? My question is that does fine-tune mean finetuning a dense layer or the latent space?
      Your videos are high quality and I really love them, sometimes I just wish they would be longer and slightly more into the implementation details, thank you!
      Edit: Regarding my first question, since the negative pair can be the same class (if we imagine the ultimate goal is classification), would a low amount of class (let's say only 2) lower the quality of the latent space due to a high amount of class "collision" ? And in the opposite if there is hundreds of class it will rarely select the same class as a negative pair and improve latent space representation?

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад +1

      @@itz_lucky6472 I strongly advise you to read the SimCLR paper as it is a very easy read and they detail everything.
      About the classification task: for SimCLR they use what we call "linear eval", meaning they plug a fully connected head on the model and train only this part. The difference between "SimCLR" and "SimCLR fine-tune" is that the weights of the backbone are modified in a supervised fashion with a small portion of the data for "SimCLR fine-tune".
      For your second question I did not read a lot about this, and I'm myself new to self-supervised learning in general, so I can't answer for sure. I guess you could easily do the experiment with 2 MNIST classes though. Intuitively I think taking many semantically similar objects and treating them as negatives is bad for the representation space.

  • @Bwaaz
    @Bwaaz 2 месяца назад

    Great video ! You mention that the contrastive loss pushes/pulls points, how does the loss function "push away" a point exactly ?

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      Thanks, it pushes negative pairs apart until their distance reaches the margin, by minimizing the difference between the margin and the distance between the points.
      This is the quantity in red at 06:40 :)

  • @gx1501996
    @gx1501996 2 месяца назад

    InfoNCE loss at 11:14 looks odd as Dp is the distance notation at 9:00, but you say its related to probabilities. It would break the flow to introduce new notation though. But as it stands it was a little confusing to me to see that the loss would be minimized by maximizing Dp. I checked the paper and it seems the term is an approximator for "mutual information" which we want bigger for positive samples. At least thats my rough understanding...
    Thanks for the video its a fantastic explanation!

    • @Deepia-ls2fo
      @Deepia-ls2fo  Месяц назад

      Indeed I should have taken the time to introduce it properly and use the correct notations

  • @stormaref
    @stormaref 18 дней назад

    Great content; keep it up

  • @gamma8675
    @gamma8675 2 месяца назад

    awesome content!

  • @dhurbatripathi6924
    @dhurbatripathi6924 Месяц назад

    when's next video. Love these visualizations!

    • @Deepia-ls2fo
      @Deepia-ls2fo  Месяц назад

      @@dhurbatripathi6924 Thanks ! By the end of November!

  • @Omunamantech
    @Omunamantech 2 месяца назад

    Awesome Video :D

  • @diffpizza
    @diffpizza 2 месяца назад

    Nice explanation! It still isn't clear to me how to choose the metric to determine how similar or dissimilar two samples are, is it also learned by the network?

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      You can choose any differentiable metric, that's one of the strength of this framework :)

  • @444haluk
    @444haluk 2 месяца назад +1

    Augmentations ARE the labels, labels of "ignore".

  • @dewibatista5752
    @dewibatista5752 2 месяца назад

    is the voice in the vid the output of a TTS model?

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад +1

      Yes ! It's my voice though :)

  • @braineaterzombie3981
    @braineaterzombie3981 2 месяца назад +3

    Cool

  • @SabbirAhmedsas
    @SabbirAhmedsas 2 месяца назад

    Thanks💀

  • @Nomah425
    @Nomah425 2 месяца назад

    Hmmmmmm YES