A New Perspective on Adversarial Perturbations

Поделиться
HTML-код
  • Опубликовано: 10 июл 2024
  • Aleksander Madry (MIT)
    simons.berkeley.edu/talks/tbd-57
    Frontiers of Deep Learning

Комментарии • 9

  • @Kram1032
    @Kram1032 5 лет назад +4

    Wow, people had a lot of questions during this one. I can see why though. This was a great talk, about all around great work!

  • @TheMarcusrobbins
    @TheMarcusrobbins 5 лет назад +1

    Oh god this is fascinating. Is there some domain of perception that is completely inaccessible to us? God find out out what the features look like already!!!

  • @volotat
    @volotat 5 лет назад +1

    Fantastic work. I wonder if this approach can potentially beat GANs at image generation one day. Very impressive.

  • @paulcurry8383
    @paulcurry8383 3 года назад

    I feel like this parallels the universal adversarial triggers of NLP models. Those are effective because they exploit a low level feature of the dataset the model is trained on. I wonder how you could apply “noise” onto the input of an NLP model to reduce lower level feature dependence... perhaps substituting words for close synonyms?

    • @PaulLai
      @PaulLai 2 года назад +1

      A token in a sentence is more analogous to a pixel in an image. Adding noise can be adding random words that doesn't misled human but misled the model.

    • @psd993
      @psd993 Год назад

      there was a paper from ilyas et al out of MIT, that proposed that adv examples come from well generalizing features in the data sets. They call these features "brittle" because they are not what humans would pick up on.

  • @aBigBadWolf
    @aBigBadWolf 5 лет назад +1

    16:55 the comment is valid, the second model just learned to imitate the previous model. The fact that the classifier architecture is slightly different is irrelevant.

  • @aBigBadWolf
    @aBigBadWolf 5 лет назад +2

    The presenter has no idea how humans learn.