Poisoning Web-Scale Training Datasets - Nicholas Carlini | Stanford MLSys #75

Поделиться
HTML-код
  • Опубликовано: 1 авг 2024
  • Episode 75 of the Stanford MLSys Seminar “Foundation Models Limited Series”!
    Speaker: Nicholas Carlini
    Title: Poisoning Web-Scale Training Datasets is Practical
    Abstract: In this talk I introduce the first practical poisoning attack on large machine learning datasets. With our attack I could have poisoned (but didn't!) the training dataset for anyone who has used LAION-400M in the last six months. While we take steps to mitigate these attacks, they come at a (sometimes significant) cost to utility. Addressing these challenges will require new categories of defenses to simultaneously allow models to train on large datasets while also being robust to adversarial training data.
    Bio: Nicholas Carlini is a research scientist at Google Brain. He studies the security and privacy of machine learning, for which he has received best paper awards at ICML, USENIX Security and IEEE S&P. He obtained his PhD from the University of California, Berkeley in 2018.
    Check out our website for the schedule: mlsys.stanford.edu
    Join our mailing list to get weekly updates: groups.google.com/forum/#!for...
  • НаукаНаука

Комментарии • 4

  • @jayasimhatalur5503
    @jayasimhatalur5503 Год назад +3

    Thanks for organising this seminar and inviting prolific researchers.

  • @xianbiaoqi7009
    @xianbiaoqi7009 Год назад

    Always interesting talk, becoming a fan of stanford mlsys seminars.

  • @bilderzucht
    @bilderzucht 10 месяцев назад

    10:24 When domain names expire, anyone can buy them. So at least the domains of the artists and authors that are pushed out of work because their data was used get their domains afterwards used by Google. Is that the kind of "compensation" that can be expected from the ML Community?