Anthropic Solved Interpretability?

Поделиться
HTML-код
  • Опубликовано: 10 окт 2024
  • Paper: transformer-ci...
    Blogpost: www.anthropic....
    Lesswrong: www.lesswrong....

Комментарии • 15

  • @FreakyStyleytobby
    @FreakyStyleytobby 7 месяцев назад +1

    Definitely a good news!
    Thank you very much for the quality video man

  • @mgetommy
    @mgetommy Год назад +2

    Enjoyed this a lot! Pls do more

  • @arjoon
    @arjoon 11 месяцев назад +1

    Great video!
    Makes me wonder if there'll always be a fundamental tradeoff between interpretability and goodness of fit for a model

  • @jordan13589
    @jordan13589 Год назад +2

    When you’re getting ready to go as sexy Pugsley to a costume party just as Anthropic cracks mech interp:

  • @Alice_Fumo
    @Alice_Fumo Год назад +6

    Recent papers are really starting to steal all my good ideas. (Or confirming that my ideas were ever good and novel to begin with)
    I believe I didn't investigate this particularly far, since I didn't feel like the implications were particularly meaningful. Like having these features and even a map of neurons associated with a feature doesn't seem that useful to me in itself. By not that useful I mean that actually using this to make stuff safer is still extremely difficult and there's like 20 more easier avenues to better model performance, so I'm pretty unsure about whether they'll find ways to use this in practice which doesn't eventually lead to sort of dead ends.

    • @fredolivier1431
      @fredolivier1431 Год назад

      I am having exactly the same experience, its crazy. Notepads full of ideas that turn up on arxiv sometimes months later. Best Move: actualise autonomously.

  • @meiotta
    @meiotta Год назад +1

    always good content

  • @jt-rv5qu
    @jt-rv5qu 10 месяцев назад

    As we talk about probabilities of the next token in different levels of attention , The Large Text Models créate only the delusion of Intelligence in the users. LTM are great for transcript, translate or as NLP interphases to other systems but there no any necessity of mechanistic interpretabilty. Far better goal will be give them reasoning capacity and then you can simply ask about the introspection 🎉

  • @kyneticist
    @kyneticist Год назад

    "They" did not say that mechanistic interpretability would never be achieved, "they" said that it should have been pursued as a fundamental step.

  • @danielbrockman7402
    @danielbrockman7402 Год назад +1

    🎉

  • @RonponVideos
    @RonponVideos Год назад +1

    I dig the hair. It ages you (in a good way).

  • @BooleanDisorder
    @BooleanDisorder 9 месяцев назад

    You are cute!