Recent papers are really starting to steal all my good ideas. (Or confirming that my ideas were ever good and novel to begin with) I believe I didn't investigate this particularly far, since I didn't feel like the implications were particularly meaningful. Like having these features and even a map of neurons associated with a feature doesn't seem that useful to me in itself. By not that useful I mean that actually using this to make stuff safer is still extremely difficult and there's like 20 more easier avenues to better model performance, so I'm pretty unsure about whether they'll find ways to use this in practice which doesn't eventually lead to sort of dead ends.
I am having exactly the same experience, its crazy. Notepads full of ideas that turn up on arxiv sometimes months later. Best Move: actualise autonomously.
As we talk about probabilities of the next token in different levels of attention , The Large Text Models créate only the delusion of Intelligence in the users. LTM are great for transcript, translate or as NLP interphases to other systems but there no any necessity of mechanistic interpretabilty. Far better goal will be give them reasoning capacity and then you can simply ask about the introspection 🎉
Definitely a good news!
Thank you very much for the quality video man
Enjoyed this a lot! Pls do more
Great! That's the plan
Great video!
Makes me wonder if there'll always be a fundamental tradeoff between interpretability and goodness of fit for a model
When you’re getting ready to go as sexy Pugsley to a costume party just as Anthropic cracks mech interp:
Recent papers are really starting to steal all my good ideas. (Or confirming that my ideas were ever good and novel to begin with)
I believe I didn't investigate this particularly far, since I didn't feel like the implications were particularly meaningful. Like having these features and even a map of neurons associated with a feature doesn't seem that useful to me in itself. By not that useful I mean that actually using this to make stuff safer is still extremely difficult and there's like 20 more easier avenues to better model performance, so I'm pretty unsure about whether they'll find ways to use this in practice which doesn't eventually lead to sort of dead ends.
I am having exactly the same experience, its crazy. Notepads full of ideas that turn up on arxiv sometimes months later. Best Move: actualise autonomously.
always good content
always good comments
As we talk about probabilities of the next token in different levels of attention , The Large Text Models créate only the delusion of Intelligence in the users. LTM are great for transcript, translate or as NLP interphases to other systems but there no any necessity of mechanistic interpretabilty. Far better goal will be give them reasoning capacity and then you can simply ask about the introspection 🎉
"They" did not say that mechanistic interpretability would never be achieved, "they" said that it should have been pursued as a fundamental step.
🎉
I dig the hair. It ages you (in a good way).
Thanks, glad you like it!
You are cute!