Search means find a series of actions that lead from the current state to end state that you would Like Or alternatively avoid potentially bad states for you in future
The answer is no because the models, including the search-based ones, require correctly scored training data to begin with. Where is this scoring supposed to come from for other domain, which cannot be easily simulated, and in which scoring the solution correctly is a big part of the problem? That is the core question for our AI hypesters (which they will avoid at all cost as it makes the whole house of cards collapse). So far their only proposition for image recognition and language modeling tasks specifically has been to hire thousands of underpaid workers to do all the scoring for them. The difficulty here is that scoring in real-life domains cannot be done by low-paid labor slaves. That is, if it can be done at all: in many cases experts cannot analytically explain their expertise, yet they can intuitively take "correct" actions, based on a life-long experience, using their own "neural nets" locked up in their brain.
@@clray123 I think that you’re underestimating the odds of AI acquiring aesthetic taste at the level of talented people via clever math/algorithms. We’ve already seen art and writing contests won by AI. To me, the actual question is when, not if.
@@JustinHalford Art and writing contests won by AI (any examples?) would really mean nothing - the recipe for success in such a contest would be to just copy someone else's great work and declare yourself the winner. We already know that AI is good at imitation, if the thing to be imitated exists in a million examples that can be interpolated across, but we also know that a great art forger does not make a great artist.
I think you are overestimating the odds of AI acquiring anything, really. What we call "emergent" abiliities are really the result of being able to pick relevant signal from humungous amounts of training data. I am talking about situations where no such training data is available.
@@clray123 have you heard of move 37? With sufficient compute and generalized self play, we will see many more examples of move 37 in a variety of domains.
His points on why people didn’t prioritize search is very illuminating The broader lesson here is that trained distilled knowledge is pattern recognition and good for perceptual take whereas adding a search and explore (as in GOFAI) is necessary for cognitive tasks I think there might be one more step: to distill the patterns discovered via search back into perceptual precepts which I think is what happens in grandmaster play in chess and genius such as Newton or Ramanujan If o1 already does this similar to alphazero I do not know as I am typing this half way the lecture
@@DistortedV12 yes I am aware of that and read Kahnemans great book on that topic too but what is fascinating is how facing human players beat the system 1 version of their bot forces them to add search
I have been listening for a while now, though I agree that enabling search is a big factor for GenAI intellect, it's still not clear from the context of poker game if why. I can only assume you taught the model to read people's faces and then search on their historical game record to know when they are bluffing and when they do really have a strong hand?
@@erikfast9764 Thank you Erik, it keeps the excitement in the game then as that makes AI beatable by confusing it with irrational behaviour. But when AI becomes unbeatable, it must not have any hand in any game as it will kill the game.
@@fil4dworldcomo623 A.I has already been beating online poker since like 2013. Playing irrationally does not matter, the ai plays defensively aka "GTO" and doesn't mind if you never bluff, or if you bluff every hand, it will still play exactly the same way(that's why all the pros talk about using "GTO Strategy"). live poker will always be a thing, but even then you could have a device that tells you how to play like a bot though.
I'm a newbie here and I noticed Noam uses the term planning and search interchangeably. So in a sense, RAG can be considered as planning? After all, it does the search and improve the quality of the answer. Correct me if I am mistaken.
And rightly so because it's not the talk where he is supposed to throw around mathematical formulae mixed with arcane poker rules and assume that everyone in audience can follow.
Why can't I shake the feeling, someone just explained o1-preview to me, without ever mentioning it 🤔 Thank you! 🙏
a ton of planning to roll out N COTs :)
the architect of Cicero and "scaling inference time compute."
Well, the talk actually took place in May if you look at the description. So he kind of hinted o1 3 months ago
@@windmaple ik my point exactly.. probably told UW to not release it until now
😢😮t😢 Pignll
This is awesome. I like how he explained the generator-verifier gap. This will be huge for AI safety and reliability in addition to performance.
Would love if some of these papers were in the description for easy reference!
1:26
Never underestimate search. -Waldo
Oh my god brilliant.
And that's how we know you're a 90s kid!
Very interesting lecture. Thank you!
Search means find a series of actions that lead from the current state to end state that you would
Like
Or alternatively avoid potentially bad states for you in future
So basic algebra counts as search?
The way AI is progressing is so closely related to evolution..just at a much faster time scale.
"It is not the strongest of the species that survive, nor the most intelligent, but the one most responsive to change." - Charles Darwin
The trillion dollar question - can search with foundation models generalize beyond objectively verifiable domains like math, coding, and games?
The answer is no because the models, including the search-based ones, require correctly scored training data to begin with. Where is this scoring supposed to come from for other domain, which cannot be easily simulated, and in which scoring the solution correctly is a big part of the problem? That is the core question for our AI hypesters (which they will avoid at all cost as it makes the whole house of cards collapse).
So far their only proposition for image recognition and language modeling tasks specifically has been to hire thousands of underpaid workers to do all the scoring for them. The difficulty here is that scoring in real-life domains cannot be done by low-paid labor slaves. That is, if it can be done at all: in many cases experts cannot analytically explain their expertise, yet they can intuitively take "correct" actions, based on a life-long experience, using their own "neural nets" locked up in their brain.
@@clray123 I think that you’re underestimating the odds of AI acquiring aesthetic taste at the level of talented people via clever math/algorithms. We’ve already seen art and writing contests won by AI. To me, the actual question is when, not if.
@@JustinHalford Art and writing contests won by AI (any examples?) would really mean nothing - the recipe for success in such a contest would be to just copy someone else's great work and declare yourself the winner. We already know that AI is good at imitation, if the thing to be imitated exists in a million examples that can be interpolated across, but we also know that a great art forger does not make a great artist.
I think you are overestimating the odds of AI acquiring anything, really. What we call "emergent" abiliities are really the result of being able to pick relevant signal from humungous amounts of training data. I am talking about situations where no such training data is available.
@@clray123 have you heard of move 37? With sufficient compute and generalized self play, we will see many more examples of move 37 in a variety of domains.
His points on why people didn’t prioritize search is very illuminating
The broader lesson here is that trained distilled knowledge is pattern recognition and good for perceptual take whereas adding a search and explore (as in GOFAI) is necessary for cognitive tasks
I think there might be one more step: to distill the patterns discovered via search back into perceptual precepts which I think is what happens in grandmaster play in chess and genius such as Newton or Ramanujan
If o1 already does this similar to alphazero I do not know as I am typing this half way the lecture
So, it'd be a loop of creating new patterns as it encounters novel situations.
Us cognitive scientists have known about this for a long time as well; "system 1" and "system 2."
@@DistortedV12 yes I am aware of that and read Kahnemans great book on that topic too but what is fascinating is how facing human players beat the system 1 version of their bot forces them to add search
@@DistortedV12 cool
Interesting 💡🚀
And this is how o1 was born.
Interesting
I have been listening for a while now, though I agree that enabling search is a big factor for GenAI intellect, it's still not clear from the context of poker game if why. I can only assume you taught the model to read people's faces and then search on their historical game record to know when they are bluffing and when they do really have a strong hand?
@@erikfast9764 Thank you Erik, it keeps the excitement in the game then as that makes AI beatable by confusing it with irrational behaviour. But when AI becomes unbeatable, it must not have any hand in any game as it will kill the game.
@@fil4dworldcomo623 A.I has already been beating online poker since like 2013. Playing irrationally does not matter, the ai plays defensively aka "GTO" and doesn't mind if you never bluff, or if you bluff every hand, it will still play exactly the same way(that's why all the pros talk about using "GTO Strategy"). live poker will always be a thing, but even then you could have a device that tells you how to play like a bot though.
COOL
many of these papers don't exist... did an LLM create these slides wtf
I'm a newbie here and I noticed Noam uses the term planning and search interchangeably. So in a sense, RAG can be considered as planning? After all, it does the search and improve the quality of the answer. Correct me if I am mistaken.
TGI MCTS
Is the poker bot making money on the internet right now?
150$ for poker bot - crazy
He always hates going into depth on how he made the poker model
And rightly so because it's not the talk where he is supposed to throw around mathematical formulae mixed with arcane poker rules and assume that everyone in audience can follow.
@@clray123 “always”
What are you implying? I’m dense
"I started grad school in 2012" but looks like he started grad school in 2025