Parables on the Power of Planning in AI: From Poker to Diplomacy: Noam Brown (OpenAI)

Поделиться
HTML-код
  • Опубликовано: 9 ноя 2024

Комментарии • 55

  • @harriehausenman8623
    @harriehausenman8623 Месяц назад +13

    Why can't I shake the feeling, someone just explained o1-preview to me, without ever mentioning it 🤔 Thank you! 🙏

    • @ericchang9568
      @ericchang9568 Месяц назад +2

      a ton of planning to roll out N COTs :)

  • @DistortedV12
    @DistortedV12 Месяц назад +47

    the architect of Cicero and "scaling inference time compute."

    • @windmaple
      @windmaple Месяц назад +9

      Well, the talk actually took place in May if you look at the description. So he kind of hinted o1 3 months ago

    • @DistortedV12
      @DistortedV12 Месяц назад +4

      @@windmaple ik my point exactly.. probably told UW to not release it until now

    • @tmchen3440
      @tmchen3440 Месяц назад

      😢😮t😢 Pignll

  • @rylieweaver1516
    @rylieweaver1516 Месяц назад +2

    This is awesome. I like how he explained the generator-verifier gap. This will be huge for AI safety and reliability in addition to performance.

  • @triplea657aaa
    @triplea657aaa Месяц назад +11

    Would love if some of these papers were in the description for easy reference!

  • @patruff
    @patruff Месяц назад +28

    Never underestimate search. -Waldo

    • @smicha15
      @smicha15 Месяц назад +1

      Oh my god brilliant.

    • @brianpalmer967
      @brianpalmer967 Месяц назад +1

      And that's how we know you're a 90s kid!

  • @omadDev
    @omadDev Месяц назад +1

    Very interesting lecture. Thank you!

  • @RaviAnnaswamy
    @RaviAnnaswamy Месяц назад +2

    Search means find a series of actions that lead from the current state to end state that you would
    Like
    Or alternatively avoid potentially bad states for you in future

    • @heykike
      @heykike Месяц назад

      So basic algebra counts as search?

  • @ankitkumarpandey7262
    @ankitkumarpandey7262 Месяц назад +9

    The way AI is progressing is so closely related to evolution..just at a much faster time scale.

    • @brandonbodily2101
      @brandonbodily2101 Месяц назад +1

      "It is not the strongest of the species that survive, nor the most intelligent, but the one most responsive to change." - Charles Darwin

  • @JustinHalford
    @JustinHalford Месяц назад +26

    The trillion dollar question - can search with foundation models generalize beyond objectively verifiable domains like math, coding, and games?

    • @clray123
      @clray123 Месяц назад +2

      The answer is no because the models, including the search-based ones, require correctly scored training data to begin with. Where is this scoring supposed to come from for other domain, which cannot be easily simulated, and in which scoring the solution correctly is a big part of the problem? That is the core question for our AI hypesters (which they will avoid at all cost as it makes the whole house of cards collapse).
      So far their only proposition for image recognition and language modeling tasks specifically has been to hire thousands of underpaid workers to do all the scoring for them. The difficulty here is that scoring in real-life domains cannot be done by low-paid labor slaves. That is, if it can be done at all: in many cases experts cannot analytically explain their expertise, yet they can intuitively take "correct" actions, based on a life-long experience, using their own "neural nets" locked up in their brain.

    • @JustinHalford
      @JustinHalford Месяц назад +2

      @@clray123 I think that you’re underestimating the odds of AI acquiring aesthetic taste at the level of talented people via clever math/algorithms. We’ve already seen art and writing contests won by AI. To me, the actual question is when, not if.

    • @clray123
      @clray123 Месяц назад

      @@JustinHalford Art and writing contests won by AI (any examples?) would really mean nothing - the recipe for success in such a contest would be to just copy someone else's great work and declare yourself the winner. We already know that AI is good at imitation, if the thing to be imitated exists in a million examples that can be interpolated across, but we also know that a great art forger does not make a great artist.

    • @clray123
      @clray123 Месяц назад +1

      I think you are overestimating the odds of AI acquiring anything, really. What we call "emergent" abiliities are really the result of being able to pick relevant signal from humungous amounts of training data. I am talking about situations where no such training data is available.

    • @JustinHalford
      @JustinHalford Месяц назад +4

      @@clray123 have you heard of move 37? With sufficient compute and generalized self play, we will see many more examples of move 37 in a variety of domains.

  • @RaviAnnaswamy
    @RaviAnnaswamy Месяц назад +6

    His points on why people didn’t prioritize search is very illuminating
    The broader lesson here is that trained distilled knowledge is pattern recognition and good for perceptual take whereas adding a search and explore (as in GOFAI) is necessary for cognitive tasks
    I think there might be one more step: to distill the patterns discovered via search back into perceptual precepts which I think is what happens in grandmaster play in chess and genius such as Newton or Ramanujan
    If o1 already does this similar to alphazero I do not know as I am typing this half way the lecture

    • @masterchief7301
      @masterchief7301 Месяц назад +1

      So, it'd be a loop of creating new patterns as it encounters novel situations.

    • @DistortedV12
      @DistortedV12 Месяц назад +1

      Us cognitive scientists have known about this for a long time as well; "system 1" and "system 2."

    • @RaviAnnaswamy
      @RaviAnnaswamy Месяц назад

      @@DistortedV12 yes I am aware of that and read Kahnemans great book on that topic too but what is fascinating is how facing human players beat the system 1 version of their bot forces them to add search

    • @FamilyYoutubeTV-x6d
      @FamilyYoutubeTV-x6d Месяц назад

      @@DistortedV12 cool

  • @elliptictree
    @elliptictree Месяц назад +1

    Interesting 💡🚀

  • @marbin1069
    @marbin1069 Месяц назад

    And this is how o1 was born.

  • @hypercube717
    @hypercube717 Месяц назад

    Interesting

  • @fil4dworldcomo623
    @fil4dworldcomo623 Месяц назад

    I have been listening for a while now, though I agree that enabling search is a big factor for GenAI intellect, it's still not clear from the context of poker game if why. I can only assume you taught the model to read people's faces and then search on their historical game record to know when they are bluffing and when they do really have a strong hand?

    • @fil4dworldcomo623
      @fil4dworldcomo623 Месяц назад

      @@erikfast9764 Thank you Erik, it keeps the excitement in the game then as that makes AI beatable by confusing it with irrational behaviour. But when AI becomes unbeatable, it must not have any hand in any game as it will kill the game.

    • @lesmoe524
      @lesmoe524 Месяц назад

      @@fil4dworldcomo623 A.I has already been beating online poker since like 2013. Playing irrationally does not matter, the ai plays defensively aka "GTO" and doesn't mind if you never bluff, or if you bluff every hand, it will still play exactly the same way(that's why all the pros talk about using "GTO Strategy"). live poker will always be a thing, but even then you could have a device that tells you how to play like a bot though.

  • @Eriiiiiiiick
    @Eriiiiiiiick Месяц назад

    COOL

  • @Z-dv3zx
    @Z-dv3zx Месяц назад +2

    many of these papers don't exist... did an LLM create these slides wtf

  • @ieltshome
    @ieltshome 10 дней назад

    I'm a newbie here and I noticed Noam uses the term planning and search interchangeably. So in a sense, RAG can be considered as planning? After all, it does the search and improve the quality of the answer. Correct me if I am mistaken.

  • @patruff
    @patruff Месяц назад

    TGI MCTS

  • @ericchang9568
    @ericchang9568 Месяц назад

    Is the poker bot making money on the internet right now?

  • @twoplustwo5
    @twoplustwo5 5 дней назад

    150$ for poker bot - crazy

  • @JimJordan1753
    @JimJordan1753 Месяц назад +3

    He always hates going into depth on how he made the poker model

    • @clray123
      @clray123 Месяц назад +2

      And rightly so because it's not the talk where he is supposed to throw around mathematical formulae mixed with arcane poker rules and assume that everyone in audience can follow.

    • @JimJordan1753
      @JimJordan1753 Месяц назад +1

      @@clray123 “always”

    • @samkee3859
      @samkee3859 26 дней назад

      What are you implying? I’m dense

  • @sucim
    @sucim Месяц назад +4

    "I started grad school in 2012" but looks like he started grad school in 2025