OpenAI's Noam Brown, Ilge Akkaya and Hunter Lightman on o1 and Teaching LLMs to Reason Better

Поделиться
HTML-код
  • Опубликовано: 21 ноя 2024

Комментарии •

  • @andrewwalker8985
    @andrewwalker8985 Месяц назад +19

    You can see the value of open source in this interview. We don’t get smart people sharing their thoughts and excitement openly, we got smart people who were excited and would love to share armed with pre-approved sentences that they were allowed to say.

  • @user-pt1kj5uw3b
    @user-pt1kj5uw3b Месяц назад +38

    I hate to thank our corporate VC overlords, but these interviews are pretty cool. I think they will be historically significant in a few years.

    • @sup3a
      @sup3a Месяц назад +1

      100%

    • @rollingrock3480
      @rollingrock3480 Месяц назад

      They will be legally significant as an example of how big tech companies have defeated the spirit of the law time and time again to the overall detriment of society. (Remember Facebook giving 5 points for an angry reaction, and 1 point for a like, when it comes to recommending posts for your FB feed, then telling congress they want to "Bring us all together"?)

    • @ashh3051
      @ashh3051 11 дней назад

      Corporate VC overlords 🤔

  • @sup3a
    @sup3a Месяц назад +3

    Very good podcast thank you. No extra hype, just very matter of factly. Just what i need in the middle of all the hype

  • @NandoPr1m3
    @NandoPr1m3 Месяц назад +5

    I like that we are getting to see the real people behind the curtain at OpenAI. My big takeaway is that they A) have other ideas being researched and B) that they aren't afraid to try new paradigms, which is basically what led to the o1 Models.

  • @rickandelon9374
    @rickandelon9374 Месяц назад +26

    Btw the capital of Bhutan is Thimpu. Mostly hilly country and the world's only negative carbon output country.

    • @marwin4348
      @marwin4348 Месяц назад

      Sucks for them, they still did not archieve industrialisation?

    • @rickandelon9374
      @rickandelon9374 Месяц назад

      @@marwin4348the country is expensive as hell and poor interms of self dependence. mostly they import their goods from India which bullies them constantly with various political pressure.

    • @ominousplatypus380
      @ominousplatypus380 Месяц назад

      "Mostly hilly" might be the understatement of the century. The entirety of the country is enveloped by the Himalayas and it's arguably the most mountainous country that exists.

    • @andrewwalker8985
      @andrewwalker8985 Месяц назад

      @@marwin4348 that seems uncalled for

    • @-rate6326
      @-rate6326 Месяц назад

      ​@@rickandelon9374 india doesn't actually bullie Bhutan. India is security garrantor for Bhutan against china. This year indian allotted 267 million USD for Bhutan. India doesn't need to bullie Bhutan. Bhutan just accepts whatever india says. Bhutani military is trained by india. They train in India.
      Real bullie is china china is responsible for salami slicing around bhutani borders.
      India and bhutan has ten-article, perpetual treaty signed right after independence. In this treaty india can't interfere in bhutan's internal matters. Bhutan's external matters are guided by india. Recently china said if bhutani permanently agrees to give certain part of bhutan to china they will return the part china has taken from Bhutan. Bhutan was agreeing to this but india said it's Chinese trap. Why bhutan should permanently give the territories that belongs to bhutan.
      What you are saying is probably Chinese influence operation. China is big bullie in asia

  • @senju2024
    @senju2024 Месяц назад +7

    They have "Strawberries" on the table while talking about O1. NICE!~

  • @emmanuelgoldstein3682
    @emmanuelgoldstein3682 Месяц назад +51

    That's the biggest bowl of strawberries I've ever seen

  • @xiaoxiandong7382
    @xiaoxiandong7382 Месяц назад +6

    It's funny the researchers kept looking at the paper in front of them. Does it say what they can say vs not?

  • @thatthotho
    @thatthotho Месяц назад +16

    How many R's are in the bowl?

    • @Crux69
      @Crux69 Месяц назад +3

      Technically, none :D

    • @tomenglish9340
      @tomenglish9340 Месяц назад

      There are 3 R's in STRAWBERRIES, as in STRAWBERRY.

    • @adityakrishnaakula746
      @adityakrishnaakula746 Месяц назад

      Quite cheeky 😂 that they have a bowl of strawberries there

  • @PaddyLamont
    @PaddyLamont Месяц назад +3

    That little beep sound before the intro had me guessing whether my headphones had gone haywire.

    • @user-pt1kj5uw3b
      @user-pt1kj5uw3b Месяц назад +1

      Same. Felt like a telegram operator interpreting morse code for a second. They need to add a visual component.

  • @ashh3051
    @ashh3051 11 дней назад

    Has there been any research into letting the model make edits to its reasoning text instead of only being able to append tokens? That way it could think longer and improve the quality of its work.

  • @constantinelinardakis8394
    @constantinelinardakis8394 Месяц назад

    21:30 on STEM in hard reasoning thats why o1 is so good

  • @uw10isplaya
    @uw10isplaya Месяц назад

    24:11 is the most interesting topic in AI for me

  • @JoshuaGottlieb-oz4er
    @JoshuaGottlieb-oz4er Месяц назад

    Great content; thank you

  • @JumpDiffusion
    @JumpDiffusion Месяц назад +7

    9:15 so he basically avoided the question 😏

    • @vnehru1
      @vnehru1 Месяц назад +1

      Yes. Also noticed.

    • @MrC0MPUT3R
      @MrC0MPUT3R Месяц назад

      He didn't really avoid it; he just said he doesn't know. They're hoping that as the reasoning method the model uses is tested in a diverse set of domains that the weaknesses and strengths become clear so that at some point in the future they can actually answer that question and further refine training methods.

  • @whemmakatatt5311
    @whemmakatatt5311 Месяц назад +1

    Dayum , one to watch for suuure

  • @spinvalve
    @spinvalve 27 дней назад

    Is it just me but is the male host from Sequoia there a doppelganger of 3Blue1Brown? Both his voice and appearance is stupendously uncanny

  • @maxziebell4013
    @maxziebell4013 Месяц назад

    Great discussion

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w Месяц назад +6

    What’s on the paper? Why everyone is staring at theirs?

    • @Crux69
      @Crux69 Месяц назад +8

      PR and Legal notes from their internal ASI ;)

    • @tomenglish9340
      @tomenglish9340 Месяц назад +2

      @@Crux69 That's what it looks like to me -- no joke.

  • @alexiscao8749
    @alexiscao8749 Месяц назад +1

    The definition of reasoning: @the ability to consider more options and evaluate the correctness of the choice" isn't that Search for optimal?

    • @tomenglish9340
      @tomenglish9340 Месяц назад +1

      In a recent talk (I've forgotten which), he made it clear that he was conflating search with reasoning. I wouldn't do that, but I don't think it's a sin.

  • @constantinelinardakis8394
    @constantinelinardakis8394 Месяц назад

    26:22 def on agi

  • @constantinelinardakis8394
    @constantinelinardakis8394 Месяц назад

    12:00 training on tons of data

  • @constantinelinardakis8394
    @constantinelinardakis8394 Месяц назад

    36:00 on just data and timr

  • @DanielleNewnham
    @DanielleNewnham Месяц назад +1

    Thimphu is the capital of Bhutan. You're welcome :)

  • @prince-din
    @prince-din Месяц назад

    Why can't i open my SeqCap?

  • @Drackomass
    @Drackomass Месяц назад +2

    Fastest click ever

  • @redyican5341
    @redyican5341 Месяц назад

    It will look like this: math > programming > simulations > agents > answering hard open questions
    IMO it easier to source info from real world that to run some simulations. Infinite IQ doesn’t exist. IQ is search is solution space and it has constraints even with best heuristics and we can see in humans that these heuristics are maladaptive when applied to too narrow problems. So for single model trained on general questions wont develop these insane heuristics of 170 IQ people.
    MoE architecture can kinda have this high IQ in different domains.
    Also I think there needs to be ability to act/experiment to answer some harder open problems.
    I still think we need to master online learning but it’s likely that better training on long context can achieve it. Even better if it could adjust weights after

    • @redyican5341
      @redyican5341 Месяц назад

      I think actually one need to have model rerun after outputting stop token and decide to which questions it want to have answers after own reasoning chain, adjusting these knowledge weights. I kinda know it works like that in pretraining with synthetic data but would be cool to have it live

  • @redyican5341
    @redyican5341 Месяц назад +1

    Limit is in energy. We would need energy to outcompete humanity. If it can be cheaper per watt. I think it might work because it doesn’t have to be that general. Anyway happy that rich noobs finally will invest in more energy

  • @findjoseph
    @findjoseph Месяц назад

    W

  • @mpnikhil
    @mpnikhil Месяц назад +1

    The capital of Bhutan is Thimphu. System 1 human response 😂.

  • @superfliping
    @superfliping Месяц назад

    Now that most of your top leadership is gone seems like they don't want to invest in it anymore kind of a contradiction to what we are seeing

  • @BrutalStrike2
    @BrutalStrike2 Месяц назад

    18:38

  • @jamdec123
    @jamdec123 Месяц назад

    interesting enough conversation However, it may be beneficial to have people possessing models first before designing models clearly. There's a lack of life experience somewhere. anywho, I'll let these guys get back to facilitating AI on how, They can best lick their own parts, PeaceOUT

  • @Mayeverycreaturefindhappiness
    @Mayeverycreaturefindhappiness Месяц назад

    they never answered if they have a ongoing experiment where they let it keep thinking.

    • @ashh3051
      @ashh3051 10 дней назад

      In o1 it would fill the context window with reasoning tokens, wouldn’t it?

    • @Mayeverycreaturefindhappiness
      @Mayeverycreaturefindhappiness 10 дней назад

      @ when you use 01 the thinking doesn’t go to your context window

  • @constantinelinardakis8394
    @constantinelinardakis8394 Месяц назад

    17:38 left off

  • @attilaszasz-mb2sj
    @attilaszasz-mb2sj 26 дней назад

    someone please tell these people that o1 is not good at all :D

  • @OBGynKenobi
    @OBGynKenobi Месяц назад

    It's not thinking, it's calculating.
    No one thinks that Mathematica or Wolfram alpha is thinking.

    • @MrC0MPUT3R
      @MrC0MPUT3R Месяц назад +2

      You're not thinking. Your brain is just undergoing some electrochemical reactions.

  • @videochampion
    @videochampion Месяц назад +1

    Dude is smart but such a dorky speaker

    • @videochampion
      @videochampion Месяц назад

      Take your time to process your output, like O1... Erhm uhm erhm lol

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w Месяц назад +4

    Does the seating make sense? I think it would have been better to have girls seated in the center given their height stature.

    • @thenoblerot
      @thenoblerot Месяц назад

      I am so unreasonable annoyed for her awful framing! And make sure the guests mics aren't blocking their face!? That said... I was mostly listening, as I'm sure most are.
      Great talk regardless!

    • @tomenglish9340
      @tomenglish9340 Месяц назад

      Randomized to ensure that there was no gender bias.

  • @wwkk4964
    @wwkk4964 Месяц назад +2

    Noam has been saying to room fulls of people "you csnt know the capital of Bhutan", which is a very silly (and offebsive example). Most Chinese and Indian sub continent people (40% of world populatikn ) know its thimphu since elementary school.

    • @wwkk4964
      @wwkk4964 Месяц назад +2

      I'm just saying because it detracts from his other well thought out points but his example is so poor it will make him lose credibility in the wrong audience who can't judge the rest of his claim.

    • @DavidToddSports
      @DavidToddSports Месяц назад +9

      That is not what he is saying. He is saying if you don't know the answer when the question is asked, there is no amount of time which is going to allow you to "think" the answer.

    • @wwkk4964
      @wwkk4964 Месяц назад

      @@DavidToddSports it's not a good example because it's not demonstrating the salience of the point he is making. It's equivalent to saying, no amount of thinking will help you recognise the capital of Australia or canada or Brazil, but is this strictly true?

    • @wwkk4964
      @wwkk4964 Месяц назад

      @@DavidToddSports here's another way to think about why it's a defective example: "No amount of thinking is going to allow you to know if baseball comes from cricket or vice versa." Is this kind of statement a good example unreachable or computationally disconnected island of knowledge ? I don't think so, it muddles things up because the question is undecidable.

    • @jinhongyu911
      @jinhongyu911 Месяц назад +1

      @@wwkk4964 I think the example he gave about the capital of Bhutan is perfectly sound given the topic of reasoning. Rather, I think your example of baseball that's the wrong type of example of give the question. What Noam is saying is that, unless you've heard of the name of the capital of Bhutan before, there is no amount of time which is going to allow you to reason out the answer (like david said above), given that you don't have access to the internet or books of course. As for your example of baseball, if you can have enough facts and historical records on hand, I'm sure you can come to a reasonable conclusion of which one came first. Just like the chicken and egg problem, if you have can set the right definitions, you can surely reason out the answer. So again, the capital of Bhutan is not a 'reasoning' problem, you can't figure it out step by step, you either know it or you don't.