OpenAI o1 - the biggest black box of all. Let’s break it open.

Поделиться
HTML-код
  • Опубликовано: 11 янв 2025

Комментарии •

  • @stephentsang9194
    @stephentsang9194 3 месяца назад +7

    u deserve so much for making your research and experience available for your audience. a super job! thank you

    • @TheTechTrance
      @TheTechTrance  3 месяца назад

      I appreciate that! Glad I can help

  • @vanessaaa.paaark
    @vanessaaa.paaark 3 месяца назад +12

    Thanks for this detailed explanation. I just shared it with a colleague who was also wondering about o1’s architecture

    • @TheTechTrance
      @TheTechTrance  3 месяца назад +3

      Share the love and share the knowledge 😎

  • @BABEENGINEER
    @BABEENGINEER 3 месяца назад +9

    Thank you for helping us understand open AI models better ❤

  • @TestMyHomeChannel
    @TestMyHomeChannel 3 месяца назад +2

    Brilliantly condensed and fast-paced explanation of O1, mixing facts with clear logic. Thank you for demystifying such a complex concept!

    • @TheTechTrance
      @TheTechTrance  3 месяца назад

      Thank you for noticing! Glad it was helpful!

  • @flickwtchr
    @flickwtchr 3 месяца назад +3

    Love the straightforward presentation style. Well done. I think with several viewings of this video, I'll have at least a bit of a grasp regarding the architecture, function, etc.

    • @TheTechTrance
      @TheTechTrance  3 месяца назад

      Definitely! It’s a lot I packed into it, to make it as comprehensive as possible :)

  • @tiffany33094
    @tiffany33094 3 месяца назад +4

    Woah that was such a good breakdown. Great to understand o1 (and LLMs) on a deeper level. Thank you

  • @viniciusdugue3063
    @viniciusdugue3063 3 месяца назад +2

    This video is incredible! Exactly what I was looking for.

  • @drhxa
    @drhxa 3 месяца назад +1

    Great explenation of Let's verify step by step and how that research was applied. Thank you so much for sharing.
    Really excited to see what others do with this and how far scaling this can take us

    • @TheTechTrance
      @TheTechTrance  3 месяца назад +1

      Glad you liked it! +1 on the scaling breakthrough

  • @SapienSpace
    @SapienSpace 3 месяца назад

    If accurate, this is the best explanation of this I have seen so far, thank you for sharing!

  • @Jerrel.A
    @Jerrel.A 3 месяца назад +2

    10+ for the topic, content and presentation skills.

    • @TheTechTrance
      @TheTechTrance  3 месяца назад +1

      I appreciate that!

    • @Jerrel.A
      @Jerrel.A 3 месяца назад

      @@TheTechTrance Great!

  • @matt.stevick
    @matt.stevick 3 месяца назад +2

    the algorithm got me here. looks to be extremely up my ally, great!
    and your a ML engineer…
    subbed done ✅

  • @Let010l01go
    @Let010l01go 2 месяца назад

    In my opinion, I think the problem with having to use so many methods and steps as you have described is that we set up a difficult model (we do it easily, we know it or we don't know it, but we do it easily first) and that is a "black box" and we have to adjust a lot of things because we don't know what it is. In the end, it becomes difficult in the end. This a Great E.P, thx a lots❤

    • @TheTechTrance
      @TheTechTrance  2 месяца назад +1

      Agreed, the models are growing in complexity. As of now each of the steps serve a purpose and later down the line, a simplified version of the design will likely be developed. We will see! Thanks for watching!

    • @Let010l01go
      @Let010l01go 2 месяца назад

      @@TheTechTrance Yes, I agreed.

  • @DataIsBeautifulOfficial
    @DataIsBeautifulOfficial 3 месяца назад +4

    Is o1 actually reasoning, or are we just getting better at mistaking noise for intelligence?

    • @SunnyNagam
      @SunnyNagam 3 месяца назад +1

      If it's just as useful who cares

    • @TheTechTrance
      @TheTechTrance  3 месяца назад +1

      The mechanism for reasoning are there (called RL-Tree-Q* unofficially), so it's getting more ~intelligent.
      That said, it's hallucinations are also getting more ~intelligent.
      x.com/DrJimFan/status/1837174801435349131

    • @RickeyBowers
      @RickeyBowers 3 месяца назад +3

      I'd say that calling it "reasoning" is marketing - we need to focus on accuracy. This technique is engineered to increase accuracy.

    • @tollington9414
      @tollington9414 3 месяца назад +2

      It can only reason well on things it that are in already in its training set, and the problem is, we the consumer aren’t told what exactly is in there, so you roll the dice when you ask it to do something. It’ll do it brilliantly if its seen it before, else you’ll get a load crap back.

    • @RickeyBowers
      @RickeyBowers 3 месяца назад

      @@tollington9414 the multi-step training also steers the model away from unknown topics - the effect is similar to how a student might reply with problem adjacent information without solving the problem. Errors are more difficult to find in some cases or clearly not addressing the problem in others.

  • @estyalasu
    @estyalasu 3 месяца назад +3

    Ohhh I get a full education every time I come to your channel 📚🤓

  • @meenuthind
    @meenuthind 2 месяца назад

    Wow! What a great explanation!! 🤩

  • @joschjosch8859
    @joschjosch8859 3 месяца назад

    Very cool. Glad I discovered your channel. Keep up the good work.

  • @jesussaeta8383
    @jesussaeta8383 3 месяца назад

    Thank you so much, that was awesome.

    • @TheTechTrance
      @TheTechTrance  3 месяца назад

      @@jesussaeta8383 my pleasure, glad you enjoyed!

  • @____2080_____
    @____2080_____ 2 месяца назад

    Looking forward Graph of Thought thinking inference

  • @starpause
    @starpause 3 месяца назад

    Awesome breakdown 🙏

  • @human_shaped
    @human_shaped 3 месяца назад

    Good video. Some bits of information I think a lot of people hadn't heard.

    • @TheTechTrance
      @TheTechTrance  3 месяца назад

      Yea a lot of concepts that this taps into!

  • @Skarredghost
    @Skarredghost 3 месяца назад

    Very informative video, thanks for making it!

    • @TheTechTrance
      @TheTechTrance  3 месяца назад +1

      My pleasure. Glad you liked it!

  • @TheBestNameEverMade
    @TheBestNameEverMade 3 месяца назад

    Awesome explanation!

  • @Let010l01go
    @Let010l01go 2 месяца назад

    Another thing I think is whether the model looks at the world from statistics or from the real world (physics) or hypbrid, I think it's all good, depending on whether it's useful to us or not. Great E.P!🎉

    • @TheTechTrance
      @TheTechTrance  2 месяца назад +1

      I believe it was trained only with text, audio, and images/video data - to develop comprehension and responses. Physical data would be used more in the context of robotics - to develop spatial understanding and take actions

    • @Let010l01go
      @Let010l01go 2 месяца назад

      @@TheTechTrance Yes, you right.

  • @darylallen2485
    @darylallen2485 3 месяца назад

    How does it feel to work in a field which is seeing such explosive growth at this point in history?
    Thanks for the explanation.

    • @TheTechTrance
      @TheTechTrance  3 месяца назад

      It feels invigorating! Also overwhelming at times, since it’s going at such fast speeds. But I guess there’s no slowing down in sight so we do our best :)

  • @junchen-jm2vg
    @junchen-jm2vg 3 месяца назад

    I am impressed of your presentation from original rules perception, you may go to Open AI as presales, looking forward to having deep discussion in O2.

    • @TheTechTrance
      @TheTechTrance  3 месяца назад

      i appreciate that! OpenAI can contact me anytime haha

  • @elpablitorodriguezharrera
    @elpablitorodriguezharrera 3 месяца назад

    Whoa thanks for your explanation!
    If this how o1 was trained, do you think it's the most effective & efficient way?
    And what do you think could be improved with memory, caching, and context window?

    • @TheTechTrance
      @TheTechTrance  3 месяца назад

      You're welcome!
      In terms of effective, I think RL is a a great way to achieve/emulate System 2 thinking.
      In terms of efficient, I wonder why OpenAI keeps the model purely LLM-based. They could also be incorporating logic based languages... like programming languages into their chain of thought. Then o1 would falter less on "how many r's are in strawberry" and "when is 9.11 greater than 9.9" type of questions
      No thoughts on their memory and such, I'm moreso familiar with their model architecture/design :)

    • @elpablitorodriguezharrera
      @elpablitorodriguezharrera 3 месяца назад

      @@TheTechTrance that's such a good idea, the question is, how do we know that LLM is doing nothing when it's in the idle mode? The more I learn these AI thing by reading books and papers as having no computer science degree the more I don't understand nothing.

  • @thenextension9160
    @thenextension9160 3 месяца назад

    This was great thank you

  • @yeezythabest
    @yeezythabest 3 месяца назад +2

    Who tf can dislike this video ?

  • @GNARGNARHEAD
    @GNARGNARHEAD 3 месяца назад

    great analysis, thanks

  • @gregoryw1
    @gregoryw1 3 месяца назад

    So interesting and helpful

  • @schnibitz
    @schnibitz 3 месяца назад

    So is there a chance that they’re going to eventually be able to drastically improve on things like hallucinations and inaccuracies by simply increasing the inference time?

    • @TheTechTrance
      @TheTechTrance  3 месяца назад

      that's what we're seeing with o1 already! of course more improvements are always and still needed, but this is in the right direction

  • @andrewlewin6525
    @andrewlewin6525 3 месяца назад +3

    Sheesh… you put the open back into the openAI 😅

  • @gileneusz
    @gileneusz 3 месяца назад

    I wouldn't be surprised if there are 1-30 instances of gpt-4omini running in behind simultaneously, and one gpt-4o instance deciding which are correct

    • @TheTechTrance
      @TheTechTrance  3 месяца назад +1

      That would be the 05:12 majority vote approach (similar to rolling a dice and seeing which one we land on most) but o1 is instead doing the Tree of Thoughts (a more elegant approach)

    • @gileneusz
      @gileneusz 3 месяца назад

      @@TheTechTrance thanks!

  • @gregtanaka3406
    @gregtanaka3406 3 месяца назад

    Well done!

  • @squidinjam
    @squidinjam 3 месяца назад

    great video!

  • @ahmadzaimhilmi
    @ahmadzaimhilmi 3 месяца назад

    While I acknowledge o1 is super good, I just feel that the reasoning method can be replicated with agents framework like crewai or autogen. It's only a matter of time before someone shares his/project on github.

    • @TheTechTrance
      @TheTechTrance  3 месяца назад

      The agents frameworks are great for getting tasks done, but I'm not so sure about solving problems. eg crosswords, math problems, coding exercises, etc. o1 is geared towards solving problems via reasoning

  • @potatoetales
    @potatoetales 3 месяца назад

    Love the video! But please make sure ding.mp3 is not way louder than the rest of the video 🙏

  • @quantumspark343
    @quantumspark343 2 месяца назад

    I think Q* stands for Quiet STaR (thinking and self taught reasoner), which is another paper, not the Q learning with A*

    • @TheTechTrance
      @TheTechTrance  2 месяца назад +1

      I believe you are right, good catch!

    • @quantumspark343
      @quantumspark343 2 месяца назад

      ​@@TheTechTrancewow thanks, wasnt expecting that 😳

  • @geldverdienenmitgeld2663
    @geldverdienenmitgeld2663 3 месяца назад

    If correct, still many human feedback necessary in the loop of AI training.

    • @memegazer
      @memegazer 3 месяца назад

      Not with strawbrary (joke spelling)
      With straberry they used syntheticdata and and an expert agent
      basically the synthetic data would generate search trees in steps, and the expert would only reward when the correct answer was arrived at with fewest steps

    • @TheTechTrance
      @TheTechTrance  3 месяца назад +1

      There are two separate moments when human feedback is used for reinforcement learning:
      - RLHF, but now that's been transitioned to RLAIF (at 10:40)
      - RL-Tree-Q* (unofficial name): to train its Process Reward Model, a human labels whether the steps of a solution are correct, incorrect, or neither (at 13:58)

    • @memegazer
      @memegazer 3 месяца назад

      I do not think that humans are going to be a bottle neck with synthetic data
      if you read the google paper on universal provers they demonstrate that a simple implementation of occam's razor removes the need for dependence on humans for feedback

  • @gileneusz
    @gileneusz 3 месяца назад

    5:57 I also have this book, and I also read it, kind of 😆

  • @and1play5
    @and1play5 3 месяца назад

    thank uuuuuuu

  • @VR_Wizard
    @VR_Wizard 3 месяца назад

    Are you sure about the active learning part with iterative human labeling of the examples it messed up?
    O1 is good at Coding and Math booth problems where the final answear can be checked automatically. So yes active learning would make sense but the system can check itself if the answear was correct and only use the paths that led to the true answear. Also it could look for the path with the least steps leading to the correct nswear likely this is also the best path. All this needs no human labeling and would explain why math and coding got so much better (In my testing coding did not get so much better often Antropics sonnet is doing a better job. Math seems to see bigger gains but even here it failed often solving my problems)

    • @TheTechTrance
      @TheTechTrance  3 месяца назад

      The active learning is for solutions with the wrong final answer but highly rated steps. The existence of these solutions can be automatically checked for, but their steps would still need human labeling - to see how and at which step it arrived at the wrong final answer

    • @VR_Wizard
      @VR_Wizard 3 месяца назад

      @@TheTechTrance thanks for the reply.
      I would agree that human labeling makes sense in some cases like:
      1. The model never converges for some problem types.
      2. Improving performance on one type of problem reduces performance on solving others.
      3. We need to validate reasoning patterns that could transfer to non-verifiable domains.
      However I question the need for human labeling by default in math/coding problems. If highly rated steps lead to wrong answers, those steps were fundamentally wrong for that type of problem and should be rated lower. Since we can automatically explore paths and verify answers, the system can find optimal reasoning patterns on its own. The only situation where rating paths lower doesn't work is when this hurts the performance when solving other tasks.

  • @Charles-Darwin
    @Charles-Darwin 3 месяца назад

    Wouldnt omni be the 'cortex', and not the 2nd brain. I would think gpt4/t, since theyre quite good, and they have that deep breadth to them like our own 2nd brain function. I think theyve just shifted 4/t to the 2nd brain tasks and have omni out front for input streams. Reason i think so, is our cortex needs to be in the nanosecond rates, where omni clearly is magitudes faster and with all modalities just like our own. (I dont see too many ppl discussing the speed with this new liquidity of inputs, by far the most impressive aspect of omni imo).
    See a plant 🌵 and you immediately know its a plant (omni/cortex/1st brain driven), but what variant/type of plant? Can you eat it? Well thats where you contemplate and ponder on it (2nd brain) by tapping into all relevant knowledge and deduce: well maybe its prickly, pricklies hurt, it might be quite the ordeal to eat it despite it probably being safe to.
    I think o1 is all things held constant (models wise), theyve just added COT to the cluster and maybe bc the scientists comments, there might be some novel new rlhf-replacement.
    [Wrote this while listening, i see you mention this toward the end]😅

  • @ran_domness
    @ran_domness 3 месяца назад

    How confident are you that this is actually how the model was created?

    • @TheTechTrance
      @TheTechTrance  3 месяца назад +1

      I’m very confident. I did my research and cited my sources. It’s in consensus with other industry leaders. Of course there’s details not included that only an OpenAI researcher would have, but hopefully this video gave you a better understanding of how o1 was designed, trained, and its impact wrt the neural scaling laws.

  • @IvanMeouch
    @IvanMeouch 3 месяца назад +1

    Good luck with the channel. I love seeing women engineers.

    • @TheTechTrance
      @TheTechTrance  3 месяца назад

      thank you, just getting started :)

  • @MeridianMindset
    @MeridianMindset 3 месяца назад

    Holy Based

  • @Tony_Indiana
    @Tony_Indiana 3 месяца назад

    This was f*cking awesome! I have my throat, head, hands and other parts tattooed. And in my own way, I understood it. My problem.... ability to extend compute time during inference will cuck accessibility and democratization of AI technology.
    Fancy people like her can still get their hair did and do their ai fancy stuff. But the rest of us - ugh.
    Did anyone catch it? I was getting high. Whilst listening to some banging dubstep! But she mentioned "01" models and QAR algorithm. This is speculative stuff ATM ya? The Q-Learning, Carlo Tree searchin.... holy sherlock homie. I mean this limb needs more branches.
    Come on.. If we learned anything from Q: Never trust an intelligent woman. The candy is not a reward, it is a trap. Yet over and over and over the same mistakes made. rawr people.
    (sorry)

  • @joseph24gt
    @joseph24gt 3 месяца назад

  • @hanskraut2018
    @hanskraut2018 3 месяца назад +2

    You are the most gorgeous model in the end the scaling laws can't account for that 🌹

  • @jayeifler8812
    @jayeifler8812 3 месяца назад

    Wow, are you married?

  • @TrungTran-hq2ys
    @TrungTran-hq2ys 3 месяца назад

    PILLAMEEOWR