Training Script & Data to update LLM to o1 Reasoning (Sky-T1 UC Berkeley)

Поделиться
HTML-код
  • Опубликовано: 13 янв 2025

Комментарии • 12

  • @code4AI
    @code4AI  День назад +1

    With the automatic audio dubbing from RUclips /Google you hear a synthetic voice in your regional language.
    To hear my original voice in English, switch to "Default" or "English" in the settings. Thank you.

  • @irbsurfer1585
    @irbsurfer1585 2 дня назад +8

    Thanks for directly addressing my comment from the last video! This video does a much better job of demonstrating how these advanced reasoning techniques can be made more accessible with a more reasonable budget and using open-source resources. The breakdown of Sky-T1 and the $500 budget example is a significant step towards making this less abstract and more actionable for people with limited resources. Appreciate the follow-up! 😀😀 I really look forward to your next lesson!

  • @jonathansanderson807
    @jonathansanderson807 2 дня назад +3

    I really appreciate these up to date videos. Keep up the great work!

  • @En1Gm4A
    @En1Gm4A День назад +2

    And its already on LM Studio. What a time to be alive

  • @patruff
    @patruff 2 дня назад +4

    It's funny that they specifically mentioned that anything under 32b was trash. Good thing I was working with them!!

    • @irbsurfer1585
      @irbsurfer1585 2 дня назад +2

      I think maybe trying to squeeze advanced reasoning capabilities out of a SLM might be pushing it. It's good to know that at least we can go as small as 32b. It would be nice if we could go smaller and maybe someday soon we will. I would not be surprised.

  • @smorty3573
    @smorty3573 2 дня назад +2

    I have seen this in many system prompts before, and I cannot get behind it. WHY do so many system prompts have grammar errors?
    i would assume that having good grammar in the system prompt tells the LLM to also use good grammar.
    I am talking about the cases in 4:28 where the text reads:
    "The solution should remain a logical, accurate, concise expression style and detail necessary step needed to reach the conclusion [...]" I think it should be "steps"
    "[...] to develop well-considered thinking process." Shouldn't it be "thinking processes" or "to develop A well-considered thinking process"?
    Is this intentional? I know that some DAN-like prompts use numbers instead of letters to tell the LLM something without it being in plain text to trick some part of it to generate not-okay content.

  • @zxwxz
    @zxwxz День назад

    You still need larger models to broadly encompass all knowledge, but specific applications using RL can still achieve breakthroughs. However, it's unclear whether cross-domain areas benefit more from larger models or from using small models in independent domains. This likely depends on the exploration of distributed training and the aggregation capabilities of small models. I think OpenAI's contribution is to point the way and show that the new paradigm of O1 is continuously viable. They gather capital to break through in new directions. Naturally, as long as the open-source community confirms that the general direction is feasible, there's nothing that can't be achieved independently. Think about it: it's been less than half a year since O1 was announced, and the open-source community can still serve as an excellent balance between tech giants and individuals.

  • @patruff
    @patruff 2 дня назад

    I was just renting out the 8 H100s

  • @aymanechourou
    @aymanechourou 2 дня назад

    NICE WORK

  • @HoldMyData
    @HoldMyData 23 часа назад

    Thanks!

  • @user-pt1kj5uw3b
    @user-pt1kj5uw3b День назад

    Typo in their prompt lol