How Meta's Thinking LLMs Work

Поделиться
HTML-код
  • Опубликовано: 12 ноя 2024

Комментарии • 11

  • @RodCoelho
    @RodCoelho 5 дней назад +1

    I have over 25 years of experience in Lean Six Sigma problem solving experience. I am a data scientist now and if anybody would like to get together to incorporate lean six sigma thinking into the thinking process of an LLM, hit me up. I don’t have much experience training LLM’s but have been able to replicate some of the O1 though process with prompt engineering.

  • @scottoxen
    @scottoxen 6 дней назад

    lol that ending. The live group convo is so entertaining.

  • @aiamfree
    @aiamfree 6 дней назад +4

    Been working on this as well...the biggest issue LLMs still face from my experience is instruction following. I think the reasoning would see a major bump up if they only knew how to better follow instructions. Gpt 4o does alright but so many cases Llama is borderline unsable.

    • @kalilinux8682
      @kalilinux8682 6 дней назад +1

      Don't use instruct lm. Only use multi shot for specific task. Trust me you will come back here and thank me not even kidding :)

    • @User.Joshua
      @User.Joshua 6 дней назад

      4o-mini has become so cheap that I pass the results into another prompt to further refine the outcome. It can follow single instructions, so after two or three passes, it mostly delivers what I require for basic tasks.

    • @ronilevarez901
      @ronilevarez901 6 дней назад

      @@User.Joshua And how do you do that _offline_ ?

    • @User.Joshua
      @User.Joshua 6 дней назад

      @@ronilevarez901 if you’re using a local model, it should still be the same thing. Just pass the output of the initial request into another one with different instructions. It’s just a process of refinement.

    • @ronilevarez901
      @ronilevarez901 6 дней назад

      @@User.Joshua lot of issues there.
      First, having even one local model is impossible on my machine.
      Lets pretend is not. Having more than one stored is prohibitive. Loading multiple models at the same time is mostly impossible in most systems.
      Lets load them one by one only when needed. Loading takes time, sometimes too much, so this is feasible only with a master model and specialized smaller ones.
      But small instruct models are very bad at many things, like following instructions.
      If op is complaining about llama failing at this, smaller models would be entirely useless for their use case.
      But my comment was referring to your remark of using 4o mini.
      It might be cheap-ish, but that lower cost doesn't help when we have/want to do inference locally.
      And multiple instruct models either. It's easier to have one decent local model and feeding it with the different prompts.

  • @cuentadeyoutube5903
    @cuentadeyoutube5903 6 дней назад +1

    There’s still no moats, it seems