Do not use Llama-3 70B for these tasks ...

Поделиться
HTML-код
  • Опубликовано: 12 май 2024
  • A detailed data analysis of the 1 mio votes by the AI community of the performance of LLMs open up new insights to areas where LLMs outperform, and areas where you better do not use a particular LLM, but opt for a better performance LLM.
    all rights w/ authors:
    What’s up with Llama 3? Arena data analysis
    lmsys.org/blog/2024-05-08-lla...
    #airesearch #ai #newtechnology
  • НаукаНаука

Комментарии • 13

  • @gileneusz
    @gileneusz Месяц назад +5

    this is great video! really amazing explanation

    • @code4AI
      @code4AI  Месяц назад

      One of the best comments today! 😊

  • @martinsherry
    @martinsherry Месяц назад +5

    “of course, those people were wrong”…..hahahaha.

    • @code4AI
      @code4AI  Месяц назад +2

      Finally, someone is laughing ! Success! 😂

  • @Sl15555
    @Sl15555 Месяц назад +2

    summarization might be low because of llama3's context length., that's my best guess. ill have to test it more as i like using the llm's to summarize youtube videos ( thought i watched this one ). I have found some areas llama3 works well and use it for that. one is creative writing / poems, but the result is then used to produce creative lists for other tasks works really well.

  • @henkhbit5748
    @henkhbit5748 Месяц назад +1

    If an opensource llm perform well for your particular usecase then, for me, it Will always have my preference than a big monolithic closed source llm from ClosedAi!

  • @IdPreferNot1
    @IdPreferNot1 Месяц назад

    Love how your critiques shred the populist AI community while providing useful info.

  • @thedoctor5478
    @thedoctor5478 Месяц назад +2

    I couldn't care less about friendliness. We can get that from low param models and use them to reform texts. Larger models should just care about reasoning above all else.

  • @TheReferrer72
    @TheReferrer72 Месяц назад

    Now I know you are tripping. Unless I can't read that graph properly you are trying tell us that a 44-45% win rate is a big loss!
    Especially as this is a 70b open weights model, while the others are all closed weights.
    And as another commenter noted Llama 3 has only 4k context window so of course it will be poor at summarisation and other tests that rely on a long context.
    We will be getting longer context versions from Meta, multi model and huge parameters.

    • @code4AI
      @code4AI  Месяц назад

      Llama 3 was trained on 8192 token 😂

    • @TheReferrer72
      @TheReferrer72 Месяц назад

      @@code4AI ok it has a 8k token length, GPT4 Turbo 128k, Claude 200K, Gemini 1000K+, so 16 times longer my point still stands.
      And I notice how you did not address my first point, Like I said you are tripping.

  • @peterbell663
    @peterbell663 Месяц назад

    I found it essentailly useless and a waste of my time. I gave it a dataset of 10,000 lines with 22 variables and asked for summary statistics in cumulative blocks of 1000. 10 blocks in total, I reposed this question about 8 times over hours and each time the answer was DRIBBLE. And that was a very easy task. Imagine giving it a little bit more difficulta task like time series modelling. I will check the alternatives.

    • @dennisestenson7820
      @dennisestenson7820 Месяц назад

      Maybe you should choose an appropriate tool for the task.