Large Language Model Speed Showdown - Bunker BBQ Sous Chef

Поделиться
HTML-код
  • Опубликовано: 15 сен 2024
  • Llama 2 70B by Meta AI continues to be a leading open-source Large Language Model, pretrained on publicly available online data sources. The fine-tuned model, Llama Chat, leverages publicly available instruction datasets and over 1 million human annotations.
    In this quick and short demo we do comparison of the model running on the Groq® LPU™ Inference Engine compared to Open AI's ChatGPT-3, prompting, “Imagine I’m staying in an emergency bunker I built in Hawaii. One of my favorite rations in my supply pantry is Sweet Baby Ray’s BBQ Sauce. Act as a personal chef who comes up with a suggested recipe using the sauce.” We followed this on with the ask, “Can you format this as a recipe table?” Lastly, we requested, “Please write a recipe using spam, rice, and Sweet Baby Ray's BBQ Sauce."
    Let us know what you think of the speed and share any questions in the comments section, and if you're impressed, don't forget to shoot us a thumbs up.
    Try our publicly available Inference Engine, GroqChat - over at groq.com and leave a comment down below to tell us what you think.
    #ai #llm #llama #genai #language #largelanguagemodels #technologist #demo
    Join our Discord community at / discord where you'll find lively discussions, exclusive sneak peeks for the latest features, and collaborative Groq engineers and developers from all around the world!

Комментарии • 6

  • @TheXood
    @TheXood 6 месяцев назад +6

    If the comparison is supposed to show how fast the LPU custom hardware is, then you must run both tests under the same conditions. Which is only possible on isolated (or local) hardware. Otherwise you messure load or artificial speed reduction on ChatGPT.

    • @jb510
      @jb510 6 месяцев назад +4

      this ^. It's weird these demos seem to use different models AND one is using ChatGPT's public instance with it's variable and very high load. Make some apple's to apple's comparison folks.

  • @anagnorisis2024
    @anagnorisis2024 6 месяцев назад +2

    You can't compare speed if one platform has millions of requests per second and the other does not. The simulation is not apple to apple. You need to compare both when the number or rate of public requests are similar.

  • @dhaw
    @dhaw 6 месяцев назад

    amazing !

  • @marthajohnson2775
    @marthajohnson2775 7 месяцев назад

    Phone screen too small! Grab laptop...