One Thought on the Future of AI Agents World Model

"I want Llama3 to perform 10x with my private knowledge" - Local Agentic RAG w/ llama3

5 Easy Ways to help LLMs to Reason

We're Back to Cabin Life!

KISS OF LIFE (키스오브라이프) 'Sticky' Official Music Video

AI is ruining the internet

Do not use Llama-3 70B for these tasks ...

code_your_own_AI

Просмотров 3 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 12 май 2024
A detailed data analysis of the 1 mio votes by the AI community of the performance of LLMs open up new insights to areas where LLMs outperform, and areas where you better do not use a particular LLM, but opt for a better performance LLM.
all rights w/ authors:
What’s up with Llama 3? Arena data analysis
lmsys.org/blog/2024-05-08-lla...
#airesearch #ai #newtechnology
Наука

Комментарии • 13

@gileneusz Месяц назад ⁺⁵
this is great video! really amazing explanation
@code4AI Месяц назад
One of the best comments today! 😊
@martinsherry Месяц назад ⁺⁵
“of course, those people were wrong”…..hahahaha.
@code4AI Месяц назад ⁺²
Finally, someone is laughing ! Success! 😂
@Sl15555 Месяц назад ⁺²
summarization might be low because of llama3's context length., that's my best guess. ill have to test it more as i like using the llm's to summarize youtube videos ( thought i watched this one ). I have found some areas llama3 works well and use it for that. one is creative writing / poems, but the result is then used to produce creative lists for other tasks works really well.
@henkhbit5748 Месяц назад ⁺¹
If an opensource llm perform well for your particular usecase then, for me, it Will always have my preference than a big monolithic closed source llm from ClosedAi!
@IdPreferNot1 Месяц назад
Love how your critiques shred the populist AI community while providing useful info.
@thedoctor5478 Месяц назад ⁺²
I couldn't care less about friendliness. We can get that from low param models and use them to reform texts. Larger models should just care about reasoning above all else.
@TheReferrer72 Месяц назад
Now I know you are tripping. Unless I can't read that graph properly you are trying tell us that a 44-45% win rate is a big loss!
Especially as this is a 70b open weights model, while the others are all closed weights.
And as another commenter noted Llama 3 has only 4k context window so of course it will be poor at summarisation and other tests that rely on a long context.
We will be getting longer context versions from Meta, multi model and huge parameters.
@code4AI Месяц назад
Llama 3 was trained on 8192 token 😂
@TheReferrer72 Месяц назад
@@code4AI ok it has a 8k token length, GPT4 Turbo 128k, Claude 200K, Gemini 1000K+, so 16 times longer my point still stands.
And I notice how you did not address my first point, Like I said you are tripping.
@peterbell663 Месяц назад
I found it essentailly useless and a waste of my time. I gave it a dataset of 10,000 lines with 22 variables and asked for summary statistics in cumulative blocks of 1000. 10 blocks in total, I reposed this question about 8 times over hours and each time the answer was DRIBBLE. And that was a very easy task. Imagine giving it a little bit more difficulta task like time series modelling. I will check the alternatives.
@dennisestenson7820 Месяц назад
Maybe you should choose an appropriate tool for the task.

Следующие

Автовоспроизведение

One Thought on the Future of AI Agents World Model

One Thought on the Future of AI Agents World Model

"I want Llama3 to perform 10x with my private knowledge" - Local Agentic RAG w/ llama3

"I want Llama3 to perform 10x with my private knowledge" - Local Agentic RAG w/ llama3

5 Easy Ways to help LLMs to Reason

5 Easy Ways to help LLMs to Reason

We're Back to Cabin Life!

We're Back to Cabin Life!

KISS OF LIFE (키스오브라이프) 'Sticky' Official Music Video

KISS OF LIFE (키스오브라이프) 'Sticky' Official Music Video

AI is ruining the internet

AI is ruining the internet

BE:FIRST X ATEEZ / Hush-Hush -Music Video-

BE:FIRST X ATEEZ / Hush-Hush -Music Video-

Ollama UI - Your NEW Go-To Local LLM

Ollama UI - Your NEW Go-To Local LLM

Can ChatGPT 4o do data analysis?

Can ChatGPT 4o do data analysis?

Has Generative AI Already Peaked? - Computerphile

Has Generative AI Already Peaked? - Computerphile

Fine-tuning Large Language Models (LLMs) | w/ Example Code

Fine-tuning Large Language Models (LLMs) | w/ Example Code

NEW TextGrad by Stanford: Better than DSPy

NEW TextGrad by Stanford: Better than DSPy

"okay, but I want Llama 3 for my specific use case" - Here's how

"okay, but I want Llama 3 for my specific use case" - Here's how

Is CODE LLAMA Really Better Than GPT4 For Coding?!

Is CODE LLAMA Really Better Than GPT4 For Coding?!

How I'd Learn to be a Data Analyst in 2024

How I'd Learn to be a Data Analyst in 2024

Unlimited AI Agents running locally with Ollama & AnythingLLM

Unlimited AI Agents running locally with Ollama & AnythingLLM

🛑 STOP! SAMSUNG НЕ ПОКУПАТЬ!

🛑 STOP! SAMSUNG НЕ ПОКУПАТЬ!

Лучшая защита iPhone - Apple Silicone/Leather Case и защитное стекло!

Лучшая защита iPhone - Apple Silicone/Leather Case и защитное стекло!

Choose a phone for your mom

Choose a phone for your mom

🤑Самый ДЕШЕВЫЙ смартфон с 512 гб

🤑Самый ДЕШЕВЫЙ смартфон с 512 гб

Стыдно ли ходить с iPhone 11 в 2024 году?

Стыдно ли ходить с iPhone 11 в 2024 году?

Неловкая ситуация… У нас можно приобрести iPhone любого цвета без сомнений Тг канал :GADGET_PR0

Неловкая ситуация… У нас можно приобрести iPhone любого цвета без сомнений Тг канал :GADGET_PR0

Вся правда про Surface Laptop на ARM. Snapdragon X Elite - конкурент Apple?

Вся правда про Surface Laptop на ARM. Snapdragon X Elite — конкурент Apple?

Smart appliances - new gadgets, versatile utensils, tool items #gadgets #shorts

Smart appliances - new gadgets, versatile utensils, tool items #gadgets #shorts