Llama 3.2 is INSANE - But Does it Beat GPT as an AI Agent?

Cole Medin

Просмотров 8 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 1 фев 2025

Комментарии • 36

@JaredVBrown 4 месяца назад ⁺²
Love this Guy. Shares knowledge in a clear and understandable way.
@ColeMedin 3 месяца назад
Thank you Jared, I appreciate it a lot!!
@PathLink-fk3cp 4 месяца назад ⁺¹
crushing the content m8. Lets gooooooo.
LLMs are getting so much better. Not where I "need" em yet, but we are already incredibly spoiled to have this kind of power available to us :P
@ColeMedin 4 месяца назад ⁺²
Thank you!! And I totally agree, it's crazy that I complain about the "weaker" Llama 3.2 models when even the 1b parameter version would have been unbelievable 4-5 years ago.
@Techonsapevole 4 месяца назад ⁺¹
Great tests, thanks!
@ColeMedin 4 месяца назад
Of course, thank you!
@helloimedden 4 месяца назад ⁺¹
Great video. You explain things well and I learned a lot. Subscribed!
@ColeMedin 4 месяца назад
Thank you, I appreciate it a lot!!
@RobotechII 4 месяца назад ⁺⁶
Great content, subscribed earlier today from your n8n video.
@ColeMedin 4 месяца назад
Thank you, I appreciate it a lot! :)
@SOL-5004 3 месяца назад
Wow thanks for the amazing contents😊 so easy to understand
Idk why I ended up here. I was just avoiding my mid term prep 😂
@ColeMedin 3 месяца назад
Thank you very much! My primary goal is to make it easy to understand so I appreciate you calling that out.
I know what it's like to browser RUclips to procrastinate studying for exams haha... I hope that is going well for you!
@arinco3817 4 месяца назад ⁺²
I've found putting basic instructions for available tools into the system prompt helps. Like 'you've got a bunch of tools available, x for working with asana etc. If you call x make sure you do y etc
@arinco3817 4 месяца назад ⁺¹
Cool video tho, I've not tried lang graph before
@ColeMedin 4 месяца назад ⁺²
@@arinco3817 Thank you and yes I appreciate you calling that out! That's actually one of the things I had in mind specifically when I said at the end you could probably make it work for "weaker" models if you really want. Just takes extra work but if you want to run locally it's worth it!
@jbaenaxd 3 месяца назад ⁺¹
Nice video! If Llama 3.2 is not working fine with function calling, which one do you think will be better? Could you recommend a model? Preferably under 8b, that can fit in a 3090 or even smaller common cards.
@ColeMedin 3 месяца назад
Thank you! I would try one of the Qwen models for function calling! Models under 8b generally won't do that great with function calling though. I would suggest trying quantized versions of larger models or using a tool like Air LLM to use larger models without having to have insane hardware:
github.com/lyogavin/airllm
@jbaenaxd 3 месяца назад ⁺²
@@ColeMedinThat sounds cool! There are many people here with RTX gaming GPUs trying to make them work in production environments without depending on the cloud because of the cost and privacy. There are not many people doing videos with this value on RUclips. Maybe you could make a video explaining in depth how it works and what's the best way to do it. I believe that a video like that would have some views. Best wishes Cole, you are in the right way, congrats! 👍
@ColeMedin 3 месяца назад ⁺²
I appreciate the suggestion a lot! I definitely will be making videos covering this in the near future!
@nanaboakyeoseitutu6896 4 месяца назад ⁺¹
Do you have any implementations with n8n and openrouter ?
@ColeMedin 4 месяца назад
@@nanaboakyeoseitutu6896 Great question! I do not since n8n doesn't always have the models I want to test with. But I still really like the idea and will probably implement it in the near future!
@TurkerTUNALI 4 месяца назад
@@ColeMedin You can use all the open source models in n8n with the ollama.
@ColeMedin 4 месяца назад ⁺¹
@TurkerTUNALI That's true! There are some platforms I like to use sometimes that aren't in n8n though like Together or Fireworks.
@DanielBowne 4 месяца назад
Found these new models subpar, but the beat local LLMs I have seen for function calling.
@ColeMedin 4 месяца назад
Yeah that's the same experience for me! Huge bummer they aren't as good at function calling as GPT-4o-mini, but still a lot better than Llama 3.1.
@navotdk 4 месяца назад ⁺²
finetuning will probably make it work.
@navotdk 4 месяца назад
you can use synthetic data from GPT-4o for the tuning
@ColeMedin 4 месяца назад ⁺²
@@navotdk Yeah fair point! And this is something I'm actually going to be exploring in the near future! It's too bad it's necessary when it isn't for even GPT-4o-mini, but local LLMs are often a requirement for a use case so fine tuning is an awesome option to make it work.
@OscarTheStrategist 4 месяца назад ⁺²
Nice comparison!
Have you been able to use the vision abilities with Llama 3.2?
I'm interested in learning how to do it. Tried it in LM studio and in Open Web UI but it doesn't really recognize the image input. Llava does work with vision out of the gate using Open WebUI but it's kinda terrible, and they haven't added the new Pixtral yet.
Anyway, thanks for posting these, they are very helpful. I'll give 3.2 with function calling a try in some of my chains and see how it does. I wonder if the 3B models are just not trained on function calling at all?
@ColeMedin 4 месяца назад ⁺¹
Thank you Oscar!
I have not tried the vision capabilities for Llama 3.2 yet. It's SUPER cool, don't get me wrong, but my use cases really don't benefit from it at this point. But I'd love to explore it more. Sorry to hear it doesn't seem to be working for LM Studio and OpenWebUI for you.
Good luck trying Llama 3.2 in your chains! Yes, it really does seem the smaller models aren't trained on function calling at all. 11b seemed to be trying to spit out function calling syntax (it's responses started with ""), but it never did it successfully even after trying for a while. 3b and 1b didn't even try.
@Cryosimorgh 3 месяца назад ⁺¹
personally, I've been mighty disappointed with llama 3.2. it's been haluluing left right and center with the simplest tasks
@ColeMedin 3 месяца назад ⁺¹
You know after doing a lot more testing with Llama 3.2 over the last weeks I do agree. A lot of it depends on the prompting/use case but yeah I've had better success with some other local models like Qwen.
@Minotaurus007 3 месяца назад ⁺¹
I tried Llama3.2:3b to implement a literature database in Qdrant as Cole detailed in a former video on N8N. It didn't take me 2 minutes but a week. It is working technically. However, for work with literature (PDF->GoogleDrive->GoogleDoc->N8N File Create->...->Qdrant) it seems to be unusable. Because it hallucinates the hell out of the GPU and the database. Each manual call to ChatGPT 4o without DB is *way* better.
Much better than Llama3.2:3b is Qwen2.5:13b LLM as well. However, this does not talk to the Qdrant database (i.e. nomic-embed-text). So not usable as a RAG as well. I am a little bit frustrated now :-) - Maybe I should find me the bigger Llama3.2 models, but they are at least not in the ollama library.
BTW: accessing ChatGPT via API is expensive and has a minute context window.
Any ideas?
@michabbb 4 месяца назад
Any issues with your eyes ?? (thumbnail)
@ColeMedin 4 месяца назад
@@michabbb Haha no issues with my eyes! Just a silly thumbnail photo. What seems off to you? 😂

Следующие

Автовоспроизведение