Make your agents 10x more reliable? Flow engineer 101

The REAL cost of LLM (And How to reduce 78%+ of Cost)

"I want Llama3.1 to perform 10x with my private knowledge" - Self learning Local Llama3.1 405B

Engineers vs Extreme Hide & Seek

Searching the Jungle for WWII Battlefields (6 Days Fishing, Kayaking & Snorkeling in Palau)

We Made Sushi, It's Scary! (Roblox Scary Sushi)

"Make Agent 10x cheaper, faster & better?" - LLM System Evaluation 101

AI Jason

Просмотров 19 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 6 фев 2025

Комментарии • 31

@Jim-ey3ry 8 месяцев назад ⁺²⁵
This is gold, most of people just show you how to build toy demo, but not many actually get into details of how to get into production; Thank you Jason!
@xXWillyxWonkaXx 8 месяцев назад
Couldnt agree more. This is gold.
@tkp2843 8 месяцев назад ⁺⁵
This is great. Loved the use of firecrawl (as a scrape tool) to get the website's data. Feel like it always helps improve the model output quality. Cheers!
@darrenhinde2971 8 месяцев назад
Been looking for more detail on eval on LLMs and been scratching around for a while. Thanks for this.
@jasonfinance 8 месяцев назад ⁺³
Amazing work as always Jason!
@kenchang3456 8 месяцев назад ⁺⁵
Way excellent video that goes well beyond demo. Thank you very much for this guidance.
@apereiracv 8 месяцев назад ⁺⁸
I recently be created a whole testing system for our LLM chatbots and we did exactly this:
LLM as evaluator and code
We created it as a series of unit tests with LLM generated cases.
Since our results were mostly conversational, we made tests pass/fail according to a scoring system
@contractorwolf 8 месяцев назад
goddamn Jason your videos just blow my mind each time. Thanks for such a thorough explanation and example.
@humanish_ai 8 месяцев назад ⁺¹
Finally you back 🎉
@kayshidow 8 месяцев назад ⁺¹
I've used promptfoo for some of my test with local llm to test the ai workflow. It allow you to write assertion like you'll do with software
@titusblair 8 месяцев назад
Awesome! Keep up the great work!
@agenticmark 8 месяцев назад ⁺²
fine tune llama 3 (8bit) - you will get exactly the behavior you want - its what I do
@techfren 8 месяцев назад ⁺¹
lesgooo!! ❤‍🔥❤‍🔥❤‍🔥
@JorritvanGinkel 8 месяцев назад
This is so good, thanks man!
@someshfengade9623 8 месяцев назад ⁺¹
I found langfuse metric monitoring little bit better.
@jimmy-ef2ow 8 месяцев назад ⁺¹
jason can we get another video about comfy ui?
@jordanz9580 8 месяцев назад
fireeee content!
@MatrixCodeBreaker88 8 месяцев назад
Great Video
@CorkyBallasdancewithme 7 месяцев назад
great stuff, as new to hearing this, very interesting, can this be built by a novice . . .
@fullgazz 8 месяцев назад ⁺¹
Who never spent 4 hours to save 10 min? That's our hobby spent time to save time.
@AGI-Bingo 8 месяцев назад ⁺¹
If 25 people or more use it successfully then you literally gave humanity more time to live and be free
@Joe-bp5mo 8 месяцев назад
Sick, whats the best practice metrics for evaluating agents?
@Ms.Robot. 8 месяцев назад
I love how my Ai girl insults the competion with flame balls,then tells me.she loves me.❤🎉😊
@KalLif-k3i 8 месяцев назад
Why not use Gemini as the LLM? It is free.
@HyperUpscale 8 месяцев назад ⁺¹
Lets me share my experience about any google AI model ... because it doesn't understand human and it hallucinate way too much.
Practically ... in my cases 75% of the time what I get back is totally useless result. You cant use for anything... To be considered for evaluation ... you must be joking
@irql2 8 месяцев назад
I dont see the value of "Agents". All of this stuff is easily done with basic function calling. I think I'm going to need to see some more creative use cases before I jump on board, i just dont get it yet.
@ayoubfr8660 8 месяцев назад
Maybe we can discuss this, I am trying to jump on in but not until I find a decent idea to apply.
@symbol9new 8 месяцев назад
when your assistant has a lot of functions, he starts giving out hallucinations, have you ever encountered this?
@SydneyF-eg5lt 8 месяцев назад
Good content but so hard to listen to his Engrish. Monotonous Pitch n sped up delivery didn’t seem to help either.

Следующие

Автовоспроизведение

Make your agents 10x more reliable? Flow engineer 101

Make your agents 10x more reliable? Flow engineer 101

The REAL cost of LLM (And How to reduce 78%+ of Cost)

The REAL cost of LLM (And How to reduce 78%+ of Cost)

"I want Llama3.1 to perform 10x with my private knowledge" - Self learning Local Llama3.1 405B

"I want Llama3.1 to perform 10x with my private knowledge" - Self learning Local Llama3.1 405B

Engineers vs Extreme Hide & Seek

Engineers vs Extreme Hide & Seek

Searching the Jungle for WWII Battlefields (6 Days Fishing, Kayaking & Snorkeling in Palau)

Searching the Jungle for WWII Battlefields (6 Days Fishing, Kayaking & Snorkeling in Palau)

We Made Sushi, It's Scary! (Roblox Scary Sushi)

We Made Sushi, It's Scary! (Roblox Scary Sushi)

Surprising Son with Dream Car on 16th Birthday

Surprising Son with Dream Car on 16th Birthday

Evaluating LLM-based Applications

Evaluating LLM-based Applications

Better than Cursor? Future Agentic Coding available today

Better than Cursor? Future Agentic Coding available today

This Algorithm Could Make a GPT-4 Toaster Possible

This Algorithm Could Make a GPT-4 Toaster Possible

Zero to Hero - Develop your first app with Local LLMs on Windows | BRK142

Zero to Hero - Develop your first app with Local LLMs on Windows | BRK142

"Research agent 3.0 - Build a group of AI researchers" - Here is how

"Research agent 3.0 - Build a group of AI researchers" - Here is how

AI tools for software engineers, but without the hype - with Simon Willison (Co-Creator of Django)

AI tools for software engineers, but without the hype – with Simon Willison (Co-Creator of Django)

How to use Cursor AI build & deploy production app in 20 mins

How to use Cursor AI build & deploy production app in 20 mins

Master CrewAI: Your Ultimate Beginner's Guide!

Master CrewAI: Your Ultimate Beginner's Guide!

Feed Your OWN Documents to a Local Large Language Model!

Feed Your OWN Documents to a Local Large Language Model!

过年了，杀个年猪给大伙助个兴… #抖音动物图鉴 #萌宠出道计划 #神奇动物在抖音

过年了，杀个年猪给大伙助个兴… #抖音动物图鉴 #萌宠出道计划 #神奇动物在抖音

Who is that baby | CHANG DORY | ometv

Who is that baby | CHANG DORY | ometv

притворился дедом и проверил шаурмечные на человечность ч11

притворился дедом и проверил шаурмечные на человечность ч11

These guys are so close to real Power Armor 😰 #fallout #powerarmor #engineering

These guys are so close to real Power Armor 😰 #fallout #powerarmor #engineering

Лечение болезни Паркинсона

Лечение болезни Паркинсона

"Россияне, как страусы, головы в песок позасовывали" #война #фронт #Украина

"Россияне, как страусы, головы в песок позасовывали" #война #фронт #Украина

Игрок 456 на самом деле БАНКРОТ? (Игра в Кальмара)

Игрок 456 на самом деле БАНКРОТ? (Игра в Кальмара)

ЗВОНИТЕ САНИТАРАМ | Маркарян - ВСЁ / СУМАСШЕДШИЙ Последователь Косенко / МНОГОЖЕНЕЦ Заигрался

ЗВОНИТЕ САНИТАРАМ | Маркарян - ВСЁ / СУМАСШЕДШИЙ Последователь Косенко / МНОГОЖЕНЕЦ Заигрался