How to use LLMs for Fact Checking

Automated Prompt Engineering with DSPy

How to save money with Gemini Context Caching

I Got The 7 Weirdest Pet Fish...

Jon Jones recaps UFC 309 win vs. Stipe Miocic, discusses future in fighting and acting | ESPN MMA

CONTEXT CACHING for Faster and Cheaper Inference

Trelis Research

Просмотров 1,7 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 17 ноя 2024

Комментарии • 12

@heski6847 2 месяца назад ⁺¹
Thank you, as always very useful content!
@TrelisResearch 2 месяца назад
you're welcome
@alleskepler9526 20 дней назад
Bro u a gem
@TrelisResearch 19 дней назад
appreciate it
@Rishab-l1u 2 дня назад
How do we deal with hallucination resulting from our background info?
@TrelisResearch 2 дня назад
Take a look at my video on synthetic data generation. I cover it there.
Unless I’m misreading your Q and it relates to caching?
@explorer945 2 месяца назад ⁺¹
How does it different from cachi7by UI libraries like chainlit where they use redis to store the embeddings of prompt and if it matches they return the previous response without even hitting the llm api. Which is better?
@TrelisResearch 2 месяца назад ⁺¹
Howdy! What you're mentioning is embedding caching, which is a complete cache (i.e. the whole answer is stored and retrieved if there's a match).
This here is kv cache embedding, it's partial embedding for LLM inference. When part of a prompt is being reused (and it has to be the first part), there are some intermediate values (k and v) that can be reused in the forward pass to generate the response.
@explorer945 2 месяца назад
@@TrelisResearch got it. why it has to first part? i couldn't quite get it from the video. Also, it is based on initial layers or end layers? how does it help with RAG architectures?
@MrMoonsilver 2 месяца назад ⁺¹
Do you think this will come to open source, self-hosted models?
@TrelisResearch 2 месяца назад ⁺¹
Yup, I show SGLang (same approach for vLLM) in this video!
@MrMoonsilver 2 месяца назад
Super cool, thank you so much.

Следующие

Автовоспроизведение

How to use LLMs for Fact Checking

How to use LLMs for Fact Checking

Automated Prompt Engineering with DSPy

Automated Prompt Engineering with DSPy

How to save money with Gemini Context Caching

How to save money with Gemini Context Caching

I Got The 7 Weirdest Pet Fish...

I Got The 7 Weirdest Pet Fish...

Jon Jones recaps UFC 309 win vs. Stipe Miocic, discusses future in fighting and acting | ESPN MMA

Jon Jones recaps UFC 309 win vs. Stipe Miocic, discusses future in fighting and acting | ESPN MMA

I Broke Everything in This Entire City - Vending Machine Business Simulator

I Broke Everything in This Entire City - Vending Machine Business Simulator

How “lru_cache” Can Make Your Functions Over 100X FASTER In Python

How “lru_cache” Can Make Your Functions Over 100X FASTER In Python

Full Fine tuning with Fewer GPUs - Galore, Optimizer Tricks, Adafactor

Full Fine tuning with Fewer GPUs - Galore, Optimizer Tricks, Adafactor

Long Context Summarization

Long Context Summarization

The Strange Physics Principle That Shapes Reality

The Strange Physics Principle That Shapes Reality

Making Long Context LLMs Usable with Context Caching

Making Long Context LLMs Usable with Context Caching

What Makes A Great Developer

What Makes A Great Developer

Fine tune and Serve Faster Whisper Turbo

Fine tune and Serve Faster Whisper Turbo

Become a bash scripting pro - full course

Become a bash scripting pro - full course

Fine tuning Pixtral - Multi-modal Vision and Text Model

Fine tuning Pixtral - Multi-modal Vision and Text Model

ЧТО ОБЩЕГО У ОЛЕСИ И ДЖИГАНА? #натальнаякарта

ЧТО ОБЩЕГО У ОЛЕСИ И ДЖИГАНА? #натальнаякарта

Она очень сильная... #джарахов #mona #мона #подкаст

Она очень сильная... #джарахов #mona #мона #подкаст

"Не так должно было быть!" Зузанна МАКСИМЮК и Камиль НЕДВЕДЬ

"Не так должно было быть!" Зузанна МАКСИМЮК и Камиль НЕДВЕДЬ

Best Funny Moment 😅

Best Funny Moment 😅

Comedy Club: Офисный квадробер | Демис Карибидис, Марина Кравец, Денис Дорохов @ComedyClubRussia

Comedy Club: Офисный квадробер | Демис Карибидис, Марина Кравец, Денис Дорохов @ComedyClubRussia

ТАБЛЕТКА (смешное видео, юмор, приколы, поржать)

ТАБЛЕТКА (смешное видео, юмор, приколы, поржать)

15 Способов Пронести ГАДЖЕТЫ и СЛАДОСТИ в ШКОЛУ !

15 Способов Пронести ГАДЖЕТЫ и СЛАДОСТИ в ШКОЛУ !

ТАРАКАН ОТОМСТИЛ ГРАБИТЕЛЮ

ТАРАКАН ОТОМСТИЛ ГРАБИТЕЛЮ