LLM Function Calling - AI Tools Deep Dive

Make AI Think Like YOU: A Guide to LLM Alignment

RAG vs. Fine Tuning

Daz Watches Candys Cooking 2

Lilo & Stitch | Official Teaser

Testing Roblox's Most TERRIFYING Myths..

Does Fine Tuning Embedding Models Improve RAG?

Adam Lucek

Просмотров 1,8 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 29 ноя 2024

Комментарии • 10

@rishabhindoria1696 9 дней назад
amazing content, thank you so much, looking forward to more
@servesh9606 19 дней назад
Thank you so much, the video was really helpful
@Jay-wx6jt 2 месяца назад ⁺¹
Great post. Keep it up the practical videos
@micams2009 2 месяца назад ⁺²
thanks for the vid and the experiment; one little thing that one might need to consider: since the sentence transformer your are using (all mini) is limited to 512 tokens (well, technically it's 256 from their training, but chroma might have overwritten that. would get truncated if larger), this should be adjusted to the chunk size that you are applying for chunking the document.
Since you have chosen 800/400 overlap, there is not a very big chance that you end up with questions which are "out of bound" from what the embedder is even taking as input, this gap naturally increases if you have larger chunks whilst using a small sentence transformer model. the kind of large overlap ratio has a second effect: if the question that was created by the LLM for the synthetic data generation is based on information that is beyond what the embedder takes into account, it is most likely that the next chunk will be returned as best match. However, this wouldn't degrade the overall RAG performance but it's certainly not optimal to train the adapter to optimize against something that is not represented in the input data.
Hope, this is understandable - would be interesting to look at the "non-successful" cases (chunk not in first place) and analyze how many times it actually chose the following chunk instead.
@AdamLucek 2 месяца назад
Thanks for the additional information, all great things to consider and ideas for further work! Wasn't aware of the base model token limitation off the bat- will have to look and see how chroma approaches this/if they do anything specifically for larger token count documentation by default.
Definitely agree with your points on the larger chunking with overlap and how that can affect ranking/accuracy. My main goal was to test this research in an environment that best simulates what would likely be an existing RAG tool for a company, rather than the nicely organized benchmark data- which definitely leads to inefficiencies for training but was hoping it would be amore realistic/applicable test as most company data is not as nicely structured or chunked as the benchmarks research gets via MTEB
Thanks for the writeup! Great context and considerations for the future
@nobody84980 20 дней назад
Great stuff good job.. new subscriber🙏
Just curious how long did the fine tuning take for the 30 epochs.. i honestly thought it'd increase the recall even more.. but given its performance on training it seems to overfit
@MikewasG 2 месяца назад
The video is AWESOME
@jonatan01i 15 дней назад
one single layer doesn't really do anything
two is the holy grail
@ravishmahajan9314 2 месяца назад ⁺¹
Why would anyone finetune embedding models?
Are you saying we will convert data into embeddings and store in vectorDB which we will query duing our call to LLM.
The embeddings returned will be used as a context to our query to LLM.
This is called Simple RAG.
But what do you mean when you say "Finetuning embeddings" ?
@RossettiAries-s5w 2 месяца назад
Martin Betty Taylor Elizabeth Allen Kevin

Следующие

Автовоспроизведение

LLM Function Calling - AI Tools Deep Dive

LLM Function Calling - AI Tools Deep Dive

Make AI Think Like YOU: A Guide to LLM Alignment

Make AI Think Like YOU: A Guide to LLM Alignment

RAG vs. Fine Tuning

RAG vs. Fine Tuning

Daz Watches Candys Cooking 2

Daz Watches Candys Cooking 2

Lilo & Stitch | Official Teaser

Lilo & Stitch | Official Teaser

Testing Roblox's Most TERRIFYING Myths..

Testing Roblox's Most TERRIFYING Myths..

Best Rug Art Wins $1,000!

Best Rug Art Wins $1,000!

Wait... What REALLY Is A Vector Database?

Wait... What REALLY Is A Vector Database?

How to Create High Quality Synthetic Data for Fine-Tuning LLMs

How to Create High Quality Synthetic Data for Fine-Tuning LLMs

"I want Llama3 to perform 10x with my private knowledge" - Local Agentic RAG w/ llama3

"I want Llama3 to perform 10x with my private knowledge" - Local Agentic RAG w/ llama3

Supercharge Your RAG with Contextualized Late Interactions

Supercharge Your RAG with Contextualized Late Interactions

Optimize Your AI Models

Optimize Your AI Models

Create AI Images of YOU with FLUX (Training and Generating Tutorial)

Create AI Images of YOU with FLUX (Training and Generating Tutorial)

Prompt Engineering, RAG, and Fine-tuning: Benefits and When to Use

Prompt Engineering, RAG, and Fine-tuning: Benefits and When to Use

LightRAG: A More Efficient Solution than GraphRAG for RAG Systems?

LightRAG: A More Efficient Solution than GraphRAG for RAG Systems?

GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem

GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem

What happened?# Shoot a good play every day# Old Tie smiles

What happened?# Shoot a good play every day# Old Tie smiles

Зомби с РЕЖИМОМ БОГА против КОМАНДЫ ПРО ИГРОКОВ! ЗОМБИ АПОКАЛИПСИС😰

Зомби с РЕЖИМОМ БОГА против КОМАНДЫ ПРО ИГРОКОВ! ЗОМБИ АПОКАЛИПСИС😰

ИЗМЕНА В БЕРЕМЕННА В 16 (Наталья, Краснотурьинск)

ИЗМЕНА В БЕРЕМЕННА В 16 (Наталья, Краснотурьинск)

МАМАША, Когда обидели Ребёнка (смешное видео, юмор, приколы, поржать)

МАМАША, Когда обидели Ребёнка (смешное видео, юмор, приколы, поржать)

Cảnh Sát Mạnh Nhất Lịch Sử | Gameplay | meGAME

Cảnh Sát Mạnh Nhất Lịch Sử | Gameplay | meGAME

Тамара Игоревна и другие | Серия 3 | Уральские пельмени

Тамара Игоревна и другие | Серия 3 | Уральские пельмени

Китайка и Челлендж Змейка😂😆

Китайка и Челлендж Змейка😂😆

Ov və ovçu hər an dəyişə bilər... #animals #shorts

Ov və ovçu hər an dəyişə bilər... #animals #shorts