How I'd Learn AI in 2024 (if I could start over)

OpenAI Structured Output - All You Need to Know

Why Agent Frameworks Will Fail (and what to use instead)

Love Island’s JaNa Craig, Leah Kateb & Serena Page Take a Friendship Quiz | GQ

Kyle Richh - UNFADEABLE / BOA

Monster Hunter Wilds: 3rd Trailer | Lala Barina & Scarlet Forest Reveal

Improve RAG with This Simple API (code included)

Dave Ebbelaar

Просмотров 8 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 20 авг 2024
Want to get started with freelancing? Let me help: www.datalumina...
Need help with a project? Work with me: www.datalumina...
🔗 GitHub Repository
gist.github.co...
📑 Azure Document Intelligence
learn.microsof...
🛠️ My Development Workflow
• My Development Workflo...
👋🏻 About Me
Hi there! I'm Dave, an AI Engineer and the founder of Datalumina. On this channel, I share practical coding tutorials to help you become better at building intelligent systems. If you're interested in that, consider subscribing!

Комментарии • 28

@farhanafridi8694 Месяц назад ⁺¹²
I believe you are the hero every AI engineer needs. Unlike most RUclipsrs who copy and paste code from documentation, you address the real problems AI engineers face.
@daveebbelaar Месяц назад ⁺²
Appreciate that!
@BrockMesarich Месяц назад ⁺²
So good at simplifying concepts in these tutorials. Loved this Dave!
@daveebbelaar Месяц назад ⁺¹
Thanks Brock 🙏🏻
@ClarkeBishopConsulting Месяц назад ⁺¹
Very helpful, Dave! Many companies try to use naive chunking because there are so many examples on the web, RUclips videos, etc. You gave us a very good way to do smarter chunking and get more useful results. This is the future for RAG use cases.
@daveebbelaar Месяц назад
Thanks Clarke!
@GeertBaeke Месяц назад ⁺¹
Good stuff! We use the exact same technique with markdown-based chunking and extra metadata for the chunks. Works really well!
@daveebbelaar Месяц назад
I think this is currently the best approach for RAG.
@skyrayzor3693 Месяц назад ⁺¹
Your video is detailed and very helpful, thank you for these type of techniques.
@micbab-vg2mu Месяц назад ⁺²
great - thank you for sharing:) Please explore the topic more - )
@krlospatrick 28 дней назад
Thanks a lot for sharing this knowledge, it's really useful!
@trendavira5128 Месяц назад ⁺²
Hi Dave, Thanks for the awesome content, a client come to me for a RAG solution, he have a library of hundreds of thousands of pages (about 60 Giga), simplest rag techniques doesn't seem to work for this case, come up to a solution using hybrid retriever and a reranker using llama-index, the results was good but not perfect, if were you how will you tackle this problem?
@awakenwithoutcoffee 23 дня назад
we are working on a solution for this that can be white-labeled on release! does your client has an API endpoint or some kind o bucket containing all the files ? it really depends in what formats the data comes. If it its just text than you can use a hybrid-approach with semantic chunking, parent-document retrieval or other meta-data filtering techniques. The main point of importance is to make sure the data is pre-processed and cleaned before being chunked an embedded. Entity extraction is expensive but can be very helpful. A second best option is to extract meta-data. One is used for semantic extraction (entity) and the other for additional filtering.
GraphRAG is the best solution, using entities, but it costs a massive amount of resources & development time making it only accessible to enterprise clients (10-50k +).
@Divyv520 Месяц назад ⁺¹
Hey Dave , Really Nice Video . I was wondering if I could help you with more High Quality and engaging editing with maintaining a brand colour to your youtube channel which can help you to get more engagement in your videos and Build your Unique Personal Brand . Pls lmk what do you think ?
@StephanieNguyen-om1ss Месяц назад
Super helpful. Can you please make a tutorial on how to use AWS Textract too?
@AaronGayah-dr8lu Месяц назад
Enjoyed this. Thank you.
@LaHoraMaker Месяц назад
Have you tried passing the PDF to Jina Reader API? The Markdown output is quite clean too! (but it's only usable for public documents)
@inflationking1271 Месяц назад ⁺¹
Could you do a GraphRAG tutorial?
@__m__e__ 18 дней назад
Thanks I'm a newbie and your videos helped get me started. Can you please also share pdf_ingester?
@chwaleedsial Месяц назад
Will try this with textaract. For my use case I am just sending a csv ( of an excel ) and its working but I think that is not a systematic, luck proof way. Do you think rag approach will be better, less prone to context, structure related hallucinations ?
@awakenwithoutcoffee 23 дня назад
awesome video but where can we find the "from config.settings import get_settings" ?
@testadrome Месяц назад ⁺¹
Does it work with scanned pdf docs?
@daveebbelaar Месяц назад ⁺³
Yes!
@brandonvelasquez3530 Месяц назад
This seems similar to GraphRAG. What is the difference?
@sahiljain9376 Месяц назад
GraphRAG is a more powerful solution than this baseline RAG. In GraphRAG, the data is stored in the graph with entities and relationships and also doing community summaries in detail which excels in retrieval flow. For eg: questions like "Did company underperform in Q4 vs Q3?" This kind of question would be difficult to answer using Baseline-RAG which can be answered easily using GraphRAG
@awakenwithoutcoffee 23 дня назад
@@sahiljain9376 you can enhance RAG with agentic frameworks to allow these questions e.g. an SQL Agent with meta-data filtering. I love graphRAG but its a.) super expensive since entity extraction requires a ton of LLM calls b.) takes allot of time to set-up the graph, c.) has additional challenges to be overcome before it can really be used for non-enterprise.
@__m__e__ 18 дней назад
@@sahiljain9376 I was unaware of GraphRAG, and it looks really interesting thanks. It looks like it's beyond my skill level now, but hopefully MS integrates it into Azure soon

Следующие

Автовоспроизведение

How I'd Learn AI in 2024 (if I could start over)

How I'd Learn AI in 2024 (if I could start over)

OpenAI Structured Output - All You Need to Know

OpenAI Structured Output - All You Need to Know

Why Agent Frameworks Will Fail (and what to use instead)

Why Agent Frameworks Will Fail (and what to use instead)

Love Island’s JaNa Craig, Leah Kateb & Serena Page Take a Friendship Quiz | GQ

Love Island’s JaNa Craig, Leah Kateb & Serena Page Take a Friendship Quiz | GQ

Kyle Richh - UNFADEABLE / BOA

Kyle Richh - UNFADEABLE / BOA

Monster Hunter Wilds: 3rd Trailer | Lala Barina & Scarlet Forest Reveal

Monster Hunter Wilds: 3rd Trailer | Lala Barina & Scarlet Forest Reveal

Secret Level - Teaser Trailer | Prime Video

Secret Level - Teaser Trailer | Prime Video

OpenAI Embeddings and Vector Databases Crash Course

OpenAI Embeddings and Vector Databases Crash Course

Why I stopped using Jupyter Notebooks

Why I stopped using Jupyter Notebooks

Best Tool For Getting Your Data Ready For RAG

Best Tool For Getting Your Data Ready For RAG

How to scrape the web for LLM in 2024: Jina AI (Reader API), Mendable (firecrawl) and Scrapegraph-ai

How to scrape the web for LLM in 2024: Jina AI (Reader API), Mendable (firecrawl) and Scrapegraph-ai

Python RAG Tutorial (with Local LLMs): AI For Your PDFs

Python RAG Tutorial (with Local LLMs): AI For Your PDFs

How to Find Freelance Data & AI Projects in 2024

How to Find Freelance Data & AI Projects in 2024

Graph RAG: Improving RAG with Knowledge Graphs

Graph RAG: Improving RAG with Knowledge Graphs

How to use environment variables in a Python script

How to use environment variables in a Python script

I can't believe this is real

I can't believe this is real

7 Days Stranded In A Cave

7 Days Stranded In A Cave

КИТАЙСКИЙ ЭПОС ➤ Black Myth: Wukong ◉ Прохождение 1

КИТАЙСКИЙ ЭПОС ➤ Black Myth: Wukong ◉ Прохождение 1

🔥НОВАЯ БАТАРЕЯ?😮

🔥НОВАЯ БАТАРЕЯ?😮

Арестович: Военный размен: Покровск или Курск? Сбор для военных👇

Арестович: Военный размен: Покровск или Курск? Сбор для военных👇

Слитный или раздельный?!?! #зожнутые #юмор #жиза

Слитный или раздельный?!?! #зожнутые #юмор #жиза

БОТАН превратился В РОК ЗВЕЗДУ на ПРОСЛУШИВАНИИ в МУЗ ГРУППУ #1 ЛУЧШИЙ ПРАНК НА КАНАЛЕ

БОТАН превратился В РОК ЗВЕЗДУ на ПРОСЛУШИВАНИИ в МУЗ ГРУППУ #1 ЛУЧШИЙ ПРАНК НА КАНАЛЕ

ЛИЗА - СПАСАТЕЛЬ😍😍😍

ЛИЗА - СПАСАТЕЛЬ😍😍😍

Бабушка не уследила за своим внуком..🫢👵🏻⚓️

Бабушка не уследила за своим внуком..🫢👵🏻⚓️