Multi-Modal RAG: Chat with Text and Images in Documents

How to chat with your PDFs using local Large Language Models [Ollama RAG]

GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem

I Attempted Pokemon's Hardest Escape Room

Dominik Mysterio On Liv Morgan, Rhea Ripley, Eddie Guerrero, His "Deadbeat Dad" Rey Mysterio

Tom Holland Confirms Spider-Man 4, Talks Hiding Tobey Maguire and Andrew Garfield Cameos and BERO

Multi-modal RAG: Chat with Docs containing Images

Prompt Engineering

Просмотров 24 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 25 окт 2024

Комментарии • 43

@engineerprompt 3 месяца назад ⁺¹
If you want to learn RAG Beyond Basics, checkout this course: prompt-s-site.thinkific.com/courses/rag
@jfbaro2 Месяц назад
Does it cover how to minimize (or even eliminate) hallucinations, and that the result would ALWAYS consider the content added into the RAG "database"?
@rubencabrera8519 Месяц назад
This is the best AI channel out there, PERIOD. Thanks for sharing your knowledge
@aerotheory 3 месяца назад ⁺¹
Keep going with this approach, it is something I have been struggling with.
@waju3234 3 месяца назад
Me too. For my case, the answer is normally hidden behind the data, context and the images.
@AI-Teamone 3 месяца назад
Such an insightful information, Eagerly waiting for more multimodel approches.
@ilaydelrey3122 3 месяца назад ⁺⁶
a nice open source and self hosted version would be great
@b.lem.2499 19 дней назад
Thanks, is there a video of the same project, but with langchain instead of llama index?
@tasfiulhedayet 3 месяца назад
We need more videos on this topic
@Techn0man1ac 3 месяца назад
What about make same, but using LLAMA3 or less local LLM?
@AyishaAshraf-s2f 11 дней назад
Use case is to extract the relevant text information along with images available in the file using generative ai, When any prompt is given then relevant text information and image should display as response.
@ai-touch9 3 месяца назад
I appreciate your effort. Pl create one to fine tune the model for efficient retrieval if possible, with lang chain.
@legendchdou9578 3 месяца назад ⁺²
Very nice video but if you can do it with open source embedding model it would be very cool. thank you for the video
@vinayakaholla 3 месяца назад ⁺¹
Can you pls dive deeper into why qdrant was used and other vector dbs limitations to store both text and image embeddings, thx
@engineerprompt 3 месяца назад
will see if I can create a video on it.
@ArdeniusYT 3 месяца назад ⁺²
Hi your videos are very helpful thank you
@engineerprompt 3 месяца назад ⁺¹
Glad you like them!
@BACA01 3 месяца назад
Thanks your videos are very helpful. I have several Gigs of pdf ebooks that i would like to process with RAG. What do you think what approach would be the best, this or a graphrag. In my case i'm looking only for local models as the costs would be very high. What if to convert all pdf pages into images first and then process them with local model like phi 3 vision and then process it with Graphrag, would it work out?
@RolandoLopezNieto 3 месяца назад
Lots of good info, thanks
@avinashnair5064 Месяц назад
can you make it using comeplete open source models?
@RedCloudServices 2 месяца назад
do you think all of this is now replaced with Gemini ?
@mohsenghafari7652 3 месяца назад
it's great job! Thanks
@engineerprompt 3 месяца назад
thanks :)
@ScottzPlaylists 2 месяца назад ⁺²
Need to do it all in open source. No API Keys.
@BarryMarkGee 3 месяца назад
Out of interest what is the application called that you used to illustrate the flows? (2:53 in the video) thanks.
@engineerprompt 3 месяца назад ⁺¹
I am using mermaid code for this.
@BarryMarkGee 3 месяца назад
@@engineerprompt thanks. Great video btw 👍🏻
@codelucky 3 месяца назад
Is it better than GraphRAG? How does the output quality compare to it?
@engineerprompt 3 месяца назад ⁺¹
You could potentially create a graphRAG on top of it.
@JNET_Reloaded 3 месяца назад
wheres the code used?
@amanharis1845 3 месяца назад
Can we do this method using Langchain ?
@engineerprompt 3 месяца назад
Yes, will be creating a video on it.
@garfield584 3 месяца назад
Thanks
@ignaciopincheira23 3 месяца назад ⁺²
It is essential to conduct a thorough preprocessing of the documents before entering them into the RAG. This involves extracting the text, tables, and images, and processing the latter through a vision module. Additionally, it is crucial to maintain content coherence by ensuring that references to tables and images are correctly preserved in the text. Only after this processing should the documents be entered into a LLM.
@engineerprompt 3 месяца назад ⁺²
agree!
@jtjames79 3 месяца назад ⁺¹
That's a lot of work. Can an AI do this?
@engineerprompt 3 месяца назад ⁺¹
@@jtjames79 Yup :)
@RickySupriyadi 3 месяца назад
I except image generation will be have another kind of breed... image gen based on image understanding based on facts
@redbaron3555 3 месяца назад ⁺⁴
This approach is not good enough to add value. The pictures and text needs to be referenced and linked in both vector stores to create better similarities.
@engineerprompt 3 месяца назад
watch my latest video :)
@arifmp3284 3 месяца назад
U have any work?
@Know_Ur_World 2 месяца назад ⁺¹
Which video @@engineerprompt

Следующие

Автовоспроизведение

Multi-Modal RAG: Chat with Text and Images in Documents

Multi-Modal RAG: Chat with Text and Images in Documents

How to chat with your PDFs using local Large Language Models [Ollama RAG]

How to chat with your PDFs using local Large Language Models [Ollama RAG]

GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem

GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem

I Attempted Pokemon's Hardest Escape Room

I Attempted Pokemon's Hardest Escape Room

Dominik Mysterio On Liv Morgan, Rhea Ripley, Eddie Guerrero, His "Deadbeat Dad" Rey Mysterio

Dominik Mysterio On Liv Morgan, Rhea Ripley, Eddie Guerrero, His "Deadbeat Dad" Rey Mysterio

Tom Holland Confirms Spider-Man 4, Talks Hiding Tobey Maguire and Andrew Garfield Cameos and BERO

Tom Holland Confirms Spider-Man 4, Talks Hiding Tobey Maguire and Andrew Garfield Cameos and BERO

I Built a SECRET Pool in My Room!

I Built a SECRET Pool in My Room!

Graph RAG: Improving RAG with Knowledge Graphs

Graph RAG: Improving RAG with Knowledge Graphs

Run ALL Your AI Locally in Minutes (LLMs, RAG, and more)

Run ALL Your AI Locally in Minutes (LLMs, RAG, and more)

Marker: This Open-Source Tool will make your PDFs LLM Ready

Marker: This Open-Source Tool will make your PDFs LLM Ready

Multimodal RAG: Text, Images, Tables & Audio Pipeline

Multimodal RAG: Text, Images, Tables & Audio Pipeline

Building Production RAG Over Complex Documents

Building Production RAG Over Complex Documents

Agentic RAG: Make Chatting with Docs Smarter

Agentic RAG: Make Chatting with Docs Smarter

23 AI Tools You Won't Believe are Free

23 AI Tools You Won't Believe are Free

Python RAG Tutorial (with Local LLMs): AI For Your PDFs

Python RAG Tutorial (with Local LLMs): AI For Your PDFs

Coding a Web Server in 25 Lines - Computerphile

Coding a Web Server in 25 Lines - Computerphile

His dad said that this way he wouldn't have to worry about the baby running around with the bottle!

His dad said that this way he wouldn't have to worry about the baby running around with the bottle!

Бокс - Финты Дмитрия Бивола

Бокс - Финты Дмитрия Бивола

Туалетные гаджеты и крутые вещи 🚽

Туалетные гаджеты и крутые вещи 🚽

I get on the horse's nerves 😁 #shorts

I get on the horse's nerves 😁 #shorts

Ванька пошел!!!! 🥰

Ванька пошел!!!! 🥰

новое испытание

новое испытание

Ilkinchi hotin oberasanmi deb o’ylabman🥹😄

Ilkinchi hotin oberasanmi deb o’ylabman🥹😄

Барселона - Бавария | Лига чемпионов. Обзор матча 3 тура

Барселона — Бавария | Лига чемпионов. Обзор матча 3 тура