Intro to RAG for AI (Retrieval Augmented Generation)

Matthew Berman

Просмотров 70 тыс.

3 000

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 6 фев 2025
This is an intro video to retrieval-augmented generation (RAG). RAG is great for giving AI long-term memory and external knowledge, reducing costs, and much more.
Be sure to check out Pinecone for all your Vector DB needs: www.pinecone.io/
Join My Newsletter for Regular AI Updates 👇🏼
www.matthewber...
Need AI Consulting? 📈
forwardfuture.ai/
My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: / matthewberman
👉🏻 Discord: / discord
👉🏻 Patreon: / matthewberman
👉🏻 Instagram: / matthewberman_ai
👉🏻 Threads: www.threads.ne...
👉🏻 LinkedIn: / forward-future-ai
Media/Sponsorship Inquiries ✅
bit.ly/44TC45V

Комментарии • 424

@matthew_berman 7 месяцев назад ⁺²²
What's your favorite use case for RAG?
@HanzDavid96 7 месяцев назад ⁺⁸
Giving the LLM/Agents a mind for long term planning and remembering stuff associatively. The memory is the half agi within the generative multiagentic system where the LLM is the context processor.
@FunwithBlender 7 месяцев назад ⁺³³
I specialize in Retrieval-Augmented Generation (RAG). Your introduction is good, but it lacks technical depth. You glossed over chunking and how to use it correctly based on the data. Pinecone is good, but it's not necessarily better than vector databases built in Rust or Go, like Qdrant and Weaviate (which are free and open source). It's also important to explain in-memory vector database solutions using tools like FAISS or on-disk solutions like Qdrant and Pinecone, and to discuss the pros and cons of each.
A significant omission is not addressing implicit behavior or implicit data versus explicit data, and their relationship with graph databases. Rerankers might be too advanced a concept; often, you can achieve better results by optimizing chunking, similar to how tokenization is used for semantic understanding. Often, agents are unnecessary, and having a chain-of-thought agent before sending to the LLM can be a waste. Additionally, discussing the similarities between the internals of a transformer and a vector database is intriguing. Overall, the video feels like a Pinecone sponsorship.
Regarding fine-tuning, it's about improving the understanding or behavior of an LLM in a specific domain at the cost of losing understanding in other areas. You should only fine-tune if the model does not seem to understand. Use RAG when the model lacks knowledge or when you want to reduce hallucinations, but relying solely on vector databases is a missed opportunity. One micro aspect you did not touch on is tokenization. The two biggest things people often overlook are chunking and tokenization, and there are massive gains to be made when these are properly understood.
@Spudster3 7 месяцев назад ⁺³
Using my local scanned (searchable) PDF documents in RAG.
@FunwithBlender 7 месяцев назад ⁺²
one good use is ecommerce products for conversational shopping...creating new experiences...built a few prototypes of this as mvps for pitches...its a night and day experience
@De-e-kay 7 месяцев назад ⁺²
@@FunwithBlender Great comment!
What is your go to open source RAG pipeline? I am beginning to learn and discover all these tools. It is pretty amazing.
@dombayo 7 месяцев назад ⁺¹²¹
A vector database tutorial would be great! Excellent content.
@gabrielsandstedt 7 месяцев назад ⁺⁷
You can ask Claude 3.5 create a locally run vector database. It will manage it in a day and you will avoid having to pay for another clouded service. I did it and it worked.
@fabrizio-6172 7 месяцев назад
Great @@gabrielsandstedt
@Dant110 7 месяцев назад ⁺⁵⁴
I would like a deeper dive into RAG and an end to end pinecone tutorial! Thanks for the great video!
@gabrielsandstedt 7 месяцев назад
You could use pinecone but Claude 3.5 can build you a custom vector search algorithm that will work and you can store locally using sqlite
@sigridopps3049 3 дня назад
Thank god there is someone so eloquent to describe things so clearly.
@mcarrusa 7 месяцев назад ⁺¹⁰
PLEASE do the how-to on setting this up. It is a key piece to the puzzle, for sure. Thank you for all the great content!
@JustinsOffGridAdventures 7 месяцев назад ⁺¹⁴
Great video! I've bee following you for awhile and have set up some edge LLM's using your tutorials. RAG is the future for any business wanting to truly utilize their data. to the fullest. I think that a lot of companies aren't even sure how they can implement their data for the greater good of the business while saving money at the same time. Videos like this help clarify the subject. Please do a video on Pinecone. I'm sure there is a lot of us that would like to see it's capabilities. Keep up the great work.
@bitcloud2304 7 месяцев назад
Just discovered this channel and it quickly leapfrogged others as one of my favorite AI channels. I'm a Data Scientist starting to work in the LLM arena and these videos are super helpful. I'd love a full tutorial on RAG!
@AbdulMajeed-lf5sq 7 месяцев назад ⁺²
This is one of the best videos I watched from you as a junior AI engineer 👌🏼 BEAUTIFUL
@ytrew9717 7 месяцев назад ⁺¹¹
Very well explained : short and clear with good examples, thanks!
@paultoensing3126 7 месяцев назад
Yes! Please set up a full tutorial for us. This is powerful. I have a Custom GPT business and I’ve always known I need to incorporate RAG in the most pragmatic way possible to advance my capabilities. So it sounds like Pinecone is the way to go. Thanks so much for your help.
@nareshtaneja7038 7 месяцев назад ⁺⁴
Thanks you for making this Video. I am a Non Techie trying to get easy to understand method of querying my documents using RAG with open source LLMs. Would eagerly await your full tutorial on this topic .
@middleman-theory 7 месяцев назад
Yes, we need a full tutorial please. This is great knowledge and a very simple to understand video! I actually have a pinecone account, and started using it when I first started playing around with Auto-GPT, but I haven't used it since. I'm interested in developing some new projects soon, and RAG sounds like something I need to be thinking about.
@Idea-LabAi 7 месяцев назад ⁺¹
I would also like more tutorials on RAG and techniques to improve chatbots. Thanks Matthew for this content. I like your posts on news but tutorials are also useful and appreciated given your ability to communicate such concepts.
@austrianronin 5 месяцев назад
Great video as always, Matthew. Thanks a lot. I really would love to have a detailed video on how to set up RAG. Trying to establish an AI retrieval system for all teachings of our Buddhist teacher. About 16 min words. The seems to be the perfect system
@garic4 7 месяцев назад
In RUclips, there are hundreds of channels baffling buzzwords and lame tutorials about these concepts without putting real effort on creating meaningful videos. And this channel is not one of those.
I appreciate your videos Matt, thank you for the great content
@garic4 7 месяцев назад
Oh and please publish both tutorials , Picone and more RAG applications - those are the future and using agents with that is golden for the near future for all of us
@BrankoPetrovic-f2z 7 месяцев назад ⁺¹
I've heard about RAG before, but this video helped me understand it much better. Thank you for sharing your knowledge! I would greatly appreciate it if you could make another video demonstrating how to use it with a real-life example
@forifand 7 месяцев назад ⁺¹⁰
A full tutorial would be great - thanks so much 👍
@JulioCesarjcfalcone 7 месяцев назад ⁺¹⁴
I would love to see a tutorial on how to use RAG! I was just thinking on how to solve some of this knowledge problem on a small project I'm working on
@AaronBrown-h2n 6 месяцев назад
Yes! Please go through a full demo! would love to see it!
@bobwarfieldoz 7 месяцев назад
Yes please, more information about Pinecone and RAG! Great content, thanks!
@youdaloser1 7 месяцев назад
100% on board with seeing a full tutorial. Also highly interested in seeing a fully open-sourced setup.
@nikhilgoyal007 3 месяца назад
Finally understood RAG . thank you!
@afonsolfm 7 месяцев назад ⁺¹
Great videos man! Listening them every day now.
@tchadcarby8439 7 месяцев назад
Thank you for your hard work Mathew! Please do videos on all suggestions that you made in this video.
@samtabby3373 7 месяцев назад ⁺¹
I like your style of explaining things. Thank you for your videos as I've learned a lot from you.
@gustavdreadcam80 7 месяцев назад
I'm defintely interested in doing RAG but more so in doing it locally. Especially with all the important information I can't trust a service for storing it, if there is a local way of doing it I'd be very interested in building a RAG pipeline. Great video for explaining the basics of it.
@TheLegomom2 7 месяцев назад
Yes definitely need to expand on RAG, vector database and pinecone. Full end to end process for incorporating specific business data sets to generate highly customized content. Creative/marketing use case if possible.
@lydiayuna9155 7 месяцев назад
This is by far the best AI educational video!!
Please share more RAG solution , this will be very very useful for your audience !!
@luizcamillo9933 7 месяцев назад
This is a great and very easy to understand explanation. Please make a full tutorial!
@bitsie_studio 7 месяцев назад
Would absolutely love to see a tutorial on this. Thanks for doing something more technical like this, Love it!
@Larimuss 6 месяцев назад
Would love a full RAG tutorial. Thanks for the great video.
@dave-cripps 7 месяцев назад
Claude's new Projects feature is like a simple RAG. I've given it all the knowledge about a novel I'm working on and it has been surprisingly good at understanding all the nuances. Way better than a normal conversation.
@ErickJohnson-qx8tb 7 месяцев назад ⁺¹¹
YESSS DO ITT PLEASE 🙏
@brianWreaves 7 месяцев назад
🏆 Very helpful, with just the main points... love it! As with other, looking forward to more details.
@jack.splash2334 7 месяцев назад ⁺²
A tutorial would be amazing! It’s exactly what I need for something I wanted to experiment with
@shonnspencer1162 7 месяцев назад
please continue to educate and show us the RAG vectoring tutuorial. Great video!
@fourlokouva 7 месяцев назад
Great explanation of RAG and how it differs from fine-tuning and prompt engineering
@sahilverma9330 7 месяцев назад
Finally an explanation without using complex terminologies. Thank you Matthew. Lets do one with RAG + Agents
@TrevorMatthews 7 месяцев назад
Ok that was awesome. Of course I’d like to know more! I’ve had a hard time understanding rag til now for some odd reason. Would also love a tutorial on pinecone and embedding.
@RetiredVet 6 месяцев назад
I would like to see a more in depth RAG tutorial. Pinecone is great, but maybe at the end show how to use a local vector db for those of us who want it completely private.
Thanks!
@Maltesse1015 7 месяцев назад
Looking forward for the Tutorial 🎉!!
@JeffParkerTexas 7 месяцев назад
Yes, please do a step-by-step guide!!!
Thank you!
@IamiAGorynT 7 месяцев назад
Great video. A step-by-step video on RAG and Pinecone would be great! 👍
@fasteddiegarcia1 7 месяцев назад
Yes please create a tutorial video showcasing step by step instructions around practical techniques for RAG, local open source vector databases, and automations
@lasithchandrasekara5200 7 месяцев назад
Great video, please do a deeper dive into RAG and later DSPy video as well.
@mushkrot 2 месяца назад
I would be grateful to see the full tutorial on embedding large documents using some of your favorite tools, storing it in Pinecone, and building an AI app using RAG. Have you recorded any tutorials of that kind already? Thank you!
@rickzhong6657 7 месяцев назад
Great top view of RAG concept, please give us a detail walk-through on a concrete coding example, many thanks! 🙏
@FullEvent5678 7 месяцев назад
I'd be very happy to see the whole process presented in a video ♥
@williamross4062 7 месяцев назад
A full tutorial is NEEDED
@Copa20777 7 месяцев назад
This topic is the kind of knowledge everyone thinks they have and brush over.. thanks Matthew
@michaeldolmos 7 месяцев назад
Love to see a full tutorial.!
@jprak123asd 7 месяцев назад
Brilliant!! Yes, a deeper dive will help
@jk-2033 7 месяцев назад
This was very interesting and a full step by step video would be very helpful!
@KiLVaiDeN 7 месяцев назад
A clever way to make an ad, here for Pinecone, by delivering knowledge. It's much more acceptable this way. Well done, and thanks for the intro to RAG :) The people @Pinecone must be proud of this video.
I've just to say that, it's more about giving AI an optimized context than truly giving them a "memory". The title feels a bit misleading. A real memory would be a workable space where the AI stores itself the required data for later retrieval, and which becomes part of its infrastructure. This is not it.
@laurenceturpin1409 7 месяцев назад
An excellent tutorial I would really like you to do a deeper dive into RAG and show how you would set it up.
@kamelirzouni 7 месяцев назад
Thank you for this wonderful explanation on RAG, very informative. Just a note regarding Claude's Context Window: it's 200K and not 100K.
@stuffaboutthings8679 7 месяцев назад
Yes ! To all of the walk through on setting up local rag llms and mixed agents
@dcmumby 7 месяцев назад ⁺²
RAG requires a knowledge graph DB as well in order to find information not directly mentioned which is a limitation of RAG, a tutorial incorporating both would be amazing
@youcandosomethingaboutit 7 месяцев назад ⁺²
00:02 An intro to RAG and its misunderstood nature
01:51 RAG is efficient for continually providing new knowledge to large language models
03:42 RAG enables adding external knowledge to AI models
05:29 RAG allows AI to access and incorporate new information into its responses.
07:25 Utilizing embedding models to enhance AI understanding
09:12 RAG enhances AI by providing external knowledge sources
11:10 Utilizing external knowledge for AI searches
12:57 RAG simplifies retrieval augmented generation process
@BigBadBurrow 7 месяцев назад
Thanks, Matt, interesting concept. A video tutorial would be great!
@rahuljauhari3240 7 месяцев назад
amazing explanation of RAG thank you!!
@svetoslavlyubenov8521 7 месяцев назад
It will be great to do a full tutorial. If you add multimodal RAG and agents functionalities it will be even better.
@thecobrasnakes 7 месяцев назад
Yess we want a tutorial! Amazing content thank you !
@morrazzo4432 7 месяцев назад
I have following doubts bro please clarify, it would be most for me:-
1) Which should I learn first? RAG or AI agent
2) which should I learn first LLM or NLP?
3) which programming language is necessary for learning all of them?
4) While writing code for any of these for example if we are writing a code in tensorflow pytorch in which pattern should we write a code?
5) what is the scope for prompt engineering? Is it just a hype or is it really worth?
@strazzi2 7 месяцев назад
A deeper dive into RAG and embeddings would be a great help for developers like me. I work in C# with GPT4o and I use REST rather than Python, but then OK, you can't always get what you want 🙂
@shuntera 7 месяцев назад ⁺⁹
Be interested to see best practices for keeping the RAG database up to date. For example if a new PDF is dropped into a watched folder the PDF gets submitted to the embedding model automatically. Likewise for PDFs that are out of date and removed which should them be dropped from the vector database.
@antaishizuku 7 месяцев назад
You could add a useage count, entered date, last accessed date, etc and have a background thread check for old info. Like say 2-3 years unless its something your llm wouldn't know
@vishal.dekatearess 7 месяцев назад
Hi Matthew,
This video is very informative about basic RAG,
Please provide a tutorial on Pinecone
@PersianMate 7 месяцев назад
yes please! I’d like to see a full tutorial on how to do the whole process
@gsmorgan 7 месяцев назад
A deeper dive on how to set-up RAG with Pinecone and an embedding model would be great!
@thesvenni 6 месяцев назад
Hi Matt, thanks for another great video. Would be great, with a tutorial on how to use the GPT chat conversation export as RAG memory.
@andredinizwolf7076 7 месяцев назад ⁺⁴
Great knowledge!! Please create a new video about pinecone..
@BizAutomation4U 7 месяцев назад
Awesome video... Would be great to put that together within the context of both fine tuning with Unsloth for example using not just documents but a database data source .. which has relationships across tables... How relationships get embedded..etc
@patrickbowen8408 7 месяцев назад ⁺¹
Yes, full tutorial on rag and pinecone. Provide details on keeping private data private.
@basedbuz 7 месяцев назад
I have said that it's less about compute power and now about organization of data and mimicking the brain.
This is one way to do it
@IdkJustCookingDude 7 месяцев назад
This is exactly what I've been looking for! Thanks so much for this
@chetanreddy6128 7 месяцев назад
Hey it would be very very helpful if you drop a detailed video on rag setting up and usage!
@Elsag_GeliNakh 7 месяцев назад
Good explanation.. It would be great to see a tutorial on how to use RAG!
@EcomSnif 7 месяцев назад
Hi Matthew, Thank you so much for sharing such valuable information! I would really appreciate it if you could share an in-depth tutorial on RAG embedding and vector databases. Additionally, I noticed that Microsoft has released an enhanced model of RAG called GraphRAG. It would be fantastic if you could also do a presentation on this. Thanks again!
@BenoitStPierre 7 месяцев назад
The OpenAI Dev Days from last year had a great session on optimizing LLMs. Their progression was to try few-shot, then RAG, then fine-tuning - and their description of fine-tuning was that it was a good way to provide "intuition" to the model, but not knowledge.
@corytimm142 7 месяцев назад
I would love to see a video on how to do all of this with open source software that I can run locally. A project combining RAG with Ollama models would be awesome
@antaishizuku 7 месяцев назад
Oh what would be interesting is a comparison of different rag databases there are so many and while personally ive settled of faiss, redis, and chromadb it would be nice to see a detailed breakdown of those most popular ones and the best options.
@attilazimler1614 7 месяцев назад
Hi, thanks for the video, a deeper dive would be interesting :) thanks :)
@PureMoss 7 месяцев назад
Would love to see both the tutorial and deeper dive using RAG
@antaishizuku 7 месяцев назад
I have been working on a chromadb vector database sothis is awesome! Thanks!
@cirtey29 4 месяца назад
Great content.
@bradstudio 7 месяцев назад
PLEASE DO A FULL RAG SETUP TUTORIAL!! 🔥
@stonibeauchamp4588 7 месяцев назад
Full tutorial would be fantastic!
@jeffreymoore1431 7 месяцев назад
Yes. Please create a tutorial on Pine Cone setup and usage. Thanks.
@jcktrck 7 месяцев назад
this is incredible, youve inspired me to learn python. I want to work with these frameworks.
@jr21294 7 месяцев назад
For search, there are two ways to do it: lexical or semantic search. RAG can also be used with lexical search
@MohammadTalli1764 7 месяцев назад ⁺¹
* 00:00:00 Introduction to Retrieval Augmented Generation (RAG)
* 01:02:22 Misunderstandings about RAG and Large Language Models
* 02:11:44 RAG as an external source of information for large language models
* 03:13:22 Context window limitations
* 04:12:11 RAG for chatbot conversation history
* 04:51:17 RAG for access to internal company documents
* 05:39:22 RAG to update large language models with new information
* 06:02:22 How Retrieval Augmented Generation Works
* 07:32:22 Workflow with RAG for finding relevant information
* 08:22:22 Embedding model and Vector database
* 10:11:22 RAG with agents for iterative approach
* 12:13:22 Pine Cone for Vector database
* 13:11:22 Conclusion
* 13:47:22 Outro
@davidlavin4774 7 месяцев назад ⁺¹
Slight pet peeve of mine - I think presenting it this way makes it sound like you must use an embedding model/vector db to do RAG. The basic version of RAG is just that idea of passing additional, retrieved info with the prompt to the LLM. Yes, the embedding model w/ vector db is a very efficient way of doing that - especially with large amounts of data. But it is not the only way to accomplish it, and may not even be the best way to do it, depending on the use case.
@michelforet6765 7 месяцев назад
Good evening Matthew,
I've been following you since January 2024, the year of AI real expansion. I'd love a video/explanation about Pinecone and this transposition of different types of documents.
Kind regards
Michel (France)
@Gusztav2001 4 месяца назад
I'm a new subscriber and would be interested in a full tutorial on the practical application of this.
@wtcbd01 5 месяцев назад
Matthew, excellent work. My only critique would be to stop doing the click bait type thumbnails and titles on your other videos when you are so incredible with explaining a concept and you already have a huge following. Again, I can point adults + young people to this video to learn more about rag, whereas I would be hesitant at pointing them to the goofy looking videos, though the content is great on the other clickbait type videos. Videos. Incredible job and looking for it to your further explanation of rag and a deep dive of how to set it up
@2010Sisko 7 месяцев назад ⁺¹
Thanks so much for your video. It really was very informative. Yes, absolutely I would love to have a course on retrieval augmented generation and also on Pinecone great content very much appreciated.
@KonradTamas 7 месяцев назад ⁺³
YeYe, do the Tutorial
@Stanislav-10101 7 месяцев назад
Great video, now it’s much clearer how RAG works!
Will RAG help with programming? For example, is it possible to download an entire application project? And so that LLM can use the project code to expand it, for example, in the style of the entire project and using components from the project loaded into the RAG?
@alanmorgan2536 7 месяцев назад ⁺¹
I've been dreaming about using RAG to compile the summary of key references I use in my profession (Geophysical interpretation). Obviously, professionals may not utilize every key learning from published materials and some information may be conflicting with other published materials in the same field. What would be immensely useful is a method of adding weights to information you utilize on a daily basis and to identify where an AI finds conflicts in logic. If a conflict is found, a model can be taught which path to follow.
@majoorF 7 месяцев назад
open prompt language model. No limit to the prompt input of a language model. You can basically add an additional large language model of data within you prompt. :)

Следующие

Автовоспроизведение

Large Language Models (LLMs) - Everything You NEED To Know