The Future of RAG is Agentic - Learn this Strategy NOW

Cole Medin

Просмотров 52 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 28 янв 2025

Комментарии • 296

@ColeMedin 13 дней назад ⁺¹⁴
Think you have what it takes to build an amazing AI agent? I'm currently hosting an AI Agent Hackathon competition for you to prove it and win some cash prizes! Register now for the oTTomator AI Agent Hackathon with a $6,000 prize pool! It's absolutely free to participate and it's your chance to showcase your AI mastery to the world:
studio.ottomator.ai/hackathon/register
@RajBiswal_Films 13 дней назад ⁺⁴⁴
Your channel is a goldmine for someone who wants to get into the nitty gritties of using AI in practical life with a technical background!
@ColeMedin 12 дней назад ⁺¹
Thank you - that means a lot!
@rakly347 12 дней назад ⁺¹⁸
I 'scraped' all your video transcriptions with youtube api and gave it as knowledge to my agent.
Thanks 👍
@curtism7121 12 дней назад ⁺¹
I’ve thought about trying this. How’d it work out?
@rakly347 11 дней назад ⁺⁷
@ It's not the only AI channel i got the transcripts from. But it does help find things that 'I know I heard about' but otherwise never can find back. With the timestamps it can even tell you more or less where in the video to go. I've put the script I used on github, I can't post links here. But my screenname here and the first digit might help you further. It's under 'EasyScripts'.
Sorry to be so descriptive but youtube is triggerhappy when being too direct with instructions.
@ColeMedin 11 дней назад ⁺¹
Wow that's super cool! You bet!
@robbief1 10 дней назад
@@rakly347- I’ve been wanting to do something similar myself, but wanted to feed a rag with my RUclips viewing history, and my Chrome / Google Web history, so I could always find something that I’d seen previously. So this will be a good start for me!
@TheSoloDIYer 10 дней назад ⁺²
Thanks a TON for making this! I was dreading having to make this myself, as I'm still pretty new to python. This will help a bunch for the Unreal Engine agent I'm creating.
@JSambrook 12 дней назад ⁺⁸
Strong work, Cole. You really do a great job with your videos. Just the right pace, and interesting projects.
@ColeMedin 12 дней назад ⁺¹
Thank you very much! This is the longest video I've ever put out on my channel so I worked extra hard to keep a good pace, so I appreciate you calling it out!
@thechahal 9 дней назад ⁺¹
This is my favourite channel for AI topics. I am hands-on with your content. Thank you for all your efforts. Greatly appreciated.
@ColeMedin 7 дней назад
Thank you very much! You are so welcome!
@waldowalden7379 12 дней назад ⁺²
Thanks!
@ColeMedin 12 дней назад
Of course! Thank you for your support - that means a lot to me!
@waldowalden7379 12 дней назад ⁺¹
@@ColeMedin I bring good luck bro. All the channels I contributed in the early stages became very successful ! =)
@ColeMedin 11 дней назад
Haha I love it! Glad to have you here man!
@mitchellrcohen 13 дней назад ⁺²
Dude, I literally watched one of your old videos on this topic today so happy you’re covering this. Thank you appreciate you.
@ColeMedin 13 дней назад
Glad it's perfect timing haha! You bet Mitchell!
@MikeBtraveling 12 дней назад ⁺⁶
Your teaching style is the best, thank you for sharing!
@ColeMedin 12 дней назад
That means the world, thank you! My pleasure :)
@Self0assembled 9 дней назад ⁺¹
Thanks
@ColeMedin 9 дней назад
Thank you so much for your support! :D
@MrVohveli 12 дней назад ⁺⁵
Funnily enough, this is more or less the architecture for all AI agent systems: a processing agent of some kind looks at the query and directs the actions to the relevant systems that then return a response that will be given to the LLM to give to the user.
You can run your Azure infra's log analytics to an agent and have it monitor & repair your systems for example: all you need is the agent that looks at the system to determine which part is down and which agent to instruct to attempt repair while another timer runs down on a verification agent to see if it was successful or to escalate it to a human and so on. The structure is identical to this one.
@ColeMedin 12 дней назад ⁺¹
Yes very true! I think a specific example of how to architect agents well with "agentic RAG" resonates with people more and makes the concept clear, but you are certainly right that this kind of solution and agentic reasoning is really the foundation for any agent.
@kacklion4103 13 дней назад ⁺²
This is what I've been waiting for all day
@ThatNerdChris 12 дней назад ⁺²
I made one for Verse language by Epic Games and it used 11 million gpt-4o-mini tokens and 23 million embedding tokens lol. Thank you for this!!! Been trying to solve this months.
@ColeMedin 11 дней назад
That's awesome!! You bet man!
@parasarkodati9804 6 дней назад
Brilliant video. After watching the video it seemed to me that RAG may be one of the killer use cases for Agentic implementations. Thank you for sharing your insights generously.
@subashchandra9557 19 часов назад ⁺¹
Hey Thanks for making this video Cole. I was having trouble bridging the gap between the phidata repo and the actual implementations of these things. This helped me make my own from scratch.
@ColeMedin 11 часов назад
That's awesome! You bet!
@subashchandra9557 8 часов назад
@@ColeMedin Have you noticed that nobody is talking about the Granite and Snowflake embedding models that have come out over the past few months. Theyre way better than Nomic and Ada, yet people dont even know they exist, and keep using OpenAI API!
@ahmedd.masoud6809 13 дней назад ⁺²
As usual,
Fruitful and informative video.
Thank you Cole, keep going bro.
@ColeMedin 12 дней назад
Thank you very much! I certainly am not going to stop :)
@vincentmayer7517 12 дней назад ⁺³
Bro, this is incredibly valuable! Bog shout out to you for all thos free content! Create a skool I'll be your first disciple 🙏
@ColeMedin 12 дней назад ⁺¹
Thank you very much! I don't have a Skool but I do have a community over at thinktank.ottomator.ai :)
@WTPleo 12 дней назад ⁺³
Thank you for what you do man. These videos are more than useful!
@ColeMedin 12 дней назад
You are so welcome! I appreciate it!
@Beatswithsoul 11 дней назад ⁺²
Funny … just started building exactly that (RAG based app, that handles different documentations in an agentic way). Very good content you provided here.
@ColeMedin 9 дней назад ⁺¹
I love it! Thank you!
@okich76 12 дней назад ⁺²
Rare, no bullshit channel, thank u Cole!
@ColeMedin 12 дней назад
Haha I appreciate it, you bet!
@leonardogrig 13 дней назад ⁺²
Great job! Watching with enthusiasm. Agentic RAG is an absolute trend now, I'm just skeptical of leaving important tasks for the LLM (ai agent)
@ColeMedin 13 дней назад ⁺³
Thanks man! Yeah I totally get the skepticism - I share it too.
The best AI agents are ones to assist you by saving you time without you having to trust it completely. For example, an agent that will draft replies to your emails without actually sending them so you can review them first.
@oceanheartai 12 дней назад ⁺²
@@ColeMedin 100% with you on this. We want to *combine* human and machine intelligence. Big thank you for all your work in this space, Cole. Your passion and dedication shines through with every video, and you've signed yourself up to the really hard stuff like community building too. See you on competition day ;)
@ColeMedin 12 дней назад ⁺¹
Thank you very much, that means the world to me! Can't wait to see your submission man!
@JeremyDevz 12 дней назад ⁺²
So good man! As a Java dev I love how you break down the Python code - helps me learn! 💯💯
@ColeMedin 11 дней назад ⁺¹
Thanks Jeremy! Glad to help!
@emprezariotv 12 дней назад ⁺¹
keep up the freaking great work Cole!
@ColeMedin 12 дней назад
Thank you, I appreciate it!
@throbbinhooddd 12 дней назад ⁺²
Unfortunately had to give up due to OpenAI rate limit issues. Tried using time library, reducing chunk size, new API key, still getting rate limit errors on Tier 1 usage. I might come back to it after going Tier 2 to see if it helps. First time forking a repo so could totally be something I'm doing but I don't see what! Awesome video though man, I do truly believe this to be the way of the future.
@ColeMedin 12 дней назад ⁺²
Thank you for the kind words! Sorry you are hitting rate limit issues though. My suggestion would be to try and use OpenRouter instead of OpenAI for the LLM. Their rate limits are much more generous right out the gate. You'd still have to use OpenAI for the embedding model though, OpenRouter doesn't have those.
@throbbinhooddd 12 дней назад ⁺¹
awesome I’ll give that a go, hadn’t thought of trying OpenRouter
@ColeMedin 12 дней назад ⁺¹
Sounds great! Yeah OpenRouter is fantastic
@duanxn 12 дней назад ⁺¹
Great video. That is exactly what am looking for - binding metadate with vectors for multi-round and precise retrievals. Thanks.
@ColeMedin 11 дней назад
Thank you, you are welcome!
@ace.1type8z8 12 дней назад ⁺¹
thank you cole youre the best
@ColeMedin 11 дней назад
You bet, I appreciate it!
@jitheshgopan2223 7 дней назад ⁺²
Great tutorial.
About the point of using Supabase as a single place to have both embeddings and structured data compared to 2 different DBs, what do you think about Payloads in Qdrant or metadata in other vector databases. Would you get the speed and ability to filter based on structured data that way?
@ColeMedin 7 дней назад ⁺¹
Thank you and great question! I really like working with SQL databases for structured data over everything else, it's the most robust + easy to work with, and also LLMs know SQL very well so you can even make them write the queries for more dynamic data access.
@thinktanks8703 9 дней назад ⁺¹
Yet another great video! thanks man
@ColeMedin 7 дней назад
Glad you enjoyed it! You bet!
@TheSoloDIYer 9 дней назад ⁺²
Have your looked into using MCP servers/clients? It seems to be where all of this is going, and has a ton of industry support. Essentially (from what I've read so far) it's a modular approach to everything shown here, but with way more flexibility and ease of setup (just a few lines of json config text for each type of worker).
@ColeMedin 9 дней назад ⁺¹
Yes I have started looking into MCP! And I am thinking about how I can take a lot of what I've been working on and leverage it for my protocol instead of what I've been doing a lot with HTTP endpoints.
@soumukhe 12 дней назад
Thank you! Awesome stuff. Just went through the entire video. Now, I’ll take my time to implement it and understanding each piece in details. You are awesome!!!!
@ColeMedin 11 дней назад
You are so welcome, thank you very much!
@Freelancer95-p8x 11 дней назад ⁺¹
Very great explanation mate! Big thanks!
@ColeMedin 11 дней назад
Thank you! You bet!
@laurenjones5081 12 дней назад ⁺¹
Yes ! I've had successes, and i've had a lot of pitfalls, I think i've had more non successful runs, then successful runs, but i'm getting there
@karaokevids 10 дней назад ⁺¹
Cole, whats the recommended way to deploy it in a real live environment? Especially aws lambdas?
@ColeMedin 9 дней назад
Great question! There are really two parts to this workflow:
1. The script that creates/updates the knowledgebase you'll want to run on a scheduled basis
2. The AI agent that leverages agentic RAG
I'd publish the knowledgebase script to something like a serverless function and set it on a schedule to run every hour/every day, depending on how often the documentation/site updates that you are scraping.
For the AI agent, you could turn this into a serverless function or have it sitting as an API endpoint on a cloud machine through something like DigitalOcean.
@nikhilmaddirala 9 дней назад ⁺²
Do you recommend any RAG frameworks for those of us who don't want to build and maintain our own RAG? Like RAGFlow etc.?
@ColeMedin 7 дней назад ⁺¹
Great question! RAGFlow is good from what I've heard, but I haven't used it myself. LightRAG is another great one, still requires coding but it takes care of a lot under the hood for you!
github.com/HKUDS/LightRAG
@Gidowan 8 дней назад ⁺²
I watched both lessons step by step, and I got stuck at the moment where you start communicating with the streamlit. In other words, I don't understand how you set up the frontend to make everything work. We're messing with scripts and so on, and boom, we have a web interface where we ask questions. How to do it?
@ColeMedin 7 дней назад ⁺¹
Sorry that part is confusing, I get it! I cover building the Streamlit app quickly at the end of the video and mention that all the code is in the repo to check out yourself if you are curious. I use the Streamlit app prematurely (before I show the code for it), just so it's easier to demo the agent as we are building it.
@Gidowan 7 дней назад ⁺¹
@ColeMedin Great stuff. I've been thinking for a long time about how to give chatgpt access to certain knowledge. It turns out it's called RAG. And all the other tools are just great. As a result, I got a lot of files that look like a complete mess. I will be looking for a video on how to structure all this. I made an agent that works great with yandex_market_api, compared scripts written simply by chatgpt4o and chatgpt4o mini with supabase, supabase wins.
@RockyTheDog756 12 дней назад ⁺¹
Awesome content, it’s so helpful in AI learning curve! Thanks a lot!
@ColeMedin 12 дней назад
Thank you very much! You bet!
@motouman3240 2 дня назад ⁺²
Great video and explanations. I have subscribed to the channel. Have a question though. Is there a way to use Ollama instead of Openai for this as I don't have an Openai API account. If there is, can you suggest how or point me to one of your videos which has done it.
@motouman3240 День назад
Does anybody know how to use this with Ollama?
@ColeMedin 11 часов назад
Thank you very much! For using Ollama, they have OpenAI API compatibility so it's really easy to adjust this code to work with that!
ollama.com/blog/openai-compatibility
@attfin2008 10 дней назад ⁺¹
superb content, fantastic presentation... if only all of RUclips was like this...
@ColeMedin 9 дней назад
Thank you very much!
@KenSilverman1 10 дней назад ⁺¹
Dude you speak so well. Thank you for all your videos. Hey I think you missed the most important part here though when you mentioned some mathematics to do the matching That's the heart of it all, and that's typically done by clustering or graphing and even the graph is intuitively clusters. A bit of a discussion on where that cluster topology is stored, IE is it stored as additional fields in the database would be great if you do that later in the video my apologies. Intuitively this is serving the effect of an index but is much different. F a i s being a graph approach and many others being cluster k means centroid approach
@ColeMedin 9 дней назад
Thanks Ken, I appreciate it a lot! You're totally right that focusing on what actually goes on under the hood with RAG is super important. And I do want to create more content diving into that later on! You're right I don't dive into it in this video though, mostly because there is already so much content here I don't want to overwhelm someone in one video.
@At-Attin 11 дней назад ⁺¹
great video and channel! Thanks! Another very entertaining way to see classic RAG fail spectacularly is legal RAG. My first RAG use case was feeding a large legal doc to a vector db and the result was abysmal... I started to question the whole concept of RAG, but of course, the issue was in front of the computer. I am now experimenting with different approaches, obviously web scraping wont help here, but I think I will need to chunk up documents in different ways and have different dbs to pull information from... Even for one doc, just one vector db wont do. But thanks, your video just reassured me and pydanticAi bundled with your tool on studio.ottomator makes me optimistic to get this right one day :-)
@ColeMedin 11 дней назад
Thank you very much! Sounds like a tough use case and I like where your head is at with it!
@screenwatcher6224 8 дней назад ⁺¹
I’m also working on a project with legal docs. But I’m a newbie so almost everything is over my head.
@amerrashed6287 12 дней назад ⁺⁴
Awsome. Does n8n supports agentic rag approach ?
@ColeMedin 12 дней назад ⁺¹
Thank you and yes it does! Essentially for agentic RAG in n8n you can include the usual "vector store retrieval" tool for your agent and also define other tools like I did in this video to explore the knowledge in other ways.
@TalOrbach 10 дней назад ⁺¹
@@ColeMedin
Jumping in here - if I'm not very experienced with coding (though not completely foreign to it either) - would it make more sense to learn pydantic or n8n, if I want to start building agents (including complex ones)? (also considering Relevance AI and Crew AI, if you can comment on those)
BTW - I think a video covering which tool to learn could be a huge success, if you feel you have the right knowledge for this.
@ColeMedin 9 дней назад ⁺¹
Great question! If you're looking to build more complex agents I would certainly recommend starting to learn Pydantic AI. n8n is still great, especially for prototyping, but coding your own agents in the end is more robust. I haven't used Relevance AI and Crew AI much but for a reason, generally it does way too much for you, which can be nice but also takes away a lot of your power for customization.
@TalOrbach 9 дней назад
@@ColeMedin Thanks for the reply. If I may ask a small follow-up: is Pydantic beginner friendly? How much longer would it take to create a fairly basic agent with it compared to something like n8n?
@johnsortino 7 дней назад ⁺¹
Question. For the second question example, how is it determined that an 'agentic ai' secondary lookup is necessary / that the initial response is not adequate? Is a secondary query always being made (or is this 'black box' )? This important tidbit seems to have been glossed over and trying to understand how this 'decision' is made. Also, as we know, follow up questions / context can be difficult with RAG for know if a new question is being asked versus a request for related or additional information to the previous query(ies). Does Streamlit facilitate this process (you mentioned historical / session based), but am not sure if Streamlit is just for the UI only. Thanks for the great videos.
@ColeMedin 7 дней назад ⁺¹
Fantastic questions! And totally fair that I did gloss over this and it would have been beneficial to dive into it more.
Firstly, the secondary query is not always made. Tested this myself. The agent does know when RAG is "enough", and the system prompt is where I tell it how to make that determination. Basically I just say "if the knowledge returned from the RAG lookup is relevant to the user's question and you feel confident using it to answer, then stick with that and don't perform the secondary lookup. Otherwise, continue to the secondary lookup with the other tools available."
This could probably be improved a lot and I could certainly be more specific to help it make that decision better! But it was a good starting point and I wanted to keep things simple.
For your last question - Streamlit has a concept of "state" for the app, which I do use to store the conversation history. So if a document chunk is returned from RAG, that is included in the conversation history so the agent can leverage it a second time to answer another related question without having to perform RAG again!
@johnsortino 7 дней назад ⁺¹
@@ColeMedin Thanks for the reply and clarifications. Appreciate your content.
@ColeMedin 5 дней назад
Thank you! You bet!
@nathWSD 4 дня назад ⁺¹
this is very beautiful but the question i will like to ask is, what if the pages are dynamic either the content of pages can chage as time goes on so the database will contain old information which will no longer be accurate how can one solve this problem?
@ColeMedin 11 часов назад
Thanks and good question! I would take my example and turn it into something you can run on a regular basis (like once a day), clear out the old knowledge, and rescrape and insert up to date knowledge for the site.
@adanpalma4026 12 дней назад ⁺²
Great Content. I have a question about the base of all that architecutrw query pipe. And is the quality of parsing complex pdfs how good isnpydantic parsing because no matter agents you have and tool if your document is bad parse all thos arch fall down. What do you think.
@ColeMedin 12 дней назад
Thank you! For parsing PDFs, I would create a custom solution that you would bake into this process. You wouldn't use Pydantic to parse the PDFs, you would use some PDF library like:
pypdf.readthedocs.io/en/stable/
@adanpalma4026 12 дней назад ⁺¹
great... it is posible parse with llamparse and crate agents with pydantic?
@ColeMedin 11 дней назад
Yes for sure!
@matthewchung74 11 дней назад ⁺¹
Great video. Do you use supabase for auth and edge functions too?
@ColeMedin 9 дней назад ⁺¹
Thank you! I have used Supabase for auth a lot in the past, mostly using Auth0 right now though just to have something more universal. Haven't used edge functions much but I know they are great!
@srlog_ 11 дней назад ⁺¹
Just wanted to let you know that this is a solid project. Thanks for sharing and inspiring us to learn agentic ai. At last youtube doing me good.
And i was trying to implement this to create a chatbot for our colllege website, and my agent replies like ....
Here are the departments in MSEC:
{"tool_calls":[{"id":"list_departments_msec","type":"function","function":{"name":"list_departments_msec"},"parameters":{"}}]}
Note: I used the tool "list_departments_msec" which is not provided by you. Please provide the real department list.
I guess its with me changing the system prompt..
@ColeMedin 9 дней назад
Thank you very much! Which model are you using for this? This kind of response looks like what I typically see with smaller LLMs that don't handle tool calling correctly all the time. I'd try with a different/bigger model!
@u.a3 12 дней назад ⁺¹
Another banger! Would you say agentic rag or KAG is better?
@ColeMedin 11 дней назад
Thank you! I need to look into KAG more, something on my list to research. But I'm not super sold on the idea at this point so right now I'm sticking with agentic RAG.
@Xactant 13 дней назад ⁺⁴
Great content, I have been binge watching your channel over the last few days. Very informative. As a developer this has become a go to channel for AI learning.
@ColeMedin 13 дней назад ⁺¹
Glad to hear it - thank you very much!
@clivedsouza6213 2 дня назад ⁺¹
You're a god. Thank you for this.
@ColeMedin 2 дня назад
You bet!!
@alqaimyouth 12 дней назад ⁺¹
Thanks for the amazing work. I have a challenging use case to use another language using rag but i need it to be as accurate as possible any suggestions?
Thanks
@ColeMedin 11 дней назад
You are so welcome! Seems like you need some larger scale advice, probably would be worth posting in our community! thinktank.ottomator.ai
@MohammedDaba-vb1ws 13 дней назад ⁺⁴
why basic rag didn'r retrive the whole weather example, although when you split chunks, the function should've accounted for taking whole code blocks and i think the whole example should have been in one chunk, easy to retirive ans spit it out just using basic rag. I know agentic is a lot better, but optimizing basic rag would help agentic I guss as it uses splitting as building block as well.
@ColeMedin 12 дней назад ⁺²
Fantastic question! Honestly I was wondering that myself. I checked the database and confirmed that the weather agent code block is maintained in a single chunk, and that the retrieval isn't grabbing that chunk so it isn't like the LLM is ignoring it.
Hard to say exactly what the problem is, and if I optimized my setup (different chunking, better RAG with query expansion, etc.) I'm sure I could eventually get to the point where it could pull the full example. Agentic RAG is just one solution to make it more robust but certainly the easiest in my mind!
@angryninja5030 8 часов назад
Thank you for replying to my previous question. I have another one if you don’t mind. Why are you not preprocessing the scrapped data before creating embeddings - normalize it, remove irrelevant data and noise, and structure the data in a more suitable format. From what I have seen, although good, the markdown provided by crawl4ai could use some further sanitation. This is a legitimate question and in no way meant to challenge how you are building; I am genuinely curious to know if preprocessing could be beneficial. Thank you again.
@davidgwyer5169 11 дней назад ⁺¹
Have you used any other vector databases? Any pref? e.g. Supabase over Pinecone, LanceDB and many others?
@ColeMedin 9 дней назад
Good question! The primary three I've used more than just a bit are Supabase, Qdrant, and Pinecone. Supabase is my general recommendation. It isn't as fast as Qdrant or Pinecone so not the best if you really need speed, but it's as powerful and I love having my SQL DB and RAG on the same platform. Qdrant is open source so I'd generally recommend it over Pinecone since you can host it yourself for free!
@Omobilo 4 дня назад ⁺¹
Wonder if agentic approach would still be valid with these new reasoning models that surfaced. Thoughts?
@ColeMedin День назад
Actually I think reasoning models will make agentic RAG even better! Using something like R1 to reason about what to search in the vector DB is something I am looking into.
@kenchang3456 3 дня назад ⁺¹
Way excellent tutorial, thanks a bunch.
@shunmax 10 дней назад ⁺¹
Great video, thanks. Did you ever test solutions like fusing (RRF) BM25 results with cosine similarity for precision?
@ColeMedin 9 дней назад
Thank you, you bet! I haven't tested this yet but I would be very curious to do so!
@christianbuttner9793 12 дней назад ⁺¹
can i create now a vue 3 and nuxt 3 agent with this and combine it somehow with cursor, so that cursor uses claude sonnet and the agent to code my requirements?
@ColeMedin 12 дней назад
GREAT question! You certainly can by putting this agent behind an OpenAI compatible API endpoint and setting that up in Cursor. Something I am going to explore more soon and probably create a video on!
@christianbuttner9793 11 дней назад ⁺¹
@@ColeMedin i see, but i am not able to create something like this. But think of the possibilities if this works - Amazing!
Really looking forward to a video like this. Let me know if i can help..
@tecnopadre 12 дней назад ⁺²
You are going as a 🚀
@ColeMedin 12 дней назад
Haha I appreciate it! :D
@shanecarlson1230 12 дней назад ⁺¹
Forgive my ignorance. You channel is great and this video is fire. It’s got me wondering, is this how Perplexity functions?
@ColeMedin 11 дней назад
Thanks man! I think something like this is a part of the Perplexity platform, but mostly it's a web search engine powered by AI not a RAG solution. Would be difficult to ingest the entire internet into a knowledgebase! haha
@suhel.choudhury 8 дней назад ⁺¹
Amazing video. Thank you so much.
@ColeMedin 7 дней назад
Thank you, of course!
@bangkitsanjaya3773 12 дней назад ⁺¹
Please 😢, how to retrieve vector database from supabase, it's always fail although node supabase is successful, but AI Agent didn't know how to answer
@ColeMedin 12 дней назад
Hmmm... have you checked to see what is retrieved from RAG? Maybe the wrong context is being fetched which is why it seems the AI Agent doesn't know how to answer?
@aniketyadav7023 10 дней назад ⁺¹
Great video! Just a small (maybe inconsequential query?) - any specific reason you used supabase and not other modules for the vectordb ? (Like faiss )
Edit: got my answer when your started agentic rag part lmao - we can't directly store metadata etc on faiss, so makes sense why you selected something like supabase
@benixmaximus 12 дней назад ⁺¹
Your desk setup looks so good. Would be great if you made a video or if not did you buy the desk somewhere or build it/customised yourself and if you built it how? :D
@ColeMedin 12 дней назад ⁺¹
I hate to break it to you but the background is not real! I generated it with AI and then I use a tool called Nvidia Broadcast to put it as my background without even having to have a green screen.
@VickySingh-s5p 12 дней назад ⁺¹
You are a star ⭐
@kofiadom7779 12 дней назад ⁺¹
This is fantastic!
@ColeMedin 12 дней назад
Thank you! :D
@soundlab4831 12 дней назад ⁺¹
Thank you, excellent presentation!
@ColeMedin 12 дней назад
You bet - thank you very much!
@sreerag4368 12 дней назад ⁺¹
Hey, If I have around 20-30 docs of websites like these, how should I go and store them,should I store it in a single table or should I break it down to multiple tables ?
@ColeMedin 12 дней назад
Great question! I'd recommend sticking to one table for simplicity, and then setting a metadata field for the website the record is from. Very similar to what I do for the "source" metadata field in the video.
Then when you query the knowledgebase and you only want to query from one website, you just include that metadata filter in your query!
@Prabhdeep-f9h 12 дней назад ⁺¹
Great video!! Just one question, can you go over how the matching of embeddings work? I guess I didn't understand what are embeddings and thus didn't understand how the searching of "relevant" docs work. Any video I missed where you discussed this in detail?
@ColeMedin 11 дней назад ⁺¹
Thank you! Lot of great resources explaining embeddings on RUclips! I don't have a video dedicated to the topic, but here is one I vetted myself that explains it very well:
ruclips.net/video/dN0lsF2cvm4/видео.html
@faustosaccoccio9932 5 дней назад ⁺¹
This is amazing, thank you so much for your work and for teaching us! much appreciated it!
Can you set this up also with deepseek r1 or you need openai embeddings capabilities for the DB?
@ColeMedin 4 дня назад ⁺¹
You are so welcome! You can certainly use R1 through DeepSeek or OpenRouter for the LLM! You'd just have to keep OpenAI for the embedding model or use a different one (like a local embedding model through Ollama).
@ashshilkin807 11 дней назад ⁺¹
You are the man.
@MyNamesJeff8 8 дней назад ⁺¹
can we use other llms instead of open ai ones??
like LLAMA, etc.
@ColeMedin 7 дней назад ⁺¹
Yeah you certainly can! In fact Ollama is OpenAI API compatible so you really wouldn't have to change much here! And Pydantic AI supports Ollama.
ollama.com/blog/openai-compatibility
@MyNamesJeff8 7 дней назад ⁺¹
@ColeMedin ohh thx for such quick response!!
@ColeMedin 5 дней назад ⁺¹
You are welcome! :)
@RodrigoMallmann1 3 дня назад ⁺¹
wouldn't it be easier to use another agent to create the summaries/titles? i saw you went directly to openai... by using an agent we can save some tokens running ollama
@ColeMedin 11 часов назад
Yes you certainly could! I was just looking to keep it simple but this would be an even better approach!
@Steve.Goldberg 12 дней назад ⁺¹
How would you setup crawl4ai hosted so you can interact with it through webhooks or http requests in n8n workflows?
@ColeMedin 12 дней назад ⁺³
Great question! And your head is certainly in the right place for how to leverage this in n8n!
I would use FastAPI to create a Python endpoint around whatever Crawl4AI logic you want. The "payload" for the API could be a specific page you want to crawl or a list of pages. Then the API can either return the contents of the crawled page(s) or just put the knowledge in a database like I do and then return a success code.
For hosting this endpoint, I'd recommend using DigitalOcean. I'll probably be making a video on this soon! Lot of people want to use this with n8n.
@YourDataDriven 8 дней назад
+1
@ghostJammerz 3 дня назад ⁺¹
subbed! great videos!
@eyalorbach9900 12 дней назад ⁺¹
Q: How vector search can find similarity to questions? Meaning: If my website is built like Q&A then it might that the question maybe comes aside the answer(same chunk?). While we know that LLM trained to answer questions, in RAG it is a just vector similarity match. I did not see that you handle that. and yet, still looks like works pretty well in your demo. I heard some podcast that taught nice trick: first give LLM to try answer something on the question (even if not good enough /no updated enough). Then, take this answer and because it is in positive sentence, not a s question, there is higher probability to find a match to it in the vector DB. what do you think?
@ColeMedin 11 дней назад ⁺¹
Good question! Vector search in simple terms is all about keyword matching. That's why a question can be matched to chunks even though it's not a positive sentence. Those keywords like "tool calling" or "weather agent example" are still in the question. But the idea of forming it into a positive sentence first before retrieval is a good idea too! For a lot of uses cases I bet that'll help with accuracy a lot.
@svenhuijbrechts 8 часов назад
Great video and nice approach.
Any specific reason to choose a chunk_size of 5000 (more or less 1250 tokens) while generally the recommended chunk size is up to 400 tokens?
@rajsodhi 10 дней назад ⁺¹
I just got this working on my home laptop. Amazing! I can see how this might chew a lot of API tokens...
Here's a hard question that the usual LLM's (GPT-4o, DeepSeek, etc.) get wrong:
"Can you explain how Pydantic AI implements its custom validation mechanisms for complex nested models, and what are the performance implications of using these validations in large-scale applications?"
now... how do I rerun this to my docs?
@ColeMedin 9 дней назад
Glad you got it up and running! The title and summary creation for every chunk will certainly take a lot of tokens, luckily though it's a simple task so you can use very cheap LLMs to get the job done.
That's a good question! Could you clarify what you mean by rerun to your docs?
@rajsodhi 6 дней назад
I have a collection of PDF's. Presumably, I need to reliably convert them to markdown, and then let LLM do chunking and bookkeeping, similar to what you did with the website crawl.
@tamilarasuofficial 7 дней назад ⁺¹
Hi Cole, I’ve been following your videos on Agentic AI and RAG, and they’ve been incredibly insightful! I’ve successfully built an AI assistant with Agentic RAG based on your guidance, and it’s working great. However, I want to ensure that the assistant only replies to queries related to my website, and any other queries are considered outside the scope. Could you share any tips or best practices to achieve this? Your expertise would be a huge help. Thanks for the amazing content you share!
@ColeMedin 7 дней назад
Thanks for the kind words and that's super cool you built an agent for yourself based on this! Nice work!
Great question too. The main wait to limit your agent to focus on just what you made it for is to tell it that in the system prompt. Something like "You are an expert at the Pydantic AI documentation and only answer questions and talk about that. If the user asks about or talks about something out of scope, direct them back to talking about Pydantic AI and say you can't discuss other topics."
@snackydibs4465 12 часов назад ⁺¹
I have a request, would appreciate if you can show how we can use some of the free alternatives to openAI api (like local ollama or huggingface models) and also if you can make tutorials which involve typing the code step by step in real time, since seeing entire chunks of code at once can be pretty heavy on the eyes.
@ColeMedin 11 часов назад
Great suggestions, I appreciate it! I'll certainly be doing more local AI in the near future for this stuff!
@WhyitHappens-911 12 дней назад ⁺¹
I would enrich it with a twin knowledge base for the langgraph doc and we woul be up to have the best AI agent assistant! How would you to it?
@ColeMedin 12 дней назад ⁺¹
YES LangGraph and Pydantic AI is an incredible combo!
This would fit very well into agentic RAG - we can ingest the LangGraph documentation just like we did with Pydantic AI using Crawl4AI. They have a sitemap.xml as well:
langchain-ai.github.io/langgraph/sitemap.xml
What we can do is set the metadata field "source" to be "pydantic_ai" for the Pydantic AI docs and "langgraph" for the LangGraph docs. Then we can create separate RAG tools for our agent that will search specifically through each of the docs in the knowledgebase using the metadata to filter.
That way the agent won't get confused between the frameworks but can still search through both to combine them together to create agents on our behalf leveraging both.
@WhyitHappens-911 12 дней назад ⁺¹
@ was not expecting a tutorial already! Thanks for the quick advice! I’ll work on it
@angryninja5030 21 час назад ⁺¹
Why would you not make embeddings for all your data and just use one vector database? Is there value in having some structured and some embeddings?
@ColeMedin 11 часов назад
Great question! Having more options to access the data like this almost always gets better results that just naive RAG. It allows the LLM to reason more about the knowledge it wants to retrieve. Basic RAG is pretty limiting because the agent can't make many decisions about the information it is getting.
@syenza 12 дней назад ⁺¹
Hey Cole, great vid! Do you help businesses setup agentic RAG systems?
@ColeMedin 11 дней назад
Thank you! I don't offer consulting at this time but I am working on a platform to connect developers to business owners. Also feel free to post in our community of developers if you are looking for someone! thinktank.ottomator.ai
@foreverindependant6916 12 дней назад ⁺¹
Is there a way to feed that data (crawled thanks to Crawl4AI) and easily feed it to a a Chat GPT agent I created ? (I am a no-coder that's why this use case is interesting for me)
@ColeMedin 12 дней назад
Did you create a GPT Assistant? Is that what you mean by Chat GPT agent? If you follow this video to create a knowledgebase in Supabase using what you scrape with Crawl4AI, you could create a custom tool for your OpenAI assistant to query that knowledgebase!
@NikhilChopra-xt3nk 9 дней назад
didnt worked out for me. i just thought to update the code for using gemini api key. all went well but at the last when it cam to build a ui then it crashed during second question. works for one question per check related to the doc.
@ColeMedin 9 дней назад
Hmmm... sounds like something is off with the way the conversation history is stored/retrieved if it crashes on the second message. What is the error you get?
@MuhammadumairAkram0276 11 дней назад ⁺¹
can you update the code using ollama models?
@ColeMedin 9 дней назад ⁺¹
Ollama is OpenAI API compatible so it's pretty easy to switch to that instead of GPT! Main thing is just changing the base URL in the OpenAI client to point to Ollama. They have docs covering this:
ollama.com/blog/openai-compatibility
@Tecnologias-g6n 9 дней назад ⁺²
Cara isso é muito bom , parabens e obrigado pelo conhecimento
@AdamPaulTalks 13 дней назад ⁺³
Agentic is like a cheat code
@Queracus 12 дней назад ⁺¹
what about local postgresql for database and local llm?
@ColeMedin 12 дней назад ⁺²
You can certainly tweak this solution to use both! For example you could host Supabase locally (for Postgres) and run an LLM through Ollama. For the LLM you'd just have to change the "base URL" for the OpenAI client to point to Ollama:
ollama.com/blog/openai-compatibility
And for the Pydantic AI agent, Pydantic AI supports Ollama:
ai.pydantic.dev/api/models/ollama/
@screenwatcher6224 8 дней назад ⁺¹
Is there anyway to do a tutorial on this for someone with zero coding experience?
@ColeMedin 7 дней назад
I'm putting out a guide on doing something similar with n8n soon!
@ep4r4 12 дней назад
Hola Cole, me gustaría sugerirte que reconsideres activar las traducciones automáticas en tus videos. Para muchos, como yo, estas hacen tu contenido mucho más accesible y fácil de seguir, especialmente cuando necesitamos prestar atención tanto a lo que explicas como a lo que muestras en pantalla. Al consultar con otros creadores de contenido de habla inglesa, me confirmaron que desactivarlas es una decisión personal, pero hacerlo puede dar la impresión de que no se valora tanto al público HISPANO.
Aprecio un montón tu trabajo y espero que tomes esto como una crítica constructiva y un impulso positivo hacia el cambio.
@ColeMedin 11 дней назад
RUclips has changed some things under the hood so I wasn't aware I lost this. I have automatic dubbing in some languages through RUclips but not others. I will have to look into it!
@ep4r4 8 дней назад ⁺¹
@@ColeMedin Gracias Cole, me encanta tu material
@zacha.711 13 дней назад ⁺¹
Very interesting video, thanks! I have a little question for you: Why do you give the URL for the page to the LLM instead of saving your markdown text obtained from the initial crawling? Is there an advantage to do so? I guess it's to avoid the need to constantly update your Markdown information since the URLs will always have the latest information through the URL and will "re-crawl" it if need be, but I was curious to understand if there was other elements in your thinking. Thanks! :)
@ColeMedin 13 дней назад ⁺¹
Thank you and great question! So when I give the URLs to the agent it doesn't actually use the URL to visit the site in realtime. It does just use the markdown I have stored in the database. It simply uses the URL to determine if the content is relevant to the user's question, sort of like a title but I was thinking URLs give extra context with the path - it speaks to how the page relates to the rest of the documentation if that makes sense.
But also you're thinking is spot on that we could also have the agent pull the latest information in realtime with the URL if we wanted!
@zacha.711 11 дней назад ⁺¹
@@ColeMedin Thank you for your quick and complete reply! I understand better now and really appreciate! It does indeed make sense to have the full URL to have the extra context from the path (I didn't even think of it that way)! Have a nice weekend!
@ColeMedin 11 дней назад
You bet! You have a great weekend too!
@larsh5853 10 дней назад ⁺¹
Cool.
Is there a JavaScript equivalent to Pydantic AI ?
@ColeMedin 9 дней назад
Thanks! There is not exactly, for JS I'd recommend using LangChain JS.
@larsh5853 8 дней назад ⁺¹
@@ColeMedin 👍
@mikew2883 12 дней назад ⁺¹
Awesome tutorial! 👏 Quick question. Is Quadrant's faster speed for sematic search big enough of a benefit to maybe introduce a hybrid model where you might want to use both whereas Supabase might have a reference column to a Quadrant vector store that handles the vector search?
@ColeMedin 12 дней назад
Thank you and good question! Though I might need a bit of clarification. In my mind I can't really see how a specific column in Supabase would point to a Qdrant vector store. If you have multiple Qdrant vector stores to perform RAG with, I would just set those up as separate tools for the agent right in the code instead of making the agent go to Supabase to first find the Qdrant vector store to use.
I suppose though that if you really do have dozens of Qdrant vector stores for some reason, it would be more scalable to maintain that list in Supabase instead of having it hardcoded in your script!
@mikew2883 11 дней назад ⁺¹
@@ColeMedin I was envisioning something similar to storing the primary key of the record in Supabase record in a Qdrant database as part of its meta data. This record would then store the vectors. It would function similar to a lookup table, instead the vector search portion would run against the Qdrant database instead of the Supabase one and the final search would be combined into one result. Is this feasible, I am wondering?
@ColeMedin 11 дней назад ⁺¹
Ah yes I gotcha - yes this kind of thing is certainly feasible!
@SOYLUISO 11 дней назад ⁺¹
couldn't this also be solved using graph rag? great videos! thanks!
@ColeMedin 9 дней назад
Yeah graph RAG is another good solution though I find it more complex than agentic RAG! I do want to cover it in future videos though!
@nikhildoye9671 6 дней назад
Question. I wish to use Llama instead of OpenAI. I have llama3.1 on my local system through ollama. Can you guide me?
@vsakaria 12 дней назад ⁺¹
Good man !
@josejunior1853 12 дней назад ⁺¹
Please make available the audio channel selection (automatic RUclips dubbing) for this and other videos 🙏🙏🙏
@ColeMedin 11 дней назад ⁺¹
I do have it on but RUclips by default only does some languages. I'll look into it!
@g4a5 13 дней назад ⁺⁶
This is so helpful, Cole. Waiting for you to cover the RAG agent with SQL queries.
@ColeMedin 13 дней назад ⁺²
Glad you found it help! Yes I'm planning on doing that soon and it'll be very similar to this. In fact this already is a RAG agent with SQL queries essentially!
@g4a5 13 дней назад ⁺³
What I was suggesting is an agent having a tool that searches user’s question in a relational database and gives answer. This would mean the agent/tool will need to convert the question into a sql query to fetch the relevant data and feed it to LLM. This is required for most B2B use cases where data is stored in tables.
@ColeMedin 12 дней назад ⁺¹
Gotcha! Yeah I did this exactly for a client once, definitely going to share a version of that implementation on my channel!
@nicoakd1034 11 дней назад
esperando lo mismo desde argentina, gracias
@subhodeepdas3884 10 дней назад
Hello @ColeMedin, I used this code of your but I used gemini model instead of openai and made necessary changes. When I am crawling and adding to the database, everything else is working correct except I am getting an error: Error getting title and summary: 429 Resource has been exhausted (e.g. check quota). It might be hitting some rate limit. Could you please tell me how to resolve this. Thanks a lot in advance.
@ColeMedin 9 дней назад
Yeah you're probably hitting the LLM too frequently - that's what a 429 error typically means. I would add a delay between each call to the LLM using the time library in Python. And probably reduce the batch size for the web scraping as well.
@BrantGarveyNoXcuses 11 дней назад
Hi Cole, Im half way through this video and I'm curious could this be done in n8n and if so how?
@ColeMedin 11 дней назад ⁺¹
Great question! Unfortunately not out of the gate since Crawl4AI is not one of the Python packages in n8n and there isn't a Crawl4AI node. But what you can do is turn a Crawl4AI implementation into an API endpoint that you call with the HTTP node in n8n! I might actually be making a video on this soon ;)
@BrantGarveyNoXcuses 11 дней назад ⁺¹
@@ColeMedin Epic mate thanks for getting back to me and I look forward to the future video.
@ColeMedin 9 дней назад
You bet - thank you!
@baghdadiabdellatif1581 8 дней назад ⁺²
شكرا جزيلا اعطيتني اجابات على اسئلة لم اكن قادرا على صياغتها او طرحها

Следующие

Автовоспроизведение