Just 3 days ago I had an idea to build an AI project to help me summarize 4 books a week before school starts. 2 days ago I started researching libraries, methods to get concise data without hallucinations. I was progressing but not a single video was up to date or taught what I needed. It started stressing me out. Then I found your channel. Just when I needed it you uploaded exactly what I searched for. I don't have the words to describe what a change you were to me and only because of this video I want to keep tinkering with LLM's. You are a legend who changed a 16 year old's mind about developing with AI and just gained a new patreon and I've never paid for Patreon. I look forward to watching your other videos.
You should still read the books. Otherwise, the degree you earn will be about as worthless as the paper it's printed on. You're also not learning how to do anything other than mimic that which others do. Don't settle to be a little man that stands on the shoulders of greater men who come before you.
Hi Matt. I always upvote your videos. It comes in mind two possible topics for future course deepenings: 1. comparison benchmark /even if qualitative) among different size of the same model, different quantization and different context window size expaining the HW respopurces trade-offs. 2: using various RAG techniques to memorize chat conversations for a sort of "long term memory" (that's indeed a very general usage). Just ideas. Thanks for all
Spot on. Thanks. I like the speed and depth of the videos. This one in particular is very helpful, as it shows everything from embedding, db setup and preparation of the junks till the final execution on CLI. I would use it exactly like this. Still, making it available for colleagues as well, there is a need for a front and which is Open WebUI in my case. Anyways, I learned so much and have tons of questions, that I would not dared raising without this input. Whether or not you use an frontend or CLI, it helps follow-up the technology side of it.
Thanks for boiling this down to the main components and how they can come together to make a solution. Its a great foundation to start with, many of the videos I have seen have focused on all kinds of different tooling and its hard to know what aspects are essential.
Thank you for providing playlist to learn this. Very helpful how you explain things. Always looking forward to your training series. Keep up the good work Matt!
Would love to see further deep dive on this including hybrid keyword/semantic search and reranker for large datasets applied with an LLM via Ollama. Thanks for the great tutorial as always!
1- You forgot to drink... It is important to keep hydrated.🧐 2- I prefer ready solutions especially those that give complete choices and options (Hybrid RAG with graph knowledge)💥 Thanks for the good content 🌹
Video Suggestion: Table-Augmented Generation (TAG). TAG is a unified and general-purpose paradigm for answering natural language questions over databases.
Great video. Any chance you can add RAG citation video to your collection? It is valuable to have the RAG output cite where in the document the content was obtained from that was used in the response.
It's such a breeze to find your channel, a lot of influencer noise about such topics, make it challenging to find quality material. I came in to try to answer a question, which you pointed at, at the end of the video. How do the pre-made RAG solutions compare to each other, and to the DIY one. Off course building your own, comes with the gift of knowledge of how things work, and eventually better understanding and implementing them, fixing problems when found ....etc. My perspective if it helps: the topic has many choice points, and it can get easily overwhelming for someone of humble knowledge, and what really help is to know for someone such as yourself, why choose this over that, or what is the balance to look for? for example why choose this embedding model ? If epub is better than PDF for the task, should we try to convert it first, to get better text content from PDFs ? Will it be helpful to rank the text extraction part before deciding to feed it to the model (what I mean is that PDFs have very much varying degrees of readability)? I'm just thinking out-loud at this point :D Lastly, thank you and looking forward to the next video.
Thanks really cool, however, not sure why you included magic when is really not needed. You are fetching most docs from the web, so you can get the mime type from the request.
What is the process to maintain dynamic data ex: customer list with active balances, balance changes everyday…. would I constantly have to delete and upload new data into the database or is there a simpler method?
Great video Matt, thanks for the awesome content. I am a novice with AI and I sometimes have issues where my RAG uses the LLM internal knowledge to answer a question, even though I provided the context and told it to use that to answer the question, just like your example. Would you have any suggestions on how to avoid that? Maybe it's something really easy I am missing 😅😅 Thanks
This was great, but... (There’s always a “but,” isn’t there?) I’m building a RAG system (using ollama for embedding and querying) at the moment and the hard part isn’t the RAG. It’s getting the text in the first place from PDF, MSG (including direct attachments and nested email chains), DOC[X], HTML etc. Do you have any recommendations for tooling in this arena?
Nice videos, clearly explained in plain English. A couple of questions: why are you using different models to get the rag and non-rag respose?. And why using ollama to get the embeddings instead of leveraging the chroma embedding feature?. Thanks for sharing your expertise
So I'm testing this out with the older video from april, i'm running into an issue where the embedded documents are matching the context, but the answer generated by gemma:2b says that "The text provided does not contain any information.... ...so I cannot answer this question with the provided context" anyone know why this is happening?
Most excellent vid sir! Can you expamd on this by showing how to make RAG perform faster at 25 tokens per second at least, with several GB or 1000's of md files uplloaded to it please?
dear Matt, could you please make practical videos, i kept watching several videos from you but never got to the point where i get things working. please mix you videos between lecturing and practical guides step by step so we all can benifit
I agree with @GhassanYousif Can you please make your videos easier to follow. What kind of audience are you targeting? We are new to programming and AI, so a lot of the technical speak doesn't really help us. If you could show steps by steps slower and clearer, that would really help. By the way, I read that there's a Llama Stack, would you be doing a simple video on how we can install and use Llama Stack?
The steps you describe are far too complex for me and my experience, 100% of the time is that no matter how exactly I try and follow along with these sorts of things, it won't work. It is either slightly out of date or one step is slightly misunderstood or whatever, and then comes the inevitable screen loads of cryptic gibberish. So, I was wondering if the easy way why via open-Webui which you mention at the end, is as good. Just add documents, create a custom model and away we go? Or is it too easy to be as good.
If you feel better with using open webui, great. Ollama is a tool for software developers first and so understanding how to build a rag system is one of those core projects everyone should learn.
When having bad results, I can"t decide if it's because of my chunking or because the data is in French. But I can't easily find models for embedding French texts.
I was waiting for the "other shoe to drop" -- "This is a FREE course" and then the scam grift begins. Its always a course with people these days. "I have to make money" he says....... You could make a Social Network and charge a $10 monthly fee to be part of a community or something with more value. Don't go into this as a scammer. Im older than you and spent many years doing Black Hat so I know exactly what Im talking about........
I don’t want your money for this course. I plan to keep going with the for a while. I may take some sponsorships but this will always be free. And I’m only mid 50s so not old at all.
Just 3 days ago I had an idea to build an AI project to help me summarize 4 books a week before school starts. 2 days ago I started researching libraries, methods to get concise data without hallucinations. I was progressing but not a single video was up to date or taught what I needed. It started stressing me out. Then I found your channel. Just when I needed it you uploaded exactly what I searched for. I don't have the words to describe what a change you were to me and only because of this video I want to keep tinkering with LLM's. You are a legend who changed a 16 year old's mind about developing with AI and just gained a new patreon and I've never paid for Patreon. I look forward to watching your other videos.
Have a nice learning path
You should still read the books. Otherwise, the degree you earn will be about as worthless as the paper it's printed on. You're also not learning how to do anything other than mimic that which others do. Don't settle to be a little man that stands on the shoulders of greater men who come before you.
You are a legend. Love your Ollama series. Keep up the great work!
Hi Matt. I always upvote your videos. It comes in mind two possible topics for future course deepenings: 1. comparison benchmark /even if qualitative) among different size of the same model, different quantization and different context window size expaining the HW respopurces trade-offs. 2: using various RAG techniques to memorize chat conversations for a sort of "long term memory" (that's indeed a very general usage). Just ideas. Thanks for all
Spot on. Thanks. I like the speed and depth of the videos. This one in particular is very helpful, as it shows everything from embedding, db setup and preparation of the junks till the final execution on CLI. I would use it exactly like this. Still, making it available for colleagues as well, there is a need for a front and which is Open WebUI in my case.
Anyways, I learned so much and have tons of questions, that I would not dared raising without this input. Whether or not you use an frontend or CLI, it helps follow-up the technology side of it.
Thanks for boiling this down to the main components and how they can come together to make a solution. Its a great foundation to start with, many of the videos I have seen have focused on all kinds of different tooling and its hard to know what aspects are essential.
Thank you for providing playlist to learn this. Very helpful how you explain things. Always looking forward to your training series. Keep up the good work Matt!
You are an amazing teacher. Thank you!
Would love to see further deep dive on this including hybrid keyword/semantic search and reranker for large datasets applied with an LLM via Ollama. Thanks for the great tutorial as always!
Brilliant. I just subscribed. Thank You for your video series.
1- You forgot to drink... It is important to keep hydrated.🧐
2- I prefer ready solutions especially those that give complete choices and options (Hybrid RAG with graph knowledge)💥
Thanks for the good content 🌹
Video Suggestion: Table-Augmented Generation (TAG). TAG is a unified and general-purpose paradigm for answering natural language questions over databases.
Great video. Any chance you can add RAG citation video to your collection? It is valuable to have the RAG output cite where in the document the content was obtained from that was used in the response.
Hmm that’s pretty easy. Sure I can add it to a list
It's such a breeze to find your channel, a lot of influencer noise about such topics, make it challenging to find quality material.
I came in to try to answer a question, which you pointed at, at the end of the video. How do the pre-made RAG solutions compare to each other, and to the DIY one. Off course building your own, comes with the gift of knowledge of how things work, and eventually better understanding and implementing them, fixing problems when found ....etc.
My perspective if it helps: the topic has many choice points, and it can get easily overwhelming for someone of humble knowledge, and what really help is to know for someone such as yourself, why choose this over that, or what is the balance to look for? for example why choose this embedding model ?
If epub is better than PDF for the task, should we try to convert it first, to get better text content from PDFs ? Will it be helpful to rank the text extraction part before deciding to feed it to the model (what I mean is that PDFs have very much varying degrees of readability)? I'm just thinking out-loud at this point :D
Lastly, thank you and looking forward to the next video.
Great video, as usual. Looking forward to the RAG tools video and maybe some integrations, please
I absolutely love this!
simplicity is the elegance - a prime example :-)
I have to jump into coding before build a rag, so actually i'm using open webui and i'm very satisfied of it 🎉
Thanks really cool, however, not sure why you included magic when is really not needed. You are fetching most docs from the web, so you can get the mime type from the request.
After the launch of llama 3.2 1B & 3B, this video should Skyrocket so Ollama
What is the process to maintain dynamic data ex: customer list with active balances, balance changes everyday…. would I constantly have to delete and upload new data into the database or is there a simpler method?
that’s pretty simple. I can't think of a simpler way
Thank you so much for this omg
Can you use a rag with the new llama3.2 in order to have it do facial or person recognition?
a continuation aimed at advanced users: local vs global vs native context comparison, using graphrag and triplex - when to use which one
Hi, what do you men for "local vs global vs native context comparison" ?
great intro to rag dev vid
Did anyone have problems with the python scripts? I had to correct some and the requiriments.txt didn't have all the necessary packages
Great video Matt, thanks for the awesome content.
I am a novice with AI and I sometimes have issues where my RAG uses the LLM internal knowledge to answer a question, even though I provided the context and told it to use that to answer the question, just like your example.
Would you have any suggestions on how to avoid that? Maybe it's something really easy I am missing 😅😅 Thanks
This was great, but... (There’s always a “but,” isn’t there?)
I’m building a RAG system (using ollama for embedding and querying) at the moment and the hard part isn’t the RAG. It’s getting the text in the first place from PDF, MSG (including direct attachments and nested email chains), DOC[X], HTML etc. Do you have any recommendations for tooling in this arena?
Thank you!
Thank you so much❤
You're welcome 😊
Nice videos, clearly explained in plain English. A couple of questions: why are you using different models to get the rag and non-rag respose?. And why using ollama to get the embeddings instead of leveraging the chroma embedding feature?. Thanks for sharing your expertise
I must have changed one and forgot the other. no reason
Please suggest best open-source model for local embeddings
there is an error in python code ,"collectionname" in line 12
any(collection.name == collectionname for collectionname in chromaclient.list_collections()) correction
What should it be? I hit this and just commented it out to get around it.
I am still using msty and having mixed results. The formating issue for data is my biggest problem right now.
So I'm testing this out with the older video from april, i'm running into an issue where the embedded documents are matching the context, but the answer generated by gemma:2b says that "The text provided does not contain any information.... ...so I cannot answer this question with the provided context" anyone know why this is happening?
Most excellent vid sir! Can you expamd on this by showing how to make RAG perform faster at 25 tokens per second at least, with several GB or 1000's of md files uplloaded to it please?
25 tokens per second seems pretty slow. I get double that on my 3 year old Mac.
next video of the series please
This is excellent. I need to hire a python coder, can you refer me?
is there a way for ollama to do yes or no answers?
Yes. Ask it a yes no question and tell it to answer that way. Or use structured outputs as shown in a recent video.
@@technovangelist ill give it a try. i use llava and i ask how many flying saucers do you see in the pictures and i always get random answers.
@@technovangelist it worked. I am getting Yes or No answers.
Timestamp format is off in the description, friend ❤
dear Matt, could you please make practical videos, i kept watching several videos from you but never got to the point where i get things working. please mix you videos between lecturing and practical guides step by step so we all can benifit
There is a lot of that on this channel. And will be more too
I agree with @GhassanYousif
Can you please make your videos easier to follow.
What kind of audience are you targeting?
We are new to programming and AI, so a lot of the technical speak doesn't really help us.
If you could show steps by steps slower and clearer, that would really help.
By the way, I read that there's a Llama Stack, would you be doing a simple video on how we can install and use Llama Stack?
The steps you describe are far too complex for me and my experience, 100% of the time is that no matter how exactly I try and follow along with these sorts of things, it won't work. It is either slightly out of date or one step is slightly misunderstood or whatever, and then comes the inevitable screen loads of cryptic gibberish.
So, I was wondering if the easy way why via open-Webui which you mention at the end, is as good. Just add documents, create a custom model and away we go?
Or is it too easy to be as good.
If you feel better with using open webui, great. Ollama is a tool for software developers first and so understanding how to build a rag system is one of those core projects everyone should learn.
@@technovangelist I'll give it a try.
When having bad results, I can"t decide if it's because of my chunking or because the data is in French.
But I can't easily find models for embedding French texts.
There are A few French Natural Language Processing (NLP) models on HuggingFace that might work for your needs.
Open WebUi first pls!
I was waiting for the "other shoe to drop" -- "This is a FREE course" and then the scam grift begins. Its always a course with people these days. "I have to make money" he says....... You could make a Social Network and charge a $10 monthly fee to be part of a community or something with more value. Don't go into this as a scammer. Im older than you and spent many years doing Black Hat so I know exactly what Im talking about........
I don’t want your money for this course. I plan to keep going with the for a while. I may take some sponsorships but this will always be free. And I’m only mid 50s so not old at all.
A "thank you" would be a better comment