Awesome video. I'm grateful for the work done. A few notes for Mac users - 1 - install Ollama locally and setup it up separately. The docket compose won't do that for you. 2 - inside n8n you'll have to change the connections to Ollama to point to the instance running baremetal. That means you can't use localhost. you have to use your hostname instead. 3- setting up the google project is a PITA, but follow n8n's directions exactly and it'll work. The last punch in the nards is that you have to make your google account a tester for the app you setup and then when you setup the google drive account connection in n8n you have to connect and grant it permission to access your google drive. It's a PITA. All that said: great work Cole. Keep it coming!
Thank you very much and I appreciate your notes here! For #2 you can also use "host.docker.internal" to connect to your machine from the container. So Ollama for example would be host.docker.internal:11434
Cole, you’ve done an outstanding job! Your videos consistently make complex topics clear and easy to follow, which is a rare talent. I appreciate how you break down each concept, making it accessible for viewers at all levels. Your teaching style not only engages but also inspires confidence in learning. I’m eagerly anticipating your next iteration. It’s always exciting to see how you evolve your content and introduce new ideas. Keep up the fantastic work, and thank you for sharing your knowledge with us!
Thank you very much!! I really appreciate the level of detail in your kind words - it means a lot to me to know that my work is noticed in such a thoughtful way and that I'm hitting my target of breaking down complex topics well!
@@ColeMedin Yes indeed! You explain topics with such a good flow between ideas and concepts that rivals that of other popular tech youtubers such as Networkchuck, Linus Tech Tips and Christian Lempa
I have to agree. Literally just getting started with local AI. Was about to skip past this video and thought, maybe it’s something I can use that I didn’t know existed. BAM! This video is going to be my beginning into what my vision is for my local AI. Really appreciate you made this understandable!
@@Hatch3dLabs the author tells well, and help me understand more clearly. I am getting started too, what tool are you using? I'm using xark-argo/argo on github, it's simple to install on Mac and Windows, what about you? I'd like to keep in touch for more learning.
This is a very good step-by-step tutorial. Following the instructions in this video will get you started with local AI. For people trying M1 and above, the ollama must be installed separately, and the rest are the same.
@@ColeMedin If you want to use GPU/NPU-accelerated LLM rather than CPU on Apple Silicon (which doesn't have either AMD or Nvidia GPUs), you'll need the actual Ollama desktop app on your host Mac and pull the models from there rather than using the Ollama container in the docker compose file. That's why in the Quickstart they call for starting the Docker compose stack without a profile - it doesn't even start the Ollama container. Docker will still be able to reach Ollama using the docker internal hostname, but you'll get much faster processing using the Apple Silicon GPU and NPU (what pieces are accelerated depend on what the Ollama team have done with the app over time). It took me a few minutes to figure it out, but once I did it works just fine.
@@scottstillwell3150 Since I could not open the credentials, I tried to setup new ones. They say they could connect, but I am not able to use the ollama node in the demo workflow. It can't fetch any models. This is super confusing.
By far the best tutorial and overview on Local RAG and also dropping gems on the little improvements you've made from the original repo. Workflow is amazing too!! One of my ideas is playing some of older rpg's back in the day on the steam deck but with less time that I have now for other priorities, its nice to just query the walkthrough docs and ask where to go next etc.
15:20 this is truely the most important part of the logics. It's absolutely necessary to have a function in order to handle contingency regarding file duplicates
Yes definitely!! I love using pgvector so I'm 100% behind you there. I focused on Qdrant in this video just to show the entire package, but often times using less services (so using Postgres both for chat memory and RAG) can be the way to go if you find it works well for your use case.
@@ColeMedinthat was my question answered 😅 simplified the stack, if you get it to work with supabase you have all the db you need for different functions in this pipeline
Genius, this is like a “medior ai engineer” tutorial video if someone builds the same thing then tweaks it to make a unique llm app out of it. I think a lot of companies would appreciate their engineers to know all this
Hi Cole, thanks for your work. Got it running last night locally on my Mac Book Pro with 128 Gigs of ram - looking forward to playing with this workflow. More videos about this would be appreciated! :)
Outstanding work, Cole. Love it. I will implement it today. Looking forward to more of your excellent content. You are not verbose, just straight to the point and deliver value to boot. Thank you!
Local is a good start. As a practical application, I think a good project would be to distribute the containers and have a login system for businesses.
Yes I definitely agree! Wish I could cover that here without making the video way to long, but I will be making content on this kind of thing in the future!
Thanks, Cole! I've been building with Neo4j to create an evolving (meaning-structure) GraphRAG knowledge base with a similar Google ingestion -- all in Python. Tying in neo4j for GraphRAG (in N8N??) would streamline AND localize. Thanks again. Awesome!
Nicely done, Cole. I was running into various errors like authentication issue when trying to pull the ollama-cpu Docker image. This error suggests that Docker is unable to authenticate with Docker Hub using the provided credentials. Here are the likely causes and solutions: To fix, i needed to login to docker from the command line using my docker hub username and access token. docker login -u your_username When prompted for password, enter the access token instead of your account password Then run: docker compose --profile cpu up No errors and all images were being pulled down.
Something I'd like to see is building in mixture of agent frame works and tool use and an actual chat interface. This is a great start and that is exactly what I'm going to start working on lol
Your doing a great job keep making content much appreciated ....i have some amazing ideas but unfortunate dealing a shitty computer phone n lack of the 5 months spent know i realize why i struggled so hard ...so i thank you for clear explanations.....if u can help be so grateful or anyone their as soon can enough or hit streets and hustle im buying anew computer cheers everybody
Totally binging on your content. Setup has been a real PITA, but attacking it one issue at a time. Latest one was it would not accept my embedding model. I had to click out of the ollama embedding step and back to my (your) flow, then load a couple embedding models into ollama from the container command line, then return to my flow and open the ollama embedding. Suddenly I had a drop down letting me select my embedding model and it worked.
Thank you sooo much.. I had to change the document import a bit to work with local network shares for my company but it works .. SO GOOD. The deleting of documents already in the system before adding more is really important, ** I cant wait for your front end video **
My pleasure!! I'm curious - what exactly did you have to change for local network shares? I'm glad it's working great for you! I can't wait to add the frontend into this setup - I appreciate you calling that out :)
@@ColeMedin I am happy to do a short clip on YT showing the local file import change, we are a large mining house and have hundreds of policies "PDF" now you can ask the AI assistant to compare policies from different operation / counties and highlight inconsistencies, or to find the one relevant to you or just to summarize the content to get you up 2 speed.. will reply with a link to the clip :)
Light mode is objectively better if you're working during the day. The brain needs a bright day to work well, sitting in the dark is quite counter productive.
I really appreciate your videos. Having the right details with the right level of depth. Perfect. What i personal like to do is following your videos and other - rebuilding what I've seen. One suggestions for people like me and this case; i needed to go into the repo and revert your newer flowise stuff (Going into it later, but not sure if i really need it). Can you refere in your description to the commit or branch which covers the compose and examples you used especially in this video? People get confused, if they see other stuff on the files than in the video. :)
Thanks for video. A lot of web ui chat tools compatible with ollama nowadays can now do the RAG just right out of the box. Like Open Web UI. Auto triggers part with n8n is a good one, if you need to automatically process a lot of documents.
This looked great, right up to the point you try to access the web interface and you find you can't proceed until you have made an account with n8n. I must have missed where that was shown in the video.
Yes, would love to see setting like Redis caching, supabase auth, next.js project, stripe payment system to have some template for a Saas. God bless you
There are two more additions that need to be added for this local dev environment. Open Web UI and ceph nano with S3 enabled With this you have your own local dev cloud environment then you can build functions and tools in open web UI that call n8n workflows, and store files using S3 protocol
@@ColeMedin Thanks! I've been building with Neo4j to create an evolving (meaning-structure) GraphRAG knowledge base with a similar Google ingestion -- all in Python. Tying in neo4j for GraphRAG would streamline AND localize. Awesome!
Thank you for the video. I have leant a lot. I am stuck on set file ID. I am not getting any output when I run test step under set File ID. Not sure what I am doing wrong. In the previous step I can see the file contents from the google drive. Thank you.
You bet, I'm glad you found it helpful! Not totally sure on that one - in the "Set File ID" node can you see the incoming input from the Google Drive node? Or are you just seeing the output when you click into the Google Drive node?
I'm excited. I just bought my first desktop PC and went all out with a 4080 Super and a Ryzen 7 7800X3D. I managed to pick up a mint condition use Samsung Odyssey CRG9 2x Quad HD monitor so gaming will be fun, but I'm equally excited about playing around with some AI stuff. I love the idea of fine tuning models, using RAG, oh and especially stable diffusion. I keep finding out things about my Nvidia GPU that I didn't know, like being able to watch everything in HDR which is fantastic because so little content is in HDR but it looks so good.
Have you looked into how to extend the intelligence by using `o1-preview`, `o1-mini`, `claude-3.5-sonnet`, `4o` and so forth as high-level thinkers/orchestrators that manage many small agents to pull intelligence in and process?
You sir are speaking my language! haha I have been experimenting with this a bit with some initial success. I'll almost certainly make a video on this kind of idea in the future!
Good question! There is a local file trigger in n8n you can use to work with files on your machine instead of in Google Drive: docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.localfiletrigger/
My issue at the moment is I believe I followed all the steps, but I am unable to connect to the Postgres account. I get an error saying password for user "root" failed. I tried the default 'password' and also the one I set in Visual Code Studio whilst following along with your steps, but neither work.
@@sebastianvesper7858 Thanks for the tip. I tried changing the host to "postgres-1" (since this is also the name for the container in docker desktop), but the error remains the same.
@@ColeMedin There are many existing self-hosted n8n users, and we don’t want to start from scratch. Hopefully, this idea can inspire you to create a tutorial on how to onboard the AI starter kit with an existing DigitalOcean self-hosted setup 🙂
That's what I thought too but I wasn't able to connect to Postgres within n8n until I exposed the Postgres port! Maybe I was just setting up the URL incorrectly at that point, you could totally be right. But that's what ended up working for me!
Great questions! So the API key for Qdrant can actually be anything since it is running locally. That parameter isn't actually used (because it is local). Which also means it is fully local to your second question! It is just set up this way for n8n to make the node compatible with the hosted version of Qdrant if you wanted to use that.
Nice, I have gone through a lot setting up Local AI. Seems very scatters. Thanks for this approach. Just wondering if I can use SQL Server 2022 Express for my project. And congratulation for get merge with bolt.new to become bolt.diy. I whish you remember us all and make every body's life easy in future as well.
You bet! Thanks for the kind words! Yes, you can certainly use SQL server instead! It's a different node in n8n but I believe there is one besides Postgres for the compatibility you need.
Hi Cole, thank you for the tutorial you shared, it really helped me understand the RAG flow. However, I’ve run into a problem during my learning process. The output from the vector store is available, but it doesn’t seem to be passed into the AI Agent or read by it. I’m using the latest version of the Llama 3.2 chat model. Could you help me with this, Cole?
Which version of Llama 3.2 are you using? Sometimes I've seen in n8n with smaller models that the information is given from the vector DB, but for some reason the LLM hallucinates and thinks it doesn't have the answer.
This is looking fantastic, I shall try this later, but I suspect maybe a little too much was skipped for my abilities. We shall see. Thank you either way for the video :)
Cole, I appreciate this video. It is excellent to watch, and I'm learning a lot. I have tried to follow along,but I'm using Linux Mint 21.3, and I'm starting to get a little turned around. Are there any good Linux tutorials you know and feel comfortable sharing?
I was able to find the issue. When running the command "docker compose --profile cpu up", on Linux, the command should be "docker-compose --profile cpu up"
greate video , one question how i can use this in my web or access to it in the web in new page because is just loading without send the message to n8n what ever use localhost or even use ngrok
Awesome video, but I think I am a little bit confused. When I follow the instructions in the video, it seems like I am basically installing the original n8n version, as this is what we are pulling. How do we actually get your version with the custom workflow?
Thank you and I understand the confusion! My version of n8n isn't anything special besides the custom workflow. But I have the custom workflow as JSON file you can download from the link in the description, and then you can import it into your n8n by creating a new workflow, clicking on the options in the top right, selecting "Import from file" and then selecting my JSON file once you download it to your machine.
Great questions! The whole local AI starter kit takes about 6GB (5.3GB of that is the Ollama container). Then the system requirements all depend on the LLM you want to run locally. Almost any computer would be able to run an LLM like Microsoft's Phi3 Mini! Then for a model like Llama 3.1 8b, you'd want a graphics card with at least 8GB of VRAM. Something like the RTX 3090.
Great Video, Thank you! I have installed the container, set erverything up and it works. But on my maschine, the results are pretty poor. I uploaded dummy meeting notes and the response sometimes does not find anything, of e.g. When asking for the agenda it only returns the first point. Any tipps on how to finetune and optimise the results?
Could we do something like receive a text message and then use this to reply to the text message based on answer received after looking through the docs?
Great question! Yes, you certainly can! You can use a service like Twilio that integrates with n8n. Your workflow trigger would be a Twilio trigger to handle new SMS messages, and you can have an action at the end of the workflow with Twilio to send an SMS response. Here is an example of a workflow that includes both! It's a bit more complicated of a workflow, but you can see Twilio triggers and actions in it: n8n.io/workflows/2342-handling-appointment-leads-and-follow-up-with-twilio-calcom-and-ai/
Great question! The answer is yes! Tiny Llama is available in Ollama so you can pull it just like I did with the other models and use it within the n8n workflow instead of Llama 3.1: ollama.com/library/tinyllama
Thank you!! I have a link to install Docker Desktop in the description! I can put it here too: www.docker.com/products/docker-desktop/ It's a very quick and easy install!
Hi dude, thank you very much for share your knowledge. Talking about RAG, right now the Graph RAG looks like the best option for get better answer, I found projects about but every single one needs code, would you do a video building a flow with neo4j? Please
Great work Cole. I plan to set up RAG for my business as I’ve followed RAG developments for about a year. Things have come a long way. I plan to model your work and would like to connect to Supabase since I plan to use for some of my other App work.
I have been trying to make Llam3.1 work using llama-stack but felt it was too complicated or still unfinished. Docker and Postgres? Oh yeah, this one sounds more like it for me! Subbed.
Great question! Unfortunately n8n doesn't provide direct support for multimodal RAG, so you would have to do this with custom code. You could use a LangChain code node to interact with Qdrant to insert the image vectors, similar to how I used a LangChain code node in the video to delete old vectors. Or if you want to create something outside of n8n with LangChain you could definitely do that!
Thank you - I'm glad it was easy to follow! I haven't encountered this error before... did you make any custom changes to the workflow by chance? Otherwise, it seems like it could be an issue with your Supabase credentials since it's an authorization error. I'd check to make sure the API key you set for the creds there in n8n matches the one in your project
Hello and thank you for the video. I don't understand what the point of having n8n locally is, since if I switch off my computer, the workflows no longer work? And how can I avoid losing my data?
You are welcome, and good question! You can host n8n locally but do it within a machine you rent in the cloud so you can have it running 24/7 and not lose any data!
Thank you! Could you clarify what you are asking? If I understand your question correctly though, you'll want to put the username in for user that you set in the .env file. Same thing with the password and database name!
You could definitely do this in n8n, but it would take a good amount of customization because the default RAG setup using the "vector store retrieval" tool doesn't provide the ability to cite sources unfortunately. I really wish it would so I'm with you there. The best route to go down would probably be to implement a custom tool for RAG using the "LangChain code" node like I used in the video to delete old vectors before reinserting a document. So you would create a separate workflow in n8n that takes a query and returns the retrieved text with the sources, and then add that workflow as a "tool" to the agent node.
@@ColeMedin Thanks for the response. I'm really looking for a RAG/Graph solution that allows me to check the source data and ensure hallucinations have not crept in.
My pleasure! And I appreciate the suggestion! I honestly didn't think about sending in a pull request to update their README but I will consider that! As long as they are open to it!
Follow up video here for deploying this to the cloud!
ruclips.net/video/259KgP3GbdE/видео.htmlsi=nUt90VMv63iVMQMe
That timing though....Sweeeeet! Thank you!
Awesome video. I'm grateful for the work done.
A few notes for Mac users -
1 - install Ollama locally and setup it up separately. The docket compose won't do that for you.
2 - inside n8n you'll have to change the connections to Ollama to point to the instance running baremetal. That means you can't use localhost. you have to use your hostname instead.
3- setting up the google project is a PITA, but follow n8n's directions exactly and it'll work. The last punch in the nards is that you have to make your google account a tester for the app you setup and then when you setup the google drive account connection in n8n you have to connect and grant it permission to access your google drive. It's a PITA.
All that said: great work Cole. Keep it coming!
Thank you very much and I appreciate your notes here!
For #2 you can also use "host.docker.internal" to connect to your machine from the container. So Ollama for example would be host.docker.internal:11434
Thank you both very much. Jake is it running smooth on a mac through all these containers?
Love your tutorial, bro! Straight to the point with intuitive, precise instructions.
I'm not a developer, so figuring this out still feels like a big step for me, but you've done an outstanding job here anyway!
Yeah I get it! I'll be continuing to put out content to make it even easier to digest! Thank you though 😃
Cole, you’ve done an outstanding job! Your videos consistently make complex topics clear and easy to follow, which is a rare talent. I appreciate how you break down each concept, making it accessible for viewers at all levels. Your teaching style not only engages but also inspires confidence in learning.
I’m eagerly anticipating your next iteration. It’s always exciting to see how you evolve your content and introduce new ideas. Keep up the fantastic work, and thank you for sharing your knowledge with us!
Thank you very much!! I really appreciate the level of detail in your kind words - it means a lot to me to know that my work is noticed in such a thoughtful way and that I'm hitting my target of breaking down complex topics well!
@@ColeMedin Yes indeed! You explain topics with such a good flow between ideas and concepts that rivals that of other popular tech youtubers such as Networkchuck, Linus Tech Tips and Christian Lempa
Wow that means a lot - thank you!!
I have to agree. Literally just getting started with local AI. Was about to skip past this video and thought, maybe it’s something I can use that I didn’t know existed. BAM! This video is going to be my beginning into what my vision is for my local AI. Really appreciate you made this understandable!
@@Hatch3dLabs the author tells well, and help me understand more clearly. I am getting started too, what tool are you using? I'm using xark-argo/argo on github, it's simple to install on Mac and Windows, what about you? I'd like to keep in touch for more learning.
man.. just dropping casual double entendres as hole references? that’s an instant sub
@@jordon7999 Haha I appreciate it Jordon! 😂
This is a very good step-by-step tutorial. Following the instructions in this video will get you started with local AI. For people trying M1 and above, the ollama must be installed separately, and the rest are the same.
Thank you Dinesh, I appreciate it a lot!!
Could you clarify why Ollama needs to be installed separately for M1 and above?
@@ColeMedin If you want to use GPU/NPU-accelerated LLM rather than CPU on Apple Silicon (which doesn't have either AMD or Nvidia GPUs), you'll need the actual Ollama desktop app on your host Mac and pull the models from there rather than using the Ollama container in the docker compose file. That's why in the Quickstart they call for starting the Docker compose stack without a profile - it doesn't even start the Ollama container. Docker will still be able to reach Ollama using the docker internal hostname, but you'll get much faster processing using the Apple Silicon GPU and NPU (what pieces are accelerated depend on what the Ollama team have done with the app over time). It took me a few minutes to figure it out, but once I did it works just fine.
@@scottstillwell3150 Ok, but how does the rest have to be configured? I tried, but the whole n8n workflow seems to be broken.
@@scottstillwell3150 Since I could not open the credentials, I tried to setup new ones. They say they could connect, but I am not able to use the ollama node in the demo workflow. It can't fetch any models. This is super confusing.
By far the best tutorial and overview on Local RAG and also dropping gems on the little improvements you've made from the original repo. Workflow is amazing too!! One of my ideas is playing some of older rpg's back in the day on the steam deck but with less time that I have now for other priorities, its nice to just query the walkthrough docs and ask where to go next etc.
Thank you very much man, I appreciate it a ton!
And that's a super fun and unique use case - I like it!
15:20 this is truely the most important part of the logics. It's absolutely necessary to have a function in order to handle contingency regarding file duplicates
Indeed! This part of the workflow definitely took the longest but I wanted to include it because I totally agree it's super important to have.
Thanks
Thank you so much for your support, it means a lot! :D
You can use also Postgress with pgvector instead of Qdrant
Yes definitely!! I love using pgvector so I'm 100% behind you there.
I focused on Qdrant in this video just to show the entire package, but often times using less services (so using Postgres both for chat memory and RAG) can be the way to go if you find it works well for your use case.
@@ColeMedinthat was my question answered 😅 simplified the stack, if you get it to work with supabase you have all the db you need for different functions in this pipeline
Is pgvector still a couple orders of magnitude slower?
My point exactly
@@ColeMedin and don't forget about Apache AGE for PostgreSQL!
Genius, this is like a “medior ai engineer” tutorial video if someone builds the same thing then tweaks it to make a unique llm app out of it. I think a lot of companies would appreciate their engineers to know all this
Thank you and yeah I agree! Definitely would take some tweaks to make this fit a specific use case, but it's a good start for sure!
The removing of the vectors records, when reimporting and updated file fixed a lot of my problems. Thanks for the help. U da man!
Seriously glad I could help, thanks Luis!!
Hi Cole, thanks for your work. Got it running last night locally on my Mac Book Pro with 128 Gigs of ram - looking forward to playing with this workflow. More videos about this would be appreciated! :)
You bet! Nice!
Yeah I am actually creating more content around this local AI package next week!
Outstanding work, Cole. Love it. I will implement it today. Looking forward to more of your excellent content. You are not verbose, just straight to the point and deliver value to boot. Thank you!
Thank you very much - your kind words mean a lot to me! 😃
Local is a good start. As a practical application, I think a good project would be to distribute the containers and have a login system for businesses.
Yes I definitely agree! Wish I could cover that here without making the video way to long, but I will be making content on this kind of thing in the future!
This is the best example I have seen for the Local AI Agent and Rag
Thank you - that means a lot to me!
Outstanding Bro I was looking for this solution !!!! since long months
Awesome man, I'm glad I could help!!
Thank you for putting lot of time to simplify for learners. Great work!
You bet, thank you!
I'm excited to see you extend this! Working Supabase into this flow for authentication, etc would be incredible. Awesome video bro!
Thank you Alex, I appreciate it a lot!! I'm stoked to extend this, so that won't be happening too far in the future 😎
Thanks, Cole! I've been building with Neo4j to create an evolving (meaning-structure) GraphRAG knowledge base with a similar Google ingestion -- all in Python. Tying in neo4j for GraphRAG (in N8N??) would streamline AND localize. Thanks again. Awesome!
You bet! That's super cool!
Nice video! As long as it's running locally and documents are safe
Nicely done, Cole.
I was running into various errors like authentication issue when trying to pull the ollama-cpu Docker image. This error suggests that Docker is unable to authenticate with Docker Hub using the provided credentials. Here are the likely causes and solutions:
To fix, i needed to login to docker from the command line using my docker hub username and access token.
docker login -u your_username
When prompted for password, enter the access token instead of your account password
Then run:
docker compose --profile cpu up
No errors and all images were being pulled down.
Ah that is really helpful to know, thank you for sharing your solution!
Something I'd like to see is building in mixture of agent frame works and tool use and an actual chat interface. This is a great start and that is exactly what I'm going to start working on lol
I love it! Mixture of agents is definitely something I'm going to be diving more into in the near future.
Your doing a great job keep making content much appreciated ....i have some amazing ideas but unfortunate dealing a shitty computer phone n lack of the 5 months spent know i realize why i struggled so hard ...so i thank you for clear explanations.....if u can help be so grateful or anyone their as soon can enough or hit streets and hustle im buying anew computer cheers everybody
Totally binging on your content. Setup has been a real PITA, but attacking it one issue at a time. Latest one was it would not accept my embedding model. I had to click out of the ollama embedding step and back to my (your) flow, then load a couple embedding models into ollama from the container command line, then return to my flow and open the ollama embedding. Suddenly I had a drop down letting me select my embedding model and it worked.
what a brilliant delete chunks trick
hello do you have discord community or reddit? I have a lot question for my workflow :(:(
Thank you sooo much.. I had to change the document import a bit to work with local network shares for my company but it works .. SO GOOD.
The deleting of documents already in the system before adding more is really important, ** I cant wait for your front end video **
My pleasure!!
I'm curious - what exactly did you have to change for local network shares? I'm glad it's working great for you!
I can't wait to add the frontend into this setup - I appreciate you calling that out :)
@@ColeMedin I am happy to do a short clip on YT showing the local file import change, we are a large mining house and have hundreds of policies "PDF" now you can ask the AI assistant to compare policies from different operation / counties and highlight inconsistencies, or to find the one relevant to you or just to summarize the content to get you up 2 speed.. will reply with a link to the clip :)
@@HermanRasthis would be great! Having the same Point here. Have to Observe a local unc for new and Updated pdfs and markdowns to Feed the rag
That sounds awesome, I look forward to it!
@@HermanRas Any news on your video? We have a local file share with thousands of documents and I wonder how they could be added.
This is absolutely awesome! Amazingly useful! Thank you so much bro! Amazing job! 🙌🙌
Thank you so much!! :D
Thank you for reminding me of this! Keep to this type of content for the people who want to benefit with our own offline AI ventures!
Of course! And I will certainly be sticking to this type of content!!
Should be illegal to use github in lightmode
I've noticed a trend going on with that. There's not enough argument been made to convince me to go bright.
Haha touché 😂
I generally prefer dark mode in every app I use, honestly not sure why I'm using light mode for GitHub still!
@@ColeMedinheathen!
Light mode is objectively better if you're working during the day. The brain needs a bright day to work well, sitting in the dark is quite counter productive.
@@sCiphre well thats a bunch of nonsense. Im most productive at night when everyone is asleep.
I really appreciate your videos. Having the right details with the right level of depth. Perfect.
What i personal like to do is following your videos and other - rebuilding what I've seen.
One suggestions for people like me and this case; i needed to go into the repo and revert your newer flowise stuff (Going into it later, but not sure if i really need it).
Can you refere in your description to the commit or branch which covers the compose and examples you used especially in this video?
People get confused, if they see other stuff on the files than in the video. :)
Thank you very much and I appreciate the suggestion a lot! I will definitely start doing this.
Thanks Cole! This is pretty amazing!
You bet! Thanks man! 😄
Thanks for video. A lot of web ui chat tools compatible with ollama nowadays can now do the RAG just right out of the box. Like Open Web UI. Auto triggers part with n8n is a good one, if you need to automatically process a lot of documents.
Thank you and fair point! I am actually looking into Open Web UI and will be doing a video on it in the near future. Awesome platform 🔥
This looked great, right up to the point you try to access the web interface and you find you can't proceed until you have made an account with n8n. I must have missed where that was shown in the video.
Oh you don't have to make an account with n8n! That is just a local account for your n8n instance!
Excellent explanation🎉🎉 most answer "why" questions
Thank you very much!
really good video. Thanks. Liked and subbed
Yes, would love to see setting like Redis caching, supabase auth, next.js project, stripe payment system to have some template for a Saas. God bless you
Thank you for the suggestion! It'll be a larger project to get a video for all of that but I am planning exactly that!
There are two more additions that need to be added for this local dev environment. Open Web UI and ceph nano with S3 enabled
With this you have your own local dev cloud environment then you can build functions and tools in open web UI that call n8n workflows, and store files using S3 protocol
I actually did implement Open WebUI here!
ruclips.net/video/E2GIZrsDvuM/видео.html
Ceph nano I haven't heard of, but that would be cool!
Thanks for sharing this - it is a good starting point for my needs.
My pleasure! I'm glad you can take this and run with it!
Thanks for doing all this work! Huge help!
You are so welcome!
Genio total desde argentina te mando un gran saludo y gracias por tu info
Thank you for this special episode. I subscribed because of this
@@acs2777 My pleasure, thank you very much!! 😊
Me too. Thank you. You help my battle with Boomer tendencys😊
Open Web UI is still the best and cleanest implementation I've seen.
Yes I am actually planning on potentially including Open Web UI in this stack as I expand it!
@@ColeMedin yes please :)
Thanks for the great videos, man. Got Ollama and AnythingLLM set up last night and I'm checking this out now
@@namegoeshere2805 Of course!! Let me know how it goes when you give it a shot!
Man! What a great job! +'
Thanks a lot!
Great video mate ! Thank you for your effort
Thank you very much - my pleasure :)
Yo! Well done! I wanna ask if we've something similar for a graph database and graph rag?
Thank you! Not yet but that's on my list for some content!
@@ColeMedin Thanks! I've been building with Neo4j to create an evolving (meaning-structure) GraphRAG knowledge base with a similar Google ingestion -- all in Python. Tying in neo4j for GraphRAG would streamline AND localize. Awesome!
Thank you for the video. I have leant a lot. I am stuck on set file ID. I am not getting any output when I run test step under set File ID. Not sure what I am doing wrong. In the previous step I can see the file contents from the google drive. Thank you.
You bet, I'm glad you found it helpful! Not totally sure on that one - in the "Set File ID" node can you see the incoming input from the Google Drive node? Or are you just seeing the output when you click into the Google Drive node?
Very interesting even if I only understood the half of it :D
Great content my man
Thank you very much!!
I'm excited. I just bought my first desktop PC and went all out with a 4080 Super and a Ryzen 7 7800X3D. I managed to pick up a mint condition use Samsung Odyssey CRG9 2x Quad HD monitor so gaming will be fun, but I'm equally excited about playing around with some AI stuff. I love the idea of fine tuning models, using RAG, oh and especially stable diffusion. I keep finding out things about my Nvidia GPU that I didn't know, like being able to watch everything in HDR which is fantastic because so little content is in HDR but it looks so good.
That's so cool man, I love it! That's an awesome setup you've got :)
Have you looked into how to extend the intelligence by using `o1-preview`, `o1-mini`, `claude-3.5-sonnet`, `4o` and so forth as high-level thinkers/orchestrators that manage many small agents to pull intelligence in and process?
You sir are speaking my language! haha
I have been experimenting with this a bit with some initial success. I'll almost certainly make a video on this kind of idea in the future!
@@ColeMedin very exciting. This kind of synergy is where real power lies.
That's right!!
thx for this awesome details content, once i get my hardware i am going to start \o/
You bet, sounds great!!
Oh man, this could really lower the barrier to entry for Linux's absurd documentation problem.
Haha I hope so!!
How would this work without google and with local folders instead?
Good question! There is a local file trigger in n8n you can use to work with files on your machine instead of in Google Drive:
docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.localfiletrigger/
My issue at the moment is I believe I followed all the steps, but I am unable to connect to the Postgres account. I get an error saying password for user "root" failed. I tried the default 'password' and also the one I set in Visual Code Studio whilst following along with your steps, but neither work.
Dang sorry you're running into that! What URL (host) are you using for Postgres?
@@ColeMedin Thank you for replying, the 'Host' is 'host.docker.internal'.
for the host don't use host.docker.internal ... use the name of the running postgres container, in my case it was "postgres-1" .... hope it helps ;)
@@sebastianvesper7858 Thanks for the tip. I tried changing the host to "postgres-1" (since this is also the name for the container in docker desktop), but the error remains the same.
I had same problem. Go to docker config for for your container and read DB_POSTGRESDB_HOST variable, copy it to host and that should work :)
Hi Cole, would love to see your tutorial on how to implement this with Digitalocean.
I will be making a guide in the near future on deploying this all to DigitalOcean! Thank you for mentioning that!
@@ColeMedin There are many existing self-hosted n8n users, and we don’t want to start from scratch. Hopefully, this idea can inspire you to create a tutorial on how to onboard the AI starter kit with an existing DigitalOcean self-hosted setup 🙂
@@ruellago22 Yeah great point! I'm on the same page as you 👍
IIRC you don't need to expose ports in docker compose if all services are on the same docker network and use their docker hostnames to communicate.
That's what I thought too but I wasn't able to connect to Postgres within n8n until I exposed the Postgres port! Maybe I was just setting up the URL incorrectly at that point, you could totally be right. But that's what ended up working for me!
i dont see the link to this workflow in the comments or video description as promised in the video
Genius!
Tell me guys, I didn't really understand where I can get the api for qdrant? And also, if I take the api, doesn't that mean it's not a local network?
Great questions! So the API key for Qdrant can actually be anything since it is running locally. That parameter isn't actually used (because it is local). Which also means it is fully local to your second question!
It is just set up this way for n8n to make the node compatible with the hosted version of Qdrant if you wanted to use that.
@@ColeMedinThank you ❤
You bet!
so unfair that cannot give you more than one thumb up
Haha I appreciate that a lot!! I sure wish you could too 😂
Awesome content !!!
Thank you very much!!
Does it work fine in win11 or do you have any extra updated explanation?
I used Windows 11, worked great for me with Docker Desktop!
amazing video!
Thank you very much! :)
Nice, I have gone through a lot setting up Local AI. Seems very scatters. Thanks for this approach. Just wondering if I can use SQL Server 2022 Express for my project. And congratulation for get merge with bolt.new to become bolt.diy. I whish you remember us all and make every body's life easy in future as well.
You bet! Thanks for the kind words!
Yes, you can certainly use SQL server instead! It's a different node in n8n but I believe there is one besides Postgres for the compatibility you need.
Hi Cole, thank you for the tutorial you shared, it really helped me understand the RAG flow. However, I’ve run into a problem during my learning process. The output from the vector store is available, but it doesn’t seem to be passed into the AI Agent or read by it. I’m using the latest version of the Llama 3.2 chat model. Could you help me with this, Cole?
Which version of Llama 3.2 are you using? Sometimes I've seen in n8n with smaller models that the information is given from the vector DB, but for some reason the LLM hallucinates and thinks it doesn't have the answer.
This is looking fantastic, I shall try this later, but I suspect maybe a little too much was skipped for my abilities. We shall see. Thank you either way for the video :)
Thank you and sounds good! Let me know if you have any questions as you implement it!
Cole, I appreciate this video. It is excellent to watch, and I'm learning a lot. I have tried to follow along,but I'm using Linux Mint 21.3, and I'm starting to get a little turned around. Are there any good Linux tutorials you know and feel comfortable sharing?
I was able to find the issue. When running the command "docker compose --profile cpu up", on Linux, the command should be "docker-compose --profile cpu up"
You bet! I'm glad you figured out the issue - thank you for circling back and sharing the solution!
Awesome sauce. Thank you!
Thank you! My pleasure! 😃
Great content. any chance you can create a video about how to make actual code changes to the n8n instance and redeploy it via github actions?
Thanks and yes! Maybe not GitHub actions specifically but I do want to integrate a more coding focused part of this ecosystem with CI/CD.
bro, just thank you. thats all i can say. thank you very much, awesome video! liked and subbed for algo
My pleasure, thank you very much my man!
greate video ,
one question how i can use this in my web or access to it in the web in new page because is just loading without send the message to n8n what ever use localhost or even use ngrok
17:04 in this part I would use an expression to create chunks (where it finds "
" or "100 - [as you did]")
Yeah fair point - thanks for the suggestion!
Awesome video, but I think I am a little bit confused. When I follow the instructions in the video, it seems like I am basically installing the original n8n version, as this is what we are pulling. How do we actually get your version with the custom workflow?
Thank you and I understand the confusion! My version of n8n isn't anything special besides the custom workflow. But I have the custom workflow as JSON file you can download from the link in the description, and then you can import it into your n8n by creating a new workflow, clicking on the options in the top right, selecting "Import from file" and then selecting my JSON file once you download it to your machine.
Can you make the same tutorial with cloud installation?
I do have a tutorial for deploying this to the cloud if that is what you are looking for!
ruclips.net/video/259KgP3GbdE/видео.html
@@ColeMedin TY very much!
How much storage does it require? And what are system requirements?
Great questions!
The whole local AI starter kit takes about 6GB (5.3GB of that is the Ollama container).
Then the system requirements all depend on the LLM you want to run locally. Almost any computer would be able to run an LLM like Microsoft's Phi3 Mini! Then for a model like Llama 3.1 8b, you'd want a graphics card with at least 8GB of VRAM. Something like the RTX 3090.
I'm running a local n8n and can't find the template for the Embedding Ollama, n8n indicates it's not available.
Hmmm.... did you install n8n pretty recently?
Great Video, Thank you! I have installed the container, set erverything up and it works. But on my maschine, the results are pretty poor. I uploaded dummy meeting notes and the response sometimes does not find anything, of e.g. When asking for the agenda it only returns the first point.
Any tipps on how to finetune and optimise the results?
The biggest thing is the LLM you are using, I've found that smaller LLMs (
Omfg I been thinking to do this with n8n forever lmao
Yeah I was too until I found this! haha
Could we do something like receive a text message and then use this to reply to the text message based on answer received after looking through the docs?
Great question! Yes, you certainly can! You can use a service like Twilio that integrates with n8n. Your workflow trigger would be a Twilio trigger to handle new SMS messages, and you can have an action at the end of the workflow with Twilio to send an SMS response.
Here is an example of a workflow that includes both! It's a bit more complicated of a workflow, but you can see Twilio triggers and actions in it:
n8n.io/workflows/2342-handling-appointment-leads-and-follow-up-with-twilio-calcom-and-ai/
@@ColeMedin appreciate the detailed response and link will check it out
Sounds great, of course!
Great video!
Thank you - I'm glad you enjoyed it!
can this be done but using tiny llama instead?
Great question! The answer is yes!
Tiny Llama is available in Ollama so you can pull it just like I did with the other models and use it within the n8n workflow instead of Llama 3.1:
ollama.com/library/tinyllama
Great video, thanks for sharing. Just one question, how do we get to docker-desktop? Was that installed as part of this?
Thank you!!
I have a link to install Docker Desktop in the description! I can put it here too:
www.docker.com/products/docker-desktop/
It's a very quick and easy install!
Hi dude, thank you very much for share your knowledge. Talking about RAG, right now the Graph RAG looks like the best option for get better answer, I found projects about but every single one needs code, would you do a video building a flow with neo4j? Please
The idea is use Neo4J like vectorial data base for embebing
You are so welcome! I haven't used neo4j before, what is the advantage of using it?
Great work Cole. I plan to set up RAG for my business as I’ve followed RAG developments for about a year. Things have come a long way. I plan to model your work and would like to connect to Supabase since I plan to use for some of my other App work.
Thank you and good luck! Are you planning on hosting Supabase yourself or using the cloud offering? Either works great!
I have been trying to make Llam3.1 work using llama-stack but felt it was too complicated or still unfinished. Docker and Postgres? Oh yeah, this one sounds more like it for me! Subbed.
Thank you for this, Is it possible to use a local drive instead of Google Drive?
My pleasure! And yes you definitely can! n8n has a "local file" trigger I'd look into!
This sounds so interesting
I'm glad it does!
I really appreciate you. Thanks for doing this for everyone!!
Of course, it's my pleasure!!
how hard is to add vector image embeddings as well for llava runjing on ollama with this?
Great question! Unfortunately n8n doesn't provide direct support for multimodal RAG, so you would have to do this with custom code. You could use a LangChain code node to interact with Qdrant to insert the image vectors, similar to how I used a LangChain code node in the video to delete old vectors. Or if you want to create something outside of n8n with LangChain you could definitely do that!
hey, thanks for the tutorial! very easy to follow. I seem to be getting a stuck with a UNAUTHORISED NULL on ``clear old vectors`` though?
Thank you - I'm glad it was easy to follow!
I haven't encountered this error before... did you make any custom changes to the workflow by chance? Otherwise, it seems like it could be an issue with your Supabase credentials since it's an authorization error. I'd check to make sure the API key you set for the creds there in n8n matches the one in your project
Hello and thank you for the video.
I don't understand what the point of having n8n locally is, since if I switch off my computer, the workflows no longer work?
And how can I avoid losing my data?
You are welcome, and good question! You can host n8n locally but do it within a machine you rent in the cloud so you can have it running 24/7 and not lose any data!
dude you are awesome
@@wangnuny93 Thank you man!!
Something j can actually finally use thanks ❤
Thank you Jarad, that means a lot! :)
Another great video Cole!
Thank you very much!!
Great video..i am getting error in the posegre in user password... Should i only put the user or root password or all three in that postgee
Thank you! Could you clarify what you are asking? If I understand your question correctly though, you'll want to put the username in for user that you set in the .env file. Same thing with the password and database name!
Can you get it to give citations/references of the data it used and from what document?
You could definitely do this in n8n, but it would take a good amount of customization because the default RAG setup using the "vector store retrieval" tool doesn't provide the ability to cite sources unfortunately. I really wish it would so I'm with you there.
The best route to go down would probably be to implement a custom tool for RAG using the "LangChain code" node like I used in the video to delete old vectors before reinserting a document. So you would create a separate workflow in n8n that takes a query and returns the retrieved text with the sources, and then add that workflow as a "tool" to the agent node.
@@ColeMedin Thanks for the response. I'm really looking for a RAG/Graph solution that allows me to check the source data and ensure hallucinations have not crept in.
Of course! And yeah that's super important!
Thanks for sharing.
Can you not send a pull request or whatever with your improvements on the help doco if you find it lacking in certain areas?
My pleasure!
And I appreciate the suggestion! I honestly didn't think about sending in a pull request to update their README but I will consider that! As long as they are open to it!