Wait can't you make postgress upcert? Or at least add a step to query that? Google Drive has versioning itself doesn't it surely you could pass a conditional?
Awesome video. I'm grateful for the work done. A few notes for Mac users - 1 - install Ollama locally and setup it up separately. The docket compose won't do that for you. 2 - inside n8n you'll have to change the connections to Ollama to point to the instance running baremetal. That means you can't use localhost. you have to use your hostname instead. 3- setting up the google project is a PITA, but follow n8n's directions exactly and it'll work. The last punch in the nards is that you have to make your google account a tester for the app you setup and then when you setup the google drive account connection in n8n you have to connect and grant it permission to access your google drive. It's a PITA. All that said: great work Cole. Keep it coming!
Thank you very much and I appreciate your notes here! For #2 you can also use "host.docker.internal" to connect to your machine from the container. So Ollama for example would be host.docker.internal:11434
inside n8n you'll have to change the connections to Ollama to point to the instance running baremetal. That means you can't use localhost. you have to use your hostname instead. how do you do this please is it a terminal thing or file change?
For real, the right information is what we need to be successful in life. I believe that the benefits of successful trading come from an expert and that is why I made huge profits in crypto with the help of Mrs Hana Christina, she is a genius and that is why I always advise beginners to trade with professionals like her.
By far the best tutorial and overview on Local RAG and also dropping gems on the little improvements you've made from the original repo. Workflow is amazing too!! One of my ideas is playing some of older rpg's back in the day on the steam deck but with less time that I have now for other priorities, its nice to just query the walkthrough docs and ask where to go next etc.
Cole, you’ve done an outstanding job! Your videos consistently make complex topics clear and easy to follow, which is a rare talent. I appreciate how you break down each concept, making it accessible for viewers at all levels. Your teaching style not only engages but also inspires confidence in learning. I’m eagerly anticipating your next iteration. It’s always exciting to see how you evolve your content and introduce new ideas. Keep up the fantastic work, and thank you for sharing your knowledge with us!
Thank you very much!! I really appreciate the level of detail in your kind words - it means a lot to me to know that my work is noticed in such a thoughtful way and that I'm hitting my target of breaking down complex topics well!
@@ColeMedin Yes indeed! You explain topics with such a good flow between ideas and concepts that rivals that of other popular tech youtubers such as Networkchuck, Linus Tech Tips and Christian Lempa
I have to agree. Literally just getting started with local AI. Was about to skip past this video and thought, maybe it’s something I can use that I didn’t know existed. BAM! This video is going to be my beginning into what my vision is for my local AI. Really appreciate you made this understandable!
@@Hatch3dLabs the author tells well, and help me understand more clearly. I am getting started too, what tool are you using? I'm using xark-argo/argo on github, it's simple to install on Mac and Windows, what about you? I'd like to keep in touch for more learning.
This is awesome work, Cole. I actually found a later video first then came back to this one. Still tinkering with models as to what I want to use, especially since there seems to be an issue with many of the ollama models being able to work with tools. Looks like they're working on that, though. One suggestion, if you are able to do so, would be to take the videos for this and put them into a playlist in the order they should be viewed. Trying to figure out which one comes next and I feel like I'm bouncing all over your channel. Granted, all good stuff and I plan on getting through all of it, but would be helpful to follow in the order you think would be the most efficient. Rock on, brother.
Thanks for the kind words and the suggestion! That's one of the big things I'll be working on actually going into this new year - organizing my videos together into better playlists and also creating a sort of "mind map" for my channel with all the different content I'm putting out!
This is a very good step-by-step tutorial. Following the instructions in this video will get you started with local AI. For people trying M1 and above, the ollama must be installed separately, and the rest are the same.
@@ColeMedin If you want to use GPU/NPU-accelerated LLM rather than CPU on Apple Silicon (which doesn't have either AMD or Nvidia GPUs), you'll need the actual Ollama desktop app on your host Mac and pull the models from there rather than using the Ollama container in the docker compose file. That's why in the Quickstart they call for starting the Docker compose stack without a profile - it doesn't even start the Ollama container. Docker will still be able to reach Ollama using the docker internal hostname, but you'll get much faster processing using the Apple Silicon GPU and NPU (what pieces are accelerated depend on what the Ollama team have done with the app over time). It took me a few minutes to figure it out, but once I did it works just fine.
@@scottstillwell3150 Since I could not open the credentials, I tried to setup new ones. They say they could connect, but I am not able to use the ollama node in the demo workflow. It can't fetch any models. This is super confusing.
Outstanding work, Cole. Love it. I will implement it today. Looking forward to more of your excellent content. You are not verbose, just straight to the point and deliver value to boot. Thank you!
Genius, this is like a “medior ai engineer” tutorial video if someone builds the same thing then tweaks it to make a unique llm app out of it. I think a lot of companies would appreciate their engineers to know all this
Yes definitely!! I love using pgvector so I'm 100% behind you there. I focused on Qdrant in this video just to show the entire package, but often times using less services (so using Postgres both for chat memory and RAG) can be the way to go if you find it works well for your use case.
@@ColeMedinthat was my question answered 😅 simplified the stack, if you get it to work with supabase you have all the db you need for different functions in this pipeline
Hi Cole, thanks for your work. Got it running last night locally on my Mac Book Pro with 128 Gigs of ram - looking forward to playing with this workflow. More videos about this would be appreciated! :)
15:20 this is truely the most important part of the logics. It's absolutely necessary to have a function in order to handle contingency regarding file duplicates
Local is a good start. As a practical application, I think a good project would be to distribute the containers and have a login system for businesses.
Yes I definitely agree! Wish I could cover that here without making the video way to long, but I will be making content on this kind of thing in the future!
Thank you sooo much.. I had to change the document import a bit to work with local network shares for my company but it works .. SO GOOD. The deleting of documents already in the system before adding more is really important, ** I cant wait for your front end video **
My pleasure!! I'm curious - what exactly did you have to change for local network shares? I'm glad it's working great for you! I can't wait to add the frontend into this setup - I appreciate you calling that out :)
@@ColeMedin I am happy to do a short clip on YT showing the local file import change, we are a large mining house and have hundreds of policies "PDF" now you can ask the AI assistant to compare policies from different operation / counties and highlight inconsistencies, or to find the one relevant to you or just to summarize the content to get you up 2 speed.. will reply with a link to the clip :)
My issue at the moment is I believe I followed all the steps, but I am unable to connect to the Postgres account. I get an error saying password for user "root" failed. I tried the default 'password' and also the one I set in Visual Code Studio whilst following along with your steps, but neither work.
@@sebastianvesper7858 Thanks for the tip. I tried changing the host to "postgres-1" (since this is also the name for the container in docker desktop), but the error remains the same.
Nicely done, Cole. I was running into various errors like authentication issue when trying to pull the ollama-cpu Docker image. This error suggests that Docker is unable to authenticate with Docker Hub using the provided credentials. Here are the likely causes and solutions: To fix, i needed to login to docker from the command line using my docker hub username and access token. docker login -u your_username When prompted for password, enter the access token instead of your account password Then run: docker compose --profile cpu up No errors and all images were being pulled down.
Hi! Could you, please, be more explicit when it comes to access token? Does it mean the N8N_Encryptio_Key or N8N_User_Management_JWT_Secret in the .Inv file?
Yes, would love to see setting like Redis caching, supabase auth, next.js project, stripe payment system to have some template for a Saas. God bless you
I really appreciate your videos. Having the right details with the right level of depth. Perfect. What i personal like to do is following your videos and other - rebuilding what I've seen. One suggestions for people like me and this case; i needed to go into the repo and revert your newer flowise stuff (Going into it later, but not sure if i really need it). Can you refere in your description to the commit or branch which covers the compose and examples you used especially in this video? People get confused, if they see other stuff on the files than in the video. :)
Great question! Generally for an at-home LLM setup you'll want a graphics card with at least 8 GB of VRAM (for models like Llama 3.1 8b), 16GB of VRAM is even better if you want to run models like Llama 3.1 70b. So an RTX 3090 graphics card is a good starting point!
@@ColeMedin nice.. the question is (as you pointed out in a video) what kind of response times to expect haha... I want to set something up as a 'tutor' for my kids. They're small and I don't want them to be let loose on the internet yet but they're always full of questions and while happy to oblige sometimes I'm busy working
Something I'd like to see is building in mixture of agent frame works and tool use and an actual chat interface. This is a great start and that is exactly what I'm going to start working on lol
Great work Cole. I plan to set up RAG for my business as I’ve followed RAG developments for about a year. Things have come a long way. I plan to model your work and would like to connect to Supabase since I plan to use for some of my other App work.
Hello, when installing the git rep I am always getting an error: Gracefully stopping... (press Ctrl+C again to force) dependency failed to start: container local-ai-packaged-postgres-1 is unhealthy What could the be?
Are you running on Mac? I've seen this happen to a few people with Mac before. I'll actually be replacing Postgres with Supabase soon though so this issue should go away. I don't have a Mac myself though so I haven't been able to replicate this.
Thanks, Cole! I've been building with Neo4j to create an evolving (meaning-structure) GraphRAG knowledge base with a similar Google ingestion -- all in Python. Tying in neo4j for GraphRAG (in N8N??) would streamline AND localize. Thanks again. Awesome!
First off, amazing work, I'm puzzled why you chose to use google drive if the aim is to be local? I'll be using your workflow as inspiration for local files, not knowing n8n, can it work with local files/folders? lets hope I can get it working! Thanks for sharing.
Thank you very much! And that is a really fair point! I have worked with businesses in the past where they wanted a local AI solution, but they were fine storing sensitive information in Google Drive. The biggest concern is more sending their data into a cloud LLM that will be trained with it. In the end, cloud storage is just so convenient that almost no one wants to store all their files locally (on something like a NAS box) even if it's sensitive data. So that is why I consider this setup fully local even though it uses Google Drive. If you do want to work with local files, n8n can certainly do that! I would check out the local file trigger for n8n: docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.localfiletrigger/
@@MustRunTonyo Honestly I'm not sure how my comment got removed! RUclips says it is there still but when I try to view it I don't see anything. All I said was Google Drive is generally considered "safe" to store company data. Numerous businesses I work with have sensitive data there even though they want to run LLMs/RAG locally for privacy. Of course it's up to each business to determine if this is really the case for them! However, I am looking into using the local file triggers in n8n to make this setup entirely locally! I will probably be making content on that in the near future!
This looked great, right up to the point you try to access the web interface and you find you can't proceed until you have made an account with n8n. I must have missed where that was shown in the video.
Light mode is objectively better if you're working during the day. The brain needs a bright day to work well, sitting in the dark is quite counter productive.
Really good video, but I want to build it on my own based on your instructions, but I have a problem connecting the AI Agent to the Vectore store tool. How is this done ? Dragging a line from the + of the AI Agent to the Tool does not wortk for me ( on a mac)
Great stuff. I find your videos and explanations very easy to follow. Can you put up a tutorial on how to get Ngrok or Traefik working on this install to get webhooks working properly? I can't seem to get either working in Docker Compose at the moment in the self-hosted environment.
Thank you Keith! I'm glad you find my videos easy to follow, that is one of my main goals! Yes, this is another one of the extensions I want to make with this package - getting a domain set up so this can be hosted in a cloud instance with a reverse proxy! Not set on the specific tool for that yet but Ngrok is one I was considering! Haven't used Traefik before myself actually. What is the issue you are running into with Docker Compose?
@@ColeMedin I just could not get Ngrok working on the free domain they offer on the free tier. I kept getting error messages and was trying to troubleshoot for a few hours before i just gave up. I have not had to use webhooks yet, but i know that would greatly improve the functionality. I also saw that someone else was asking n8n directly about how to get webhooks working in the self-hosted-ai-starter-kit and n8n said it was still something they have not quite worked out. Glad you put this tutorial together. i might uninstall by current self-hosted-ai-starter-kit installation and start off with your build. I did not know about the postres login that needed to be created, so i had followed one of your other tutorials and had a rag ai agent set up using Supabase for the chat history and vector store. I was using Ollama with llama 3.1 and i was having issues with the embeddings, so this newer tutorial is great. Once again, i really appreciate what you are doing here. Your tutorials are giving me some inspiration to build some real world tools.
Thank you for the kind words Keith, they mean a lot! That's strange n8n would say the webhooks wouldn't work in the self-hosted setup. I'm pretty sure I can get those to work 100%, I will be diving into that and will include that in my video I'll put up in the near future where I show how to host this whole setup remotely!
Totally binging on your content. Setup has been a real PITA, but attacking it one issue at a time. Latest one was it would not accept my embedding model. I had to click out of the ollama embedding step and back to my (your) flow, then load a couple embedding models into ollama from the container command line, then return to my flow and open the ollama embedding. Suddenly I had a drop down letting me select my embedding model and it worked.
Very useful content. I think if the camera is moved a little lower, your face will be closer to the middle of the frame, and this will create a more balanced angle. Thank you very much.
Interesting ! I enjoyed your full breakdown, under the hood - thank you. It's a lot of work to re-invent the RAG Wheel, so personally I'm using Spheria AI for my life's knowledge base - full privacy and data ownership... almost same as running on local :)
Thank you! I'm glad you found it interesting :) I've actually never heard of Spheria but I checked it out now and it looks awesome! What LLM does Spheria use or which have you chosen to use on the platform?
Thanks for the video, amazing content Not sure on how I can use this in production, will I need a powerful VM with good GPU to run this? I have self-hosted the n8n on ec2 but I am not sure about adding ollama on that instance. Looking forward to the self-host on domain video, it will clear a lot of things
No this isn't heavy on anything except what llm you get locally , phi3 mini might be best if you can't do llama 3b, you don't want something that's to slow, also there is a small qwen2 and Mistral 7b I'm testing
Thank you very much! I'm glad you enjoyed it :) Fantastic question! I really appreciated what @jarad4621 said - it's really only the self-hosted LLM that is heavy. His suggestion of using phi3 mini is great if your EC2 isn't strong enough for a Llama 3.1 model. If you want to self-host Llama 3.1 8b, I would recommend a dedicated GPU on your cloud machine with at least 8GB of VRAM. Then at least 16GB of VRAM for Llama 3.1 70b.
Great intro video. Some pointers in case you revisit it, and to other nubes trying to follow along. The first time you connect to n8n, you have to create the owner account. It looks like you cloned the n8n example workflow then modified it, but this isn't shown. The credentials/connection string for the database could be emphasized more as this is critical. I tried changing the password and secret keys in the .env file, but had issues with encryption and had to destroy and rebuild the containers and volumes a few times. I would have preferred a local document store rather than google docs - especially since the main driver for using AI locally is to maintain control of your data.
Thank you for the feedback here! I'm making more content on this local AI starter kit in the future and I'll be sure to keep this in mind. Especially for the credentials and focusing more on that.
You NEED to put a list of the HW you're running and an approximate speed you're achieving in the video description, or no one is goung to bother trying this
Thanks for video. A lot of web ui chat tools compatible with ollama nowadays can now do the RAG just right out of the box. Like Open Web UI. Auto triggers part with n8n is a good one, if you need to automatically process a lot of documents.
I installed both GitHub Desktop and Docker on Win11. When I run the self-hosted-ai-starter-kit.git in the terminal (assuming Windows Terminal), I get an error "The term 'git" is not recognized. 🤷♂
Dang that's really weird... Could you try downloading Git directly? git-scm.com/downloads/win It should come with GitHub desktop but this will for sure solve it for you!
Your doing a great job keep making content much appreciated ....i have some amazing ideas but unfortunate dealing a shitty computer phone n lack of the 5 months spent know i realize why i struggled so hard ...so i thank you for clear explanations.....if u can help be so grateful or anyone their as soon can enough or hit streets and hustle im buying anew computer cheers everybody
Have you looked into how to extend the intelligence by using `o1-preview`, `o1-mini`, `claude-3.5-sonnet`, `4o` and so forth as high-level thinkers/orchestrators that manage many small agents to pull intelligence in and process?
You sir are speaking my language! haha I have been experimenting with this a bit with some initial success. I'll almost certainly make a video on this kind of idea in the future!
Great question! Unfortunately n8n doesn't provide direct support for multimodal RAG, so you would have to do this with custom code. You could use a LangChain code node to interact with Qdrant to insert the image vectors, similar to how I used a LangChain code node in the video to delete old vectors. Or if you want to create something outside of n8n with LangChain you could definitely do that!
installed docker desktop and git desktop on W10 (12 cores, 48GB machine, Nvidia GT710) and it seems to keep blue screening on me. WTF? never did that before. Pretty sure it is docker desktop and WSL that is the problem, I don't know why.
Thanks and good question! Take a look at the Docker compose file and you'll see a "mount" (data) folder for the n8n service, that connects a folder on your machine to one within the container. You can customize this as well!
That's what I thought too but I wasn't able to connect to Postgres within n8n until I exposed the Postgres port! Maybe I was just setting up the URL incorrectly at that point, you could totally be right. But that's what ended up working for me!
There are two more additions that need to be added for this local dev environment. Open Web UI and ceph nano with S3 enabled With this you have your own local dev cloud environment then you can build functions and tools in open web UI that call n8n workflows, and store files using S3 protocol
Great questions! So the API key for Qdrant can actually be anything since it is running locally. That parameter isn't actually used (because it is local). Which also means it is fully local to your second question! It is just set up this way for n8n to make the node compatible with the hosted version of Qdrant if you wanted to use that.
Thank you for the video. I have leant a lot. I am stuck on set file ID. I am not getting any output when I run test step under set File ID. Not sure what I am doing wrong. In the previous step I can see the file contents from the google drive. Thank you.
You bet, I'm glad you found it helpful! Not totally sure on that one - in the "Set File ID" node can you see the incoming input from the Google Drive node? Or are you just seeing the output when you click into the Google Drive node?
Amazing tutorial that taught me a lot about RAG systems. Just disappointed that the results pulled from RAG are really shitty in my test cases. Maybe I need to setup the qdrant database differently.
Thanks Tom! The results depend a lot on the LLM you are using, so if you aren't getting the best results the first thing I'd try is using a larger model. Then messing with how you are storing things in Qdrant after that.
I have been trying to make Llam3.1 work using llama-stack but felt it was too complicated or still unfinished. Docker and Postgres? Oh yeah, this one sounds more like it for me! Subbed.
That happens when you try to use the default credentials from the starter kit. Make sure you create new credentials for yourself for every service in the workflow!
Perfect Video, really really good Job ;-) Only Question from me, it is locally except Google Docs, can we do this also with local files? I have tried it, but did not succed. That would be perfect for private data....
Glad the trigger worked! Since the file is already downloaded to your machine you don't need to download it to extract the text like you do with Google Drive, you can skip straight to the node that extracts the text from the file. Does that make sense?
@@ColeMedin Deffently makes sense. but still not running. I skipped the old records for the first version, to keep it simple. The Extract Document gehts the file ID and folder ID, but awaits Data, so i think i need a step in between?
Anyone else getting: There was a problem loading the parameter options from server: "Credentials could not be decrypted. The likely reason is that a different "encryptionKey" was used to encrypt the data." I've tried stripping out all security and running the whole thing open and I still get it.
Thanks for sharing all of this, I started learning about all of this just a few days ago. I followed your tutorial and everything is working but I don't know why my agent never uses the knowledge base so it answers only with code, do you know what's going on?
You are welcome! Sorry the LLM isn't doing great for you - it's probably because it is too small. I've had this happen a lot, especially with LLMs 8b parameters or smaller. It'll respond with "|" or something like that. Could you try with a larger LLM?
Good question! There is a local file trigger in n8n you can use to work with files on your machine instead of in Google Drive: docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.localfiletrigger/
This is a fantastic video that will help me optimize the business. I am wondering if some of the following would be possible: 1. Can I modify this n8n workflow so that, as a response, I get the document URL or download link for multiple documents where I can find the answer to my question? 2. Is it possible to chat with this agent externally (for example, within Slack or by embedded chat box on the internal website)? In your example, the trigger chat message is sent from n8n, but what if I want to enable my colleague to ask a question but don't want to let him inside n8n? 3. Can subversion clients be used as the source of knowledge base documents, such as Tortoise SVN?
Thank you! And great questions! 1. Yes you definitely can. You'd just have to make sure the links to the sources are available in the knowledgebase and then you can prompt the LLM in the system prompt to give URLs when it references knowledge. 2. Yeah, the main thing here is changing the trigger for the n8n workflow to a webhook instead of the chat message trigger. This will turn your agent into a API that you can integrate with any platform. You can also have the trigger be something like a new message in Slack so you can integrate it directly into a Slack channel. 3. You can really connect any data source as a source of knowledge to ingest in n8n as long as it has an API! There won't be a direct integration for Tortoise SVN like there is for something like GitHub, but you could have the n8n workflow use the Tortoise SVN API to pull documents and add it to the vector DB.
Could we do something like receive a text message and then use this to reply to the text message based on answer received after looking through the docs?
Great question! Yes, you certainly can! You can use a service like Twilio that integrates with n8n. Your workflow trigger would be a Twilio trigger to handle new SMS messages, and you can have an action at the end of the workflow with Twilio to send an SMS response. Here is an example of a workflow that includes both! It's a bit more complicated of a workflow, but you can see Twilio triggers and actions in it: n8n.io/workflows/2342-handling-appointment-leads-and-follow-up-with-twilio-calcom-and-ai/
Hi Cole, thanks a lot for your work and share. Got everything running well from ollama, qdrant ... but i got no answer related to my document ;( I change the chunk size, ... always clear old vectors ... Thanks if you have time to help. Love from Paris
This happens if you are using the base credentials that come with the starter kit. Could you try making your own credentials within the node you are getting that error with?
I must be doing something wrong here, but I followed to the letter. Even tried just importing your json. Workflow is fine, uploaded 3 news articles to my Drive, they are imported into quadrant. But the response of the chat is always like this {"name":"documents","parameters":{"input":"When did Israel Launch a New Strike on Beirut"}}. That is the output from the ollama Chat Model step 2. If I check the ollama model under Vector Store, the output is there that it should say.
Hmmm... seems like an issue with the LLM specifically if everything else including the actual RAG process is working. Which model are you using? If you are using Llama 3.1 8b it might not be good enough to handle all the context you are giving it. First thing I would try is to use a more powerful model available in Ollama. It's easy to search through the Ollama repo and find another one to try! ollama.com/library
If i check the flow, the answer is there but under the model under the vector store. There is the correct answer. It only does not reply it in the chat. It replies step 2 from chat model
Yeah that does seem like an issue with the LLM for sure then. I would certainly try using another model and seeing if the response you get is any better. I would try one of the Qwen models! 14b or 32b if your GPU is good enough: ollama.com/library/qwen2.5
@@ColeMedin There are many existing self-hosted n8n users, and we don’t want to start from scratch. Hopefully, this idea can inspire you to create a tutorial on how to onboard the AI starter kit with an existing DigitalOcean self-hosted setup 🙂
greate video , one question how i can use this in my web or access to it in the web in new page because is just loading without send the message to n8n what ever use localhost or even use ngrok
@@ColeMedin thanks i found the problem i was using ngrok but it doesn't work with it so as you mention when I use localhost it work fine thanks is there any way to use ngrok ?
Everything mostly seems to work after some tweaking, except with the Qdrant Vector Store, where I get the error: "Only the 'load' and 'insert' operation modes are supported with execute".
@@ColeMedin It actually worked fine and it was just happening when I tested that step individually, which happened a few other times when individual nodes were tested. Thank you for the awesome workflow! Your content is fantastic.
For someone doing it for the first time there is still a lot of information missing. I was unable to set it up in 3h of trying. I'll try again with some help of AI. Thanks for the video anyway!
Hi, I am trying this out first time, and I am already lost on the half way through... When there was a step starting up n8n, I have created an account when I shouldn't have. Now it locks me out from accessing your workflow, and I can't delete it, because my email isn't present in n8n user base. What can be done with this?!
The account you create there is just for your local instance of n8n! It's not like you are creating an account for their cloud platform. When you say you're locked out of the workflows, what do you mean by that?
I believe that I have created an Owner account for n8n in my local instance, and it blocks me from accessing a Workflow made by you ( n8n on localhost complains when I try to open it) I am now trying to start it from scratch on a different system now, hope I will be able to avoid this issue altogether
Nice video thanks! Is any way to create a permanet memory based in the conversations with the chat bot? I mean: I just with the bot and it keeps all the learnings from this conversation for ever. Thanks
Thank you, my pleasure! :) This kind of thing you're describing where an LLM can learn from conversations over a long period of time is definitely possible, but it isn't a quick and easy implementation! Essentially, you have to tell the LLM (usually in the system prompt) how to determine what is important to remember in conversations. Then, you would set up a tool for it to add parts of conversations to its knowledge base (for RAG) when it determines something is important. I will be making content around this in the future so hopefully that will make this much clearer!
I am failing at the first hurdle! When i try to clone the github repositary it says "git is not recognised as an internal or external command" Can anyone help? Something must be missing on my laptop
1. Yes you can work with AI generated images with this stack! 2. You could use some of these services in a Java program like Qdrant for RAG, you could calling into n8n workflows, etc.
Follow up video here for deploying this to the cloud!
ruclips.net/video/259KgP3GbdE/видео.htmlsi=nUt90VMv63iVMQMe
That timing though....Sweeeeet! Thank you!
Wait can't you make postgress upcert? Or at least add a step to query that? Google Drive has versioning itself doesn't it surely you could pass a conditional?
Ignore me you literally go on one second later to show code clearing the ingested file haha
Awesome video. I'm grateful for the work done.
A few notes for Mac users -
1 - install Ollama locally and setup it up separately. The docket compose won't do that for you.
2 - inside n8n you'll have to change the connections to Ollama to point to the instance running baremetal. That means you can't use localhost. you have to use your hostname instead.
3- setting up the google project is a PITA, but follow n8n's directions exactly and it'll work. The last punch in the nards is that you have to make your google account a tester for the app you setup and then when you setup the google drive account connection in n8n you have to connect and grant it permission to access your google drive. It's a PITA.
All that said: great work Cole. Keep it coming!
Thank you very much and I appreciate your notes here!
For #2 you can also use "host.docker.internal" to connect to your machine from the container. So Ollama for example would be host.docker.internal:11434
Thank you both very much. Jake is it running smooth on a mac through all these containers?
inside n8n you'll have to change the connections to Ollama to point to the instance running baremetal. That means you can't use localhost. you have to use your hostname instead. how do you do this please is it a terminal thing or file change?
For real, the right information is what we need to be successful in life. I believe that the benefits of successful trading come from an expert and that is why I made huge profits in crypto with the help of Mrs Hana Christina, she is a genius and that is why I always advise beginners to trade with professionals like her.
she is not on youtube , Tried responding to you but youtube has been on a hyper deletion spree with me today, she is on TG
By far the best tutorial and overview on Local RAG and also dropping gems on the little improvements you've made from the original repo. Workflow is amazing too!! One of my ideas is playing some of older rpg's back in the day on the steam deck but with less time that I have now for other priorities, its nice to just query the walkthrough docs and ask where to go next etc.
Thank you very much man, I appreciate it a ton!
And that's a super fun and unique use case - I like it!
Cole, you’ve done an outstanding job! Your videos consistently make complex topics clear and easy to follow, which is a rare talent. I appreciate how you break down each concept, making it accessible for viewers at all levels. Your teaching style not only engages but also inspires confidence in learning.
I’m eagerly anticipating your next iteration. It’s always exciting to see how you evolve your content and introduce new ideas. Keep up the fantastic work, and thank you for sharing your knowledge with us!
Thank you very much!! I really appreciate the level of detail in your kind words - it means a lot to me to know that my work is noticed in such a thoughtful way and that I'm hitting my target of breaking down complex topics well!
@@ColeMedin Yes indeed! You explain topics with such a good flow between ideas and concepts that rivals that of other popular tech youtubers such as Networkchuck, Linus Tech Tips and Christian Lempa
Wow that means a lot - thank you!!
I have to agree. Literally just getting started with local AI. Was about to skip past this video and thought, maybe it’s something I can use that I didn’t know existed. BAM! This video is going to be my beginning into what my vision is for my local AI. Really appreciate you made this understandable!
@@Hatch3dLabs the author tells well, and help me understand more clearly. I am getting started too, what tool are you using? I'm using xark-argo/argo on github, it's simple to install on Mac and Windows, what about you? I'd like to keep in touch for more learning.
Thanks
Thank you so much for your support, it means a lot! :D
I'm not a developer, so figuring this out still feels like a big step for me, but you've done an outstanding job here anyway!
Yeah I get it! I'll be continuing to put out content to make it even easier to digest! Thank you though 😃
man.. just dropping casual double entendres as hole references? that’s an instant sub
@@jordon7999 Haha I appreciate it Jordon! 😂
This is awesome work, Cole. I actually found a later video first then came back to this one. Still tinkering with models as to what I want to use, especially since there seems to be an issue with many of the ollama models being able to work with tools. Looks like they're working on that, though. One suggestion, if you are able to do so, would be to take the videos for this and put them into a playlist in the order they should be viewed. Trying to figure out which one comes next and I feel like I'm bouncing all over your channel. Granted, all good stuff and I plan on getting through all of it, but would be helpful to follow in the order you think would be the most efficient. Rock on, brother.
Thanks for the kind words and the suggestion! That's one of the big things I'll be working on actually going into this new year - organizing my videos together into better playlists and also creating a sort of "mind map" for my channel with all the different content I'm putting out!
This is a very good step-by-step tutorial. Following the instructions in this video will get you started with local AI. For people trying M1 and above, the ollama must be installed separately, and the rest are the same.
Thank you Dinesh, I appreciate it a lot!!
Could you clarify why Ollama needs to be installed separately for M1 and above?
@@ColeMedin If you want to use GPU/NPU-accelerated LLM rather than CPU on Apple Silicon (which doesn't have either AMD or Nvidia GPUs), you'll need the actual Ollama desktop app on your host Mac and pull the models from there rather than using the Ollama container in the docker compose file. That's why in the Quickstart they call for starting the Docker compose stack without a profile - it doesn't even start the Ollama container. Docker will still be able to reach Ollama using the docker internal hostname, but you'll get much faster processing using the Apple Silicon GPU and NPU (what pieces are accelerated depend on what the Ollama team have done with the app over time). It took me a few minutes to figure it out, but once I did it works just fine.
@@scottstillwell3150 Ok, but how does the rest have to be configured? I tried, but the whole n8n workflow seems to be broken.
@@scottstillwell3150 Since I could not open the credentials, I tried to setup new ones. They say they could connect, but I am not able to use the ollama node in the demo workflow. It can't fetch any models. This is super confusing.
Outstanding work, Cole. Love it. I will implement it today. Looking forward to more of your excellent content. You are not verbose, just straight to the point and deliver value to boot. Thank you!
Thank you very much - your kind words mean a lot to me! 😃
The removing of the vectors records, when reimporting and updated file fixed a lot of my problems. Thanks for the help. U da man!
Seriously glad I could help, thanks Luis!!
Genius, this is like a “medior ai engineer” tutorial video if someone builds the same thing then tweaks it to make a unique llm app out of it. I think a lot of companies would appreciate their engineers to know all this
Thank you and yeah I agree! Definitely would take some tweaks to make this fit a specific use case, but it's a good start for sure!
Love your tutorial, bro! Straight to the point with intuitive, precise instructions.
You can use also Postgress with pgvector instead of Qdrant
Yes definitely!! I love using pgvector so I'm 100% behind you there.
I focused on Qdrant in this video just to show the entire package, but often times using less services (so using Postgres both for chat memory and RAG) can be the way to go if you find it works well for your use case.
@@ColeMedinthat was my question answered 😅 simplified the stack, if you get it to work with supabase you have all the db you need for different functions in this pipeline
Is pgvector still a couple orders of magnitude slower?
My point exactly
@@ColeMedin and don't forget about Apache AGE for PostgreSQL!
Hi Cole, thanks for your work. Got it running last night locally on my Mac Book Pro with 128 Gigs of ram - looking forward to playing with this workflow. More videos about this would be appreciated! :)
You bet! Nice!
Yeah I am actually creating more content around this local AI package next week!
I'm excited to see you extend this! Working Supabase into this flow for authentication, etc would be incredible. Awesome video bro!
Thank you Alex, I appreciate it a lot!! I'm stoked to extend this, so that won't be happening too far in the future 😎
Dear Cole, best regards from MX, you share great content and communicate great passion for your work
Thanks a bunch, I'm glad it's hitting the mark!
15:20 this is truely the most important part of the logics. It's absolutely necessary to have a function in order to handle contingency regarding file duplicates
Indeed! This part of the workflow definitely took the longest but I wanted to include it because I totally agree it's super important to have.
Local is a good start. As a practical application, I think a good project would be to distribute the containers and have a login system for businesses.
Yes I definitely agree! Wish I could cover that here without making the video way to long, but I will be making content on this kind of thing in the future!
Thank you sooo much.. I had to change the document import a bit to work with local network shares for my company but it works .. SO GOOD.
The deleting of documents already in the system before adding more is really important, ** I cant wait for your front end video **
My pleasure!!
I'm curious - what exactly did you have to change for local network shares? I'm glad it's working great for you!
I can't wait to add the frontend into this setup - I appreciate you calling that out :)
@@ColeMedin I am happy to do a short clip on YT showing the local file import change, we are a large mining house and have hundreds of policies "PDF" now you can ask the AI assistant to compare policies from different operation / counties and highlight inconsistencies, or to find the one relevant to you or just to summarize the content to get you up 2 speed.. will reply with a link to the clip :)
@@HermanRasthis would be great! Having the same Point here. Have to Observe a local unc for new and Updated pdfs and markdowns to Feed the rag
That sounds awesome, I look forward to it!
@@HermanRas Any news on your video? We have a local file share with thousands of documents and I wonder how they could be added.
This is the best example I have seen for the Local AI Agent and Rag
Thank you - that means a lot to me!
My issue at the moment is I believe I followed all the steps, but I am unable to connect to the Postgres account. I get an error saying password for user "root" failed. I tried the default 'password' and also the one I set in Visual Code Studio whilst following along with your steps, but neither work.
Dang sorry you're running into that! What URL (host) are you using for Postgres?
@@ColeMedin Thank you for replying, the 'Host' is 'host.docker.internal'.
for the host don't use host.docker.internal ... use the name of the running postgres container, in my case it was "postgres-1" .... hope it helps ;)
@@sebastianvesper7858 Thanks for the tip. I tried changing the host to "postgres-1" (since this is also the name for the container in docker desktop), but the error remains the same.
I had same problem. Go to docker config for for your container and read DB_POSTGRESDB_HOST variable, copy it to host and that should work :)
Outstanding Bro I was looking for this solution !!!! since long months
Awesome man, I'm glad I could help!!
Nicely done, Cole.
I was running into various errors like authentication issue when trying to pull the ollama-cpu Docker image. This error suggests that Docker is unable to authenticate with Docker Hub using the provided credentials. Here are the likely causes and solutions:
To fix, i needed to login to docker from the command line using my docker hub username and access token.
docker login -u your_username
When prompted for password, enter the access token instead of your account password
Then run:
docker compose --profile cpu up
No errors and all images were being pulled down.
Ah that is really helpful to know, thank you for sharing your solution!
Hi! Could you, please, be more explicit when it comes to access token? Does it mean the N8N_Encryptio_Key or N8N_User_Management_JWT_Secret in the .Inv file?
Yes, would love to see setting like Redis caching, supabase auth, next.js project, stripe payment system to have some template for a Saas. God bless you
Thank you for the suggestion! It'll be a larger project to get a video for all of that but I am planning exactly that!
I really appreciate your videos. Having the right details with the right level of depth. Perfect.
What i personal like to do is following your videos and other - rebuilding what I've seen.
One suggestions for people like me and this case; i needed to go into the repo and revert your newer flowise stuff (Going into it later, but not sure if i really need it).
Can you refere in your description to the commit or branch which covers the compose and examples you used especially in this video?
People get confused, if they see other stuff on the files than in the video. :)
Thank you very much and I appreciate the suggestion a lot! I will definitely start doing this.
i want to build a system for my homelab for playing with AI and LLMs. Who has some suggestions on suitable hardware specs for decent performance?
Great question! Generally for an at-home LLM setup you'll want a graphics card with at least 8 GB of VRAM (for models like Llama 3.1 8b), 16GB of VRAM is even better if you want to run models like Llama 3.1 70b. So an RTX 3090 graphics card is a good starting point!
Any hope with amd? I have a spare 6700xt
@@Airbag888 yeah you will be able to run 8b parameter models with that for sure!
@@ColeMedin nice.. the question is (as you pointed out in a video) what kind of response times to expect haha... I want to set something up as a 'tutor' for my kids. They're small and I don't want them to be let loose on the internet yet but they're always full of questions and while happy to oblige sometimes I'm busy working
Something I'd like to see is building in mixture of agent frame works and tool use and an actual chat interface. This is a great start and that is exactly what I'm going to start working on lol
I love it! Mixture of agents is definitely something I'm going to be diving more into in the near future.
Thank you for this special episode. I subscribed because of this
@@acs2777 My pleasure, thank you very much!! 😊
Me too. Thank you. You help my battle with Boomer tendencys😊
Thank you for putting lot of time to simplify for learners. Great work!
You bet, thank you!
Great work Cole. I plan to set up RAG for my business as I’ve followed RAG developments for about a year. Things have come a long way. I plan to model your work and would like to connect to Supabase since I plan to use for some of my other App work.
Thank you and good luck! Are you planning on hosting Supabase yourself or using the cloud offering? Either works great!
Hello, when installing the git rep I am always getting an error:
Gracefully stopping... (press Ctrl+C again to force)
dependency failed to start: container local-ai-packaged-postgres-1 is unhealthy
What could the be?
Are you running on Mac? I've seen this happen to a few people with Mac before. I'll actually be replacing Postgres with Supabase soon though so this issue should go away. I don't have a Mac myself though so I haven't been able to replicate this.
Open Web UI is still the best and cleanest implementation I've seen.
Yes I am actually planning on potentially including Open Web UI in this stack as I expand it!
@@ColeMedin yes please :)
Thanks, Cole! I've been building with Neo4j to create an evolving (meaning-structure) GraphRAG knowledge base with a similar Google ingestion -- all in Python. Tying in neo4j for GraphRAG (in N8N??) would streamline AND localize. Thanks again. Awesome!
You bet! That's super cool!
First off, amazing work, I'm puzzled why you chose to use google drive if the aim is to be local? I'll be using your workflow as inspiration for local files, not knowing n8n, can it work with local files/folders? lets hope I can get it working! Thanks for sharing.
Thank you very much!
And that is a really fair point! I have worked with businesses in the past where they wanted a local AI solution, but they were fine storing sensitive information in Google Drive. The biggest concern is more sending their data into a cloud LLM that will be trained with it. In the end, cloud storage is just so convenient that almost no one wants to store all their files locally (on something like a NAS box) even if it's sensitive data. So that is why I consider this setup fully local even though it uses Google Drive.
If you do want to work with local files, n8n can certainly do that! I would check out the local file trigger for n8n:
docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.localfiletrigger/
@@ColeMedinthanks for your work ! Is it possible to add the local trigger on top of the google drive one ?
This is exactly what i need ❤ please, please, please do a short clip on how to do this with local or files stored on a share!
What was the comment by Cole? Why has it been deleted?
@@MustRunTonyo Honestly I'm not sure how my comment got removed! RUclips says it is there still but when I try to view it I don't see anything.
All I said was Google Drive is generally considered "safe" to store company data. Numerous businesses I work with have sensitive data there even though they want to run LLMs/RAG locally for privacy. Of course it's up to each business to determine if this is really the case for them!
However, I am looking into using the local file triggers in n8n to make this setup entirely locally! I will probably be making content on that in the near future!
This looked great, right up to the point you try to access the web interface and you find you can't proceed until you have made an account with n8n. I must have missed where that was shown in the video.
Oh you don't have to make an account with n8n! That is just a local account for your n8n instance!
Should be illegal to use github in lightmode
I've noticed a trend going on with that. There's not enough argument been made to convince me to go bright.
Haha touché 😂
I generally prefer dark mode in every app I use, honestly not sure why I'm using light mode for GitHub still!
@@ColeMedinheathen!
Light mode is objectively better if you're working during the day. The brain needs a bright day to work well, sitting in the dark is quite counter productive.
@@sCiphre well thats a bunch of nonsense. Im most productive at night when everyone is asleep.
Really good video, but I want to build it on my own based on your instructions, but I have a problem connecting the AI Agent to the Vectore store tool. How is this done ? Dragging a line from the + of the AI Agent to the Tool does not wortk for me ( on a mac)
Great stuff. I find your videos and explanations very easy to follow. Can you put up a tutorial on how to get Ngrok or Traefik working on this install to get webhooks working properly? I can't seem to get either working in Docker Compose at the moment in the self-hosted environment.
Thank you Keith! I'm glad you find my videos easy to follow, that is one of my main goals!
Yes, this is another one of the extensions I want to make with this package - getting a domain set up so this can be hosted in a cloud instance with a reverse proxy! Not set on the specific tool for that yet but Ngrok is one I was considering! Haven't used Traefik before myself actually.
What is the issue you are running into with Docker Compose?
@@ColeMedin I just could not get Ngrok working on the free domain they offer on the free tier. I kept getting error messages and was trying to troubleshoot for a few hours before i just gave up. I have not had to use webhooks yet, but i know that would greatly improve the functionality. I also saw that someone else was asking n8n directly about how to get webhooks working in the self-hosted-ai-starter-kit and n8n said it was still something they have not quite worked out. Glad you put this tutorial together. i might uninstall by current self-hosted-ai-starter-kit installation and start off with your build. I did not know about the postres login that needed to be created, so i had followed one of your other tutorials and had a rag ai agent set up using Supabase for the chat history and vector store. I was using Ollama with llama 3.1 and i was having issues with the embeddings, so this newer tutorial is great. Once again, i really appreciate what you are doing here. Your tutorials are giving me some inspiration to build some real world tools.
Thank you for the kind words Keith, they mean a lot!
That's strange n8n would say the webhooks wouldn't work in the self-hosted setup. I'm pretty sure I can get those to work 100%, I will be diving into that and will include that in my video I'll put up in the near future where I show how to host this whole setup remotely!
Totally binging on your content. Setup has been a real PITA, but attacking it one issue at a time. Latest one was it would not accept my embedding model. I had to click out of the ollama embedding step and back to my (your) flow, then load a couple embedding models into ollama from the container command line, then return to my flow and open the ollama embedding. Suddenly I had a drop down letting me select my embedding model and it worked.
Very useful content. I think if the camera is moved a little lower, your face will be closer to the middle of the frame, and this will create a more balanced angle. Thank you very much.
Thank you and I appreciate the suggestion a lot!
Nice video! As long as it's running locally and documents are safe
Something j can actually finally use thanks ❤
Thank you Jarad, that means a lot! :)
This is absolutely awesome! Amazingly useful! Thank you so much bro! Amazing job! 🙌🙌
Thank you so much!! :D
Interesting ! I enjoyed your full breakdown, under the hood - thank you. It's a lot of work to re-invent the RAG Wheel, so personally I'm using Spheria AI for my life's knowledge base - full privacy and data ownership... almost same as running on local :)
Thank you! I'm glad you found it interesting :)
I've actually never heard of Spheria but I checked it out now and it looks awesome! What LLM does Spheria use or which have you chosen to use on the platform?
Dope video. We need more
Thank you! A lot more coming soon! :)
excellent video
Thank you very much!
Thank you for reminding me of this! Keep to this type of content for the people who want to benefit with our own offline AI ventures!
Of course! And I will certainly be sticking to this type of content!!
Thanks for the video, amazing content
Not sure on how I can use this in production, will I need a powerful VM with good GPU to run this? I have self-hosted the n8n on ec2 but I am not sure about adding ollama on that instance.
Looking forward to the self-host on domain video, it will clear a lot of things
No this isn't heavy on anything except what llm you get locally , phi3 mini might be best if you can't do llama 3b, you don't want something that's to slow, also there is a small qwen2 and Mistral 7b I'm testing
Thank you very much! I'm glad you enjoyed it :)
Fantastic question! I really appreciated what @jarad4621 said - it's really only the self-hosted LLM that is heavy. His suggestion of using phi3 mini is great if your EC2 isn't strong enough for a Llama 3.1 model.
If you want to self-host Llama 3.1 8b, I would recommend a dedicated GPU on your cloud machine with at least 8GB of VRAM. Then at least 16GB of VRAM for Llama 3.1 70b.
Great intro video. Some pointers in case you revisit it, and to other nubes trying to follow along. The first time you connect to n8n, you have to create the owner account. It looks like you cloned the n8n example workflow then modified it, but this isn't shown. The credentials/connection string for the database could be emphasized more as this is critical. I tried changing the password and secret keys in the .env file, but had issues with encryption and had to destroy and rebuild the containers and volumes a few times. I would have preferred a local document store rather than google docs - especially since the main driver for using AI locally is to maintain control of your data.
Thank you for the feedback here! I'm making more content on this local AI starter kit in the future and I'll be sure to keep this in mind. Especially for the credentials and focusing more on that.
You NEED to put a list of the HW you're running and an approximate speed you're achieving in the video description, or no one is goung to bother trying this
Fair enough! I will add something to the description now!
Thanks for video. A lot of web ui chat tools compatible with ollama nowadays can now do the RAG just right out of the box. Like Open Web UI. Auto triggers part with n8n is a good one, if you need to automatically process a lot of documents.
Thank you and fair point! I am actually looking into Open Web UI and will be doing a video on it in the near future. Awesome platform 🔥
Excellent explanation🎉🎉 most answer "why" questions
Thank you very much!
I installed both GitHub Desktop and Docker on Win11. When I run the self-hosted-ai-starter-kit.git in the terminal (assuming Windows Terminal), I get an error "The term 'git" is not recognized. 🤷♂
Dang that's really weird... Could you try downloading Git directly?
git-scm.com/downloads/win
It should come with GitHub desktop but this will for sure solve it for you!
same
Your doing a great job keep making content much appreciated ....i have some amazing ideas but unfortunate dealing a shitty computer phone n lack of the 5 months spent know i realize why i struggled so hard ...so i thank you for clear explanations.....if u can help be so grateful or anyone their as soon can enough or hit streets and hustle im buying anew computer cheers everybody
what a brilliant delete chunks trick
hello do you have discord community or reddit? I have a lot question for my workflow :(:(
Thanks for sharing this - it is a good starting point for my needs.
My pleasure! I'm glad you can take this and run with it!
instead of exposing ports, likely easier / better to just use docker network and then use the service name instead of localhost
Have you looked into how to extend the intelligence by using `o1-preview`, `o1-mini`, `claude-3.5-sonnet`, `4o` and so forth as high-level thinkers/orchestrators that manage many small agents to pull intelligence in and process?
You sir are speaking my language! haha
I have been experimenting with this a bit with some initial success. I'll almost certainly make a video on this kind of idea in the future!
@@ColeMedin very exciting. This kind of synergy is where real power lies.
That's right!!
how hard is to add vector image embeddings as well for llava runjing on ollama with this?
Great question! Unfortunately n8n doesn't provide direct support for multimodal RAG, so you would have to do this with custom code. You could use a LangChain code node to interact with Qdrant to insert the image vectors, similar to how I used a LangChain code node in the video to delete old vectors. Or if you want to create something outside of n8n with LangChain you could definitely do that!
installed docker desktop and git desktop on W10 (12 cores, 48GB machine, Nvidia GT710) and it seems to keep blue screening on me. WTF? never did that before. Pretty sure it is docker desktop and WSL that is the problem, I don't know why.
That's super weird, I'm sorry that's happening! Could you try installing not in WSL? I don't use WSL currently myself on Windows.
@@ColeMedinso are you using hyper-v when you install dock desktop?
I think it might install Hyper-v under the hood, but I just installed Docker Desktop and that's it!
Thanks Cole! This is pretty amazing!
You bet! Thanks man! 😄
Any chance to get this running with Nextcloud instead of Google Drive ?
Yeah it looks like n8n does integrate with Nextcloud!
n8n.io/integrations/nextcloud/
Love this, but want to ingest local file changes (not Google). I don't know where to place files so that n8n can see them in the docker?
Thanks and good question! Take a look at the Docker compose file and you'll see a "mount" (data) folder for the n8n service, that connects a folder on your machine to one within the container. You can customize this as well!
IIRC you don't need to expose ports in docker compose if all services are on the same docker network and use their docker hostnames to communicate.
That's what I thought too but I wasn't able to connect to Postgres within n8n until I exposed the Postgres port! Maybe I was just setting up the URL incorrectly at that point, you could totally be right. But that's what ended up working for me!
There are two more additions that need to be added for this local dev environment. Open Web UI and ceph nano with S3 enabled
With this you have your own local dev cloud environment then you can build functions and tools in open web UI that call n8n workflows, and store files using S3 protocol
I actually did implement Open WebUI here!
ruclips.net/video/E2GIZrsDvuM/видео.html
Ceph nano I haven't heard of, but that would be cool!
Tell me guys, I didn't really understand where I can get the api for qdrant? And also, if I take the api, doesn't that mean it's not a local network?
Great questions! So the API key for Qdrant can actually be anything since it is running locally. That parameter isn't actually used (because it is local). Which also means it is fully local to your second question!
It is just set up this way for n8n to make the node compatible with the hosted version of Qdrant if you wanted to use that.
@@ColeMedinThank you ❤
You bet!
Thank you for the video. I have leant a lot. I am stuck on set file ID. I am not getting any output when I run test step under set File ID. Not sure what I am doing wrong. In the previous step I can see the file contents from the google drive. Thank you.
You bet, I'm glad you found it helpful! Not totally sure on that one - in the "Set File ID" node can you see the incoming input from the Google Drive node? Or are you just seeing the output when you click into the Google Drive node?
how to includ Open Web UI?
I will be making a video on this in the very near future!
Thank you!
Amazing tutorial that taught me a lot about RAG systems. Just disappointed that the results pulled from RAG are really shitty in my test cases. Maybe I need to setup the qdrant database differently.
Thanks Tom! The results depend a lot on the LLM you are using, so if you aren't getting the best results the first thing I'd try is using a larger model. Then messing with how you are storing things in Qdrant after that.
I have been trying to make Llam3.1 work using llama-stack but felt it was too complicated or still unfinished. Docker and Postgres? Oh yeah, this one sounds more like it for me! Subbed.
Thanks for the great videos, man. Got Ollama and AnythingLLM set up last night and I'm checking this out now
@@namegoeshere2805 Of course!! Let me know how it goes when you give it a shot!
Dude you're awesome!
Thank you so much! 😀
Problem loading credential
Credentials could not be decrypted. The likely reason is that a different "encryptionKey" was used to encrypt the data.
That happens when you try to use the default credentials from the starter kit. Make sure you create new credentials for yourself for every service in the workflow!
Perfect Video, really really good Job ;-) Only Question from me, it is locally except Google Docs, can we do this also with local files? I have tried it, but did not succed. That would be perfect for private data....
Thank you very much! Yes, you can use the "Local file trigger" in n8n to work with local files just like I do with Google Drive!
@ the trigger worked, but not the file and folder id,… how to load it… sorry, big beginner here
Glad the trigger worked! Since the file is already downloaded to your machine you don't need to download it to extract the text like you do with Google Drive, you can skip straight to the node that extracts the text from the file. Does that make sense?
@@ColeMedin Deffently makes sense. but still not running. I skipped the old records for the first version, to keep it simple. The Extract Document gehts the file ID and folder ID, but awaits Data, so i think i need a step in between?
Anyone else getting:
There was a problem loading the parameter options from server: "Credentials could not be decrypted. The likely reason is that a different "encryptionKey" was used to encrypt the data."
I've tried stripping out all security and running the whole thing open and I still get it.
Ollama service was referring to localhost, changed it to refer to the container and it worked.
And then I hit play again and he mentions exactly this. Facepalm
It's all good, I'm glad you figured it out!
Great job, bro!
Thank you! I appreciate it!
Thanks for sharing all of this, I started learning about all of this just a few days ago. I followed your tutorial and everything is working but I don't know why my agent never uses the knowledge base so it answers only with code, do you know what's going on?
You are welcome! Sorry the LLM isn't doing great for you - it's probably because it is too small. I've had this happen a lot, especially with LLMs 8b parameters or smaller. It'll respond with "|" or something like that. Could you try with a larger LLM?
How would this work without google and with local folders instead?
Good question! There is a local file trigger in n8n you can use to work with files on your machine instead of in Google Drive:
docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.localfiletrigger/
This is a fantastic video that will help me optimize the business. I am wondering if some of the following would be possible:
1. Can I modify this n8n workflow so that, as a response, I get the document URL or download link for multiple documents where I can find the answer to my question?
2. Is it possible to chat with this agent externally (for example, within Slack or by embedded chat box on the internal website)? In your example, the trigger chat message is sent from n8n, but what if I want to enable my colleague to ask a question but don't want to let him inside n8n?
3. Can subversion clients be used as the source of knowledge base documents, such as Tortoise SVN?
Thank you! And great questions!
1. Yes you definitely can. You'd just have to make sure the links to the sources are available in the knowledgebase and then you can prompt the LLM in the system prompt to give URLs when it references knowledge.
2. Yeah, the main thing here is changing the trigger for the n8n workflow to a webhook instead of the chat message trigger. This will turn your agent into a API that you can integrate with any platform. You can also have the trigger be something like a new message in Slack so you can integrate it directly into a Slack channel.
3. You can really connect any data source as a source of knowledge to ingest in n8n as long as it has an API! There won't be a direct integration for Tortoise SVN like there is for something like GitHub, but you could have the n8n workflow use the Tortoise SVN API to pull documents and add it to the vector DB.
Error in sub-node ‘Embeddings Ollama1‘
fetch failed
Open node
Can anyone help?
Sorry you're hitting this error! What are you trying to embed exactly?
Could we do something like receive a text message and then use this to reply to the text message based on answer received after looking through the docs?
Great question! Yes, you certainly can! You can use a service like Twilio that integrates with n8n. Your workflow trigger would be a Twilio trigger to handle new SMS messages, and you can have an action at the end of the workflow with Twilio to send an SMS response.
Here is an example of a workflow that includes both! It's a bit more complicated of a workflow, but you can see Twilio triggers and actions in it:
n8n.io/workflows/2342-handling-appointment-leads-and-follow-up-with-twilio-calcom-and-ai/
@@ColeMedin appreciate the detailed response and link will check it out
Sounds great, of course!
Hi Cole, thanks a lot for your work and share. Got everything running well from ollama, qdrant ... but i got no answer related to my document ;( I change the chunk size, ... always clear old vectors ... Thanks if you have time to help. Love from Paris
You bet! Which model are you using? Sometimes the smaller ones don't do the best with RAG.
@@ColeMedin Thanks for your reply. will give a try soon. You re the one
Great video mate ! Thank you for your effort
Thank you very much - my pleasure :)
Great content my man
Thank you very much!!
Credentials could not be decrypted. The likely reason is that a different "encryptionKey" was used to encrypt the data.
This happens if you are using the base credentials that come with the starter kit. Could you try making your own credentials within the node you are getting that error with?
Which files in your GitHub repository do I need to update to use ollama 3.2 instead of 3.1.
That is in the docker-compose.yml file! If you search for 3.1 you'll find Llama 3.1 and you can replace that with Llama 3.2.
Thanks for the help. Im developing a game that will use ai. Is flowise better or easier to use?
Of course! And Flowise is really easy to get started with so I'd recommend that if you want to get your feet wet with AI!
I assume you can use redis as your memory storage vs Postgres
Yes you can!
I must be doing something wrong here, but I followed to the letter. Even tried just importing your json. Workflow is fine, uploaded 3 news articles to my Drive, they are imported into quadrant. But the response of the chat is always like this {"name":"documents","parameters":{"input":"When did Israel Launch a New Strike on Beirut"}}. That is the output from the ollama Chat Model step 2. If I check the ollama model under Vector Store, the output is there that it should say.
Hmmm... seems like an issue with the LLM specifically if everything else including the actual RAG process is working. Which model are you using? If you are using Llama 3.1 8b it might not be good enough to handle all the context you are giving it. First thing I would try is to use a more powerful model available in Ollama. It's easy to search through the Ollama repo and find another one to try!
ollama.com/library
@@ColeMedin hi i use the default that is installed with the tutorial. 3.1 latest.
If i check the flow, the answer is there but under the model under the vector store. There is the correct answer. It only does not reply it in the chat. It replies step 2 from chat model
Yeah that does seem like an issue with the LLM for sure then. I would certainly try using another model and seeing if the response you get is any better. I would try one of the Qwen models! 14b or 32b if your GPU is good enough:
ollama.com/library/qwen2.5
I'm running a local n8n and can't find the template for the Embedding Ollama, n8n indicates it's not available.
Hmmm.... did you install n8n pretty recently?
Hi Cole, would love to see your tutorial on how to implement this with Digitalocean.
I will be making a guide in the near future on deploying this all to DigitalOcean! Thank you for mentioning that!
@@ColeMedin There are many existing self-hosted n8n users, and we don’t want to start from scratch. Hopefully, this idea can inspire you to create a tutorial on how to onboard the AI starter kit with an existing DigitalOcean self-hosted setup 🙂
@@ruellago22 Yeah great point! I'm on the same page as you 👍
Thanks for doing all this work! Huge help!
You are so welcome!
greate video ,
one question how i can use this in my web or access to it in the web in new page because is just loading without send the message to n8n what ever use localhost or even use ngrok
Thank you and good question!
You can access n8n by vising localhost:5678 in your browser!
@@ColeMedin thanks i found the problem i was using ngrok but it doesn't work with it so as you mention when I use localhost it work fine thanks
is there any way to use ngrok ?
Everything mostly seems to work after some tweaking, except with the Qdrant Vector Store, where I get the error: "Only the 'load' and 'insert' operation modes are supported with execute".
Weird I never ran into this myself! Is this for retrieval or adding documents into Qdrant?
@@ColeMedin It actually worked fine and it was just happening when I tested that step individually, which happened a few other times when individual nodes were tested. Thank you for the awesome workflow! Your content is fantastic.
Okay that's fantastic! You bet and thank you for the kind words!!
For someone doing it for the first time there is still a lot of information missing.
I was unable to set it up in 3h of trying. I'll try again with some help of AI.
Thanks for the video anyway!
You are welcome! I'm sorry it's taking you a while though! What information would you say is missing? I would love to improve my walkthrough.
Hi, I am trying this out first time, and I am already lost on the half way through...
When there was a step starting up n8n, I have created an account when I shouldn't have. Now it locks me out from accessing your workflow, and I can't delete it, because my email isn't present in n8n user base. What can be done with this?!
The account you create there is just for your local instance of n8n! It's not like you are creating an account for their cloud platform. When you say you're locked out of the workflows, what do you mean by that?
I believe that I have created an Owner account for n8n in my local instance, and it blocks me from accessing a Workflow made by you ( n8n on localhost complains when I try to open it)
I am now trying to start it from scratch on a different system now, hope I will be able to avoid this issue altogether
Sorry, I can't... For some reason, my comments get deleted.
Great content. any chance you can create a video about how to make actual code changes to the n8n instance and redeploy it via github actions?
Thanks and yes! Maybe not GitHub actions specifically but I do want to integrate a more coding focused part of this ecosystem with CI/CD.
Nice video thanks! Is any way to create a permanet memory based in the conversations with the chat bot? I mean: I just with the bot and it keeps all the learnings from this conversation for ever. Thanks
Thank you, my pleasure! :)
This kind of thing you're describing where an LLM can learn from conversations over a long period of time is definitely possible, but it isn't a quick and easy implementation!
Essentially, you have to tell the LLM (usually in the system prompt) how to determine what is important to remember in conversations. Then, you would set up a tool for it to add parts of conversations to its knowledge base (for RAG) when it determines something is important.
I will be making content around this in the future so hopefully that will make this much clearer!
@@ColeMedin great thanks!! Please if you nake it do simple and easy to follow please. I am new with all this world :)
Of course! And yes, I will make it simple and easy to follow!
I am failing at the first hurdle!
When i try to clone the github repositary it says "git is not recognised as an internal or external command"
Can anyone help? Something must be missing on my laptop
Make sure you install Git! Quick Google search and install will get you sorted! I recommend GitHub Desktop
Two questions: Does it have the ability to ai generate images? Also, is it possible to use in a Java Program?
1. Yes you can work with AI generated images with this stack!
2. You could use some of these services in a Java program like Qdrant for RAG, you could calling into n8n workflows, etc.
Thanks.
You bet!