I haven't even gotten 30 seconds into this video and I'm so excited... You're like my fav RUclipsr to learn this stuff from, but I'm a big Open WebUI user and I am SO F-ing pumped right now you made this video... This is what I have been waiting for to allow me to connect all of the dots in my own little AI world setup.
I haven't even finished the video and I'm astounded at the rate you can consume new information, master it, script a video, and demo it in a step-by-step accessible way. Truly a unicorn and my personal RUclips Hero.
alot of info in a quick pace! I like it and am subscribing. WebUI is awesome, so the n8n aspect will be cool to learn. love your intensity. looking for more.
I was probably one of the few who didn't see this as a missing feature but a chance to integrate my own web UI for chat it originally started as a chat gpt clone and got various features added, now I have my own model to run it on that's dope but great video for everyone else still watching it
@@ColeMedin Not as awesome as your content, thank you for making everything so accessible. I will be looking to contribute to your bolt project. You are awesome bro
LOVE IT. After very quick test including closing the microphone listening window, I tried building a push-to-talk action button. It does not quite work yet, but in further testing, my microphone stops listening while "thinking" and responding, then back to active listening when the response ends.
00:05 Integrating open web UI for chatting with n8n AI agents 02:11 Self-hosted AI starter kit with various services packaged together. 06:01 Setting up Open WebUI with N8N AI Agents locally 07:56 Access N8N through Local Host Port 5678 for account setup. 11:58 Configuring connections and credentials for different nodes in N8N workflow 14:01 Using Google Drive as an example for ingesting documents and updating the quadrant vector store. 17:46 Customize valves in N8N AI agents for specific integrations 19:34 Open WebUI allows convenient parameter customization for n8n AI agents. 23:07 Open Web UI allows for easy customization and voice chat with N8N AI agents 25:01 Extension of local AI starter kit with Open WebUI for voice chat integration. Crafted by Merlin AI.
This was such a great video - thank you! It would be great to extend this by enabling file uploads via OpenWebUI. I'm guessing that would just mean detecting whether a file has been uploaded or not and putting it in the vector store.
The only thing thats kinda missing from being a starterkit, is a fully worked out agent that doesnt just handle a document that only has text in it well, but that will handle a document with pictures, text and code. Ideally spreadsheets too. Also, how to train a model on the collected vector data. Thank you for sharing the journey!
You bet! And I totally agree with you here - developing agents to bake into the starter kit for this kind of thing is certainly one of the ways I want to expand this in the future!
Awesome video as always and keep bringing n8n content please ;-)) next n8n video maybe with agents swarm ? where multiple agents work together to accomplish task and show overall output in order in open web ui like searching through web and getting back response and than summarizing it or extracting values from it and writing in google docs,sheets or can self host only office and maybe try to create excel with that. This will be cool idea to just say anything and in chat and it would perform actual actions ;-))
I was just about to add this as a comment. @ColeMedin you are doing amazing work, keep it up. I am just about to replace my current n8n container with yours 🤣
Thank you very much - I sure will! I love your thoughts here and I'll definitely be making content like this in the near future! Maybe not the very next video but coming up for sure.
Hi Cole. Regarding addition to your project: Add nodered, influxdb, grafana to the container Then input IOT data with an MQQT node in nodered. Then webhook and/or api call from influxdb. That way your RAG is live data. Maybe some observability with grafana because it is awesome. Thanks.
Fantastic video! Now only the next step is missing, how to turn such a local environment into a production one, how to release such a built application into production based on Azure/Google/AWS or any other hosting. Maybe this is a topic for another video, I would be the first customer of such content.
Thank you and yes that is a very needed next step so I will be making more content around that in the future! I do already have a video on deploying this to the cloud though as a starting point! ruclips.net/video/259KgP3GbdE/видео.html But more to come to make it really production ready!
Thanks for your great videos Cole. I find them inspirational and I am learning lots!! I like your enthusiastic attitude, it sometimes lift me up if I am feeling down 🙂 There is just one little thing standing between me and total Victory! What you have there to "clean the old vectors" does not seem work with a local file trigger :-( Probably you don't have to make a whole video just for that one node. If you could just somehow share that bit of updated code then that would be fantastic!! and even better if it is in python instead of java (if at all possible) Keep going with your excellent work, you are making a positive difference during this inflection point in humanity.
hey man, I'm not an expert in containers, but my intuition is any environment you set up that is connected to the Internet is susceptible to getting hacked, so don't use your ssn as your username or turn off auth, etc on these things. otherwise, super cool vid!
Thanks man and yeah I agree! I was sort of kidding haha just to drive home the point that it is indeed running locally. And technically you can run literally everything offline as long as your N8N workflows aren't going out to the internet for things! If you using a local file trigger instead of Google Drive for example.
Nice video Cole ! I hope that it will be possible to add pictures to the input prompt of OpenWebUI soon. In the stack, I would like to have image generation.
Cole Medin - I thank you for your efforts in doing and showing us ways to utilize Docker with n8n, ollama, postgre, and qdrant..., however. Google IS NOT 100% LOCAL and offline. I thought this was purely an OFFLINE LOCAL setup. I'm very new to this and so far, yours' has been the BEST. I just got lost when you went to GOOGLE. I need a step by step for local documents WITHOUT GOOGLE. Can you help with that? thanks.Better yet a n8n flow that already has the local input instead of google docs.
so grateful to you for sharing info about N8N and local LLMs. Would be extremely grateful if you could follow up with how to use cloud based tools for embeddings. Super excited to move forward on all this but from what I can tell my desktop machine (Mac mini M2 with 16gb ram) is just not powerful enough to do the embeddings. I went through your incredibly helpful video from a few weeks back on setting up N8N. I got it all set up but my machine kept timing out when it tried to do the embeddings. I WAS able to get it to work doing embeddings with INCREDIBLY short documents, i.e. like text file that was two lines long. But anything longer and it would time out ofter 5 minutes. I'm super new at this but I've seen some discussion online of using cloud based options to do the embeddings? Would be super grateful for a video on that.
It's my pleasure! For cloud based embeddings I would recommend using OpenAI embeddings, you just need an OpenAI API key and it's supported as an embedding option in n8n! Also which embedding model are you using? There are a lot of options and you could always try another smaller one for making it quick locally!
Thanks much @@ColeMedin RE: the model I'm using, I should have mentioned that I'm using nomic-embed-text. I went through your helpful video on Run ALL Your AI locally in Minutes -- video was great and I got the n8n all set up with Ollama-- but, as I said, my Mac mini kep timing out when I tried to set up embeddings for documents more than a few lines long.
Okay yeah sorry I forgot you already mentioned! I would suggest going to the Ollama site and searching through their embedding models, lot of fast options to try there!
This is GOLD! Thank you so much, Cole! Does anyone is getting |python tags| rather than the actual message from the Tool while triggering the agent via chat?
You bet!! I've had this happen before, generally with smaller models because they hallucinate since there are actually quite a bit of instructions under the hood for RAG in n8n.
@@ColeMedin It's basically the same thing you did, but instead of the data source being a document, it's an Excel table with columns like product, customer, date and sales price. The goal would be to ask specific information about products, such as total sales in a given period.
You are so welcome! You can really just switch out the Google Drive triggers for the local file trigger to work with your local files! docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.localfiletrigger/
First off, great video set. Thank you for doing these. I will say one thing, these are making me work more than expected. I didn't realize how rusty I am on all these items until I started trying to work my way through them. Still haven't gotten it completely off the ground but working through it. Do you have something created for setting the local file triggers up in N8N instead of using Google?
Thank you very much! And that's partially on me - I am working on making this entire package easier to get up and running and work with. I don't have a video using the local file trigger instead of Google drive but I'm looking to do that in the near future!
@@ColeMedin I need to just knuckle down and figure out how to connect to Google drive and be done with it. I was doing some research and doing a local repository will be an interesting problem to tackle with n8n running in a container.
Thank you for the great tutorial, it inspired me a lot. I followed your tutorial, but i have some issues; The AI is hallucinating, even if the result of the Vector Store Tool is ok; So im my case, the vector Store Tool has the correct information, the AI Agent couldn't find any information ... Maybe I have missed a few steps. What is your AI Agent System message? Do you have a Description in the Vector Store Tool? Thanks for the great tutorials and your help :-) Keep on working :-)
I'm glad - thank you! Smaller LLMs will give bad results sometimes even if the right info is given from the RAG part of the workflow - I've noticed that. Which model are you using?
Great work man! I am all set up and ready to start ingesting documents!! quick question, in your previous video you touched on the issue with the response adding in the headings. I think you mentioned we can clear that with a prompt directive or something.. thanks!
Thanks man! Yeah those kind of weird responses are typically from smaller models that get confused by the n8n tool calling prompting that happens under the hood. So you can prompt the model to say something like "don't output tool call syntax" or something like that. I know that's pretty general but you can mess around with what gets the responses you want!
@ awesome! I’ll try that. Curious if you’ve adapted the workflow to ingest documents from a local bound share? I would like to do this but I think the mods to the workflow may need a bit of work since it won’t be a on a Google drive. Guess I’m going to need to really learn n8n now. lol
Mighty Powerful, Great Video.. Im getting Fetch Failure in the ollama embedding at 5 min, even though the connectivity is successful; can you demonstrate how to batch a document in case im running out of memory.. Thanks & Cheers
Thank you! If your documents are too large I would try splitting them based on paragraphs (maybe like 20 paragraphs at a time) and inserting those one at a time into the vector database.
HI Cole, awesome, awesome awesome!!! I managed to configure it thanks a lot! when I run it (simple hello), the title of the chat is in Spanish, no big deal of course but you know how to get it to behave like yours, just english? Thanks again!
Thank you Matt! I'm actually really not sure why the title would be in Spanish... is there a setting on your system that sets the default language to Spanish or something like that?
Great video. Liked and subscribed. Surprised this vid has 18k views and barely 1k likes??? Nobody tips anymore... So Open-WebUI has a code editor. Is it any good? Could we use the BOLT fork on ur Git pg instead? Lmk, & Keep Iteratin'.
Thank you so much! Honestly 1k likes for 18k views is pretty good across RUclips so I'm happy! Thanks though haha Honestly I haven't played around the code editor in Open WebUI, but I would certainly like to try it out! The Bolt.new fork is great for iterating on projects initially - I would highly suggest checking it out!
Greeaat, you are the best bro! Do you know any source where I can improve my skills in prompt engineering for agents and function calling? I’m really needing to get better at this point
Do you have a Discord or similar setup for users to talk about your various projects, help each other, and offer advice on how to improve certain functions, report bugs, etc? If not this would be an incredibly important asset for you and your work, I think!
Great tutorial thank you! How do we get past the daemon error during the initial docker compose if I already have open-webui installed at localhost:3000?
phenomenal. Are all these instructions same when self hosting on cloud with say digital ocean? What GB will you recommend to buy to run this efficiently
Thank you and yes the self hosting instructions remain the same! The hardware depends on what models you want to run with Ollama. Aside from that though you can pretty much use any DigitalOcean droplet to run this for yourself! You'll just need a good GPU to run the LLMs locally.
Hey Cole love it , I am not a developer but i am researching on a way to serve the agent as an Openai Endpoint this will make the integration easier , I am not sure it's just an idea , you think it's doable ?
You are awesome... thanks mate. Please would you be able to show how to integrate LlamaParse in the n8n workflow for parsing complex pdfs. i'm struggling. thanks
You bet Tony!! I haven't used LlamaParse before actually but if I ever give it a shot I'll certainly consider making a video on integrating it with n8n!
Great tutorial cole! Thanks a lot! Unfortunately, when I run requests through WebUI I get weird answers. Example: Input: Hi! Can you give me some information about dogs? WebUI Output using N8N Pipe: tool.call("dog") Any idea about the culprit?
Thank you! Yeah I've seen this happen before - typically it is because I'm using a smaller LLM that isn't powerful enough to handle the larger prompt for RAG. Which model are you using? Also for Ollama models, it can be helpful to increase the context window size. That's under "Context Length" in the options for the Ollama node in n8n.
Hello Cole, First of all great video! i setup everything with your local ai package and imported the Open WebUI Function that uses the n8n workflow. I have a Problem with it tho, everytime i use the Function/Pipe it calls the n8n Workflow 2-3 Times, the second and third call being for Title Generation or Summary Generation or Tags or something. Is there anyway to prevent this? I guess the best way would be if the other calls are not being made towards the n8n workflow but rather towards a normal llm. Another problem i have is that the chat is not updating/showing status at all. I have to reload the chat when its finished to see the results and i can not open other chats while the Functions are running. Would be really thankful for your expertise here
Thank you! I haven't actually experienced either of these before. Are you sure the n8n workflow function is being called for the title/summary generation? Seems super weird it would do that... might be worth posting on the Open WebUI forum since potentially an update on their end made this happen.
in case it helps anyone, if you try to make a model to customize with this as well. It seems that the way it ties the session id and first prompt has a conflict with setting a prompt for the model ,since the first part of the prompt never changes then.
I actually have already put out a video doing this with Swarm! And I certainly will with other setups as well. ruclips.net/video/q7_5eCmu0MY/видео.html
Super video 👍 I have the silicon Mac version with local Ollama and without open Webui up and running. Will an update install Webui and not interfere with local ollama? thx a lot
Hey Cole. For.additional integrations to the ai took kit: add nodered, influx and grafana to the docker container. Add data to influx via nodered with an mqqt node or other then webhook to influx API. That would be un believable as your RAG could be live data. Many, many possibilities with some visualization in grafana just for.more capability. Nobody's got such a system.
Great video thanks, looking forward to using this on my mac. but i am a bit stuck ran everything as per video and README yet i get and error when n8n tries to start up it seems to be be complaining that the Encryption key has a mismatch. Any ideas on what i have done wrong? I have looked everywhere.
You are welcome! For that error I would delete the credentials and recreate them, it's probably because those are the default credentials when downloading the repo.
Hi Cole, How to read multiple files in this scenario? Currently it is reading only single file, is it possible to provide the way to read multiple files and store in vector database?
Yeah of course! I assume you mean multiple files that are uploaded at once? The only thing you have to edit in this workflow to make it possible is to add a loop after the Google Drive trigger so that it can loop over all the docs uploaded instead of just processing one. The rest of the flow remains the same since it's the same process to handle each file in the loop one at a time.
Cole what if we have the original AI Starter kit already installed and OpenWeb UI installed separately? Is it possible to do an upgrade so I don't lose everything like models and custom settings in OpenWeb UI? Or should I just delete the old installs and go fresh? Thanks for the effort! Where can I support you?
Fantastic question! You could continue to use your Open WebUI instance and just point it to the N8N endpoints you want to use from the local AI starter kit! You would just use localhost for the URL instead of n8n for the webhooks, assuming Open WebUI is running directly on your machine. Thank you for asking about supporting me! Right now I am building a community behind the scenes which I will be releasing soon. Being a part of that will be the best way to support me! It means a lot! :D
Thank you, Cole, for the amazing video! I have a question: when I use Open WebUI with an n8n pipe (Webhook), the workflow runs three times. The first time it uses my chatInput from Open WebUI, but the second and third times it seems to send default messages like "Create a concise, 3-5 word title with an emoji..." and "### Task: Generate 1-3 broad tags...". Could you explain why this happens? Thanks again for your great work!
Awesome work here Mr. Medin, My question is to use a nextcloud self hosted setup instance, but I have an issue getting past authentication for a local instance of nextcloud and docker, Please assist if possible. Thanks in advance
Thank you so much for this tutorial cole! That being said, I do need some help. I am trying to run the “command-r” LLM instead of llama3.1. I changed the Yama file but it seems like docker only likes to pull the llama module, not the command-r. How would you go about running the other LLM? Are there any other dependencies or alterations I need to make? Once again thanks for your help!
You are so welcome! You can run the Ollama pull command for the command-r LLM while the container is running to pull it! You just have to access the terminal for the container (super easy to do in Docker Desktop) and run the Ollama commands like usual.
I would recommend using Supabase and I'm even planning on adding it to the local AI starter kit in place of Postgres and Qdrant! It's just such an awesome platform
If there is an update to one of the containers you can stop all at once with Docker compose, pull the updates, and then restart it again without losing anything you've set up in the containers!
Great question! I love Supabase as well and for RAG it will do just as well as Qdrant until you start to get millions of records. That's when a dedicated vector DB like Qdrant or Pinecone starts to perform better than using PGVector with Supabase. But until then, I honestly would prefer Supabase because then you also have the platform for auth and SQL as well!
I have a significant request for you as the author of the channel. Could you test Apple computers with M4 Pro, Max and M2 Ultra chips, equipped with 48,64-128 GB of RAM, to determine the maximum size of local LLMs that can be practically used with acceptable performance? The question is whether it makes sense to invest in 128 GB versions, or if 64 or even 48 GB would be sufficient, considering that larger models might be unusable due to insufficient computational power. As an LLM user, I’ve encountered the issue where I cannot properly deploy models larger than 14B on my 12 GB GPU. I am particularly interested in the practical use of LLMs on such machines. Models of 14B are not satisfactory for me due to their limitations in accuracy and capabilities.
24:16 is the voice model test, kinda feels like a text to speech software that was add on top, not really a voice model lol, not like chatGPT's short and human like answers, needs work for sure, but could be useful to blind people.
I need to setup this for personal study group, with university material, but i would like to share this with some friends. How would you go about it hosting this solution on GCP for example? What are the VMs minimun prerequisites to run this smoothly? Sorry if this go to far beyond the local solution, but i would appreciate a lot on how to do this
Fantastic question! It depends a ton on the LLM you want to use. If you want to use smaller models like Llama 3.2 8b, you probably only need a GPU with 8GB of VRAM. Something like a 32b parameter model you would want a GPU like the 3090 with 24GB of VRAM. There are a lot of cloud providers out there that will provide you with a machine with these kind of specs!
@@martinhltr The Bolt.new fork is going strong! Actually more updates on that tomorrow! I'll still be posting other content like this each week but it doesn't mean the Bolt.new fork is going anywhere!
Great question! Yes you can easily host this somewhere with any platform like Render, DigitalOcean, AWS, etc. I have a tutorial on my channel for self hosting apps like this (using the local AI starter kit as an example actually) that will basically apply to any platform you want to use: ruclips.net/video/259KgP3GbdE/видео.html
I would like to know how to use a local file instead of google drive, like maybe point it to an obsidian folder, or it would be nice if it can point to a folder on another computer or nas system
Assuming the amount of "RAG data" you can give the model is limited by the context window, I'm sceptical how powerful RAG is, unless context windows increase in size significantly. Beyond Gemini Pro even.
It definitely depends a lot on the use case and what you are trying to pick out of your data! If you want to use RAG to summarize entire documents, then yes, the context window will be an issue. But if you want to pick out a small piece of info from a bunch of documents, that is where RAG rocks.
Do you have any setup steps for using clouflare tunnels? I have it running for n8n on a subdomain. How can I do the same for open webui? UPDATE: Nevermind, I've successfully setup the ssh tunnel for open webui
@ColeMedin sure! In my head I'm thinking of a web form that a user could upload the document and enter any relevant meta data to, which the submit action would trigger the addition to the vector database, and probably store the file for reference later if necessary.
Almost everything is working, but when I activate the function, it doesn't show up as an AI model, and I'm not sure what you did to make it appear there. It's active and configured, n8n is working and tested, but the function isn't showing up as a model like it does in your setup. I don't know why this is happening.
Same experience with the function. It does not show up in the list of webUI models. I created a new function and pasted the code in to create the function.
@@ColeMedin When I imported the function, even after updating everything, the model with the name N8N Pipe didn't show up. It was imported correctly, but nothing happened. In another video, I saw that it was possible to create a custom model and insert the desired function into this model, which I did. However, the imported function didn't appear in the model's properties. When editing another piece of function code, I noticed that the main class you used as Pipe was called Filter, so the WebUI understood the function, and it started appearing when creating a custom model. But even then, running the model with the customized function, N8N wasn't triggered. So, I understood that there was something wrong with this flow. That's what I tried because simply updating it wasn't working for me at all.
That's really awesome thank you. I haven't tested it yet as I'm using openwebui but with Live APIs like gpt, gemini and openai compatible named Deepinfra (Together AI also) which I'm using currently (Deepinfra) to use their large models like 405b. before starting this process, what should I change to use the API with the API token within n8n instead of using local models?
You are welcome! Love all the models you are using here! You can change out the model nodes (and embedding nodes as well if you want) in the n8n workflows to use other models like GPT or Claude. They support quite a few options!
I've got an idea for a full-stack orchestrator with n8n, but not react and supabase, would you like to talk about it? :) I know that we might need about 15 agents in total :D.
I really don't understand... isn't open webui already include building RAG by default with few settings? Just asking to know if i can skipp this video and just connect OWUI to open ai API and add documents to the "knowledge" part on OWUI
Yes, Open WebUI does have RAG functionality, and it works pretty well! But for any kind of customization you want to get better RAG performance you'll want to implement your own "RAG pipeline", and that's where this tutorial is useful. For example, you might want to segment your knowledgebase for different clients and you can do that with a RAG setup you create yourself but to my knowledge you can't with the basic RAG in Open WebUI.
I haven't even gotten 30 seconds into this video and I'm so excited... You're like my fav RUclipsr to learn this stuff from, but I'm a big Open WebUI user and I am SO F-ing pumped right now you made this video... This is what I have been waiting for to allow me to connect all of the dots in my own little AI world setup.
@@lofigamervibes I'm so glad to be able to give you what you've been looking for! It's my pleasure and thank you so much for the kind words! 😁
I haven't even finished the video and I'm astounded at the rate you can consume new information, master it, script a video, and demo it in a step-by-step accessible way. Truly a unicorn and my personal RUclips Hero.
Thank you so much Cameron, that means a lot to me! :D
You are the best, Cole. It's like you read my mind and made my dream come true. Thanks a lot. Please keep posting.
I appreciate it a lot! You are so welcome and I certainly am not going to stop posting ;)
alot of info in a quick pace! I like it and am subscribing. WebUI is awesome, so the n8n aspect will be cool to learn. love your intensity. looking for more.
Thank you very much!
I love your work flow and how you think about the processes and how to integrate them, I wan to thank you for pulling all this together.
You bet man!!
I was probably one of the few who didn't see this as a missing feature but a chance to integrate my own web UI for chat it originally started as a chat gpt clone and got various features added, now I have my own model to run it on that's dope but great video for everyone else still watching it
That's awesome!
@@ColeMedin Not as awesome as your content, thank you for making everything so accessible. I will be looking to contribute to your bolt project. You are awesome bro
Thank you so much! I look forward to seeing your contributions man!
LOVE IT. After very quick test including closing the microphone listening window, I tried building a push-to-talk action button. It does not quite work yet, but in further testing, my microphone stops listening while "thinking" and responding, then back to active listening when the response ends.
Wow , amazing and complete free content ! I Think the next step is to implement the vision of llama 3.2 in this workflow!
Thank you very much! Yeah I agree!
Big thanks for your efforts, that's super valuable!
You bet!!
00:05 Integrating open web UI for chatting with n8n AI agents
02:11 Self-hosted AI starter kit with various services packaged together.
06:01 Setting up Open WebUI with N8N AI Agents locally
07:56 Access N8N through Local Host Port 5678 for account setup.
11:58 Configuring connections and credentials for different nodes in N8N workflow
14:01 Using Google Drive as an example for ingesting documents and updating the quadrant vector store.
17:46 Customize valves in N8N AI agents for specific integrations
19:34 Open WebUI allows convenient parameter customization for n8n AI agents.
23:07 Open Web UI allows for easy customization and voice chat with N8N AI agents
25:01 Extension of local AI starter kit with Open WebUI for voice chat integration.
Crafted by Merlin AI.
Hands down you are a godsend. Thank you so much for this video. I was spinning my wheels as of late and now I can let off the brake 🚘🚀
You are so welcome! I'm glad you're moving fast now, that's awesome!
What incredible video! Congratulations. This is the best video I saw on youtube about it
Thank you very much man!
Dude this is so fire, this is exactly what I wanted you are a godsend.
Thanks man - I'm glad! 🔥
Thank you for creating such an outstanding tutorial! I truly appreciate your effort in making this video.
You are so welcome - thank you for the kind words!
Cole! You are the man! Thanks for this awesome integration!
You are so welcome! Thank you!
Dude! You deserve so much more subscribers!
Thank you man, that means a lot to me! :D
Really cool!
This was such a great video - thank you! It would be great to extend this by enabling file uploads via OpenWebUI. I'm guessing that would just mean detecting whether a file has been uploaded or not and putting it in the vector store.
The only thing thats kinda missing from being a starterkit, is a fully worked out agent that doesnt just handle a document that only has text in it well, but that will handle a document with pictures, text and code. Ideally spreadsheets too. Also, how to train a model on the collected vector data. Thank you for sharing the journey!
You bet! And I totally agree with you here - developing agents to bake into the starter kit for this kind of thing is certainly one of the ways I want to expand this in the future!
Another excellent video "FORWARD SLASH, FORWARD SLASH" // Not Back \ happy days
Excellent video. You read my mind. ❤
@@subasc I'm glad! Thank you very much!
Really thank you for your awesome work! 🎉
@@k3nning_w3st You are so welcome!!
This is brilliant! Just brilliant! ❤
Thank you very much!
Awesome video as always and keep bringing n8n content please ;-)) next n8n video maybe with agents swarm ? where multiple agents work together to accomplish task and show overall output in order in open web ui like searching through web and getting back response and than summarizing it or extracting values from it and writing in google docs,sheets or can self host only office and maybe try to create excel with that.
This will be cool idea to just say anything and in chat and it would perform actual actions ;-))
I was just about to add this as a comment. @ColeMedin you are doing amazing work, keep it up. I am just about to replace my current n8n container with yours 🤣
Thank you very much - I sure will! I love your thoughts here and I'll definitely be making content like this in the near future! Maybe not the very next video but coming up for sure.
Hi Cole. Regarding addition to your project: Add nodered, influxdb, grafana to the container
Then input IOT data with an MQQT node in nodered. Then webhook and/or api call from influxdb. That way your RAG is live data. Maybe some observability with grafana because it is awesome. Thanks.
This is a fantastic idea - thank you Jeffrey!!
Many thanks!
You're welcome!
Fantastic video! Now only the next step is missing, how to turn such a local environment into a production one, how to release such a built application into production based on Azure/Google/AWS or any other hosting. Maybe this is a topic for another video, I would be the first customer of such content.
Thank you and yes that is a very needed next step so I will be making more content around that in the future! I do already have a video on deploying this to the cloud though as a starting point!
ruclips.net/video/259KgP3GbdE/видео.html
But more to come to make it really production ready!
You are a gem bro
Thank you Ramo!!
Thanks for your great videos Cole. I find them inspirational and I am learning lots!!
I like your enthusiastic attitude, it sometimes lift me up if I am feeling down 🙂
There is just one little thing standing between me and total Victory!
What you have there to "clean the old vectors" does not seem work with a local file trigger :-(
Probably you don't have to make a whole video just for that one node.
If you could just somehow share that bit of updated code then that would be fantastic!! and even better if it is in python instead of java (if at all possible)
Keep going with your excellent work, you are making a positive difference during this inflection point in humanity.
Thank you so much!! Let me look into that!
Cole! Thanks! 🤩
You bet Fredrik!!
hey man, I'm not an expert in containers, but my intuition is any environment you set up that is connected to the Internet is susceptible to getting hacked, so don't use your ssn as your username or turn off auth, etc on these things. otherwise, super cool vid!
Thanks man and yeah I agree! I was sort of kidding haha just to drive home the point that it is indeed running locally. And technically you can run literally everything offline as long as your N8N workflows aren't going out to the internet for things! If you using a local file trigger instead of Google Drive for example.
Nice video Cole ! I hope that it will be possible to add pictures to the input prompt of OpenWebUI soon. In the stack, I would like to have image generation.
Thank you and yes that is one of the highest priority features to be added right now!
Yo, you're quick to address the ollama issue. props.
Love your content Cole, any chance you'd do something to show how to put this in production on a Railway or AWS or something?
I do cover deploying this to the cloud here!
ruclips.net/video/259KgP3GbdE/видео.html
You are awesome!!!
Thank you Cole! 😊
You bet!! :D
This is awesome!
Thank you very much!
This is awesome! 🎉🎉 can you add langfuse to the stack please
Thank you! Langfuse is one of the next additions I am considering!
Phenomenal!
Thank you!
Cole Medin - I thank you for your efforts in doing and showing us ways to utilize Docker with n8n, ollama, postgre, and qdrant..., however. Google IS NOT 100% LOCAL and offline. I thought this was purely an OFFLINE LOCAL setup. I'm very new to this and so far, yours' has been the BEST. I just got lost when you went to GOOGLE. I need a step by step for local documents WITHOUT GOOGLE. Can you help with that? thanks.Better yet a n8n flow that already has the local input instead of google docs.
You are welcome and I hear you on this! I would look into using the "local file trigger" with n8n!
so grateful to you for sharing info about N8N and local LLMs. Would be extremely grateful if you could follow up with how to use cloud based tools for embeddings.
Super excited to move forward on all this but from what I can tell my desktop machine (Mac mini M2 with 16gb ram) is just not powerful enough to do the embeddings.
I went through your incredibly helpful video from a few weeks back on setting up N8N. I got it all set up but my machine kept timing out when it tried to do the embeddings. I WAS able to get it to work doing embeddings with INCREDIBLY short documents, i.e. like text file that was two lines long. But anything longer and it would time out ofter 5 minutes.
I'm super new at this but I've seen some discussion online of using cloud based options to do the embeddings? Would be super grateful for a video on that.
It's my pleasure! For cloud based embeddings I would recommend using OpenAI embeddings, you just need an OpenAI API key and it's supported as an embedding option in n8n!
Also which embedding model are you using? There are a lot of options and you could always try another smaller one for making it quick locally!
Thanks much @@ColeMedin RE: the model I'm using, I should have mentioned that I'm using nomic-embed-text. I went through your helpful video on Run ALL Your AI locally in Minutes -- video was great and I got the n8n all set up with Ollama-- but, as I said, my Mac mini kep timing out when I tried to set up embeddings for documents more than a few lines long.
Okay yeah sorry I forgot you already mentioned! I would suggest going to the Ollama site and searching through their embedding models, lot of fast options to try there!
This is GOLD! Thank you so much, Cole!
Does anyone is getting |python tags| rather than the actual message from the Tool while triggering the agent via chat?
You bet!!
I've had this happen before, generally with smaller models because they hallucinate since there are actually quite a bit of instructions under the hood for RAG in n8n.
Hi, Cole Medin, thank you for the awesome work. Could you do a demo showing how an LLM can answer questions about a sales database for example?
Thank you! And I appreciate the suggestion! Could you elaborate a bit more on what this use case would look like in your mind?
@@ColeMedin It's basically the same thing you did, but instead of the data source being a document, it's an Excel table with columns like product, customer, date and sales price. The goal would be to ask specific information about products, such as total sales in a given period.
Oh yeah, I'll actually be making a video on this in the near future with CSV/spreadsheet RAG!
good stuff sir, next video can have something for RAG to work with SQL command generation and it tries to reprocess if SQL command got field errors
Thank you Mike and great suggestion!
Damn dude, thank you so much!
@@orthodox_gentleman Haha you bet man!
This is amazing! Do you plan on adding the ability to add multiple agents?
Thank you so much for this. I'm very new to this setup, can you guide us on how to work with a local folder with say my Obsidian notes?
You are so welcome! You can really just switch out the Google Drive triggers for the local file trigger to work with your local files!
docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.localfiletrigger/
i would like to know this as well
Greatness
First off, great video set. Thank you for doing these. I will say one thing, these are making me work more than expected. I didn't realize how rusty I am on all these items until I started trying to work my way through them. Still haven't gotten it completely off the ground but working through it. Do you have something created for setting the local file triggers up in N8N instead of using Google?
Thank you very much! And that's partially on me - I am working on making this entire package easier to get up and running and work with. I don't have a video using the local file trigger instead of Google drive but I'm looking to do that in the near future!
@@ColeMedin I need to just knuckle down and figure out how to connect to Google drive and be done with it. I was doing some research and doing a local repository will be an interesting problem to tackle with n8n running in a container.
You rock!!
Thank you!! :D
n8n guru i swear..
Haha I appreciate it! 😀
Thank you for the great tutorial, it inspired me a lot.
I followed your tutorial, but i have some issues;
The AI is hallucinating, even if the result of the Vector Store Tool is ok; So im my case, the vector Store Tool has the correct information, the AI Agent couldn't find any information ...
Maybe I have missed a few steps.
What is your AI Agent System message?
Do you have a Description in the Vector Store Tool?
Thanks for the great tutorials and your help :-)
Keep on working :-)
I'm glad - thank you! Smaller LLMs will give bad results sometimes even if the right info is given from the RAG part of the workflow - I've noticed that. Which model are you using?
@@ColeMedin ah, ok. I have used llama3.1 8B and llama 3.2 3B. I'll try it with ChatGPT ...
Great work man! I am all set up and ready to start ingesting documents!! quick question, in your previous video you touched on the issue with the response adding in the headings. I think you mentioned we can clear that with a prompt directive or something.. thanks!
Thanks man! Yeah those kind of weird responses are typically from smaller models that get confused by the n8n tool calling prompting that happens under the hood. So you can prompt the model to say something like "don't output tool call syntax" or something like that. I know that's pretty general but you can mess around with what gets the responses you want!
@ awesome! I’ll try that. Curious if you’ve adapted the workflow to ingest documents from a local bound share? I would like to do this but I think the mods to the workflow may need a bit of work since it won’t be a on a Google drive.
Guess I’m going to need to really learn n8n now. lol
I haven't yet but I know there are local file triggers to help you with that!
@@ColeMedin yeah I got local file trigger working, but I’m not sure it can ingest .pdf documents.
Mighty Powerful, Great Video.. Im getting Fetch Failure in the ollama embedding at 5 min, even though the connectivity is successful; can you demonstrate how to batch a document in case im running out of memory.. Thanks & Cheers
Thank you! If your documents are too large I would try splitting them based on paragraphs (maybe like 20 paragraphs at a time) and inserting those one at a time into the vector database.
HI Cole, awesome, awesome awesome!!! I managed to configure it thanks a lot! when I run it (simple hello), the title of the chat is in Spanish, no big deal of course but you know how to get it to behave like yours, just english? Thanks again!
Thank you Matt! I'm actually really not sure why the title would be in Spanish... is there a setting on your system that sets the default language to Spanish or something like that?
Great video. Liked and subscribed. Surprised this vid has 18k views and barely 1k likes??? Nobody tips anymore... So Open-WebUI has a code editor. Is it any good? Could we use the BOLT fork on ur Git pg instead? Lmk, & Keep Iteratin'.
Thank you so much! Honestly 1k likes for 18k views is pretty good across RUclips so I'm happy! Thanks though haha
Honestly I haven't played around the code editor in Open WebUI, but I would certainly like to try it out! The Bolt.new fork is great for iterating on projects initially - I would highly suggest checking it out!
Thank you :-) please tell me how do I get the' Documents' and 'Prompt' Tabs on the left below Workspace
You are welcome! I'm not sure what tabs you are referring to, is this within Open WebUI? I haven't used those!
It would be awesome if you replaced vanilla Postgres with SupaBase in the docker file!
Thank you so much for the support Marc! I agree that's a smart move, and I am actually planning on doing that in the future!
@ColeMedin awesome! I'm anxiously looking forward to it!
Can you upload or create a video showing how to replace the Google Drive workflow to local files for truly local?
I am definitely considering making a video on that!
Great video. Can you add searxng to allow anonymous web search from openwebui?
@@joepalovick1915 Thank you and I appreciate the suggestion! I'm going to create a list of improvements to make and add this to it!
Greeaat, you are the best bro! Do you know any source where I can improve my skills in prompt engineering for agents and function calling? I’m really needing to get better at this point
Thank you Arthur! Here is my favorite resource for learning prompt engineering in a really concise way:
www.promptingguide.ai/
OWUI lists integration with LMStudio - would be nice to see this included in the stack (or is there too much of an overlap?)
Yeah I'd say they are super similar. But maybe there is a place for it!
Do you have a Discord or similar setup for users to talk about your various projects, help each other, and offer advice on how to improve certain functions, report bugs, etc? If not this would be an incredibly important asset for you and your work, I think!
Thank you for mentioning this! I don't yet but I am actually building up a community platform behind the scenes right now that will be released soon!
can you add to this docker the stable diffusion connected to open-webui
Great tutorial thank you! How do we get past the daemon error during the initial docker compose if I already have open-webui installed at localhost:3000?
Thanks! You can remove Open WebUI from the docker-compose file - just take out the service for it and then rerun the command to star the stack!
@@ColeMedin thanks
phenomenal. Are all these instructions same when self hosting on cloud with say digital ocean? What GB will you recommend to buy to run this efficiently
Thank you and yes the self hosting instructions remain the same! The hardware depends on what models you want to run with Ollama. Aside from that though you can pretty much use any DigitalOcean droplet to run this for yourself! You'll just need a good GPU to run the LLMs locally.
@@ColeMedin Thank you Cole! You really are the absolute best
You bet - thank you!!
Hey Cole love it , I am not a developer but i am researching on a way to serve the agent as an Openai Endpoint this will make the integration easier , I am not sure it's just an idea , you think it's doable ?
You are awesome... thanks mate.
Please would you be able to show how to integrate LlamaParse in the n8n workflow for parsing complex pdfs. i'm struggling. thanks
You bet Tony!!
I haven't used LlamaParse before actually but if I ever give it a shot I'll certainly consider making a video on integrating it with n8n!
thanks mate.... much appreciated
You bet!
This is a freakin' awesome project!!!!
I'm having issues getting my NVIDIA GeForce RTX 2080 Ti to be used. Any guidance?
Thank you! Ollama should use your GPU by default. Are you saying it isn't and the LLMs are running on your CPU?
Is there a way to also include a web inerface to Postgres like adding Pgadmin? Otherwise awesome kit!
Yeah anything open source you could add in another container to this! I might be doing this soon actually!
Thanks Cole. I cannot make the OpenUi communicate with the webhook. Do i need to activate pipelines in order to have this working?
You bet! What is the error you are getting? This is a function not a pipeline so you shouldn't have to.
Is there a workflow that I can use which doesn't include the ollama part? I have ollama installed, I'd like to just use the rest of the package.
Great tutorial cole! Thanks a lot!
Unfortunately, when I run requests through WebUI I get weird answers.
Example:
Input: Hi! Can you give me some information about dogs?
WebUI Output using N8N Pipe: tool.call("dog")
Any idea about the culprit?
Thank you!
Yeah I've seen this happen before - typically it is because I'm using a smaller LLM that isn't powerful enough to handle the larger prompt for RAG. Which model are you using?
Also for Ollama models, it can be helpful to increase the context window size. That's under "Context Length" in the options for the Ollama node in n8n.
Hello Cole,
First of all great video!
i setup everything with your local ai package and imported the Open WebUI Function that uses the n8n workflow.
I have a Problem with it tho, everytime i use the Function/Pipe it calls the n8n Workflow 2-3 Times, the second and third call being for Title Generation or Summary Generation or Tags or something. Is there anyway to prevent this? I guess the best way would be if the other calls are not being made towards the n8n workflow but rather towards a normal llm.
Another problem i have is that the chat is not updating/showing status at all. I have to reload the chat when its finished to see the results and i can not open other chats while the Functions are running.
Would be really thankful for your expertise here
Thank you! I haven't actually experienced either of these before. Are you sure the n8n workflow function is being called for the title/summary generation? Seems super weird it would do that... might be worth posting on the Open WebUI forum since potentially an update on their end made this happen.
in case it helps anyone, if you try to make a model to customize with this as well. It seems that the way it ties the session id and first prompt has a conflict with setting a prompt for the model ,since the first part of the prompt never changes then.
Yes that is true, thank you for bringing this up!
cole do some videos about agents "talking" to SQLs < 3
I actually have already put out a video doing this with Swarm! And I certainly will with other setups as well.
ruclips.net/video/q7_5eCmu0MY/видео.html
Super video 👍 I have the silicon Mac version with local Ollama and without open Webui up and running.
Will an update install Webui and not interfere with local ollama? thx a lot
Hey Cole. For.additional integrations to the ai took kit: add nodered, influx and grafana to the docker container. Add data to influx via nodered with an mqqt node or other then webhook to influx API. That would be un believable as your RAG could be live data. Many, many possibilities with some visualization in grafana just for.more capability. Nobody's got such a system.
@@jeffreymoore1431 Boy that is a really great idea (ok I use all of it :) yes live data would be killer. great, great, great 👍
@@jeffreymoore1431 cool, super idea👍
Thank you and great question! This all runs in containers so it won't interfere with what you have running already!
Fantastic idea - thanks Jeffrey!
Great video thanks, looking forward to using this on my mac. but i am a bit stuck ran everything as per video and README yet i get and error when n8n tries to start up it seems to be be complaining that the Encryption key has a mismatch. Any ideas on what i have done wrong? I have looked everywhere.
You are welcome! For that error I would delete the credentials and recreate them, it's probably because those are the default credentials when downloading the repo.
@@ColeMedinreally appreciate your response.
Have not had a chance to try yet need to work out where the creds are stored so I can delete them.
Hi Cole,
How to read multiple files in this scenario? Currently it is reading only single file, is it possible to provide the way to read multiple files and store in vector database?
Yeah of course! I assume you mean multiple files that are uploaded at once? The only thing you have to edit in this workflow to make it possible is to add a loop after the Google Drive trigger so that it can loop over all the docs uploaded instead of just processing one. The rest of the flow remains the same since it's the same process to handle each file in the loop one at a time.
Cole what if we have the original AI Starter kit already installed and OpenWeb UI installed separately? Is it possible to do an upgrade so I don't lose everything like models and custom settings in OpenWeb UI? Or should I just delete the old installs and go fresh? Thanks for the effort! Where can I support you?
Fantastic question! You could continue to use your Open WebUI instance and just point it to the N8N endpoints you want to use from the local AI starter kit! You would just use localhost for the URL instead of n8n for the webhooks, assuming Open WebUI is running directly on your machine.
Thank you for asking about supporting me! Right now I am building a community behind the scenes which I will be releasing soon. Being a part of that will be the best way to support me! It means a lot! :D
@@ColeMedin Awesome, thank you for the response! Well as soon as you get the community finished I’ll be first in line to sign up!
You bet! I appreciate it!
Thank you, Cole, for the amazing video! I have a question: when I use Open WebUI with an n8n pipe (Webhook), the workflow runs three times. The first time it uses my chatInput from Open WebUI, but the second and third times it seems to send default messages like "Create a concise, 3-5 word title with an emoji..." and "### Task:
Generate 1-3 broad tags...".
Could you explain why this happens?
Thanks again for your great work!
Ops, I see an option within Open WebUI and everything works perfect! Thanks a lot
You bet, thank you! Glad you figured it out!
Awesome work here Mr. Medin, My question is to use a nextcloud self hosted setup instance, but I have an issue getting past authentication for a local instance of nextcloud and docker, Please assist if possible. Thanks in advance
Thank you! What is the error you are seeing?
Thank you so much for this tutorial cole! That being said, I do need some help. I am trying to run the “command-r” LLM instead of llama3.1. I changed the Yama file but it seems like docker only likes to pull the llama module, not the command-r. How would you go about running the other LLM? Are there any other dependencies or alterations I need to make? Once again thanks for your help!
You are so welcome! You can run the Ollama pull command for the command-r LLM while the container is running to pull it! You just have to access the terminal for the container (super easy to do in Docker Desktop) and run the Ollama commands like usual.
Hi Cole, would you suggest me to use supabase or directly postgresql for vector database or is it better to use Qdrant?
I would recommend using Supabase and I'm even planning on adding it to the local AI starter kit in place of Postgres and Qdrant! It's just such an awesome platform
What would be the best way to keep these separate projects up to date within their docker containers?
If there is an update to one of the containers you can stop all at once with Docker compose, pull the updates, and then restart it again without losing anything you've set up in the containers!
I like supabase, is qdrant better ?
Great question! I love Supabase as well and for RAG it will do just as well as Qdrant until you start to get millions of records. That's when a dedicated vector DB like Qdrant or Pinecone starts to perform better than using PGVector with Supabase. But until then, I honestly would prefer Supabase because then you also have the platform for auth and SQL as well!
I have a significant request for you as the author of the channel. Could you test Apple computers with M4 Pro, Max and M2 Ultra chips, equipped with 48,64-128 GB of RAM, to determine the maximum size of local LLMs that can be practically used with acceptable performance? The question is whether it makes sense to invest in 128 GB versions, or if 64 or even 48 GB would be sufficient, considering that larger models might be unusable due to insufficient computational power.
As an LLM user, I’ve encountered the issue where I cannot properly deploy models larger than 14B on my 12 GB GPU. I am particularly interested in the practical use of LLMs on such machines. Models of 14B are not satisfactory for me due to their limitations in accuracy and capabilities.
I would test if I had Apple computers! But I don't unfortunately
@@ColeMedin I'd really appreciate it, and I think the rest of the subscribers would too!
Yeah for sure! They are pretty expensive though haha
24:16 is the voice model test, kinda feels like a text to speech software that was add on top, not really a voice model lol, not like chatGPT's short and human like answers, needs work for sure, but could be useful to blind people.
Yeah it's meant to be more of an example at this point, I know the Open WebUI team is working on improving it though!
I need to setup this for personal study group, with university material, but i would like to share this with some friends. How would you go about it hosting this solution on GCP for example? What are the VMs minimun prerequisites to run this smoothly?
Sorry if this go to far beyond the local solution, but i would appreciate a lot on how to do this
Fantastic question! It depends a ton on the LLM you want to use. If you want to use smaller models like Llama 3.2 8b, you probably only need a GPU with 8GB of VRAM. Something like a 32b parameter model you would want a GPU like the 3090 with 24GB of VRAM. There are a lot of cloud providers out there that will provide you with a machine with these kind of specs!
what happens to the bolt.new fork?
@@martinhltr The Bolt.new fork is going strong! Actually more updates on that tomorrow!
I'll still be posting other content like this each week but it doesn't mean the Bolt.new fork is going anywhere!
Easy to host all this somewhere? Render?
Great question! Yes you can easily host this somewhere with any platform like Render, DigitalOcean, AWS, etc. I have a tutorial on my channel for self hosting apps like this (using the local AI starter kit as an example actually) that will basically apply to any platform you want to use:
ruclips.net/video/259KgP3GbdE/видео.html
I would like to know how to use a local file instead of google drive, like maybe point it to an obsidian folder, or it would be nice if it can point to a folder on another computer or nas system
I would look into the "Local file trigger" in n8n to do this! It's certainly doable
Assuming the amount of "RAG data" you can give the model is limited by the context window, I'm sceptical how powerful RAG is, unless context windows increase in size significantly. Beyond Gemini Pro even.
It definitely depends a lot on the use case and what you are trying to pick out of your data! If you want to use RAG to summarize entire documents, then yes, the context window will be an issue. But if you want to pick out a small piece of info from a bunch of documents, that is where RAG rocks.
Do you have any setup steps for using clouflare tunnels? I have it running for n8n on a subdomain. How can I do the same for open webui?
UPDATE: Nevermind, I've successfully setup the ssh tunnel for open webui
You mentioned using local files for input, but how about using a webhook to ingest the RAG data?
Could you expand more on what you mean by this? Sounds interesting for sure man!
@ColeMedin sure! In my head I'm thinking of a web form that a user could upload the document and enter any relevant meta data to, which the submit action would trigger the addition to the vector database, and probably store the file for reference later if necessary.
Oohhhh I gotcha, yeah I love that idea! Something I am actually doing for a client right now.
Please make a new one with Browser use as addition.
Almost everything is working, but when I activate the function, it doesn't show up as an AI model, and I'm not sure what you did to make it appear there. It's active and configured, n8n is working and tested, but the function isn't showing up as a model like it does in your setup. I don't know why this is happening.
Same experience with the function. It does not show up in the list of webUI models. I created a new function and pasted the code in to create the function.
Yeah I've had this happen before but all I had to do was refresh the page and it showed up!
@@ColeMedin The name off class, i change to filter..but not start n8n job.
Sorry could you clarify?
@@ColeMedin When I imported the function, even after updating everything, the model with the name N8N Pipe didn't show up. It was imported correctly, but nothing happened. In another video, I saw that it was possible to create a custom model and insert the desired function into this model, which I did. However, the imported function didn't appear in the model's properties. When editing another piece of function code, I noticed that the main class you used as Pipe was called Filter, so the WebUI understood the function, and it started appearing when creating a custom model. But even then, running the model with the customized function, N8N wasn't triggered. So, I understood that there was something wrong with this flow. That's what I tried because simply updating it wasn't working for me at all.
That's really awesome thank you. I haven't tested it yet as I'm using openwebui but with Live APIs like gpt, gemini and openai compatible named Deepinfra (Together AI also) which I'm using currently (Deepinfra) to use their large models like 405b. before starting this process, what should I change to use the API with the API token within n8n instead of using local models?
You are welcome! Love all the models you are using here! You can change out the model nodes (and embedding nodes as well if you want) in the n8n workflows to use other models like GPT or Claude. They support quite a few options!
@@ColeMedin Thanks 🙏 I will try yo modify it to make it work. ❤️
You bet!
I've got an idea for a full-stack orchestrator with n8n, but not react and supabase, would you like to talk about it? :) I know that we might need about 15 agents in total :D.
Sounds super cool! I'm always interested to hear more about projects like this :)
I really don't understand... isn't open webui already include building RAG by default with few settings?
Just asking to know if i can skipp this video and just connect OWUI to open ai API and add documents to the "knowledge" part on OWUI
Yes, Open WebUI does have RAG functionality, and it works pretty well! But for any kind of customization you want to get better RAG performance you'll want to implement your own "RAG pipeline", and that's where this tutorial is useful. For example, you might want to segment your knowledgebase for different clients and you can do that with a RAG setup you create yourself but to my knowledge you can't with the basic RAG in Open WebUI.