I went to bed yesterday night thinking about what other ways we can train the models, other than feeding the documents at the prompt - and woke up in the morning with this video being at the top of suggested videos in RUclips !
@@B33t_R007 depends on the model you using and the level of quantization used. In truth when buying PC for AI, GPU VRAM capacity is more important than CPU RAM. Go for a PC with a 64GB VRAM you could run llama 90B with half precision(bfloat16)
Agreed! And those that aren't hype require three years of careful study in the field before you can understand them. This video is perfect for just showing me how to accomplish what I want to get done.
Dave: you can "bind mount" the documents directory into the container FS. Bind mount is docker-speak for "reflect a piece of the host FS into the container FS".
Love this - I'm hoping to set up my own in the next year! I'll need to watch this back and take notes! Cheers! I also loved the part when your blinds momentarily lost connection to wifi at 11:12.
For years I’ve stored documentation relevant to my domain, more than I can read in ten life times and cherry pick from to build conferences and workshop. For some time I knew that local model working on your own data data set could be useful. But today, when I see you explaining that in your video, that became obvious that I should do that move!. Thank you for your help. I subscribed to watch more about local set up of a llama model.
Great demo. It shows that you had the knowledge to know when it gave a subpar answer. People going in without that knowledge won't have that ability though. It again shows that these are assistants that may need help, not replacements, as some are pitching LLMs as
Dave, this is great! I walked through everything this weekend, but my web-ui looked different. I didn't have a scan button under documents, but under workspace I had a "knowledge" tab where I would upload documents to my custom model. I think they must have just changed this functionality to make it easier to upload documents without copying to the backend because I could upload the documents from the web-ui and it automatically put them in an "upload" directory and used the document as a reference the same as your video. Overall, I think the change is a huge improvement. Thanks for putting this together, because I was able to use it to host my own AI at home using ollama and web-ui and build custom models using my own reference document. Keep up the great work!
Software updates and changes occur often due to the rapid development of AI tools, so every step in this tutorial could be obsolete with-in days, weeks, months...
@Dave's Garage i'm playing with this myself now. i'm having success adding docs via the Knowledge section for the workspace. then, when making a new model, it can reference that doc the same as when you scanned the folder. i did that because i am running the ui in a docker container. i know how to find and populate the volume mount, but i wasn't seeing a button to scan the directory. but the Knowledge section is another way to add the docs! great video!
It is insane how fast AI is moving. Since you made the video Open WebUI have released an update for knowledge and document management, making it easier (you can now upload directly from the web interface)!! Thanks for making these videos, I love how clear and concise they are, as well as just entertaining!
For users who can't find the document tab in Workspace in V0.3.12, it is now called Knowledge and you can just upload your documents here even while in Docker. In Workspace, create your Model as per normal but choose the Knowledge Base that you've created and uploaded files into. Everything else works as normal.
Absolutely awesome content you are pushing out on this Dave, thank you so much! Im a little worker bee at Apple with 0 programming skills and im using what im picking up from you to try and make our department a custom IA for some things. Thank you!
Hey Dave, your vids are value packed - so much good info in under 20 minutes. I'd love to see a similar vid on AI image analysis - perhaps for use with security cameras, or with 3D printers to detect foul-ups etc.
Wow! PDP11??? I did my summer practice on a PDP8 + Donner Analogue in 1970. And met my first PDP11 some years later when I started to work at DEC. Nice video and nice memories 😊 Thanks!
This is where I wanted AI to be two years ago. Great video. Once Sales directors, Project Managers and Product Owners realise that their company needs reliable and reviewed documentation to leverage LLMs, the nightmare of technical documentation will begin. I'm pro manuals and pristine technical documentation, but the majority of engineering teams NEVER document knowledge: I guess Knowledge Management will have a resurgence.
I asked ChatGPT whether creating a custom GPT actually places documents into the context window or if it access them via RAG. Here's what it said: When you create a custom GPT and upload documents, those documents aren't loaded directly into my context window or memory. Instead, they're used with a retrieval-augmented generation (RAG) approach. This means that when you ask a question, the system searches through the uploaded documents to retrieve relevant information and incorporates those specific parts into the response dynamically. So, I only "see" what’s relevant to your query at the time, rather than having the entire document always loaded in memory. This keeps responses focused and ensures data privacy by only referencing what’s necessary for the question.
I have the same question. 16:19 "It will be slower" part is what I am wondering about. If RAG can search and bring the most relevant pieces from the whole set of docs, we should always include all the docs, and it should be approximate the same amount of time
Thanx Dave. I think I finally found a way to get rid of all that zillions of user manuals of all kind of equipment (from kitchen to garden and everything in between). And still find an answer when I really need to use a manual.
Thanks, that video was actually very helpful. I am midst working out different ways to use our local documents to chat with them. Preferably with a local hosted LLM.
Love your clear explanation, and I might not have heard you correct on why you wanted to run Ollama locally rather than in a container... but you can just mount a volume to the docs rather than run locally. That would enable maximum flexibility.
Dave I asked for this exact setup your talking about, dont know if you saw my comment but either way I REALLY appreciate this video.. It will help my create my own custom base of information for computer parts for gaming office and server ect.. SO GRACIOUS!!!
Thanks for being so clear and detailed in your presentation. Even someone like me without formal tech training can understand and follow the steps easily. 🙌 Liked. Subscribed. Waiting for more!
Another great video, Dave! Have you thought about covering Prompt Engineering? It would be awesome to hear your take on techniques like Shot Prompting, Chain-of-Thought (CoT), and working with AI Agents, especially with where that field is heading. Thanks for all the valuable content - I love all your videos!
The tooling available today to make this so easy to do on your desktop computer is just amazing. The tech here is very dense and can be challenging to do without the tools available today.
It took a while, but it turns out to be easy to add documents using web-ui in a docker container under windows. Just using the add documents + button. I don't know where they go, if anywhere but it does work. ChatGpt suggested it. It wasn't obvious on my screen layout, but it is there.
thank you for the brief intro into setting up a custom GPT, online and local. but the openwebui+ollama combo as docker container is still fit, you just need to map your host folder to the openwebui container path, there's no need to COPY something INTO the container
You could use docker and mount a physical drive/folder on the host as the doc folder or a subfolder of it, for which you probably need to adjust the Dockerfile setup, so you won't need to copy, if you have access to whatever runs the docker container
Love being a sub, and love to like videos with this quality content. This is one of the most exciting projects I have got to play with in decades ... This AI stuff is the door to a whole new world of learning . LOVE IT !!!! I love that for now it is free....
Any chance you’d consider a deeper dive into RAG? I'd love to see more on the practical side - like how it can improve response accuracy, the nuances of retrieval techniques, and combining RAG with methods like CoT (Chain-of-Thought) and Agents. I’d be thrilled to see where you take this topic next. Thanks for all the great content!"
I can tell you from experience, as this is my current field of expertise, that A100s are no longer a viable platform. The H100 is current and will be quickly replaced with the H200. This means that, on the second-hand market, you can pick up an A100 relatively inexpensively for all your home lab needs.
You had my hopes up that I might be able to make running the llama3.1 70b model viable at home, but so far those things are still going for $7,000 - $25,000 used. I guess I’m waiting until the H200s are widely deployed and/or getting long in the tooth, or until I win the lottery or something. Still, good to know that day is coming! I asked ChatGPT how the model’s speed would likely compare on my system with that vs. my current 3080 and the numbers are estimated to come in between 20-150x faster. I can hardly wait! That model takes ages to run natively on my i9, but the quality of the output is approaching functional for what I’d use it for, so I’m thinking something around there is probably near my bottom of the barrel target for the time being.
Where is a good place to buy those A100s second hand? Is there a specific channel where we can find them? And how much do you estimate they will cost? As right now they are definitely outside the price possibilities of a homelab
@@babybirdhome FWIW the 70B version of Llama3.1 ran on the system I have at home. I wouldn't say it was exactly fast, but it did work. Ryzen 9, 4080 Super and 32 giggs of RAM FWIW.
@@raduboboc A100 occasionally comes on eBay at a lower cost, depending on supply and demand. I consider them affordable for the home market used, considering that new they cost around 12k. However, you will have two other hurdles than the price you must overcome. An A100 does not use a standard PCIe power connector, even in the PCI form factor. It uses what is known as an EPS connector, which is the same as the CPU connector to the motherboard, and although you could jury rig something together from two PCI 8pin adaptors to EPS, it is not mainstream right now, so something would have to be custom made. Now, let us talk about cooling. These devices are meant to be in pressurized data centers with at least very high airflow and hot and cold aisles. This means that to use something like this at home, you would need to print a 3D shroud to interface a high CFM fan, and you may not want to hear that thing running. Also, keep in mind that the A100-PCIE-40GB is powered by the 7nm (TSMC) process node, which makes it relatively old compared to even consumer video cards like the RTX 4090 with 4nm (TSMC) and with more than twice the CUDA cores. I realize video memory is what you need in AI workloads, but there are confirmed reports of the 4090s after the market was modified to 48 GB of RAM for this exact reason.
Create information Dave! The lastest version of OpenWeb UI has changed this a bit with Knowledge Collections. There is no scan button anymore under Admin Panel --> Settings --> Documents .
I downloaded Ollama and llama3.2 and asked it if it was running locally. No, it said, it could only run from a server, not locally. So then I pulled out my internet connection and continued to chat with it!
Thanks for this Dave, great inspiration for playing around with LLMs - and maybe even get something useful out of them :-)) One thought: If - as they claim - LLMs had already absorbed all the knowledge know to man, there should be nothing more to add… so this shows that’s just hype, like much of the rest ;-)
Hi Dave - Autism & AI - I'm interested in hearing about your overlap because I see it too. We're the same age, I fall into the Asperger's camp and everything I've learned about AI parallels my autistic thought process. I'm interested in developing this relationship for the benefit of autistic others and curious to learn about the connections you've made.
Great starter, thank you. The info on webui is out of date now. I had to create a knowledge . It did not do great but I'll just have to try various models.
That's going to be fascinating when the concepts settle into a viable product for enterprise and SME clients; though there's going to be some potentially messy security implications though particularly if used in a legal field (so if a firm has confidential/sensitive documentation that is fed in or is operating with a Chinese Wall in place internally, even demonstrating the integrity of that would be awfully difficult). I'm keen for AI like this to be used as a starting point or an "ideas expander" rather than a be-all and end-all of expertise, to that end it'd be nice if it could respond with the actual documentation link references or index links - though it doesn't seem to be built that way at this stage which is fair I guess. I do love the notion of being able to chomp on company technical and archival documentation like this though - places that discarded or threw things out (purely for handling reasons) decided that on primitive estimations of retrieval and usage, best to err on the side of science fiction with that I think (unless disposal is required by legislation of course).
I had been using what would be considered more structured ai many years ago. Then it was called ultra hal. Today I no longer use that one I use a combination of semantic word net, conceptnet, xgboost, and distilli gpt2 for fluency. Using much less resources for a wearable pc.Still playing around with it, may be a little in the woods for many. Like Dave is pointing to you may need a much more powerful machine to get reasonable results. I am working with 32 and 64 gb machines for a wearable applications. Think gigabyte brix mini pc. While there are lower power consuming hardware, you half to strike a balance for the application you may be using it for and the results you may expect in my use context of a wearable pc. I will definitely be playing around with rag and my setup.
Scripting with Open Interpreter and Ollama in Python has been blowing my mind this week. The myriad of added value and possibilities this new toolbox provides are seemingly endless.
The retraining is called _finetuning._ If you don't need images and the OCR is good enough you can remove the images from documents, so it doesn't need to see the images.
Super interesting! A couple of practical questions: 1) How can you deal with constantly updating files? I’d like to just have my entire Obsidian directory stack accessible by RAG (including PDF attachments), but all the ongoing new documents and changes to existing ones would need to be ingested somehow. Can it do that incrementally, or would I just need to re-run the ingesting process? (Running overnight via a cron job would be fine, I’m on a Mac if that matters - and yeah, I know it’ll be slow on my M1 Pro/64gb) 2) Is there any way to have it provide a link, index or other reference to specific documents it found the answer on? My use case would be as much about access and retrieval as query and summarization. (3 - For unrelated bonus credit if any readers might have a solution: Any good way of integrating Google Docs with local Obsidian? I don’t want to use Obsidian’s publishing function, but would like to be able to integrate individual Google Docs into my Obsidian system.)
really love your work Dave. Suggestion for content - build Hal9000 or Grok (hitchhikers guide). Voice to text to ai local model with a ui. Fun project, currently possible.
Discovered you completely by chance, but I really appreciate the insights. I was wondering if there's a way to automate the parsing of documents in RAGs. My hope is to aggregate my multiple cloud storages into a local, synced NAS and then use a local LLM (embed / vecotrise) etc. Any plans to build a video on this?
The numbers for the "tell me a story" test are quite interesting on the Big Box. FWIW my setup at home is a Ryzen 9 with a 4080 Super. It managed about half the performance of your box on the Llama 3.2 tests with 173.74 tokens/s on the short story and 168.20 on the long story. Your server trounced it on the 70B model though. The short story was a whopping 1.12 tokens/s and I really didn't have the patience to do the longer one.
I went to bed yesterday night thinking about what other ways we can train the models, other than feeding the documents at the prompt - and woke up in the morning with this video being at the top of suggested videos in RUclips !
My radio in my car tells me what I'm thinking all the time! It's been listening to me for a long time
I guess it knows me
@@B33t_R007 depends on the model you using and the level of quantization used. In truth when buying PC for AI, GPU VRAM capacity is more important than CPU RAM.
Go for a PC with a 64GB VRAM you could run llama 90B with half precision(bfloat16)
Wow, Google really knows you.
LOL, white text on blue background, watching the 70B model generate was like a total flashback to the bulletin board days on dialup/early isps.
I appreciate the practical hands-on approach to your AI videos. So many AI presentations are 90% hype (or more).
Agreed! And those that aren't hype require three years of careful study in the field before you can understand them.
This video is perfect for just showing me how to accomplish what I want to get done.
Dave: you can "bind mount" the documents directory into the container FS. Bind mount is docker-speak for "reflect a piece of the host FS into the container FS".
Dave, you've been reading my mind lately on topics I want to know more about. Thanks for making this!
I've started playing around with a local LLM based on your other video. This was very helpful. Thanks!
Love this - I'm hoping to set up my own in the next year!
I'll need to watch this back and take notes! Cheers!
I also loved the part when your blinds momentarily lost connection to wifi at 11:12.
For years I’ve stored documentation relevant to my domain, more than I can read in ten life times and cherry pick from to build conferences and workshop. For some time I knew that local model working on your own data data set could be useful. But today, when I see you explaining that in your video, that became obvious that I should do that move!. Thank you for your help. I subscribed to watch more about local set up of a llama model.
This is awesome, what a time to be alive! Much appreciated, thank you Dave!
Great demo. It shows that you had the knowledge to know when it gave a subpar answer. People going in without that knowledge won't have that ability though. It again shows that these are assistants that may need help, not replacements, as some are pitching LLMs as
Dave, this is great! I walked through everything this weekend, but my web-ui looked different. I didn't have a scan button under documents, but under workspace I had a "knowledge" tab where I would upload documents to my custom model. I think they must have just changed this functionality to make it easier to upload documents without copying to the backend because I could upload the documents from the web-ui and it automatically put them in an "upload" directory and used the document as a reference the same as your video. Overall, I think the change is a huge improvement. Thanks for putting this together, because I was able to use it to host my own AI at home using ollama and web-ui and build custom models using my own reference document. Keep up the great work!
Software updates and changes occur often due to the rapid development of AI tools, so every step in this tutorial could be obsolete with-in days, weeks, months...
@Dave's Garage i'm playing with this myself now. i'm having success adding docs via the Knowledge section for the workspace. then, when making a new model, it can reference that doc the same as when you scanned the folder. i did that because i am running the ui in a docker container. i know how to find and populate the volume mount, but i wasn't seeing a button to scan the directory. but the Knowledge section is another way to add the docs! great video!
It is insane how fast AI is moving. Since you made the video Open WebUI have released an update for knowledge and document management, making it easier (you can now upload directly from the web interface)!!
Thanks for making these videos, I love how clear and concise they are, as well as just entertaining!
For users who can't find the document tab in Workspace in V0.3.12, it is now called Knowledge and you can just upload your documents here even while in Docker. In Workspace, create your Model as per normal but choose the Knowledge Base that you've created and uploaded files into.
Everything else works as normal.
Absolutely awesome content you are pushing out on this Dave, thank you so much!
Im a little worker bee at Apple with 0 programming skills and im using what im picking up from you to try and make our department a custom IA for some things.
Thank you!
Thanks Dave! RAG now makes a lot more sense to me. This sounds like a way that would actually make AI LLMs useful to me.
RAG is a godsend. And it’s necessary as new knowledge itself is not used for LLM training, and forget about your/your company’s specific knowledge
Thank you. This is not self-explanatory but your video helped me make this work.
You can also map a local directory into a docker container at a specified path with the --volume flag.
So there are no limits to the size that can be uploaded? Drive is usually 10s or 100s GB.
Dave... greetings from Uruguay! I must say... you're the man! Thanks for this one!!!
Thanks for the friendly introduction to this topic, Dave. Quite keen to try this stuff out now.
I have a Wix site with chat that I’d love to use my own LLM for. This is a terrific start to figuring all that out. Thanks Dave!
30 seconds into the video = LIKE - what an intro Dave. Love it!
Hey Dave, your vids are value packed - so much good info in under 20 minutes. I'd love to see a similar vid on AI image analysis - perhaps for use with security cameras, or with 3D printers to detect foul-ups etc.
Wow! PDP11??? I did my summer practice on a PDP8 + Donner Analogue in 1970. And met my first PDP11 some years later when I started to work at DEC. Nice video and nice memories 😊 Thanks!
This is where I wanted AI to be two years ago. Great video.
Once Sales directors, Project Managers and Product Owners realise that their company needs reliable and reviewed documentation to leverage LLMs, the nightmare of technical documentation will begin. I'm pro manuals and pristine technical documentation, but the majority of engineering teams NEVER document knowledge: I guess Knowledge Management will have a resurgence.
AI can auto-generate the doc from the code.
They don't want to be replaced 😂
I asked ChatGPT whether creating a custom GPT actually places documents into the context window or if it access them via RAG. Here's what it said: When you create a custom GPT and upload documents, those documents aren't loaded directly into my context window or memory. Instead, they're used with a retrieval-augmented generation (RAG) approach. This means that when you ask a question, the system searches through the uploaded documents to retrieve relevant information and incorporates those specific parts into the response dynamically.
So, I only "see" what’s relevant to your query at the time, rather than having the entire document always loaded in memory. This keeps responses focused and ensures data privacy by only referencing what’s necessary for the question.
I have the same question. 16:19 "It will be slower" part is what I am wondering about. If RAG can search and bring the most relevant pieces from the whole set of docs, we should always include all the docs, and it should be approximate the same amount of time
New Subsciber and Former DEC PDP-11 Software Engineer !
Thanx Dave. I think I finally found a way to get rid of all that zillions of user manuals of all kind of equipment (from kitchen to garden and everything in between). And still find an answer when I really need to use a manual.
Thanks, that video was actually very helpful. I am midst working out different ways to use our local documents to chat with them. Preferably with a local hosted LLM.
Perfect explanation, thanks a lot! I did not know that open-webUI can do RAG, again thanks!
So cool, you create an PDP-11 expert. I am old enough to remember this type of computer. The video was also very informative. Thank you.
Love your clear explanation, and I might not have heard you correct on why you wanted to run Ollama locally rather than in a container... but you can just mount a volume to the docs rather than run locally. That would enable maximum flexibility.
Dave I asked for this exact setup your talking about, dont know if you saw my comment but either way I REALLY appreciate this video.. It will help my create my own custom base of information for computer parts for gaming office and server ect.. SO GRACIOUS!!!
specswriter AI fixes this. Local large language model training
great stuff Dave, lots of things to try. Thanks for the direction.
Thank you for the clear instruction and still bringing back memories of PDP11's.
Thanks for being so clear and detailed in your presentation. Even someone like me without formal tech training can understand and follow the steps easily. 🙌 Liked. Subscribed. Waiting for more!
Re: RAG - This is the first time anyone has shown me an actual USEFUL reason to use AI (aside from "Wow! It's really neat!) Thanks Dave!
I am both stunned and very very apprehensive. AI is beyond human.
Another great video, Dave! Have you thought about covering Prompt Engineering? It would be awesome to hear your take on techniques like Shot Prompting, Chain-of-Thought (CoT), and working with AI Agents, especially with where that field is heading. Thanks for all the valuable content - I love all your videos!
The tooling available today to make this so easy to do on your desktop computer is just amazing. The tech here is very dense and can be challenging to do without the tools available today.
Thankyou for running the other models. There’s a lack of people actually just running simple ollama comparisons from hardware.
It took a while, but it turns out to be easy to add documents using web-ui in a docker container under windows. Just using the add documents + button. I don't know where they go, if anywhere but it does work. ChatGpt suggested it. It wasn't obvious on my screen layout, but it is there.
NotebookLM also lets you upload documents into the context and chat with them. The interface is really great.
You speak like a news anchor and it’s awesome
thank you for the brief intro into setting up a custom GPT, online and local. but the openwebui+ollama combo as docker container is still fit, you just need to map your host folder to the openwebui container path, there's no need to COPY something INTO the container
You could use docker and mount a physical drive/folder on the host as the doc folder or a subfolder of it, for which you probably need to adjust the Dockerfile setup, so you won't need to copy, if you have access to whatever runs the docker container
Thanks for the custom GPT demo, very useful.
Love being a sub, and love to like videos with this quality content. This is one of the most exciting projects I have got to play with in decades ... This AI stuff is the door to a whole new world of learning . LOVE IT !!!! I love that for now it is free....
I love your AI stuff. I have learned a lot about AI and how it works from your channel. THANK YOU!!!
Any chance you’d consider a deeper dive into RAG? I'd love to see more on the practical side - like how it can improve response accuracy, the nuances of retrieval techniques, and combining RAG with methods like CoT (Chain-of-Thought) and Agents. I’d be thrilled to see where you take this topic next. Thanks for all the great content!"
This is very interesting to me. I recently downloaded langchain to start playing with local documents.
Your pre-recorded summary following the RAG demo was about your Custom GPT. Good vid anyway. Thanks. 18:20
Excellent, many thanks, great presentation, well done
SOLD, I'll install open webui . Thanks Dave
oh, just a straight and easy information. what a man, thanks
I can tell you from experience, as this is my current field of expertise, that A100s are no longer a viable platform. The H100 is current and will be quickly replaced with the H200. This means that, on the second-hand market, you can pick up an A100 relatively inexpensively for all your home lab needs.
You had my hopes up that I might be able to make running the llama3.1 70b model viable at home, but so far those things are still going for $7,000 - $25,000 used. I guess I’m waiting until the H200s are widely deployed and/or getting long in the tooth, or until I win the lottery or something. Still, good to know that day is coming! I asked ChatGPT how the model’s speed would likely compare on my system with that vs. my current 3080 and the numbers are estimated to come in between 20-150x faster. I can hardly wait! That model takes ages to run natively on my i9, but the quality of the output is approaching functional for what I’d use it for, so I’m thinking something around there is probably near my bottom of the barrel target for the time being.
Where is a good place to buy those A100s second hand? Is there a specific channel where we can find them? And how much do you estimate they will cost? As right now they are definitely outside the price possibilities of a homelab
@@babybirdhome FWIW the 70B version of Llama3.1 ran on the system I have at home. I wouldn't say it was exactly fast, but it did work. Ryzen 9, 4080 Super and 32 giggs of RAM FWIW.
@@raduboboc A100 occasionally comes on eBay at a lower cost, depending on supply and demand. I consider them affordable for the home market used, considering that new they cost around 12k. However, you will have two other hurdles than the price you must overcome. An A100 does not use a standard PCIe power connector, even in the PCI form factor. It uses what is known as an EPS connector, which is the same as the CPU connector to the motherboard, and although you could jury rig something together from two PCI 8pin adaptors to EPS, it is not mainstream right now, so something would have to be custom made. Now, let us talk about cooling. These devices are meant to be in pressurized data centers with at least very high airflow and hot and cold aisles. This means that to use something like this at home, you would need to print a 3D shroud to interface a high CFM fan, and you may not want to hear that thing running. Also, keep in mind that the A100-PCIE-40GB is powered by the 7nm (TSMC) process node, which makes it relatively old compared to even consumer video cards like the RTX 4090 with 4nm (TSMC) and with more than twice the CUDA cores. I realize video memory is what you need in AI workloads, but there are confirmed reports of the 4090s after the market was modified to 48 GB of RAM for this exact reason.
@@adamsnook9542 At which level of accuracy did you run it?
Q4?
Serious double take here. I worked with the PDP-11/34, recognized the reference immediately, and booted from the panel every day. 😊
Create information Dave! The lastest version of OpenWeb UI has changed this a bit with Knowledge Collections. There is no scan button anymore under Admin Panel --> Settings --> Documents .
So where is it?
Same, container or bare metal, seems to be gone.
Thanks Dave. This is what i was looking for.
Great video. Always great audio.
I downloaded Ollama and llama3.2 and asked it if it was running locally. No, it said, it could only run from a server, not locally. So then I pulled out my internet connection and continued to chat with it!
Hahah, nice!
If you really wanted to use your docker image you could mount the docs directory as a docker "volume" with
docker run -v :
Thanks for this Dave, great inspiration for playing around with LLMs - and maybe even get something useful out of them :-)) One thought: If - as they claim - LLMs had already absorbed all the knowledge know to man, there should be nothing more to add… so this shows that’s just hype, like much of the rest ;-)
Very informative and entertaining! Nice one!🎉
You are an inspiration and legend!
Hi Dave - Autism & AI - I'm interested in hearing about your overlap because I see it too. We're the same age, I fall into the Asperger's camp and everything I've learned about AI parallels my autistic thought process. I'm interested in developing this relationship for the benefit of autistic others and curious to learn about the connections you've made.
Great info. Will start playing around with this info for sure.
Superb and honest, Dave. Rare.
Very useful and very clearly explained.
Excellent video, thanks Dave!
That the informations i need. RAG is what i want. Thank you.
Great starter, thank you. The info on webui is out of date now. I had to create a knowledge . It did not do great but I'll just have to try various models.
That's going to be fascinating when the concepts settle into a viable product for enterprise and SME clients; though there's going to be some potentially messy security implications though particularly if used in a legal field (so if a firm has confidential/sensitive documentation that is fed in or is operating with a Chinese Wall in place internally, even demonstrating the integrity of that would be awfully difficult).
I'm keen for AI like this to be used as a starting point or an "ideas expander" rather than a be-all and end-all of expertise, to that end it'd be nice if it could respond with the actual documentation link references or index links - though it doesn't seem to be built that way at this stage which is fair I guess.
I do love the notion of being able to chomp on company technical and archival documentation like this though - places that discarded or threw things out (purely for handling reasons) decided that on primitive estimations of retrieval and usage, best to err on the side of science fiction with that I think (unless disposal is required by legislation of course).
Fantastic content Dave. Thank you!
I had been using what would be considered more structured ai many years ago. Then it was called ultra hal. Today I no longer use that one I use a combination of semantic word net, conceptnet, xgboost, and distilli gpt2 for fluency. Using much less resources for a wearable pc.Still playing around with it, may be a little in the woods for many. Like Dave is pointing to you may need a much more powerful machine to get reasonable results. I am working with 32 and 64 gb machines for a wearable applications. Think gigabyte brix mini pc. While there are lower power consuming hardware, you half to strike a balance for the application you may be using it for and the results you may expect in my use context of a wearable pc. I will definitely be playing around with rag and my setup.
Uh, wow! I had no idea this was possible. I'm about to build a gpt out of the manuals folder I keep for my music equipment. Mind blown.
I'd just use LM Studio. Has a UI, has built in RAG stuff, can easily turn on an API for local dev, etc. etc. etc.
Dude, thanks for your content. Just Great.
Excellent²*1000! Thanks a lot, great video. Thank you very much Sir, I was experimenting and you provide a lot of answers to my questions. Legend!
Scripting with Open Interpreter and Ollama in Python has been blowing my mind this week. The myriad of added value and possibilities this new toolbox provides are seemingly endless.
The retraining is called _finetuning._
If you don't need images and the OCR is good enough you can remove the images from documents, so it doesn't need to see the images.
Super interesting! A couple of practical questions:
1) How can you deal with constantly updating files? I’d like to just have my entire Obsidian directory stack accessible by RAG (including PDF attachments), but all the ongoing new documents and changes to existing ones would need to be ingested somehow. Can it do that incrementally, or would I just need to re-run the ingesting process? (Running overnight via a cron job would be fine, I’m on a Mac if that matters - and yeah, I know it’ll be slow on my M1 Pro/64gb)
2) Is there any way to have it provide a link, index or other reference to specific documents it found the answer on? My use case would be as much about access and retrieval as query and summarization.
(3 - For unrelated bonus credit if any readers might have a solution: Any good way of integrating Google Docs with local Obsidian? I don’t want to use Obsidian’s publishing function, but would like to be able to integrate individual Google Docs into my Obsidian system.)
Exactly what I needed, exactly when I needed it.
really love your work Dave. Suggestion for content - build Hal9000 or Grok (hitchhikers guide). Voice to text to ai local model with a ui. Fun project, currently possible.
Extremely interesting. Thank you
14:28 You can mount a local drive into docker easily with -v. It's dead simple to get cwd and have docker read it.
Discovered you completely by chance, but I really appreciate the insights. I was wondering if there's a way to automate the parsing of documents in RAGs. My hope is to aggregate my multiple cloud storages into a local, synced NAS and then use a local LLM (embed / vecotrise) etc. Any plans to build a video on this?
Watching your channel is always a fascinating and educational journey. Keep leading us into the world of your creativity and inspiring us!🌿🖐🏵
Point this at your Linux logs and have fun. It works surprisingly well when I tested this out. Oh and use a good model that handles this.
The numbers for the "tell me a story" test are quite interesting on the Big Box. FWIW my setup at home is a Ryzen 9 with a 4080 Super. It managed about half the performance of your box on the Llama 3.2 tests with 173.74 tokens/s on the short story and 168.20 on the long story. Your server trounced it on the 70B model though. The short story was a whopping 1.12 tokens/s and I really didn't have the patience to do the longer one.
Fascinating! Thanks, Dave.. 👍
2001: Hello Dave, I can't do that .....
Thanks Dave, this is good to know!
Great stuff, THANKS! :)
OpenWebUI Docker Folder - you can start the container and map/mount an internal folder to a local
“Approximately correct answers”
Yup - that sums up the quality of the answers. Pretty good, but not good enough to trust it with something important.
Great as ever Dave.
9:05 do you really type that fast?
Thanks Dave, I'm buying another mug...