Feed Your OWN Documents to a Local Large Language Model!

Dave's Garage

Просмотров 420 тыс.

20 000

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 7 фев 2025
Dave explains how retraining, RAG (retrieval augmented generation) and context documents serve to expand the functionality of existing models, both local and online. For my book on the autism spectrum, check out: amzn.to/3zBinWM
Dave's Attic - Friday 4PM Podcast - / @davepl
Follow me for updates!
Twitter: @davepl1968 davepl1968
Facebook: davepl

Комментарии • 593

@bipolarbear-40N Месяц назад ⁺¹⁷⁸
No nonsense, 100% information, 0 fluff videos. Thank you so much for sharing your knowledge.
@somudadda6539 9 дней назад
also he has a very symmetric face]\
@JamesNashWalker День назад
so true
@JordRengerve 2 месяца назад ⁺⁶⁰
For years I’ve stored documentation relevant to my domain, more than I can read in ten life times and cherry pick from to build conferences and workshop. For some time I knew that local model working on your own data data set could be useful. But today, when I see you explaining that in your video, that became obvious that I should do that move!. Thank you for your help. I subscribed to watch more about local set up of a llama model.
@genejoanen Месяц назад ⁺⁵
I'm with you! I have over 37 years of reports that I've generated and all the backup documents. I would love nothing more than to create my own local AI that can generate a report just like I would based on the data that I input from a brand new data set to build the report using language that I've mastered over the years. Could be a big time saving measure!
@1Eagler Месяц назад
@@genejoanenwon't y need a lot of money for doing this?
@vincentfernandez7328 14 дней назад ⁺⁵
but dont use chatgpt for that, you are uploading all your knowledge to a propietary company.
@davidboreham 3 месяца назад ⁺⁵⁸
Dave: you can "bind mount" the documents directory into the container FS. Bind mount is docker-speak for "reflect a piece of the host FS into the container FS".
@ChristopherReevesNZ Месяц назад ⁺¹
Thats what I was thinking too
@Fmstrat 27 дней назад
This should definitely be a pinned comment.
@robinkuster1127 10 дней назад
I'm pretty sure that on macOS and Windows, this has a performance overhead. On Windows you can mitigate this by being in WSL2 but he's on macOS and I think the bind mounts are not as performant as on Linux which can be annoying.
@Fmstrat 10 дней назад
@@robinkuster1127 No different than the volume bind mount the container runs in. Just poor performance all around comparatively to Linux.
@googleuser4434 9 дней назад
Came here to say this
@bwcbiz 3 месяца назад ⁺⁸²
I appreciate the practical hands-on approach to your AI videos. So many AI presentations are 90% hype (or more).
@senseisecurityschool9337 3 месяца назад ⁺³
Agreed! And those that aren't hype require three years of careful study in the field before you can understand them.
This video is perfect for just showing me how to accomplish what I want to get done.
@krishnaprasadsomashiandan4660 3 месяца назад ⁺¹²⁸
I went to bed yesterday night thinking about what other ways we can train the models, other than feeding the documents at the prompt - and woke up in the morning with this video being at the top of suggested videos in RUclips !
@LeahleahB 3 месяца назад ⁺⁶
My radio in my car tells me what I'm thinking all the time! It's been listening to me for a long time
I guess it knows me
@landryplacid4065 2 месяца назад
@@B33t_R007 depends on the model you using and the level of quantization used. In truth when buying PC for AI, GPU VRAM capacity is more important than CPU RAM.
Go for a PC with a 64GB VRAM you could run llama 90B with half precision(bfloat16)
@kopachke 2 месяца назад ⁺⁴
Wow, Google really knows you.
@wizarddon 2 месяца назад ⁺⁶
Now they reading our thoughts
@JohnDanielPalvai 2 месяца назад
hi krishna, i am already using RAG but i am also thinking the same i.e, how can i train models on my data, video says that its very hard to do that , but did you find any way?
@KerboOnYT 4 месяца назад ⁺⁵⁷
I've started playing around with a local LLM based on your other video. This was very helpful. Thanks!
@erictayet 3 месяца назад ⁺¹⁵
For users who can't find the document tab in Workspace in V0.3.12, it is now called Knowledge and you can just upload your documents here even while in Docker. In Workspace, create your Model as per normal but choose the Knowledge Base that you've created and uploaded files into.
Everything else works as normal.
@JamieBainbridge 2 месяца назад ⁺¹¹
Using a smaller model for RAG is an interesting idea, so that it has less of its own content and relies more on the documents. I always learn something from you, thanks!
@NC17z 2 месяца назад ⁺³
Thank you so much for taking the time to create this video. It answered so many questions for me. Completely "Demystified" much of the mystery around what path is best for me personally for adding data to an LLM. I'm so glad you came up in my RUclips feed. Thanks again.
@jonw0224 3 месяца назад ⁺¹⁴
Dave, this is great! I walked through everything this weekend, but my web-ui looked different. I didn't have a scan button under documents, but under workspace I had a "knowledge" tab where I would upload documents to my custom model. I think they must have just changed this functionality to make it easier to upload documents without copying to the backend because I could upload the documents from the web-ui and it automatically put them in an "upload" directory and used the document as a reference the same as your video. Overall, I think the change is a huge improvement. Thanks for putting this together, because I was able to use it to host my own AI at home using ollama and web-ui and build custom models using my own reference document. Keep up the great work!
@tonysolar284 3 месяца назад ⁺¹
Software updates and changes occur often due to the rapid development of AI tools, so every step in this tutorial could be obsolete with-in days, weeks, months...
@lmath56 3 месяца назад ⁺¹¹
It is insane how fast AI is moving. Since you made the video Open WebUI have released an update for knowledge and document management, making it easier (you can now upload directly from the web interface)!!
Thanks for making these videos, I love how clear and concise they are, as well as just entertaining!
@prational 3 дня назад
Great Channel really enjoy your talks. About 20 years ago I worked for a company that had full source code to Windows NT as it was called then. I remember coming in on weekends just sifting through the code trying to understand how the operating system worked. Very complicated and very impressive. I remember one time Dave Cutler came to our company and sat down with all the engineers and gave a talk. What a brilliant man! Have been retired now for about 10 years and am currently trying to understand large language models. So your video was very appropriate. I often wondered how RAG worked..
@RegalEase 4 месяца назад ⁺³⁵
Dave, you've been reading my mind lately on topics I want to know more about. Thanks for making this!
@dlbattle100 3 месяца назад ⁺¹⁷
I asked ChatGPT whether creating a custom GPT actually places documents into the context window or if it access them via RAG. Here's what it said: When you create a custom GPT and upload documents, those documents aren't loaded directly into my context window or memory. Instead, they're used with a retrieval-augmented generation (RAG) approach. This means that when you ask a question, the system searches through the uploaded documents to retrieve relevant information and incorporates those specific parts into the response dynamically.
So, I only "see" what’s relevant to your query at the time, rather than having the entire document always loaded in memory. This keeps responses focused and ensures data privacy by only referencing what’s necessary for the question.
@leoxiao2751 2 месяца назад
I have the same question. 16:19 "It will be slower" part is what I am wondering about. If RAG can search and bring the most relevant pieces from the whole set of docs, we should always include all the docs, and it should be approximate the same amount of time
@t33mtech59 2 месяца назад
I think me meant it will be slower to demo/film
In practice yeah youd dump the whole folder of relevant docs@@leoxiao2751
@notaras1985 Месяц назад ⁺²
How can i feed my uni notes to an LLM?
@btCharlie_ Месяц назад ⁺¹
@@notaras1985 Literally the video tells you how. You just need to make sure your notes are typed as text on a keyboard or scanned with OCR and saved as pdfs. I don't think the LLMs can easily "read" images.
@zinsy23 Месяц назад ⁺¹
Yeah, I'm pretty sure this clarification is needed within the original video. I believe what you said is more accurate regarding how ChatGPT treats things and would certainly explain why I've gotten strange results. I think of the context window used in chat as an immediately available short term memory and the uploaded documents like a long term memory library which gets fetched as needed. I believe ChatGPT is more like a RAG tool, while other tools like NotebookLM by Google, is the context window type they claim in the video originally regarding ChatGPT. *Correction: As I watch more, I realize they uploaded documents in the video to chat directly, which would make that usage a context document rather than RAG. But I still stand by what I say regarding the custom GPT side of things.* NotebookLM seems to incorporate documents in the context window directly, hence why it errors out when the total tokens of all documents exceeds at least 2 million tokens as of now (due to 2 million context window in its current model vs ChatGPT 128k). ChatGPT can go beyond that not because of context window capacity, but the fact that uploaded documents are more of a long-term memory library not immediately available unless fetched. It's more like unstudied look-up materials, whereas NotebookLM is studied look-up materials if you will.
Tokens (as in context window tokens) are only involved in the "short term" memory and information can be fetched in there as needed. I find most of the training process on custom GPTs actually involves optimally indexing these "long term knowledge library" documents with good labels and having good custom instructions so it accurately fetches the right parts without fail. If general models are fed inaccurate information from the internet in initial training, ChatGPT still pulls from the general inaccurate model despite being provided the information and instructions. NotebookLM, which is grounded in provided materials and everything is in the context window directly, answers far more accurately even without custom instructions! Obviously, that depends on whether custom instructions are needed to correctly interpret information provided. I believe that is due to everything being in immediate context vs the ChatGPT fetching mechanism.
I think it's more accurate, as you are implying, that ChatGPT is effectively a RAG rather than the context document due to the short vs long term memory concepts in it. NotebookLM would likely be the context document type because everything is in the context window itself and not some "knowledge library" getting fetched as needed. There is no "long term library" which gets spontaneously retrieved in NotebookLM. Yeah, I got tripped up in what Dave said about ChatGPT being a context window type, so thanks for pointing this out.
@jjolleta 26 дней назад ⁺⁴
This is by far the best video you´ve ever done...... Thanks so much !!!!!!!
@paulw3182 23 дня назад ⁺¹
I was looking for this video - WOW, thanks DAVE!! Finally the RUclips suggestions actually work - Watch you all the time - subscribed, yet busy.. missed this video!
@GeertBaeke 13 дней назад ⁺⁹
There are several inaccuracies here. Option two and three are actually the same because content is always added to the context window in real time. You can upload very large files to ChatGPT which would be inefficient to load in the context window at once. And expensive. So in your option two, where you upload a doc to ChatGPT, it does a sort of mini rag. It converts and chunks the content and searches for relevant chunks that fit your query. Your example of the custom GPT is in fact full RAG with multiple documents. In the backend, OpenAI uses Azure AI Search which is a vector database where your content is persisted. The documents you upload are chunked and vectorized. When you ask a question, relevant chunks are added to the context window to answer your query. Open WebUI uses a similar approach.
@AmitTiwari-sb3qy 6 дней назад ⁺¹
Thanks buddy, I was thinking the same
@happy-stash 4 дня назад ⁺¹
Ah, I just asked something about this few minutes ago. I was indeed confused because the 2 felt the same ... both seems persistent and "RAG-ish".
@javabeanz8549 4 месяца назад ⁺²¹
Thanks Dave! RAG now makes a lot more sense to me. This sounds like a way that would actually make AI LLMs useful to me.
@thecompanioncube4211 3 месяца назад ⁺¹
RAG is a godsend. And it’s necessary as new knowledge itself is not used for LLM training, and forget about your/your company’s specific knowledge
@JohnKuhles1966 2 месяца назад ⁺⁷
Local Deployment: First, you'd need to deploy a local instance of an LLM. This could involve downloading pre-trained models like those available from Hugging Face, Google's BERT, or others, and setting up the necessary hardware (powerful enough GPUs or TPUs for running these models efficiently).
Customization:
Fine-Tuning/Retraining: If you have specific data you want the model to know or focus on, you can fine-tune the model. This doesn't mean retraining from scratch but rather adjusting the model's parameters based on your dataset, which could be documents, texts, or any other form of data relevant to your needs. This is particularly useful for creating domain-specific applications, like legal advice, medical diagnosis, customer service in your industry, etc.
RAG (Retrieval Augmented Generation): Set up a system where your LLM can pull in additional information from a local database or document set in real-time when generating responses. This method requires you to have a retrieval system in place, like an efficient database search or vector space model for semantic search, which would find and feed relevant documents or data pieces as additional context for the LLM's responses.
Contextual Inputs: Directly inputting documents or data as context for each query. This doesn't change the model's knowledge base permanently but allows it to generate responses based on the specific information provided for each interaction.
Tools and Libraries: Use frameworks like PyTorch, TensorFlow, or libraries specifically designed for LLMs like Transformers from Hugging Face, which provide tools for model fine-tuning, data preparation, and deployment.
Privacy and Control: Running AI models locally gives you complete control over data privacy, as there's no need to send sensitive information over the internet. This is particularly appealing for sectors like healthcare, legal, or any enterprise needing strict data governance.
Custom User Interface: Develop or adapt a user interface where users can interact with your specialized AI tool, whether it's through a web app, desktop application, or even a chatbot for internal business tools.
Integration: Your specialized AI could be integrated into existing systems or workflows, enhancing functionalities like document analysis, customer support, research assistance, etc., with tailored responses.
The challenge lies in setting up the infrastructure (hardware and software), managing the computational load, and ensuring the model's performance aligns with your specific requirements. However, with the right resources and know-how, creating such a specialized AI tool locally is not only feasible but can offer significant advantages in terms of customization, privacy, and application-specific performance.
@joeshmoe4207 Месяц назад
What is the point of this? I see people like you commenting random generated overly wordy and unhelpful summaries of either the transcript of the video or literally just a simple question. Do you think that helps anybody? Is it just for interaction so you feel good about yourself?
@JohnKuhles1966 Месяц назад
@@joeshmoe4207 if you do not get it ... move on ... nothing to see here
@MichaelCoIIins 4 месяца назад ⁺¹¹
Absolutely awesome content you are pushing out on this Dave, thank you so much!
Im a little worker bee at Apple with 0 programming skills and im using what im picking up from you to try and make our department a custom IA for some things.
Thank you!
@rgsiiiya 4 месяца назад ⁺⁷⁴
LOL, white text on blue background, watching the 70B model generate was like a total flashback to the bulletin board days on dialup/early isps.
@onedrop7967 Месяц назад ⁺⁴
This comment reminds me I am old. BBS and the modem squeal are something that haunts me still... you've got mail.
@louabney 25 дней назад
How old are you! Nevermind, I'm that age also
@Dadum-bass 23 дня назад
A fully grown adult coworker looked at me and asked what it was like growing up in the 1900s, I nearly had a heart attack when I realized that that's how university grads refer to elders.
@annaczgli2983 3 месяца назад ⁺⁴
Thanks for the friendly introduction to this topic, Dave. Quite keen to try this stuff out now.
@strongpresence Месяц назад
Outstanding! Finally, something useful on this topic... referring your channel to everyone I know! Love the approach you take to sharing information.
@CristianAcosta-i9v 4 месяца назад ⁺⁶
Dave... greetings from Uruguay! I must say... you're the man! Thanks for this one!!!
@clewismessina6630 2 дня назад
Again, wow! Your videos are incredibly helpful.
Thank you!
@waylandmayo 2 месяца назад ⁺¹
Dave, to get ride of the dialog box, right click on top of it and select inspect > select the html containing dialog box(the element should be currently highlighted) and right click on it and select delete element. This way you can continue your screencap recording without having the irritating box displayed.
@Agililord 4 месяца назад ⁺⁴
@Dave's Garage i'm playing with this myself now. i'm having success adding docs via the Knowledge section for the workspace. then, when making a new model, it can reference that doc the same as when you scanned the folder. i did that because i am running the ui in a docker container. i know how to find and populate the volume mount, but i wasn't seeing a button to scan the directory. but the Knowledge section is another way to add the docs! great video!
@_pixelpicnic 3 месяца назад ⁺⁵
Love this - I'm hoping to set up my own in the next year!
I'll need to watch this back and take notes! Cheers!
I also loved the part when your blinds momentarily lost connection to wifi at 11:12.
@MrSniper2k7 8 дней назад
Thank you.. . I will be visiting your shop once in everyday.. and keep learning
@ScottGinn 3 месяца назад ⁺²
Love your clear explanation, and I might not have heard you correct on why you wanted to run Ollama locally rather than in a container... but you can just mount a volume to the docs rather than run locally. That would enable maximum flexibility.
@rdatta Месяц назад ⁺¹
An excellent episode Dave. Really well done. The explanation of training, augmentation and session based approaches is very useful not just for this episode but for others as well. Thanks.
@greenmoondog 6 дней назад
Brilliant... and prefect timing showing up in my feed, thankyou
@purpleAiPEy 3 месяца назад ⁺⁴
You speak like a news anchor and it’s awesome
@odytrice Месяц назад
Hey Dave, Just came across this video and oh boy! I'm in the middle of researching all of this and you just straight up gave all the answers. Absolutely Love It!
One improvement at 14:15 is that you can mount the docs folder as a volume into the docker container at that path. That way you don't need to use the COPY command in the docker file and bake it into the docker image.
Thank you for all that you do. 🖖
@TheZimberto 4 месяца назад ⁺⁴
Hey Dave, your vids are value packed - so much good info in under 20 minutes. I'd love to see a similar vid on AI image analysis - perhaps for use with security cameras, or with 3D printers to detect foul-ups etc.
@MarkoVukovic0 3 месяца назад ⁺⁷
This is awesome, what a time to be alive! Much appreciated, thank you Dave!
@syntrax-og Месяц назад
wow, I just found your channel and you're great at explaining!
quick, simple, and straight to the points!
Totally subbed.
@terrysimons 4 месяца назад ⁺¹¹
You can also map a local directory into a docker container at a specified path with the --volume flag.
@jagannathdas5491 3 месяца назад ⁺¹
So there are no limits to the size that can be uploaded? Drive is usually 10s or 100s GB.
@igorcastilhos 2 месяца назад
Hello terry, could you help me with something? I work with attorneys and judges in a court of military justice in Brazil. We have a folder that contains resolutions and ongoing processes. I would like to create a collection knowledge in Web UI that will be receiving the newly updated files sent by them in that same folder. I'm an intern in software development, but I have setted up a server with Ollama and WebUI that responds to us in a URL.
@arthurleong 9 часов назад
Wow! Ever since I found out how we could download these LLMs and run them locally, I've always been wondering how we can 'train' them to know some very specific information about our focus area. Thanks for answering this, Dave!
@mikesveganlife4359 4 месяца назад ⁺¹
The tooling available today to make this so easy to do on your desktop computer is just amazing. The tech here is very dense and can be challenging to do without the tools available today.
@josuevivas 4 месяца назад ⁺⁵
This is where I wanted AI to be two years ago. Great video.
Once Sales directors, Project Managers and Product Owners realise that their company needs reliable and reviewed documentation to leverage LLMs, the nightmare of technical documentation will begin. I'm pro manuals and pristine technical documentation, but the majority of engineering teams NEVER document knowledge: I guess Knowledge Management will have a resurgence.
@davidboreham 3 месяца назад ⁺¹
AI can auto-generate the doc from the code.
@terrorists-are-among-us 3 месяца назад
They don't want to be replaced 😂
@geor664 3 месяца назад ⁺⁸⁸
2001: Hello Dave, I can't do that .....
@SineN0mine3 Месяц назад ⁺⁴
I'm pretty excited to replace our home assistant's voice with HAL9000's
@mylocarrillo4729 Месяц назад ⁺¹
😂
@itsallmakebelieve9151 13 дней назад
Cringe
@caspersmith7112 4 месяца назад ⁺¹
Dave I asked for this exact setup your talking about, dont know if you saw my comment but either way I REALLY appreciate this video.. It will help my create my own custom base of information for computer parts for gaming office and server ect.. SO GRACIOUS!!!
@soceanIQ 29 дней назад
Dave, I paused the video after 11min - to thank you for producing it. And the PDP11... in the 80s we developed a graph-based logistics system running on it.
@ET.AIMusic 2 месяца назад
30 seconds into the video = LIKE - what an intro Dave. Love it!
@tjunkieu2b Месяц назад
this is great! but also a bit hilarious. It sounds like Dave is reading text written by GPT 😂. It is just so good and well structured. He may have the linguistic chops to write that well, most people I know don't. And whatever way this level of quality is attained, I applaud and appreciate it very much. I just learned a ton, thank you.
@FiskeFind2008 5 дней назад
Thanks for this very informative video. Likes this practical hands on approach.
@chasisaac 2 месяца назад
This is good. Best explanation I have had between the two. The best piece of advice is to run rag on a small model. Here I am trying to do it on 8b when I should be using a 1b
@GungaGaLunga777 18 дней назад
Thank you for sharing your experience and knowlwedge. You should build a Dave model so we can talk to you as we go thru our own compute adventures. I'm learning so much, just for fun on my home hobby network. Love this channel. Thanks again.
@maxdiamond55 4 месяца назад ⁺¹
great stuff Dave, lots of things to try. Thanks for the direction.
@stevebarnes766 3 месяца назад ⁺⁶
Re: RAG - This is the first time anyone has shown me an actual USEFUL reason to use AI (aside from "Wow! It's really neat!) Thanks Dave!
@RobertLindsley 4 месяца назад ⁺¹
I have a Wix site with chat that I’d love to use my own LLM for. This is a terrific start to figuring all that out. Thanks Dave!
@bobdemp8691 3 месяца назад ⁺¹
Great demo. It shows that you had the knowledge to know when it gave a subpar answer. People going in without that knowledge won't have that ability though. It again shows that these are assistants that may need help, not replacements, as some are pitching LLMs as
@marekdzurak1867 2 месяца назад
sir you are teacher all kids deserve to have better chances in their future ... thank you for your work
@kenchang3456 6 дней назад ⁺¹
Oh goodness, a PDP-11. Memories of days gone by. Thanks 🙂
@michaelvivirito 6 дней назад
It's a pidp11 a pdp11 replica using a raspberry pi. You can build one yourself with a kit.
@ChristopherReevesNZ Месяц назад
Thanks! The RAG setup in Open Web UI is exactly what I was looking for!
@davidtindell950 2 месяца назад ⁺⁵
New Subsciber and Former DEC PDP-11 Software Engineer !
@louprado8319 2 дня назад
Since Dave uploaded an 8MB document around the 8 minute mark, I asked gpt the following:
what happens if an uploaded pdf has more words that your input token limit ?
ChatGPT said:
If an uploaded PDF exceeds my 128K token limit, I won’t be able to process the entire document at once. However, I can still help by:
Processing the document in sections - I can analyze and summarize parts of the PDF separately.
Extracting key sections - You can specify which sections (e.g., chapters, pages) are most important.
Guiding you on splitting the file - You can divide the PDF into smaller parts and upload them one by one.
**Providing a compressed summary
@turbostar101 Месяц назад ⁺¹
Am I the only one who is impressed by Dave’s blazing typing speed? 😂 7:57
@AAACrazy Месяц назад ⁺¹
All required informative, watched without taking a breath. I have seen much worse videos in e-learning platforms. Good job. Thank you.
@richardzeitz54 Месяц назад
How useful and helpful! RAG is just the thing I need. Thanks!
@andikunar7183 4 месяца назад ⁺²
Perfect explanation, thanks a lot! I did not know that open-webUI can do RAG, again thanks!
@tvstation8102 Месяц назад
Fantastics video. First decent explanation of the often referenced RAG I've found. Will go try it for my data now!
@timwetterau6168 Месяц назад
Thank you for sharing your massive knowledge! I really appreciate this. Liked this video for the super informative content. Keep going! Love your videos.
@RogerValor 3 месяца назад
You could use docker and mount a physical drive/folder on the host as the doc folder or a subfolder of it, for which you probably need to adjust the Dockerfile setup, so you won't need to copy, if you have access to whatever runs the docker container
@hstrinzel 17 дней назад
WOW, another awesome video, explaining, defining AND SHOWING AI things SO WELL! THANK YOU! Genius level work!
@d3n4c3 20 дней назад
These options have kind of changed in the latest builds. You can now directly upload the knowledge from the UI frontend. So, it's much easier to add your knowledge base. Also when making the KB or the Model the button to add is all the way on the right side a tiny little + icon. You can miss it if you don't look closely.
@mehmetnaciakkk3983 2 месяца назад
Wow! PDP11??? I did my summer practice on a PDP8 + Donner Analogue in 1970. And met my first PDP11 some years later when I started to work at DEC. Nice video and nice memories 😊 Thanks!
@bwrscott1 4 месяца назад
Thank you for the clear instruction and still bringing back memories of PDP11's.
@QuintBoney Месяц назад ⁺¹
Fantastic vid Dave. This will help the individual level of employees start to incorporate AI into their lives. Very well done. Thank you.
@brunoandlydia 13 дней назад
in Docker setup - you don't need to copy the documents manually. You can simply use binded mounts, so that you can easily share docs between host and container. However in docker - you might have some issues utilizing GPU (e.g. on a Mac)
@PauloEdson 3 месяца назад
Thanks for being so clear and detailed in your presentation. Even someone like me without formal tech training can understand and follow the steps easily. 🙌 Liked. Subscribed. Waiting for more!
@kunalvids 18 дней назад
The keyboard sounds is just awesome.
@tonyg_nerd 3 месяца назад ⁺²
Your pre-recorded summary following the RAG demo was about your Custom GPT. Good vid anyway. Thanks. 18:20
@stephendgreen1502 2 месяца назад ⁺¹
How a legacy codebase could be used with RAG would be interesting. A base that had the learning of the codebase language would be obvious starting point. Thanks very much for ruling out relearning.
@4X6GP 4 месяца назад ⁺⁹
I downloaded Ollama and llama3.2 and asked it if it was running locally. No, it said, it could only run from a server, not locally. So then I pulled out my internet connection and continued to chat with it!
@BBWahoo 3 месяца назад ⁺³
Hahah, nice!
@stephanszarafinski9001 22 дня назад ⁺¹
Haha yes, even when you tell it that it runs local, it won’t believe you. Only after telling it how you did it, it will start to believe you 😂
@espiya5557 17 дней назад
This is really interesting. I want to take advantage of the RAG part for story-writing. English isn't my mother tongue, so I want it to assist me in writing a very long story, and probably the RAG approach will help me remember all of the details. Wish there's an option of which messages/instructions will be incorporated without manually updating the documents.
@czeslawpi 2 месяца назад
Thanx Dave - this was very timely!
@bogdantanasa1374 3 месяца назад
thank you for the brief intro into setting up a custom GPT, online and local. but the openwebui+ollama combo as docker container is still fit, you just need to map your host folder to the openwebui container path, there's no need to COPY something INTO the container
@Nomoreidsleft Месяц назад
Thanks, I needed this. Open-WebUI is just what I was looking for.
@clippership8381 2 дня назад
Nice, really nice. I've listened to many AI stories. If you do your homework, Dave is THE best professor.
@carelvanderpoel9953 3 месяца назад
Thanx Dave. I think I finally found a way to get rid of all that zillions of user manuals of all kind of equipment (from kitchen to garden and everything in between). And still find an answer when I really need to use a manual.
@Igbon5 4 месяца назад ⁺²
It took a while, but it turns out to be easy to add documents using web-ui in a docker container under windows. Just using the add documents + button. I don't know where they go, if anywhere but it does work. ChatGpt suggested it. It wasn't obvious on my screen layout, but it is there.
@gracegoce5295 Месяц назад
I dont get it. Explain it better
@Igbon5 Месяц назад
@@gracegoce5295
I can't replicate the steps since moving to Linux as an experiment. So whether the later versions of openweb-ui are different or the Linux one is I don't know. I'll swap back to Windows soon and maybe that will be different. For now I can't help.
@Igbon5 Месяц назад
@@gracegoce5295
I still can't replicate it it looks like the feature has gone. I have a new system and am at a complete loss too now,
@stevendonaldson1216 3 месяца назад ⁺¹
Great video. Always great audio.
@FunkMasterF 2 месяца назад
Thank you. This is not self-explanatory but your video helped me make this work.
@nschul4 3 месяца назад ⁺¹
NotebookLM also lets you upload documents into the context and chat with them. The interface is really great.
@PITERPENN 3 месяца назад
oh, just a straight and easy information. what a man, thanks
@kwazar6725 4 месяца назад ⁺¹
Thanks Dave. This is what i was looking for.
@pramodhost 2 месяца назад
Clear and precise. Thanks for making this video
@HaydonRyan 4 месяца назад
Thankyou for running the other models. There’s a lack of people actually just running simple ollama comparisons from hardware.
@richards6269 4 месяца назад ⁺²
SOLD, I'll install open webui . Thanks Dave
@PixelKind-k4y 3 месяца назад
Thanks, that video was actually very helpful. I am midst working out different ways to use our local documents to chat with them. Preferably with a local hosted LLM.
@doozowings4672 4 месяца назад
Love being a sub, and love to like videos with this quality content. This is one of the most exciting projects I have got to play with in decades ... This AI stuff is the door to a whole new world of learning . LOVE IT !!!! I love that for now it is free....
@Kauffy901 3 месяца назад
Exactly what I needed, exactly when I needed it.
@billybunt 2 месяца назад
Hi Dave, what a great video. Thank you
@manfredmuller9724 3 месяца назад
So cool, you create an PDP-11 expert. I am old enough to remember this type of computer. The video was also very informative. Thank you.
@prash_singh Месяц назад
This was a very informative video! Thank you.
@frankkolmann4801 4 месяца назад ⁺⁴
I am both stunned and very very apprehensive. AI is beyond human.
@hhhllkk88 2 месяца назад
Ai is advanced pattern recognition and matching that’s all lol
@technowey Месяц назад
Dave - Thank you for yet another excellent video.
I am running the the Ollama 3.1 model on a system with 128 GB of RAM, base on what you taught in your earlier video. Thank you for that video.
However, I was unable to run the Web UI, because my graphics card, which is powerful, was an older NVIDIA card, and there were missing NVidia DLL's that were needed. The newere installations were incompatible with my graphics card.
SInce I just wanted to run it for myself, I really don't need a graphics card at all for running the model. I wish there was a Web UI that didn't require the NVidia DLLs.
@neanda 11 дней назад ⁺⁶
nice, straight to the point. You should do an updated version no that we Deepseek R1 as Retraining should now be possible

Следующие

Автовоспроизведение

Run Local LLMs on Hardware from $50 to $50,000 - We Test and Compare!