Please do a dedicated video on training minimal base models for specific purposes. You're a legend. Also a video on commercial use and licensing would be immensely valuable and greatly appreciated.
[00:00] 💻 Introduction to Local LLM Setup - Overview of setting up a local LLM on your device using LM Studio and Anything LLM. - Single-click installation process for both applications. - Highlighting the benefits of GPU over CPU for running these applications. [00:26] 🛠 Tools and Compatibility - Discussion on compatibility and installation for Windows OS. - Introduction to Anything LLM as a versatile, private chat application. - Emphasis on open-source nature and contribution possibilities of Anything LLM. [01:36] 🔧 Installation Guide - Step-by-step guide on installing LM Studio and Anything LLM. - Highlighting the simplicity and halfway mark of the installation process. [02:04] 🌐 Exploring LM Studio - Navigating LM Studio's interface and downloading models. - Discussion on model compatibility and the importance of GPU offloading for better performance. [03:14] 📊 Model Selection and Download - Detailed overview of different model qualities (Q4, Q5, Q8) and their implications on performance. - Advice on model selection based on size and download considerations. [04:23] 🤖 Chatting with Models in LM Studio - How to use the internal chat client of LM Studio. - Limitations of LM Studio's chat feature and how Anything LLM can enhance the experience. [05:47] 🔗 Integrating Anything LLM with LM Studio - Setting up Anything LLM to work with LM Studio. - Configuration details for connecting the two applications for a more powerful local LLM experience. [06:12] 🛠 Configuring Anything LLM for Enhanced Functionality - Detailed instructions on configuring Anything LLM settings to connect with LM Studio. - Emphasizing the customization options for optimizing performance and user experience. [06:45] 🎨 Customizing User Experience - Exploring the customization features within Anything LLM, including themes and plugins. - How to personalize the application to fit individual needs and preferences. [07:15] 🔄 Syncing with LM Studio - Step-by-step guide on ensuring seamless integration between Anything LLM and LM Studio. - Tips on troubleshooting common issues that may arise during the syncing process. [07:58] 🚀 Launching Your First Session - Initiating a chat session within Anything LLM to demonstrate the real-time capabilities. - Showcasing the smooth and efficient operation of the model after configuration. [08:34] 💡 Advanced Features and Tips - Introduction to advanced features available in Anything LLM, like voice recognition and command shortcuts. - Advice on how to utilize these features to enhance the overall experience and productivity. [09:10] 🌟 Conclusion and Encouragement for Exploration - Encouraging users to explore the full potential of Anything LLM and LM Studio. - Reminder of the open-source community's role in improving and expanding the software's capabilities. [09:45] 🤝 Invitation to Contribute - Invitation for viewers to contribute to the development of Anything LLM and LM Studio. - Highlighting the importance of community feedback and contributions for future enhancements. [10:20] 📚 Resources and Support - Providing resources for additional help and support, including community forums and documentation. - Encouragement to reach out with questions or for assistance in maximizing the use of the tools. [10:55] 🎉 Final Thoughts and Farewell - Reflecting on the ease and power of deploying LLMs locally with Anything LLM and LM Studio. - Wishing viewers success in their projects and explorations with these tools. With love, From a Brother in Christ
00:01 Easiest way to run locally and connect LMStudio & AnythingLLM 01:29 Learn how to use LMStudio and AnythingLLM for a comprehensive LLM experience for free 02:48 Different quantized models available on LMStudio 04:14 LMStudio includes a chat client for experimenting with models. 05:33 Setting up LM Studio with AnythingLLM for local model usage. 06:57 Setting up LM Studio server and connecting to AnythingLLM 08:21 Upgrading LMStudio with additional context 09:51 LM Studio and AnythingLLM enable private end-to-end chatting with open source models Crafted by Merlin AI.
This is exactly what I've been looking for. Now, I'm not sure if this is already implemented, but if the chat bot can use EVERYTHING from all previous chats within the workspace for context and reference... My god that will change everything for me.
It does use the history for context and reference! History, system prompt, and context - all at the same time and we manage the context window for you on the backend
@@IrakliKavtaradzepsyche yes, but we manage the overflow automatically so you at least don't crash from token overflow. This is common for LLMs, to truncate or manipulate the history for long running sessions
I’m just about to dive into LM Studio and AnythingLM Desktop, and let me tell you, I’m super pumped! 🚀 The potential when these two join forces is just out of this world!
I’d love to hear more about your product roadmap - specifically with how it relates to the RAG system you have implemented . I’ve been experimenting a lot with Flowise and the new LlamaIndex integration is fantastic - especially the various text summarisation and content refinement methods available with a LlamaIndex based RAG. Are you planning to enhance the RAG implementation in AnythingLLM?
🎯 Key points for quick navigation: 00:00:12 *💻 Execução fácil de modelos de linguagem locais usando LM Studio e Anything LLM no seu computador.* 00:01:36 *🧩 Instalação simples e rápida do LM Studio e Anything LLM em Windows, metade do processo já é completado ao instalar os programas.* 00:02:59 *📥 Download de modelos pode ser a parte mais demorada, mas essencial para começar.* 00:04:23 *🖥️ Uso de GPU acelera respostas e alcança velocidades como do ChatGPT.* 00:06:14 *🔗 Conexão entre LM Studio e Anything LLM desbloqueia uso poderoso em contexto local.* 00:09:25 *📚 Incremento de modelos com dados contextuais adicionais melhora a precisão das respostas.* 00:09:54 *🔒 Chat local é privado e seguro, sem custos mensais, usando LM Studio e Anything LLM.* 00:10:49 *⭐ Modelos populares como Llama 2 ou Mistral garantem uma melhor experiência.* Made with HARPA AI
Thank you, I've been struggling for so long with problematic things like privateGPT etc. which gave me headaches. I love how easy it is to download models and add embeddings! Again thank you. I'm very eager to learn more about AI, but I'm absolute beginner. Maybe video on how would you learn from the beginning?
Great stuff,this way you can run a good smaller conversational model like 13b or even 7b,like Laser Mistral. Main problem with this smaller LLM are massive holes in some topics,or informations about events,celebs or other stuff,this way you can make your own database about stuff you wanna chat. Amazing.
So if in case we need to programmatically use this, does anythingllm itself offer a ‘run locally on server’ option to get an API endpoint that we could call from a local website for example? i.e. local website -> post request -> anythingllm (local server + PDFs)-> LMstudio (local server - foundation model)
Thanks for this, about to try it to query legislation and case law for a specific area of UK law to see if it effective in returning references to relevant sections and key case law. Interested in building a private LLM to assist with specific repetitive tasks. Thanks for the video.
Thanks a ton ...you are giving us power on working with our local documents... its blazingly fast to embed the docs, super fast responses and all in all i am very happy.
Also, how is this different from implementing RAG on a base foundation model and chunking our documents and loading it into a vector db like pinecone? Is the main point here that everything is locally run on our laptop? Would it work without internet access?
Really awesome stuff. Thank you for bringing such quality aspects and making it open-source. Could you please help to understand on how efficiently the RAG pipeline in AnythingLLM works ? For example: If I upload a pdf with MultiModal content or If I want my document to be embedded in a semantic way or use Multi-vector search, Can we customize such advanced RAG features ?
Mm...doesn't seem to work for me. The model (Mistral 7B) loads, and so does the training data, but the chat can't read the documents (PDF or web links) properly. Is that a function of the model being too small, or is there a tiny bug somewhere? [edit: got it working, but it just hallucinates all the time. Pretty useless]
The biggest challenge I am having is getting the prompt to provide accurate information that is included in the source material. The interpretation is just wrong. I have pinned the source material and I have also played with the LLM Temperature to no avail of an accurate chat response that aligns with the source material. Also tried setting chat mode to Query but it typically doesn't produce a response. Another thing that is bothering me is how I can't delete the default thread that is under the workspace as the first thread.
I have tried, but could not get it to work with the files that was shared as context. Am I missing something? It's giving answers like that the file is in my inbox I will have to read it, but does not actually reads the file
I am a software developer but am clueless when it comes to machine learning and LLM's. What I was wondering, is it possible to train a local LLM by feeding in all of your code for a project?
LM Studios TOS paragraph: "Updates. You understand that Company Properties are evolving. As a result, Company may require you to accept updates to Company Properties that you have installed on your computer or mobile device. You acknowledge and agree that Company may update Company Properties with or WITHOUT notifying you. You may need to update third-party software from time to time in order to use Company Properties. Company MAY, but is not obligated to, monitor or review Company Properties at any time. Although Company does not generally monitor user activity occurring in connection with Company Properties, if Company becomes aware of any possible violations by you of any provision of the Agreement, Company reserves the right to investigate such violations, and Company may, at its sole discretion, immediately terminate your license to use Company Properties, without prior notice to you." Several posts on LLM Reddit groups with people not happy about it. NOTE: I'm not one of the posters, read-only, I'm just curious what others think.
Wait so their TOS basically says they may or may not monitor your chats in case you are up to no good with no notification? okay. I see why people are pissed about that. I dont like that either unless they can verifiable prove the "danger assessment" is done on device because otherwise this is no better than just cloud hosting but paying for it with your resources
Ancient idea clash between wanting to be a good "software citizen" and the unfortunate fact that their intent is still to "monitor" your activities. As you said in your second reply to me, "monitoring" does not go over well with some and the consideration of the intent for doing so, even if potentially justified, is a subsequent thought they will refuse to entertain. @@TimCarambat
@@TimCarambatLet's say there is a monitoring background behind, what if we setup a vm that did not allow to connect to the internet, does that will make our data safe ?
@@alternate_fantasy it would prevent phone homes, sure, so yes. That being said I have Wiresharkd lmstudio while running and did not see anything sent outbound that would indicate they can view anything like that. I think that's just their lawyers being lawyers
Absolutely stellar video, Tim! 🌌 Your walkthrough on setting up a locally run LLM for free using LM Studio and Anything LLM Desktop was not just informative but truly inspiring. It's incredible to see how accessible and powerful these tools can make LLM chat experiences, all from our own digital space stations. I'm particularly excited about the privacy aspect and the ability to contribute to the open-source community. You've opened up a whole new universe of possibilities for us explorers. Can't wait to give it a try myself and dive into the world of private, powerful LLM interactions. Thank you for sharing this cosmic knowledge! 🚀👩🚀
To operate a model comparable to GPT-4 on a personal computer, you would currently need around 60GB of VRAM. This would roughly necessitate three 24GB graphics cards, each costing between $1,500 and $2,000. Therefore, equipping a PC to run a similar model would cost more than 25 years' worth of a ChatGPT subscription at $20 per month or $240 per year. Although there are smaller LLM (Large Language Models) available, such as 8B or 13B models requiring only 4-16GB of VRAM, they don't compare favorably even with the freely available GPT-3.5. Furthermore, with OpenAI planning to release GPT-5 later this year, the hardware requirements to match its capabilities on a personal computer are expected to be even more demanding.
Absolutely. Closed source and cloud based models will always have a performance edge. The kicker is are you comfortable with their limitations on what you can do with them, paying for additional plugins, and the exposure of your uploaded documents and chats to a third party. Or get 80-90% of the same experience with whatever the latest and greatest oss model is running on your CPU/GPU with none of that concern. Its just two different use cases, both should exist
@@TimCarambat While using versions 2.6 to 2.9 of Llama (dolphin), I've noticed significant differences between it and ChatGPT-4. Llama performs well in certain areas, but ChatGPT generally provides more detailed responses. There are exceptions where Llama may have fewer restrictions due to being less bound by major company policies, which can be a factor when dealing with sensitive content like explosives or explicit materials. however, while ChatGPT has usage limits and avoids topics like politics and explicit content, some providers offer unrestricted access through paid services. and realistically, most users-over 95%-might try these services briefly before discontinuing their use.
Get a pcei nvme ssd. I have 500gb of “swap” that I labeled as ram3. Ran a 70b like butter with the gpu at 1% only running display. Also you can use a 15$ riser and add graphics cards. You should have like 256gb on the gpu, but you can also vramswap, but that isn’t necessary bc you shouldn’t rip anywhere near 100gb at once. Split up your processes. Instead of just cpu and ram use the cpu to send commands to anything with a chip, and attach a storage device immediately to it. The pc has 2 x 8gb ram naturally. You can even use an hdd it is just a noticeable drag of under 1 gb/s. There are many more ways to do it, once I finish the seamless container pass I will have an otb software solution for you. -- swap rate and swapiness will help if you have solid storage.
@@Naw1dawg yes, you can modify or add addition to your pc to run LLM on your pc, but still it's not worth to do it. because, most of people who playing around LLM, they would use it only short period of time. a month or so max, what i am saying is paying over $ 5,000 build for the LLM is not worth, compare to paying $20 per month enjoying fun.
I get this response every time: "I am unable to access external sources or provide information beyond the context you have provided, so I cannot answer this question". Mac mini M2 Pro Cores:10 (6 performance and 4 efficiency) Memory:16 GB
I loaded a simple txt file, embbedded as presented in video, and ask a question about a topic within the text. Unfortunately it seems the model does't know nothing about the text. Any tip ? (Mistral 8 bit, RtX4090 24 Gb).
I had a spare 6800xt sitting around that had been retired due to overheating for no apparent reason, as well as a semi-retired ryzen 2700x , and i found 32 gigs of ram sitting around for the box. Just going to say flat out that it is shockingly fast. I actually think running Rocm to enable gpu acceleration for lm studio is running llm's better than my 3080ti in my main system, or at the very least, so similar i cant perceive a difference
OK, I'm confused. If I were to feed this a bunch of pdf documents/books, would it then be able to draw on the information contained in those files to answer questions, summarise then info, or general content based on that info in the same literary/writing style as the initial files? And all 'offline' on a local install? (This is the Holy Grail that I am seeking out.
@@holykim4352 got a link or reference? I've not found any way to do what I want so far. Maybe I misunderstand the process, but I can't seem to find the info I need either. Cheers.
Can't wait to try this. I've watched a dozen other tutorials that were too complicated for someone like me without basic coding skills. What are the pros/cons of setting this up with LMStudio vs. Ollama?
If you don't like to code, you will find the UI of lmstudio much more approachable, but it can be an information overload. Lmstudio has every model on huggingface. Ollama is only accessible via terminal and has limited model support but is dead simple. This video was made before we launched the desktop app. Our desktop comes with ollama pre-installed and gives you a UI to pick a model and start chatting with docs privately. That might be a better option since that is one app, no setup, no cli or extra application
This is an amazing tutorial. Didn't know there were that many models out there. Thank you for clearing the fog. I have one question though, how do I find out what number to put into "Token context window"? Thanks for your time!
Once pulling into LMStudio, its in the sidebar once the model is selected. Its a tiny little section on the right sidebar that say "n_ctxt" or something similar to that. Youll then see it will explain further how many tokens your model can handle at max, RAM permitting.
Thanks for the video! I did it as you said and got the model working (same as you picked). It ran faster than i expected and I was impressed with the quality of the text and the general understanding of the model. However when i uploaded some documents [in total just 150 kb of downloaded HTML from a wiki] it gave very wrong answers [overwhelmingly incorrect]. What can i do to improve this?
two things help by far the most! 1. Changing the "Similarity Threshold" in the workspace settings to be "No Restriction". This basically allows the vector database to return all remotely similar results and no filtering is applied. This is based purely on the vector database distance of your query and the "score" filtered on, depending on documents, query, embedder, and more variables - a relevant text snippet can be marked as "irrelevant". Changing this setting usually fixes this with no performance decrease. 2. Document pinning (thumbtack icon in UI once doc is embedded). This does a full-text insertion of the document into the prompt. The context window is managed in case it overflows the model, however this can slow your response time by a good factor, however coherence will be extremely high.
Very useful video!! Thanks for the work. I kept a doubt about the chats that take place, there is any registration of the conversations? For commercial purposes it will be nice to generate leads with the own chat!
Absolutely, while you can "clear" a chat window you can always view all chats sent as a system admin and even export them for manual analysis or fine-tuning.
Thank you so much for the concise tutorial. Can we use ollama and LM studio as well with AnythingLLM. It only takes either of them. I have some models in ollama, and some in LM. Would love to have them both in AnythingLLM. I don't know if this is possible though. Thanks!
Can you make a tutorial how we can make either or the other to TTS for the AI-Response in a chat? I don't mean speech-recognition. just AI voice output.
looks soo good! I have a question : is there some way to add chat diagram like voiceflow or botpress ? For example, guiding the discussion for an ecommerce chatbot and give multiple choice when ask questions ?
I think this could be done with just some clever prompt engineering. You can modify the system prompt to behave in this way. However, there is no voiceflow-like experience built-in for that. That is a clever solution though.
thanks a lot for the video. Can you please tell me if there is a way to install the model via usb pendrive (manual installing) the other system I'm trying to install doesn't have an internet connection. pls reply
@Tim, this episode is brilliant! Let me ask you one thing. Do you have any ways to force this LLM model to return the response in a specific form, e.g. JSON with specific keys?
Thanks, Tim, for the good video. Unfortunately I do not get good results for uploaded content. I'm from Germany, so could it be a language problem, cause the uploaded content is german text? I'm using the same mistral model from your video and added 2 web pages to anythingLLMs workspace. But I'm not sure if the tools are using this content for building the answer. In the LM studio log I can see a very small chunk of one of the uploaded web pages. But in total, the result is wrong. To get good embeddings values I downloaded nomic-embed-text-v1.5.Q8_0.gguf and use it for the Embedding Model Settings in LM Studio which might be not necessary, cause you didn't mentioned such steps in your video. I would appreciate any further hints. Thanks a lot in advance.
Looks really clean thank you ! quick question wanted to test with a 50mb txt log file but after sometime embedding got an error cannot create a string longer than 0x1 and didn't catch the rest any thoughts on how I could add big log files ? Used the default embedder and vector store with ollama codellama 7b
Instead of dragging files, can you connect it to a local folder? Also, why does the first query work but the second always fail? (it says "Could not respond to message. an error occured while streaming response")
This is an amazing video and exactly what Ineeded. Thank you! I really appreciate it. Now the one thing,how do I find the token context window for the different models? I'm trying out gemma?
up to 8,000 (depends on VRAM available - 4096 is safe if you want best performance). I wish they had it on the model card on HuggingFace, but in reality it just is better to google it sometimes :)
I gotcha. So for the most part, just use the recommended one. I got everything working. But I uploaded a PDF and it keeps saying I am unable to provide a response to your question as I am unable to access external sources or provide a detailed analysis of the conversation. But the book was loaded and moved to workspace and save and embed? @@TimCarambat
For what its worth in LM Studio, on the sidebar there is a `n_cntxt` param that shows the maximum you can run. Performance will degrade if your GPU is not capable though to run the max token context.
Many thanks for this. I have been looking for this kind of solution for 6+ months now. Is it possible to create an LLM based uniquely on say a database of say 6000 pdfs?
A workspace, yes. You could then chat with that workspace over a period of time and then use the answers to create a fine-tune and then you'll have an LLM as well. Either way, it works. No limit on documents or embeddings or anything like that.
@@TimCarambatMany thanks! I shall investigate "workspaces". If I understand correctly I can use a folder instead of a document and AnythingLLM will work with the content it contains. Or was that too simplistic? I see other people are asking the same type of question.
I want to try it in a Linux VM, but from what I see you can only make this work on a laptop with a desktop OS. It would be even better if both LMstudio and AnythingLLM could run in one or two separate containers with a web UI
Very helpful video. I'd love to be able scrape an entire website in Anything LLM. Is there a way to do that? Is there a website where I can ask help questions about Anything LLM?
Can you do more of these demonstrations or vidoeos, is anythingLLM capable of generating visual conent like a dalle3 or video, assuming using a capable open sourse modell is there a limitation other then local memory as to the size of the vector databases created? this is amazing ;) Thanks for this video truly appreciated man. Liked and subscrided to support you.
Thanks for the insights. What's the best alternative for a person who doesn't want to run locally yet he wants to use opensource LLMs for interacting with documents and webscraping for research.
Nice one Tim. It’s been on my list to get a private LLM set up. You’re guide is just what I needed. I know Mistral is popular. Are those models listed on capabilities, top being most efficient? I’m wondering how to choose the best model for my needs.
Those models are curated by thr lmstudio team. Imo they are based on popularity. However, if you aren't sure what model to chose, go for Llama2 or Mistral, can't go wrong with those models as they are all around capable
AnythingLLM looks super awesome, cant wait to setup with ollama and give it a spin. tried chat with rtx but the youtube upload option didnt install for me and that was all i wanted it for
for me the app does not open (1.7.0) or the window of the app does not appear, when i quit it and try to restart it it automaticly shutsdown i tried trashing the cache files with appcleaner then it behaves like first time starting, the app is on but window does not appear tried re-intalling also tried intel version. same behaviour. i have lulu outgoing firewall sometimes it picks some attemps and i allow those but other times it does not. also tried vpn turned off. weird thing is that one time it started fine i downloaded model it got stuck unpacking the model for two hours so ended up closing the app it did not open again after that. i wonder what might be blocking it on my computer?
hey tim im on debian, lm studio runs and wonderfully well too, however im having a small issue, on the sidebar the only icon that isnt a weird square is the local server icons... what icon pack or font do i need from the repo?
@@matheus-mondaini Yes, if you have an instance running or on desktop their is an API and you can see the endpoint "update-embeddings" for a workspace. Docs are usually at localhost:3001/api/docs
If you change them to .txt if will be okay. We just need to basically have all "unknown" types try to parse as text to allow this since there are thousands of programming text types
Im on a Linux machine, and want to set up some hardware ... recommended GPU (or even point me to the direction for good information?) or better yet can an old bitcon rig do the job somehow seeing as though theyre useless for bitcoin these days?! Great tutorial too mate, really appreciate you taking the time!
Hi Tim I am farley new to this. But going to as a silly question. will this method have up to date information knowledge similar to GPT4 using bing etc.... ? Thans this is a great video!
The model cutoff dates vary, so no. however! We are going to be adding live search via various means (from free to connecting to an external paid services). Live web browsing is the term for this. Some models are even "online" and have this baked in, but they are not private or something you can run yourself. See perplexity AI for that kind of functionality. We want to unlock this for local LLMs for you though
Cool! Is there any AI tool that can learn something when being fed with some websites, then can answer my questions about stuff mentioned on those websites? For example, I want to learn Rust (a programming language). I can give that AI tool websites about the language, the libraries in that language ... Then the AI tool must be able to write some Rust applications when being given enough details. Is it feasible now or we need to wait for a few more years/decades?
I am curious to know if anything LLM has code interpretation capabilities like running it if not I would love to work on that and also if it is I am going to do some awsomawesomegs with it.
We dont have a code inteperter yet, but the idea is that we will require the user to run a docker container alongside the app to accomplish this to prevent runaway code or bash commands! Which is how tools like OpenDevin work
I have with docker version connection problems with LMSTUDIO as well as ollama,.whereas the desktop version does not have any connection problems. In previous versions, everything worked fine also with docker Version. On my second server, the situation is the same, so I can exclude the setup as root cause. Something is not okay with the newer docker versions
I´ve been playing around with running local LLMs for a while now, and it´s really cool to be able to run something like that locally at all, but it does not come even close to replacing ChatGPT. If there actually were models as smart as ChatGPT to run locally they would require a very expensive bunch of computers...
That's a great one .. Just got stucked in one scenario after some time of use on asking any questions giving respone- could not respond to messages. Request failed with status code 400. Please help!
Please do a dedicated video on training minimal base models for specific purposes. You're a legend. Also a video on commercial use and licensing would be immensely valuable and greatly appreciated.
+1
Where to start with in the path of learning AI (llm, rag, generative Ai..,)
+1
Yes!
Very nice question i am waiting for the same. Wish Tim make that video soon
[00:00] 💻 Introduction to Local LLM Setup
- Overview of setting up a local LLM on your device using LM Studio and Anything LLM.
- Single-click installation process for both applications.
- Highlighting the benefits of GPU over CPU for running these applications.
[00:26] 🛠 Tools and Compatibility
- Discussion on compatibility and installation for Windows OS.
- Introduction to Anything LLM as a versatile, private chat application.
- Emphasis on open-source nature and contribution possibilities of Anything LLM.
[01:36] 🔧 Installation Guide
- Step-by-step guide on installing LM Studio and Anything LLM.
- Highlighting the simplicity and halfway mark of the installation process.
[02:04] 🌐 Exploring LM Studio
- Navigating LM Studio's interface and downloading models.
- Discussion on model compatibility and the importance of GPU offloading for better performance.
[03:14] 📊 Model Selection and Download
- Detailed overview of different model qualities (Q4, Q5, Q8) and their implications on performance.
- Advice on model selection based on size and download considerations.
[04:23] 🤖 Chatting with Models in LM Studio
- How to use the internal chat client of LM Studio.
- Limitations of LM Studio's chat feature and how Anything LLM can enhance the experience.
[05:47] 🔗 Integrating Anything LLM with LM Studio
- Setting up Anything LLM to work with LM Studio.
- Configuration details for connecting the two applications for a more powerful local LLM experience.
[06:12] 🛠 Configuring Anything LLM for Enhanced Functionality
- Detailed instructions on configuring Anything LLM settings to connect with LM Studio.
- Emphasizing the customization options for optimizing performance and user experience.
[06:45] 🎨 Customizing User Experience
- Exploring the customization features within Anything LLM, including themes and plugins.
- How to personalize the application to fit individual needs and preferences.
[07:15] 🔄 Syncing with LM Studio
- Step-by-step guide on ensuring seamless integration between Anything LLM and LM Studio.
- Tips on troubleshooting common issues that may arise during the syncing process.
[07:58] 🚀 Launching Your First Session
- Initiating a chat session within Anything LLM to demonstrate the real-time capabilities.
- Showcasing the smooth and efficient operation of the model after configuration.
[08:34] 💡 Advanced Features and Tips
- Introduction to advanced features available in Anything LLM, like voice recognition and command shortcuts.
- Advice on how to utilize these features to enhance the overall experience and productivity.
[09:10] 🌟 Conclusion and Encouragement for Exploration
- Encouraging users to explore the full potential of Anything LLM and LM Studio.
- Reminder of the open-source community's role in improving and expanding the software's capabilities.
[09:45] 🤝 Invitation to Contribute
- Invitation for viewers to contribute to the development of Anything LLM and LM Studio.
- Highlighting the importance of community feedback and contributions for future enhancements.
[10:20] 📚 Resources and Support
- Providing resources for additional help and support, including community forums and documentation.
- Encouragement to reach out with questions or for assistance in maximizing the use of the tools.
[10:55] 🎉 Final Thoughts and Farewell
- Reflecting on the ease and power of deploying LLMs locally with Anything LLM and LM Studio.
- Wishing viewers success in their projects and explorations with these tools.
With love, From a Brother in Christ
00:01 Easiest way to run locally and connect LMStudio & AnythingLLM
01:29 Learn how to use LMStudio and AnythingLLM for a comprehensive LLM experience for free
02:48 Different quantized models available on LMStudio
04:14 LMStudio includes a chat client for experimenting with models.
05:33 Setting up LM Studio with AnythingLLM for local model usage.
06:57 Setting up LM Studio server and connecting to AnythingLLM
08:21 Upgrading LMStudio with additional context
09:51 LM Studio and AnythingLLM enable private end-to-end chatting with open source models
Crafted by Merlin AI.
This is exactly what I've been looking for. Now, I'm not sure if this is already implemented, but if the chat bot can use EVERYTHING from all previous chats within the workspace for context and reference... My god that will change everything for me.
It does use the history for context and reference! History, system prompt, and context - all at the same time and we manage the context window for you on the backend
@@TimCarambatbut isn’t history actually constrained by active model’s context size?
@@IrakliKavtaradzepsyche yes, but we manage the overflow automatically so you at least don't crash from token overflow. This is common for LLMs, to truncate or manipulate the history for long running sessions
So this strictly for LLM's? Is this like an AI assistant?
I’m just about to dive into LM Studio and AnythingLM Desktop, and let me tell you, I’m super pumped! 🚀 The potential when these two join forces is just out of this world!
Are you a crypto bro?
Fantastic! I've been waiting for someone to make RAG smooth and easy :) Thank you for the video!
I’d love to hear more about your product roadmap - specifically with how it relates to the RAG system you have implemented . I’ve been experimenting a lot with Flowise and the new LlamaIndex integration is fantastic - especially the various text summarisation and content refinement methods available with a LlamaIndex based RAG. Are you planning to enhance the RAG implementation in AnythingLLM?
🎯 Key points for quick navigation:
00:00:12 *💻 Execução fácil de modelos de linguagem locais usando LM Studio e Anything LLM no seu computador.*
00:01:36 *🧩 Instalação simples e rápida do LM Studio e Anything LLM em Windows, metade do processo já é completado ao instalar os programas.*
00:02:59 *📥 Download de modelos pode ser a parte mais demorada, mas essencial para começar.*
00:04:23 *🖥️ Uso de GPU acelera respostas e alcança velocidades como do ChatGPT.*
00:06:14 *🔗 Conexão entre LM Studio e Anything LLM desbloqueia uso poderoso em contexto local.*
00:09:25 *📚 Incremento de modelos com dados contextuais adicionais melhora a precisão das respostas.*
00:09:54 *🔒 Chat local é privado e seguro, sem custos mensais, usando LM Studio e Anything LLM.*
00:10:49 *⭐ Modelos populares como Llama 2 ou Mistral garantem uma melhor experiência.*
Made with HARPA AI
Thank you, I've been struggling for so long with problematic things like privateGPT etc. which gave me headaches. I love how easy it is to download models and add embeddings! Again thank you.
I'm very eager to learn more about AI, but I'm absolute beginner. Maybe video on how would you learn from the beginning?
The potential of this is near limitless so congratulations on this app.
You deserve a Nobel Peace Prize. Thank you so much for creating Anything LLM.
Great stuff,this way you can run a good smaller conversational model like 13b or even 7b,like Laser Mistral.
Main problem with this smaller LLM are massive holes in some topics,or informations about events,celebs or other stuff,this way you can make your own database about stuff you wanna chat.
Amazing.
Omg I've found u !!!! Been searching over the net. None of it are legit. Yours are true value
So if in case we need to programmatically use this, does anythingllm itself offer a ‘run locally on server’ option to get an API endpoint that we could call from a local website for example? i.e. local website -> post request -> anythingllm (local server + PDFs)-> LMstudio (local server - foundation model)
Did you get an answer?
thanks for the tutorial, everything works great and surprisingly fast on M2 Mac Studio, cheers!
How well does it perform on large documents. Is it prone to lost in the middle phenomena?
That is more of a "model behavior" and not something we can control.
Just got this running and it's fantastic. Just a note that LM Studio uses the API key "lm-studio" when connecting using Local AI Chat Settings.
does it provide script for youtube?
Thanks for this, about to try it to query legislation and case law for a specific area of UK law to see if it effective in returning references to relevant sections and key case law. Interested in building a private LLM to assist with specific repetitive tasks. Thanks for the video.
IMO anythingLLM is much userfriendly and really has big potential. thanks Tim!
Thanks a ton ...you are giving us power on working with our local documents... its blazingly fast to embed the docs, super fast responses and all in all i am very happy.
thats liberating ! i was really concerned about privacy especially when coding or working on refining internal proposals> Now I know what to do
What type of processor/GPU/model are you using? I'm using version 5 of Mistral and it is super slow to respond. i7 and an Nvidia RTX 3060ti GPU.
Also, how is this different from implementing RAG on a base foundation model and chunking our documents and loading it into a vector db like pinecone? Is the main point here that everything is locally run on our laptop? Would it work without internet access?
Very nice tutorial! Thanks Tim,
Awesome man. Hope to see more video with AnythingLL!
Really awesome stuff. Thank you for bringing such quality aspects and making it open-source.
Could you please help to understand on how efficiently the RAG pipeline in AnythingLLM works ?
For example:
If I upload a pdf with MultiModal content or If I want my document to be embedded in a semantic way or use Multi-vector search, Can we customize such advanced RAG features ?
Mm...doesn't seem to work for me. The model (Mistral 7B) loads, and so does the training data, but the chat can't read the documents (PDF or web links) properly. Is that a function of the model being too small, or is there a tiny bug somewhere? [edit: got it working, but it just hallucinates all the time. Pretty useless]
The biggest challenge I am having is getting the prompt to provide accurate information that is included in the source material. The interpretation is just wrong. I have pinned the source material and I have also played with the LLM Temperature to no avail of an accurate chat response that aligns with the source material. Also tried setting chat mode to Query but it typically doesn't produce a response. Another thing that is bothering me is how I can't delete the default thread that is under the workspace as the first thread.
Excellent tutorial. Thanks a bunch😊
I have tried, but could not get it to work with the files that was shared as context. Am I missing something? It's giving answers like that the file is in my inbox I will have to read it, but does not actually reads the file
i m also struggling. sometimes it refers to the context and most of the times it forgot having access eventho its referencing it
This is great! So we woukd always have to run lm studio before running anything llm?
If you wanted to use LMStudio, yes. There is not specific order but both need to be running of course
software engineer and AI knowledge? You got my sub.
I am a software developer but am clueless when it comes to machine learning and LLM's. What I was wondering, is it possible to train a local LLM by feeding in all of your code for a project?
Wow, great information. I have a huge amount of documents and everytime I search for something it's getting such a difficult task to fulfill
And what have you found with this combination of dumb tools? Search through documents is crazy slow with LM Studio and AnythingLLM.
LM Studios TOS paragraph:
"Updates. You understand that Company Properties are evolving. As a result, Company may require you to accept updates to Company Properties that you have installed on your computer or mobile device. You acknowledge and agree that Company may update Company Properties with or WITHOUT notifying you. You may need to update third-party software from time to time in order to use Company Properties.
Company MAY, but is not obligated to, monitor or review Company Properties at any time. Although Company does not generally monitor user activity occurring in connection with Company Properties, if Company becomes aware of any possible violations by you of any provision of the Agreement, Company reserves the right to investigate such violations, and Company may, at its sole discretion, immediately terminate your license to use Company Properties, without prior notice to you."
Several posts on LLM Reddit groups with people not happy about it. NOTE: I'm not one of the posters, read-only, I'm just curious what others think.
Wait so their TOS basically says they may or may not monitor your chats in case you are up to no good with no notification?
okay. I see why people are pissed about that. I dont like that either unless they can verifiable prove the "danger assessment" is done on device because otherwise this is no better than just cloud hosting but paying for it with your resources
Thanks for bringing this to my attention btw. I know _why_ they have it in the ToS, but I cannot imagine how they think that will go over.
Ancient idea clash between wanting to be a good "software citizen" and the unfortunate fact that their intent is still to "monitor" your activities. As you said in your second reply to me, "monitoring" does not go over well with some and the consideration of the intent for doing so, even if potentially justified, is a subsequent thought they will refuse to entertain. @@TimCarambat
@@TimCarambatLet's say there is a monitoring background behind, what if we setup a vm that did not allow to connect to the internet, does that will make our data safe ?
@@alternate_fantasy it would prevent phone homes, sure, so yes. That being said I have Wiresharkd lmstudio while running and did not see anything sent outbound that would indicate they can view anything like that. I think that's just their lawyers being lawyers
This video is gold. Push this to the top people.
Absolutely stellar video, Tim! 🌌 Your walkthrough on setting up a locally run LLM for free using LM Studio and Anything LLM Desktop was not just informative but truly inspiring. It's incredible to see how accessible and powerful these tools can make LLM chat experiences, all from our own digital space stations. I'm particularly excited about the privacy aspect and the ability to contribute to the open-source community. You've opened up a whole new universe of possibilities for us explorers. Can't wait to give it a try myself and dive into the world of private, powerful LLM interactions. Thank you for sharing this cosmic knowledge! 🚀👩🚀
To operate a model comparable to GPT-4 on a personal computer, you would currently need around 60GB of VRAM. This would roughly necessitate three 24GB graphics cards, each costing between $1,500 and $2,000. Therefore, equipping a PC to run a similar model would cost more than 25 years' worth of a ChatGPT subscription at $20 per month or $240 per year.
Although there are smaller LLM (Large Language Models) available, such as 8B or 13B models requiring only 4-16GB of VRAM, they don't compare favorably even with the freely available GPT-3.5.
Furthermore, with OpenAI planning to release GPT-5 later this year, the hardware requirements to match its capabilities on a personal computer are expected to be even more demanding.
Absolutely. Closed source and cloud based models will always have a performance edge. The kicker is are you comfortable with their limitations on what you can do with them, paying for additional plugins, and the exposure of your uploaded documents and chats to a third party.
Or get 80-90% of the same experience with whatever the latest and greatest oss model is running on your CPU/GPU with none of that concern. Its just two different use cases, both should exist
@@TimCarambat While using versions 2.6 to 2.9 of Llama (dolphin), I've noticed significant differences between it and ChatGPT-4. Llama performs well in certain areas, but ChatGPT generally provides more detailed responses. There are exceptions where Llama may have fewer restrictions due to being less bound by major company policies, which can be a factor when dealing with sensitive content like explosives or explicit materials. however, while ChatGPT has usage limits and avoids topics like politics and explicit content, some providers offer unrestricted access through paid services. and realistically, most users-over 95%-might try these services briefly before discontinuing their use.
Get a pcei nvme ssd. I have 500gb of “swap” that I labeled as ram3. Ran a 70b like butter with the gpu at 1% only running display. Also you can use a 15$ riser and add graphics cards. You should have like 256gb on the gpu, but you can also vramswap, but that isn’t necessary bc you shouldn’t rip anywhere near 100gb at once. Split up your processes. Instead of just cpu and ram use the cpu to send commands to anything with a chip, and attach a storage device immediately to it. The pc has 2 x 8gb ram naturally. You can even use an hdd it is just a noticeable drag of under 1 gb/s. There are many more ways to do it, once I finish the seamless container pass I will have an otb software solution for you. -- swap rate and swapiness will help if you have solid storage.
@@Naw1dawg yes, you can modify or add addition to your pc to run LLM on your pc, but still it's not worth to do it. because, most of people who playing around LLM, they would use it only short period of time. a month or so max,
what i am saying is paying over $ 5,000 build for the LLM is not worth, compare to paying $20 per month enjoying fun.
@@catwolf256 could be worth it if you make it available to all your friends and get them to pay you instead ;-)
Great work Tim, I'm hoping I can introduce this or anything AI into our company
I get this response every time:
"I am unable to access external sources or provide information beyond the context you have provided, so I cannot answer this question".
Mac mini
M2 Pro
Cores:10 (6 performance and 4 efficiency)
Memory:16 GB
Bro, this is exactly what I was looking for. Would love to see a video of the cloud option at $50/month
@@monbeauparfum1452 have you tried the desktop app yet (free)
I loaded a simple txt file, embbedded as presented in video, and ask a question about a topic within the text. Unfortunately it seems the model does't know nothing about the text. Any tip ? (Mistral 8 bit, RtX4090 24 Gb).
Same here, plus it hallucinates like hell :)
Thanks for building this.
I had a spare 6800xt sitting around that had been retired due to overheating for no apparent reason, as well as a semi-retired ryzen 2700x , and i found 32 gigs of ram sitting around for the box. Just going to say flat out that it is shockingly fast. I actually think running Rocm to enable gpu acceleration for lm studio is running llm's better than my 3080ti in my main system, or at the very least, so similar i cant perceive a difference
🎯 Key Takeaways for quick navigation:
00:00 *🖥️ 如何在本地运行高效的LLM应用*
- 展示如何使用单击安装的两款工具(LM Studio和Anything LLM Desktop)在个人电脑上运行高效的LLM应用。
- 提到如果有GPU,体验会更佳,但仅有CPU也是可行的。
02:19 *🧰 LM Studio的安装与模型下载*
- 介绍LM Studio支持的操作系统和如何安装。
- 强调模型下载可能是整个过程中最耗时的部分。
04:23 *💡 利用LM Studio进行聊天和模型探索*
- 展示如何通过LM Studio的聊天客户端与下载的模型进行交互。
- 提到GPU卸载的设置,以及如何通过调整设置优化体验。
06:00 *🔄 整合Anything LLM以增强本地LLM的能力*
- 说明如何将LM Studio和Anything LLM Desktop整合,提升LLM的功能。
- 展示如何通过Anything LLM添加文档和上下文信息,以及如何对话以获取更准确的回答。
09:54 *🚀 总结和潜在应用*
- 总结如何通过LM Studio和Anything LLM Desktop在本地创建一个完全私有的、功能丰富的LLM系统。
- 强调这种方式的便捷性,无需支付费用即可使用最新的开源模型。
Made with HARPA AI
That was a really good video. Thank you so much.
That's really amazing 🤩, I will definitely be using this for BIM and Python
Thank you so much for your generosity. I wish the very best for your enterprise . God Bless!
OK, I'm confused. If I were to feed this a bunch of pdf documents/books, would it then be able to draw on the information contained in those files to answer questions, summarise then info, or general content based on that info in the same literary/writing style as the initial files? And all 'offline' on a local install? (This is the Holy Grail that I am seeking out.
u can already do this with chatgpt custom gpts
@@holykim4352 got a link or reference? I've not found any way to do what I want so far. Maybe I misunderstand the process, but I can't seem to find the info I need either. Cheers.
Can't wait to try this. I've watched a dozen other tutorials that were too complicated for someone like me without basic coding skills. What are the pros/cons of setting this up with LMStudio vs. Ollama?
If you don't like to code, you will find the UI of lmstudio much more approachable, but it can be an information overload. Lmstudio has every model on huggingface. Ollama is only accessible via terminal and has limited model support but is dead simple.
This video was made before we launched the desktop app. Our desktop comes with ollama pre-installed and gives you a UI to pick a model and start chatting with docs privately. That might be a better option since that is one app, no setup, no cli or extra application
changing the embedding model would be a good tutorial! For examle how to use a multi langual model!
Absolutely great!! thank you!!!
This is an amazing tutorial. Didn't know there were that many models out there. Thank you for clearing the fog. I have one question though, how do I find out what number to put into "Token context window"? Thanks for your time!
Once pulling into LMStudio, its in the sidebar once the model is selected. Its a tiny little section on the right sidebar that say "n_ctxt" or something similar to that. Youll then see it will explain further how many tokens your model can handle at max, RAM permitting.
@@TimCarambat your the best... thanks... 🍻
Thanks for the video!
I did it as you said and got the model working (same as you picked). It ran faster than i expected and I was impressed with the quality of the text and the general understanding of the model.
However when i uploaded some documents [in total just 150 kb of downloaded HTML from a wiki] it gave very wrong answers [overwhelmingly incorrect]. What can i do to improve this?
two things help by far the most!
1. Changing the "Similarity Threshold" in the workspace settings to be "No Restriction". This basically allows the vector database to return all remotely similar results and no filtering is applied. This is based purely on the vector database distance of your query and the "score" filtered on, depending on documents, query, embedder, and more variables - a relevant text snippet can be marked as "irrelevant". Changing this setting usually fixes this with no performance decrease.
2. Document pinning (thumbtack icon in UI once doc is embedded). This does a full-text insertion of the document into the prompt. The context window is managed in case it overflows the model, however this can slow your response time by a good factor, however coherence will be extremely high.
Thank you! But i dont understand what you mean with "Thumbtack icon in UI once doc is embedded". Could you please clarify?@@TimCarambat
Very useful video!! Thanks for the work. I kept a doubt about the chats that take place, there is any registration of the conversations? For commercial purposes it will be nice to generate leads with the own chat!
Absolutely, while you can "clear" a chat window you can always view all chats sent as a system admin and even export them for manual analysis or fine-tuning.
Thank you so much for the concise tutorial. Can we use ollama and LM studio as well with AnythingLLM. It only takes either of them. I have some models in ollama, and some in LM. Would love to have them both in AnythingLLM. I don't know if this is possible though. Thanks!
Can you make a tutorial how we can make either or the other to TTS for the AI-Response in a chat? I don't mean speech-recognition. just AI voice output.
Thank you! Very useful info. Subbed.
looks soo good! I have a question : is there some way to add chat diagram like voiceflow or botpress ?
For example, guiding the discussion for an ecommerce chatbot and give multiple choice when ask questions ?
I think this could be done with just some clever prompt engineering. You can modify the system prompt to behave in this way. However, there is no voiceflow-like experience built-in for that. That is a clever solution though.
thanks a lot for the video. Can you please tell me if there is a way to install the model via usb pendrive (manual installing) the other system I'm trying to install doesn't have an internet connection. pls reply
Hi Tim, Fantastic. Is it possible to use anythingllm with gpt4 directly, for local use? like the example you demonstrated above.
Can't imagine that's possible with GPT4. The VRAM requires for that model would be in the hundreds of GB.
@Tim, this episode is brilliant! Let me ask you one thing. Do you have any ways to force this LLM model to return the response in a specific form, e.g. JSON with specific keys?
thank you for your simple explanation
Thanks a lot! This tutorial is a gem!
Great video very well explained !
Wao what a great tool. Congratulations and thank you.
Can you make a video explaining licence and commercial use to sell this to clients? Thank you.
License is MIT, not much more to explain :)
Thanks mate. Had them up and running in a few minutes.
Just like Ollama and many others.
Thanks, Tim, for the good video. Unfortunately I do not get good results for uploaded content.
I'm from Germany, so could it be a language problem, cause the uploaded content is german text?
I'm using the same mistral model from your video and added 2 web pages to anythingLLMs workspace.
But I'm not sure if the tools are using this content for building the answer.
In the LM studio log I can see a very small chunk of one of the uploaded web pages. But in total, the result is wrong.
To get good embeddings values I downloaded nomic-embed-text-v1.5.Q8_0.gguf and use it for the Embedding Model Settings in LM Studio which might be not necessary, cause you didn't mentioned such steps in your video.
I would appreciate any further hints. Thanks a lot in advance.
Looks really clean thank you ! quick question wanted to test with a 50mb txt log file but after sometime embedding got an error cannot create a string longer than 0x1 and didn't catch the rest any thoughts on how I could add big log files ? Used the default embedder and vector store with ollama codellama 7b
That is a bizzare error i have never seen. What operating system?
Instead of dragging files, can you connect it to a local folder? Also, why does the first query work but the second always fail? (it says "Could not respond to message. an error occured while streaming response")
This is an amazing video and exactly what Ineeded. Thank you! I really appreciate it. Now the one thing,how do I find the token context window for the different models? I'm trying out gemma?
up to 8,000 (depends on VRAM available - 4096 is safe if you want best performance). I wish they had it on the model card on HuggingFace, but in reality it just is better to google it sometimes :)
I gotcha. So for the most part, just use the recommended one. I got everything working. But I uploaded a PDF and it keeps saying I am unable to provide a response to your question as I am unable to access external sources or provide a detailed analysis of the conversation. But the book was loaded and moved to workspace and save and embed?
@@TimCarambat
For what its worth in LM Studio, on the sidebar there is a `n_cntxt` param that shows the maximum you can run. Performance will degrade if your GPU is not capable though to run the max token context.
Many thanks for this. I have been looking for this kind of solution for 6+ months now. Is it possible to create an LLM based uniquely on say a database of say 6000 pdfs?
A workspace, yes. You could then chat with that workspace over a period of time and then use the answers to create a fine-tune and then you'll have an LLM as well. Either way, it works. No limit on documents or embeddings or anything like that.
@@TimCarambatMany thanks! I shall investigate "workspaces". If I understand correctly I can use a folder instead of a document and AnythingLLM will work with the content it contains. Or was that too simplistic? I see other people are asking the same type of question.
This video changed everything for me. Insane how easy to do all this now!
what much better between higher parameters or bit (Q)??
Models tend to "listen" better at hight quantization
Awesome, but anything LLM won't see PDFS with OCR like ChatGPT would, is there a multi-model that can do that?
We need to support vision first so we can enable OCR!
I want to try it in a Linux VM, but from what I see you can only make this work on a laptop with a desktop OS. It would be even better if both LMstudio and AnythingLLM could run in one or two separate containers with a web UI
Very helpful video. I'd love to be able scrape an entire website in Anything LLM. Is there a way to do that?
Is there a website where I can ask help questions about Anything LLM?
Can you do more of these demonstrations or vidoeos, is anythingLLM capable of generating visual conent like a dalle3 or video, assuming using a capable open sourse modell is there a limitation other then local memory as to the size of the vector databases created? this is amazing ;)
Thanks for this video truly appreciated man. Liked and subscrided to support you.
Thank you for making this video. This helped me a lot.
Thanks dude! Great video
Thanks for the insights. What's the best alternative for a person who doesn't want to run locally yet he wants to use opensource LLMs for interacting with documents and webscraping for research.
OpenRouter has a ton of hosted open-source LLMs you can use. I think a majority of them are free and you just need an API key.
Nice one Tim. It’s been on my list to get a private LLM set up. You’re guide is just what I needed. I know Mistral is popular. Are those models listed on capabilities, top being most efficient? I’m wondering how to choose the best model for my needs.
Those models are curated by thr lmstudio team. Imo they are based on popularity. However, if you aren't sure what model to chose, go for Llama2 or Mistral, can't go wrong with those models as they are all around capable
Thanks Tim, much appreciated.
AnythingLLM looks super awesome, cant wait to setup with ollama and give it a spin. tried chat with rtx but the youtube upload option didnt install for me and that was all i wanted it for
for me the app does not open (1.7.0) or the window of the app does not appear, when i quit it and try to restart it it automaticly shutsdown i tried trashing the cache files with appcleaner then it behaves like first time starting, the app is on but window does not appear tried re-intalling also tried intel version. same behaviour. i have lulu outgoing firewall sometimes it picks some attemps and i allow those but other times it does not. also tried vpn turned off. weird thing is that one time it started fine i downloaded model it got stuck unpacking the model for two hours so ended up closing the app it did not open again after that. i wonder what might be blocking it on my computer?
hey tim im on debian, lm studio runs and wonderfully well too, however im having a small issue, on the sidebar the only icon that isnt a weird square is the local server icons... what icon pack or font do i need from the repo?
phosphor icons
How can I add data into the ChromaDB for AnythingLLM to read it?
By uploading via the UI while Chroma is your selected vector DB
@@TimCarambat i need to do this via code, is it possible?
@@matheus-mondaini Yes, if you have an instance running or on desktop their is an API and you can see the endpoint "update-embeddings" for a workspace.
Docs are usually at localhost:3001/api/docs
How can I use .py files. It appears they arent supported
If you change them to .txt if will be okay. We just need to basically have all "unknown" types try to parse as text to allow this since there are thousands of programming text types
Very cool, I'll check it out. Is there a way to not install this on your OS drive?
I mean this is pretty useful already, is there plans to increase the capabilities to include other formats of documents, images, etc?
Im on a Linux machine, and want to set up some hardware ... recommended GPU (or even point me to the direction for good information?) or better yet can an old bitcon rig do the job somehow seeing as though theyre useless for bitcoin these days?! Great tutorial too mate, really appreciate you taking the time!
Hi Tim I am farley new to this. But going to as a silly question. will this method have up to date information knowledge similar to GPT4 using bing etc.... ? Thans this is a great video!
The model cutoff dates vary, so no. however! We are going to be adding live search via various means (from free to connecting to an external paid services). Live web browsing is the term for this. Some models are even "online" and have this baked in, but they are not private or something you can run yourself. See perplexity AI for that kind of functionality.
We want to unlock this for local LLMs for you though
Cool! Is there any AI tool that can learn something when being fed with some websites, then can answer my questions about stuff mentioned on those websites? For example, I want to learn Rust (a programming language). I can give that AI tool websites about the language, the libraries in that language ... Then the AI tool must be able to write some Rust applications when being given enough details. Is it feasible now or we need to wait for a few more years/decades?
You could do that exact thing now. There is a website scraper in the app currently
Followed the instructions. At my first question in chat with mistral-7b, I get "Only user and assistant roles are supported!"
Using LMStudio?
How do I go in an change the LS-Studio base model URL and Token context window in AnythingLLM after install.
I am curious to know if anything LLM has code interpretation capabilities like running it if not I would love to work on that and also if it is I am going to do some awsomawesomegs with it.
We dont have a code inteperter yet, but the idea is that we will require the user to run a docker container alongside the app to accomplish this to prevent runaway code or bash commands! Which is how tools like OpenDevin work
I have with docker version connection problems with LMSTUDIO as well as ollama,.whereas the desktop version does not have any connection problems. In previous versions, everything worked fine also with docker Version. On my second server, the situation is the same, so I can exclude the setup as root cause. Something is not okay with the newer docker versions
Well explained! Thanks!
Very nice. Will definitely try it. Is or will there be an option to integrate a anything LLM workspace in a python code to automate task via API?
Yes, but the api is only in the docker version currently since that can be run locally and on cloud so an API makes more sense for that medium
Excellent work. Please make a video on text to sql and excel csv sql support for llms and chatbot. Thank you so much ♥️
I´ve been playing around with running local LLMs for a while now, and it´s really cool to be able to run something like that locally at all, but it does not come even close to replacing ChatGPT. If there actually were models as smart as ChatGPT to run locally they would require a very expensive bunch of computers...
Can someone explain to me what are this tools I have no idea and how this are replacement for chatgtp
That's a great one .. Just got stucked in one scenario after some time of use on asking any questions giving respone- could not respond to messages.
Request failed with status code 400. Please help!