great🔥fantastic what the Anythingllm team develops and makes available open-source 👍now I understand what the thumbs up button is for, that already exists in the workspace chats for the language model answers
I looked at the videos you've made, but I saw nothing about doing a local fine tune. Did I mess it up, or have you not done it yet? Thank you for this one. I'm looking at doing a fine tune myself, and this was helpful.
Very nice, though I would love to see more local functionality since that is really the goal of running something local. If I wanted cloud I would stick with Claude or OpenAI and fund their empire ;)
Great stuff, however, i am desperately searching the video about locally made fine tune and seems we are out of luck as you never have seem to made one, right? Has someone else made it instead?
The next version of this fine-tuning will indeed enable "Raw-document" tuning so you don't need to chat. Totally understand making chats seems annoying and direct content tuning would be much faster! One step at a time :)
@@j0hnc0nn0r-sec Soon you should be able to "port" those messages in as well so you can continue chatting in AnythingLLM as well. But yes, we could tune off that file directly as well
hello tim. i am pretty interested in the possibilities anythingLLM and Ollama presents, but the question for you (as a person with lots of experience in my book): if I want to make an agent, who would analyse some twitter threads / post in a theme (let's say, nature, or sports), and then write / reply based on what was read. what would be the tool or a service to make such a beast? would it be done greatly in a no-code pipeline (many of them, possibly), or there would be an easier way around?
I am new to the fine tuning and currently searching for tutorials online. If you have a tutorial in the works @TimCarambat I would definitely like to learn from you! Although the $250.00 price does seem very reasonable and with the assurance nothing is stored on your servers...Trust!
Thanks for sharing, please do help zoom into the screen text as you elaborate. The font size is really too small and the screen estate is not well used. Appreciate the sharing.
Hey Tim, great video! I have a question: I write short 10-minute stories, all based on a five-step storytelling structure. Is it possible to fine-tune a model so that, whenever I ask it to write a story on a given topic, the model consistently follows this five-step process?
@Tim Does the fine-tune pipeline take the documents and websites you vectorized into consideration or just the chat? If its the former and latter then its super powerful, if its just the latter then I have to generate thousands of chats to just get close to an 800 page book that I vectorized in my knowledge base.
@@shubhamkhichi_cyberagi for now it's the chats WITH citations but we are working on a way to generate data direct from documents. Believe me I know it's tedious to generate the chats. Next iteration is raw docs + chats!
@@TimCarambat Yup click on your profile -> Settings -> Data Controls -> Export Data. You get multiple json as well as an HTML file I have chat dating back 2023 March which have gold in it. I need it out and imported to AnythingLLM
This is an interesting approach to get local llms be of higher usefulness. The 250 for the service sound reasonable if the accessibility for lesser technical users or user with limited time is given.
@@drp111 that is great feedback, as that really is my core intention with that. I shopped around myself and was quoted as low as 1k to much much more and sometimes I never even get the model back. The local training stuff is in the works but I already foresee someone getting issues thinking a 10 year old gpu should be able to do this, which is why we offered cloud first since it for sure works for everyone
@@TimCarambat Funny....10 year old GPU. That would be me...trying to use old hardware for as long as possible to see if it truly is obsolete for some tasks.
Great content! Thanks for "birthing" this cool tech. I'm a documentarian with a library of interview transcripts. I'm trying to find the best local method to create a RAG to search and interact with them. Currently, they're PDFs and CSV files. I installed AnythingLLM and have been getting some lackluster results. Probably my fault! What's the best set-up for my use case re. LLM, embedder, vector db, etc? Currently, I'm using all the native options and I find the RAG is hallucinating a lot and/or not showing all results of a search term. Any tips would be appreciated! Thank you!
Sure! So PDFs are pretty cut and dry and the defaults work. CSVs and other tabular data though are HARD. The nature of CSV's often requires full-document comprehension which basically means you need a massive model (Google Gemini, Anthropic) to digest them. The alternative is to load them into a proper database so at least relationships exist. CSV's cannot be effectively used in RAG since there is no "semantic" relationships among datapoints for the most part. This makes "chunking" effectively worthless. Checkout Document pinning here for solving CSV issues, you may have to use a cloud model for those since they are probably thousands of lines long docs.anythingllm.com/llm-not-using-my-docs
@@TimCarambat Thanks so much for your reply, Tim! I first started with PDFs, but because the timecode info wasn't consistent across all PDFs, I converted them into 3 column CSVs. In my system prompt, I explain how the CSVs are structured and where to look for the data. But, you're saying, even with my instructions, CSVs are still difficult to work with? Side note...I'm about to start creating content for my filmmaker's channel. I know transcript wrangling is a popular topic/pain point. I'm sure you're super busy, but if you'd be interested in doing a quick interview, perhaps we can shine the light on a local llm solution for filmmakers who typically wouldn't consider it. Let me know!
The training dataset that is sent, consist of only chat record or including reference dataset too? Why fine tuned LLM size doubled? If the fine tune data is increased causing fine tuned LLM increasing too? Thank you.
It is a LoRA! If you need the .safetensors we can also provider those for export as well since they tend to be more portable than a hefty GGUF. I just thought most people would get more confused with all the files for a LoRA and unsure how to use them
this is a great tool indeed, installed it and works great !. Could you help me understand how you got Ollama models to have function call to work, I find it errors out when I use with langchain. Appreciate your help. It seems to work ok in Anything LLM, at least better than using langchain
One technical quesion, I learned that for training, it's better to use the higher precision version if not full prevision for better quality. Is it because the 8b model is so small, a quantized version does not make any meaningful difference in terms of quality?
@@TimCarambat For AnythingLLM, I actually have no option to use docker and installed application are tracked/listed on server. So I mostly find simple ready to run apps which I don't have to install. :)
thank your awesome post, just clarify: though with just 14 dataset, but during your training process, you use RAG so that it can expand knowledge in 14 data and can respond with longer text, am I right?
8:09 don't take it bad but in a law firm the lawyer has a legal duty of confidentialty. No "cloud" solution is legally adequate. Attys who do cloud services INCLUDING gmail are engaged in malpractice. Most likely don't understand google scans their mails. If you want a monetization pathway for anythinglm it's right there.
@@QuizmasterLaw this is for just generating a fine tune. The llm is running locally on device. 99% of people don't need a fine tune so the app is sufficient for their use since it's on device and using also on device documents for rag
@@TimCarambat my point wasn't "it's bad" but was "here's a monetization path". I'm an educator so my use case is not confidential data. There really is a job for anyone who wants to go to the local law firms and set up their local secure llm.
@@boardsontt1756 custom GPTs are just OpenAI models with a system prompt + rag and sometimes basic tools. A fine tune is basically a custom llm where it already knows your documents inherently and does not need rag. Nor does it need a system prompt to behave in a specific manner. Lastly, it can run fully offline and have additional rag and agent tooling on top of all that.
One point if is possible . I'm not coder but i try to create a python script to store my "personal behavior" in a plain text/texts in a folder to be loaded automatic by model, like that every time i load the model, he will know my old conversations, know my plans , my direction for a specific field . Of course , in time the model should try to sort the data from files in a manner that made him more useful in responses. Obviously didn't work well 😐
@@SiliconSouthShow we are working on custom agents right now so that you are not limited to what we provide out of the box. let me know if there is a tool that would be more useful if there is one top of mind
@@TimCarambat First, thank you so much. I am thrilled to see the progress with AnythingLLM. I've spent a lot of time building agents from scratch in Python using Ollama, and while it's been a tremendous learning experience, having robust tools at my disposal would be a game-changer. I would love to see the addition of a comprehensive Tools Library. A moonshot feature could be a Memorize Tool for unsupervised data collection and learning. A tool for handling webhooks, dialers, and callers, akin to those available with LangChain, would also be fantastic. However, if those are outside the immediate scope, enhancing the current web search and web scraping tools would be invaluable. I advocate for AnythingLLM passionately. I’ve introduced over 100 people to it, often speaking about it in live Zoom sessions. The platform stands out because it’s accessible enough for anyone to start using immediately while still being powerful. It’s well-designed, user-friendly, and out-of-the-box ready. Given the chance, I'd love to run a channel teaching others how to leverage AnythingLLM for various applications, from work to play. I’m particularly excited about potential memory features that would allow for advanced projects like multiplayer RPGs. AnythingLLM is in a class of its own. Unlike other tools that are merely interfaces for other functionalities, AnythingLLM is a powerhouse. It’s a unique tool that truly delivers, and I can’t praise it enough. My wife even jokes that I should be a spokesperson for AnythingLLM because of my enthusiasm. Looking forward to more great features and continuing to support this amazing platform! (I"m such a loser, I spent 2 hrs oen night teaching, talking and complain aLLM when the last update came out and ollama was missing from the agent support system, but I am a huge advocate and fan of aLLM, period, I love it.) PPS: I could see aLLM with a multiAgent system in it, I mean, I see it clearly, doing it all.
Wow that is huge! Thank you! Can I import *.gguf file directly into AnythingLLM like the downloaded system models? Couldn't find any answer to that simple task. Don't want to install LM Studio or Ollama.
We will allow that on the desktop app - since the docker version does not have an LLM inside of it. We _do_ have llama cpp in the docker image but we will be removing it soon due to the complexity of maintaining it - which is why standalone llm runners tools like Ollama or LMStudio exist. It's a project in and of itself to maintain. Can i ask what you have against for installing LMStudio or Ollama?
@@TimCarambat just showed AnythingLLM to some ordinary people and they were amazed what can be done locally and offline but when I told them they need to install also LM Studio and start server etc. than it was already too much work. LOL. And I have to agree with them. It would be really great if everything was in a desktop app. I know this is not for the masses they just use chatgpt and be done with it.
@@DanielSchweinert Well then the desktop app fits that. It has an LLM inside of it. Only for the multi-user docker version do you need to have some other external LLM. By default the desktop app has an LLM built into it which makes requirement to install and external runner extraneous. From how it sounds they should use the desktop app. At the start of the video I mentioned I was using the multi-user browser based version because that is just where this feature is live now, that's all
@@TimCarambat Thank you! I know there are system llm inside of AnythingLLM like Meta Llama3 etc. for download but I really don't know how I can import other LLM's into it like "dolphin Llama3". Where is the location or path where to put those other LLM's without using LM Studio. Btw Im on a mac.
can i do this in future= run llama3 400 billion (i believe frontier model) make it chat with all kind of complicated information (with frontier model it might able answer them) then use those chat into... json? to fine tune llama3 8b?
@@NLPprompter that is exactly what you can do, I just used OpenAI here because I can. Same principle though of using a more powerful model to fine tune a smaller one. Also we should have Llama 3.1 live soon for tuning as well so best of both worlds
@@TimCarambat I'm sorry to ask here, but I'll ask any way maybe you know something, there was a paper by OpenAI about Grokking, it's about... when a model were in fine tune phase and got overfitting continuously then at some point it become able to generalize have you seen such phenomenon in your system? if yes then if got time... i would like to hear more.
Its just a single command! If you have docker in your terminal just run this command: docs.anythingllm.com/installation-docker/local-docker#recommend-way-to-run-dockerized-anythingllm
@@louisduplessis5167 you can run it free for desktop. You can self host at your cost, and if you can't do any of that, yeah you can pay us to host it for you.
This is a common misunderstanding when it comes to fine-tuning specifically. The generation of a fine-tuned model from API output (NOT ChatGPT) is not the generation of a new full-weight competing model with respect to their terms. If we used the output to generate a brand new foundational model - like LLama 3.2 or something, that would be a violation as it is a new-weight full-parameter model that would compete with OpenAI. Creating a fine tune from any foundational model, using responses from their API, is permissible within those terms. References from their TOS ----- Use Output to develop models that compete with OpenAI. ----- Ownership of content. As between you and OpenAI, and to the extent permitted by applicable law, you (a) retain your ownership rights in Input and (b) own the Output. We hereby assign to you all our right, title, and interest, if any, in and to Output. ------ Source:openai.com/policies/terms-of-use/
You should really release this for people to do locally (because obviously you have the code for it lol), and then have the finetuning service for bigger companies... Someone else is going to soon if you don't
I don't know how many times I said this in the video, but this is not for people who know how to use libraries like Unsloth. Unsloth is amazing, but you still have to know how to code AND have a GPU you can even use. Funny enough, for the promised local version I am using Unsloth because it's so simple. Sure you can run their colab example but that isn't even close to what everyday people need. The issue is not gatekeeping, it's making it easy and accessible to those who probably dont even have a GPU they can use to fine-tune. Hell, even if you go to unsloths website they dont offer this for free. The LIBRARY for custom code is free, their hosted version is still paid and is not even open to the public.
This is amazingly simple! Great job! But I have the horse power so PLEASE get the locally run version out soon!! 👍🏻
You're an excellent teacher. Thanks
great🔥fantastic what the Anythingllm team develops and makes available open-source 👍now I understand what the thumbs up button is for, that already exists in the workspace chats for the language model answers
Sounds powerful. Looking forward to checking this out.
I looked at the videos you've made, but I saw nothing about doing a local fine tune. Did I mess it up, or have you not done it yet? Thank you for this one. I'm looking at doing a fine tune myself, and this was helpful.
Very nice, though I would love to see more local functionality since that is really the goal of running something local. If I wanted cloud I would stick with Claude or OpenAI and fund their empire ;)
Great stuff, however, i am desperately searching the video about locally made fine tune and seems we are out of luck as you never have seem to made one, right? Has someone else made it instead?
@@sixtheninth correct I still owe everyone that video and I'm actually working with Unsloth right now to make that happen
@@TimCarambat
You are visionary genius 🚀
9:53 I have a large obsidian notebook (.Md files) of about 10,000 notes. Can I use these notes instead of the “chats” when doing the fine tune?
The next version of this fine-tuning will indeed enable "Raw-document" tuning so you don't need to chat.
Totally understand making chats seems annoying and direct content tuning would be much faster! One step at a time :)
@@TimCarambat hell yeah
@@TimCarambat I suppose it would be a good idea if I exported my chats from OpenAI and Anthropic and used those chats for fine tuning.
@@j0hnc0nn0r-sec Soon you should be able to "port" those messages in as well so you can continue chatting in AnythingLLM as well.
But yes, we could tune off that file directly as well
Cant wait to be able to train my models locally through AnthingLLM
Do yousee his face expression when he explained why you do need to pay for remote processing. 😂
what time
This is awesome keep it up!
So good!! Thank You Man.
hi, is desktop version update is available now ? it's extremely awesome mr genius
hello tim. i am pretty interested in the possibilities anythingLLM and Ollama presents, but the question for you (as a person with lots of experience in my book): if I want to make an agent, who would analyse some twitter threads / post in a theme (let's say, nature, or sports), and then write / reply based on what was read. what would be the tool or a service to make such a beast?
would it be done greatly in a no-code pipeline (many of them, possibly), or there would be an easier way around?
gj bro! wen locally run version
I would like to do this locally
I am new to the fine tuning and currently searching for tutorials online. If you have a tutorial in the works @TimCarambat I would definitely like to learn from you! Although the $250.00 price does seem very reasonable and with the assurance nothing is stored on your servers...Trust!
can i use it to fine tune model with a exported telegram chat to make a clone of myself ?
waiting for local install video
Thanks for sharing, please do help zoom into the screen text as you elaborate. The font size is really too small and the screen estate is not well used. Appreciate the sharing.
Hey Tim, great video! I have a question: I write short 10-minute stories, all based on a five-step storytelling structure. Is it possible to fine-tune a model so that, whenever I ask it to write a story on a given topic, the model consistently follows this five-step process?
Thank you!
@Tim Does the fine-tune pipeline take the documents and websites you vectorized into consideration or just the chat?
If its the former and latter then its super powerful, if its just the latter then I have to generate thousands of chats to just get close to an 800 page book that I vectorized in my knowledge base.
Also second question: Can I import chats from my chatGPT export where I have been chatting about technical knowledge for a while now?
@@shubhamkhichi_cyberagi for now it's the chats WITH citations but we are working on a way to generate data direct from documents. Believe me I know it's tedious to generate the chats. Next iteration is raw docs + chats!
@@shubhamkhichi_cyberagi I didn't even know this was a thing. You can export data chats from chat gpt?
@@TimCarambat Yup click on your profile -> Settings -> Data Controls -> Export Data.
You get multiple json as well as an HTML file
I have chat dating back 2023 March which have gold in it. I need it out and imported to AnythingLLM
@@TimCarambat Yes its under Profile -> Settings -> Data control -> Export Chat
Function not even available in the local host docker version
This is an interesting approach to get local llms be of higher usefulness. The 250 for the service sound reasonable if the accessibility for lesser technical users or user with limited time is given.
@@drp111 that is great feedback, as that really is my core intention with that. I shopped around myself and was quoted as low as 1k to much much more and sometimes I never even get the model back.
The local training stuff is in the works but I already foresee someone getting issues thinking a 10 year old gpu should be able to do this, which is why we offered cloud first since it for sure works for everyone
@@TimCarambat Funny....10 year old GPU. That would be me...trying to use old hardware for as long as possible to see if it truly is obsolete for some tasks.
@@BirdsPawsandClaws The thing is you still probably could run it...but it might take you 10 hours while others take 2 - so its not out of the question
Great content! Thanks for "birthing" this cool tech. I'm a documentarian with a library of interview transcripts. I'm trying to find the best local method to create a RAG to search and interact with them. Currently, they're PDFs and CSV files. I installed AnythingLLM and have been getting some lackluster results. Probably my fault! What's the best set-up for my use case re. LLM, embedder, vector db, etc? Currently, I'm using all the native options and I find the RAG is hallucinating a lot and/or not showing all results of a search term. Any tips would be appreciated! Thank you!
Sure! So PDFs are pretty cut and dry and the defaults work. CSVs and other tabular data though are HARD. The nature of CSV's often requires full-document comprehension which basically means you need a massive model (Google Gemini, Anthropic) to digest them.
The alternative is to load them into a proper database so at least relationships exist. CSV's cannot be effectively used in RAG since there is no "semantic" relationships among datapoints for the most part. This makes "chunking" effectively worthless.
Checkout Document pinning here for solving CSV issues, you may have to use a cloud model for those since they are probably thousands of lines long
docs.anythingllm.com/llm-not-using-my-docs
@@TimCarambat Thanks so much for your reply, Tim! I first started with PDFs, but because the timecode info wasn't consistent across all PDFs, I converted them into 3 column CSVs. In my system prompt, I explain how the CSVs are structured and where to look for the data. But, you're saying, even with my instructions, CSVs are still difficult to work with?
Side note...I'm about to start creating content for my filmmaker's channel. I know transcript wrangling is a popular topic/pain point. I'm sure you're super busy, but if you'd be interested in doing a quick interview, perhaps we can shine the light on a local llm solution for filmmakers who typically wouldn't consider it. Let me know!
Can I train that on pictures and video data?
Its a language model, so the answer is no. Unless you use a vision model like llava, but it uses different dataset format structure..
bravo
Thanks.
Is there a way to include the knowledge from the RAG assets into fine tuning? Can you fine tune with multi modal assets?
Any thoughts on using LORA adapters?
The training dataset that is sent, consist of only chat record or including reference dataset too? Why fine tuned LLM size doubled? If the fine tune data is increased causing fine tuned LLM increasing too? Thank you.
love it!!
This is great will this extend into creating LoRAs?
It is a LoRA! If you need the .safetensors we can also provider those for export as well since they tend to be more portable than a hefty GGUF.
I just thought most people would get more confused with all the files for a LoRA and unsure how to use them
You should add an check for updates feature, it is difficult to uninstall and download everytime
this is a great tool indeed, installed it and works great !. Could you help me understand how you got Ollama models to have function call to work, I find it errors out when I use with langchain. Appreciate your help. It seems to work ok in Anything LLM, at least better than using langchain
One technical quesion, I learned that for training, it's better to use the higher precision version if not full prevision for better quality. Is it because the 8b model is so small, a quantized version does not make any meaningful difference in terms of quality?
Correct, with smaller param models (
is there a portable version available? Like download and run in place of installation?
For the fine-tuning or AnythingLLM? For the fine-tuning we give you a model GGUF file you can take anywhere and AnythingLLM has a desktop version.
@@TimCarambat For AnythingLLM, I actually have no option to use docker and installed application are tracked/listed on server. So I mostly find simple ready to run apps which I don't have to install. :)
How did you get the localhost:3000? Love the app and the content, will test this later !!
Using our docker image! hub.docker.com/r/mintplexlabs/anythingllm
thank your awesome post, just clarify: though with just 14 dataset, but during your training process, you use RAG so that it can expand knowledge in 14 data and can respond with longer text, am I right?
if i understand correctly, yes
When it will be released for windows?
@@hasangh4678 should be able to get the desktop app updated with this by the end of the week
@@TimCarambat❤❤
8:09 don't take it bad but in a law firm the lawyer has a legal duty of confidentialty. No "cloud" solution is legally adequate. Attys who do cloud services INCLUDING gmail are engaged in malpractice. Most likely don't understand google scans their mails. If you want a monetization pathway for anythinglm it's right there.
@@QuizmasterLaw this is for just generating a fine tune. The llm is running locally on device. 99% of people don't need a fine tune so the app is sufficient for their use since it's on device and using also on device documents for rag
@@TimCarambat It works.
You're amazing!
@@TimCarambat my point wasn't "it's bad" but was "here's a monetization path". I'm an educator so my use case is not confidential data. There really is a job for anyone who wants to go to the local law firms and set up their local secure llm.
Can I finetune models from the cloud version? I have absolutely no coding skills.
Yes, but i would recommend still using the desktop app so you can easily load that model in locally once its ready
What’s the difference between this route and creating a custom GPT
@@boardsontt1756 custom GPTs are just OpenAI models with a system prompt + rag and sometimes basic tools. A fine tune is basically a custom llm where it already knows your documents inherently and does not need rag. Nor does it need a system prompt to behave in a specific manner. Lastly, it can run fully offline and have additional rag and agent tooling on top of all that.
also how can I add some custom tools?
Like this: docs.anythingllm.com/agent/custom/introduction
One point if is possible . I'm not coder but i try to create a python script to store my "personal behavior" in a plain text/texts in a folder to be loaded automatic by model, like that every time i load the model, he will know my old conversations, know my plans , my direction for a specific field . Of course , in time the model should try to sort the data from files in a manner that made him more useful in responses. Obviously didn't work well 😐
How can we also fine-tune whisper?
That is an STT model, does not work the same as text text models and is a very different set of data to train on
How do you delete a workspace?
Click on the "Gear" icon on a workspace. On the "General settings" tab its a big red button
Now that the agent is working with ollama i dont think ill log out of anythingllm, lol
@@SiliconSouthShow we are working on custom agents right now so that you are not limited to what we provide out of the box. let me know if there is a tool that would be more useful if there is one top of mind
@@TimCarambat First, thank you so much. I am thrilled to see the progress with AnythingLLM. I've spent a lot of time building agents from scratch in Python using Ollama, and while it's been a tremendous learning experience, having robust tools at my disposal would be a game-changer.
I would love to see the addition of a comprehensive Tools Library. A moonshot feature could be a Memorize Tool for unsupervised data collection and learning. A tool for handling webhooks, dialers, and callers, akin to those available with LangChain, would also be fantastic. However, if those are outside the immediate scope, enhancing the current web search and web scraping tools would be invaluable.
I advocate for AnythingLLM passionately. I’ve introduced over 100 people to it, often speaking about it in live Zoom sessions. The platform stands out because it’s accessible enough for anyone to start using immediately while still being powerful. It’s well-designed, user-friendly, and out-of-the-box ready.
Given the chance, I'd love to run a channel teaching others how to leverage AnythingLLM for various applications, from work to play. I’m particularly excited about potential memory features that would allow for advanced projects like multiplayer RPGs.
AnythingLLM is in a class of its own. Unlike other tools that are merely interfaces for other functionalities, AnythingLLM is a powerhouse. It’s a unique tool that truly delivers, and I can’t praise it enough. My wife even jokes that I should be a spokesperson for AnythingLLM because of my enthusiasm.
Looking forward to more great features and continuing to support this amazing platform!
(I"m such a loser, I spent 2 hrs oen night teaching, talking and complain aLLM when the last update came out and ollama was missing from the agent support system, but I am a huge advocate and fan of aLLM, period, I love it.)
PPS: I could see aLLM with a multiAgent system in it, I mean, I see it clearly, doing it all.
Wow that is huge! Thank you! Can I import *.gguf file directly into AnythingLLM like the downloaded system models? Couldn't find any answer to that simple task. Don't want to install LM Studio or Ollama.
We will allow that on the desktop app - since the docker version does not have an LLM inside of it. We _do_ have llama cpp in the docker image but we will be removing it soon due to the complexity of maintaining it - which is why standalone llm runners tools like Ollama or LMStudio exist. It's a project in and of itself to maintain.
Can i ask what you have against for installing LMStudio or Ollama?
@@TimCarambat just showed AnythingLLM to some ordinary people and they were amazed what can be done locally and offline but when I told them they need to install also LM Studio and start server etc. than it was already too much work. LOL. And I have to agree with them. It would be really great if everything was in a desktop app. I know this is not for the masses they just use chatgpt and be done with it.
@@DanielSchweinert Well then the desktop app fits that. It has an LLM inside of it. Only for the multi-user docker version do you need to have some other external LLM.
By default the desktop app has an LLM built into it which makes requirement to install and external runner extraneous. From how it sounds they should use the desktop app.
At the start of the video I mentioned I was using the multi-user browser based version because that is just where this feature is live now, that's all
@@TimCarambat Thank you! I know there are system llm inside of AnythingLLM like Meta Llama3 etc. for download but I really don't know how I can import other LLM's into it like "dolphin Llama3". Where is the location or path where to put those other LLM's without using LM Studio. Btw Im on a mac.
can i do this in future= run llama3 400 billion (i believe frontier model) make it chat with all kind of complicated information (with frontier model it might able answer them) then use those chat into... json? to fine tune llama3 8b?
@@NLPprompter that is exactly what you can do, I just used OpenAI here because I can. Same principle though of using a more powerful model to fine tune a smaller one.
Also we should have Llama 3.1 live soon for tuning as well so best of both worlds
@@TimCarambat I'm sorry to ask here, but I'll ask any way maybe you know something, there was a paper by OpenAI about Grokking, it's about... when a model were in fine tune phase and got overfitting continuously then at some point it become able to generalize have you seen such phenomenon in your system? if yes then if got time... i would like to hear more.
Can you make a complete video on the installation of Docker Anything LLM please ??
Its just a single command! If you have docker in your terminal just run this command:
docs.anythingllm.com/installation-docker/local-docker#recommend-way-to-run-dockerized-anythingllm
50usd pm min ?!
@@louisduplessis5167 you can run it free for desktop. You can self host at your cost, and if you can't do any of that, yeah you can pay us to host it for you.
❤
🙏
Just FYI it is against OpenAI usage policy to use their models to create content for any model training
This is a common misunderstanding when it comes to fine-tuning specifically. The generation of a fine-tuned model from API output (NOT ChatGPT) is not the generation of a new full-weight competing model with respect to their terms.
If we used the output to generate a brand new foundational model - like LLama 3.2 or something, that would be a violation as it is a new-weight full-parameter model that would compete with OpenAI.
Creating a fine tune from any foundational model, using responses from their API, is permissible within those terms.
References from their TOS
-----
Use Output to develop models that compete with OpenAI.
-----
Ownership of content. As between you and OpenAI, and to the extent permitted by applicable law, you (a) retain your ownership rights in Input and (b) own the Output. We hereby assign to you all our right, title, and interest, if any, in and to Output.
------
Source:openai.com/policies/terms-of-use/
You should really release this for people to do locally (because obviously you have the code for it lol), and then have the finetuning service for bigger companies... Someone else is going to soon if you don't
You can already fine-tune locally via tons of services you have to write the glue code yourself
FOR ANYONE THATS CURIOUS UNSLOTH DOES THIS FREE
I don't know how many times I said this in the video, but this is not for people who know how to use libraries like Unsloth.
Unsloth is amazing, but you still have to know how to code AND have a GPU you can even use. Funny enough, for the promised local version I am using Unsloth because it's so simple. Sure you can run their colab example but that isn't even close to what everyday people need.
The issue is not gatekeeping, it's making it easy and accessible to those who probably dont even have a GPU they can use to fine-tune.
Hell, even if you go to unsloths website they dont offer this for free. The LIBRARY for custom code is free, their hosted version is still paid and is not even open to the public.
Looks awesome but for some reason the way you talk is making me want to check my wallet is still in my pocket.
I took your Blockbuster card