Run ANY Open-Source Model LOCALLY (LM Studio Tutorial)

Matthew Berman

Просмотров 148 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 27 сен 2024
Get UPDF Pro with an Exclusive 63% Discount Now: bit.ly/46bDM38
Use the #UPDF to make your study and work more efficient! The best #adobealternative tool for everyone!
In this video, we look at LMStudio and I give you an in-depth tutorial of the easiest-to-use LLM software. You can run any open-source AI model easily, even if you know nothing about how to run them.
Join My Newsletter for Regular AI Updates 👇🏼
www.matthewber...
Need AI Consulting? ✅
forwardfuture.ai/
Rent a GPU (MassedCompute) 🚀
bit.ly/matthew...
USE CODE "MatthewBerman" for 50% discount
My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: / matthewberman
👉🏻 Discord: / discord
👉🏻 Patreon: / matthewberman
Media/Sponsorship Inquiries 📈
bit.ly/44TC45V
Links:
LMStudio - lmstudio.ai/

Комментарии • 364

@matthew_berman 10 месяцев назад ⁺⁸
The best discount for Black Friday: bit.ly/46bDM38
@MrAndi1281 10 месяцев назад ⁺⁶
Hi Matthew, i love you videos, watching all of them lately, but i have to ask, did you forget the Autogen Expert Tutorial??
@amandamate9117 10 месяцев назад ⁺¹
how to run deepseek-code-7B in ML Studio? its perfect for coding but i dont get a good answer. I dont know which Preset (on the right) to use for this model.
@SDGwynn 10 месяцев назад
Fake?
@Mike-Denver 10 месяцев назад ⁺¹
It would be great to see how it works with autogen and memgpt. And thank you Mat for this great job your are doing! Keep doing!
@matthew_berman 10 месяцев назад
@@MrAndi1281 haha no, but that’ll take a bit longer to put together
@kanishak13 10 месяцев назад ⁺¹⁵
I m blown by the possibilities it brings to the users who are not comfortable with the earlier present methods.
@jsmythib 7 месяцев назад ⁺⁹
I just tried the local server of LM Studio...using it, and its examples I had a c# console app setup and talking to it in about 15 minutes. Easiest API to use, maybe ever. So good I came here to mention it! :)
@TheZanzz27 10 месяцев назад ⁺⁹
I like how with nearly no context, "Mario" just pumped out a romance novel scene.....
@leandrotami 4 месяца назад
OMG I stopped the video to read it and honestly I never imagined Mario in such a context. What did he do to Peach!? Is it even Peach!?
@productivitygod7887 Месяц назад
that was wild as hell, mario and his taboo activities
@godned74 10 месяцев назад ⁺¹⁷
LM studio is awesome . running the server and operating open source models from an IDE I was able to get it to perform pretty much on par with gpt 3 j. just a bit slower . running the server is the way to give your llm the most tokens possible for inference while you formulate your questions around json's and SPR sparse primary representation prompts in the IDE. At one point I had dolphin 2.2 telling a story for over an hour strait with out stopping and not even repeating itself until I shut it off . Massive unexplored potential there.
@retex73 10 месяцев назад ⁺¹
OMG! What was the quality of the story like ? You think it could readable and enjoyable novels on demand ?
@bigglyguy8429 8 месяцев назад ⁺¹
@@retex73 I too am interested in this magic?
@Boneless1213 10 месяцев назад ⁺¹⁶
Do you have a running list of the best models for each category? I can't always remember which one you tested last for either coding or uncensored ect. Thanks for any comments.
@kai_s1985 10 месяцев назад ⁺⁷
Do they have a document upload feature, so that we can chat about our document like the custom GPTs?
@saintsscholars8231 10 месяцев назад ⁺²
Seconded.
I’m wondering about this too
@64jcl 10 месяцев назад ⁺²¹
In your demo you seem to only use 1 GPU layer. For my "old" Nvidia 2060 with 6GB I can easly do 40 layers on the GPU and it is very fast with for example the mistral dolphin 2.2.1 Q5 models. The API feature is brilliant, I use it for developing my own agent using a system message to give it some interesting features in its output (calling functions).
@irakmendez9985 10 месяцев назад ⁺⁴
Any link?
@bigglyguy8429 8 месяцев назад
@@irakmendez9985 You can search from inside the LM Studio software
@pipoviola 10 месяцев назад ⁺⁴
That was amazing. You are helping us so much introducing to all this tools. Thank you very much.
@markelshnops 10 месяцев назад ⁺¹³
Would be a little more useful if the system would allow you to upload documents so you could perform actions like summarization
@norhloudspeaker 10 месяцев назад
You can do that with GPT4ALL with a plugin.
@coinheadz1942 10 месяцев назад
learn how to code lol
@fossil98 10 месяцев назад ⁺⁵
10:04
😂 Indeed. I think we know what its finetuned on hahaha.
@adamstewarton 10 месяцев назад
Mario is definitely a hor.y little llm😂
@kalvinarts 10 месяцев назад ⁺⁷
I know this is very easy to use but there are plenty open source solutions to do the same. It would be good to inform about the data collection these companies are doi g on the users who use their software.
@64jcl 10 месяцев назад ⁺¹
Do you know if LM Studio actually collects anything? Has anyone run a packet sniffer to check if it actually sends packets somewhere?
@RZRRR1337 10 месяцев назад ⁺³
Like which one? Can you tell us some open source examples?
@RZRRR1337 10 месяцев назад ⁺¹
Like which one? Can you tell us some open source examples?
@NoidoDev 2 месяца назад
Any recommendation for doing stuff in CLI?!
@issiewizzie 10 месяцев назад ⁺²
got difusionbee for local picture generation
so its about time we had an easy way to use LLM on our local Maschine
@temp911Luke 10 месяцев назад ⁺⁷¹
The only problem is...its CLOSED source, not open source program.
@etunimenisukunimeni1302 10 месяцев назад ⁺¹⁶
I agree, but what I'm seeing here, that seems to indeed be the _only_ problem. Which is cool, unless closed source is a showstopper for you.
@temp911Luke 10 месяцев назад
@@etunimenisukunimeni1302 It looks great but unfortunately it is a showstopper, at least for now.
@vaisakhkm783 10 месяцев назад
@@etunimenisukunimeni1302 ig, someone will make open source version of this in the near future.. may be not with all features and this much polished, but mostly..
@hrishikeshkumar2264 10 месяцев назад ⁺⁷
Not sure if the title changed in the last 3 hours. But he only said about the model being open source. The closed source part is only the frontend which should be fine.
@olafge 10 месяцев назад ⁺⁶
TBH this is just a UI for open source models. I really can live with a closed source product here. There are actually OSS alternatives. So, no need to worry.
@ManiSaintVictor 10 месяцев назад ⁺⁶
Just in time! Thank you. How is the MemGPT setup process? I’m gonna try this out after work. Thanks.
@jaykrown 22 дня назад ⁺¹
Great video, thank you for explaining this.
@MichaelRamkissoon 10 месяцев назад ⁺⁹
Love this!!! Thanks for always giving a walkthrough.
@spencerfunk6697 10 месяцев назад ⁺⁴
please do a tutorial with this for memgpt. ive been using lm studio for a couple weeks now. ive seen people get memgpt to work with the server but some people have issue, me included
@spencerfunk6697 10 месяцев назад
or with anything that calls an openai api for that matter i just really wanna try memgpt and chat dev with this thing
@travotravo6190 10 месяцев назад ⁺¹
I've been trying this out and it honestly delivers. So easy to run your own AI's!
@ydmoskow 10 месяцев назад ⁺¹
We are about 1 year into gpt pandemonium and the momentum is only getting faster and everything is easier
@tobiaswegener1234 10 месяцев назад ⁺⁴
It's sadly not allowed for Commercial use. But indeed very easy to install and run.
@Appleloucious 7 месяцев назад
One Love!
Always forward, never ever backward!!
☀️☀️☀️
💚💛❤️
🙏🏿🙏🙏🏼
@bobbytables6629 9 месяцев назад ⁺¹
LMStudio lacks local documents, what a bummer I will continue to use GPT4All
@rakly3473 10 месяцев назад ⁺²
Those 'should work' etc is not based on your system, it's about compatibility with the LM Studio app. (GGUF models)
I have 128GB system and 40GB VRAM, and it also shows the 30GB+ required warning.
@Axxis270 10 месяцев назад
I have been yelling about this and Faraday (my favorite) for quite some time now, but for some reason you never see any of the ai channels telling you about. These are the easy to use programs that the majority of ai users want.
@theresalwaysanotherway3996 10 месяцев назад ⁺³
looks nice, but I wouldn't rely on it for testing in your videos until you can specify prompt formats (there's a good chance the model might be handicapped by the wrong format, currently it only lets you edit context, not the full prompt format). Also it only uses llama.cpp, which means anyone with an nvidia GPU could double their speed by switching to ExLlamaV2 and EXL2 quants.
@ezygoat 6 месяцев назад
I accidentally subscribed to you a long time ago, best decision I ever made.
@Parisneo 10 месяцев назад ⁺⁶
Very cool tool. Thanks for this nice tutorial.
I wish some day you give lollms a try. It has a models zoo and can run multiple types of models including GGUF , gptq and now awq. It has a persona system and can be installed with a single file install script. It supports basically all remote and local LLMs. Asits name suggests, it is built to support everything that crawls out there. It can be used to generate text, image and audio. It has an extension system (WIP) and It took me hell lot of effort to make. it is 100% free under apache 2.0 licence and there are documentations on my modest youtube channel. I think you can present it way better than I do :) .
@stickmanland 10 месяцев назад
Look who's here, guys!
@jtabox 10 месяцев назад
lollms is absolutely worth giving a try, I installed it and have been using it for a week now. It has so many features and functions it's almost unbelievable that it's just a couple of devs behind it all. It's a bit rough around the edges in some aspects, but still very much functional and new additions and bugfixes are published constantly. It's been my favorite so far.
@Steve.Jobless 10 месяцев назад ⁺⁴
Running the open-source models, but the software itself is not open source, lol
@imperialGaming.2473 6 месяцев назад
GPT killer other than Sora. This will be what LLM will look like in the near future! So excited to get my hands dirty! 😮
@paveljanetka2864 10 месяцев назад ⁺²
thanks for video, please could you advice how to work with local documents with the model?
@yerneroneroipas8668 10 месяцев назад ⁺²
Mario started writing 50 shades of grey for you 💀
@Buddylee-7 10 месяцев назад ⁺²
Wish they would add the chat with your docs feature
@Pietro-Caroleo-29 10 месяцев назад
Good afternoon Mr Berman... You have a talant doing these videos, you come over as clear as glass. well done.
@OutdoorsHappiness 10 месяцев назад ⁺¹
LMStudio looks pretty awesome, great job on giving us a tour, going to try it, thanks !
@theh1ve 10 месяцев назад ⁺²
Hmm what has Mario been up to 😂😂😂
@mutleyeng 4 месяца назад
im a complete coding/compter numty and got it running fine. Quest i dont know is how to take a basic base model and add learning to it. It told me it can extract information from webpages, but it dosnt seem very effective
@DikHi-fk1ol 10 месяцев назад ⁺¹
Off-topic question- how can i save a fine-tuned model that i fine-tuned using gradientAI to run it locally.
Please reply, love your videos!❤❤
@alibahrami6810 10 месяцев назад ⁺¹
How to run LMStudio with Autogen and MemGPT?
9 месяцев назад ⁺²
Is there a way to add your own text files, datafiles etc.? So when using the chat, it also knows the specific info about a subject from the files I provided?
@davidhendrie6061 7 месяцев назад
I am also very interested in this. i want to add tons of local video and audio content to the chosen LLM. would love to batch it in. anyone else doing that sort of thing?
@parthwagh3607 9 месяцев назад ⁺¹
Can you please provide a specification for PC build of $2400, which will run ai models locally in fastest way possible at this price. What things we should consider when building PC solely for running ai models locally and rarely gaming? What really helps to run this model fastest locally? please provide related information also. I want to build a PC with budget of $2400. Thank you.
@Leto2ndAtreides 10 месяцев назад
TheBloke also gives recommendations for which models to use or not use - not necessarily which one is the biggest that you can run.
@infinitytrading-ai 10 месяцев назад ⁺¹
can you make a tutorial on how to run and test local llm models on a linux server for business us. also using vector embeddings to allow way more data to chat with. ?
@friendofai 9 месяцев назад ⁺¹
Would you be able to cover more in-depth about the developer side? I would like to host on my local PC, but be able to access it from my android phone.
@johnne86sd 8 месяцев назад
I have a GTX 1660Ti with 6GBs VRAM and I got way faster results from my Nvidia card when setting n_gpulayers to around 20-30, instead of leaving at 0. Haven't tried anything higher than that, but the difference was night and day. I tried it on mostly 7B 4QKM/S models around 4-5Gbs.
@xdasdaasdasd4787 10 месяцев назад ⁺¹
Great video! Id love a lm studio with memgpt and autogen video if possible
@Joe_Brig 10 месяцев назад ⁺¹
Looks good an I'll try it. I'd argue that Ollama is much easier.
"ollama run mistral" vs
Open LMStudio click, click, click, click...
@Pyriold 10 месяцев назад
After opening its just one click if you use the same model as before. Maybe you have to reload the model, so ok, 2 clicks.
@Joe_Brig 10 месяцев назад
@@Pyriold How many clicks to find, download, and start a new model?
Compared to "ollama run vicuna"
How do you start a model from the terminal?
@Pyriold 10 месяцев назад
@@Joe_Brig Is this really relevant? I download a model maybe every few days or weeks and then its like 1 minute of work. Using a model is done way more often, and thats practically instant.
@theh1ve 10 месяцев назад ⁺³
What telemetry does it capture/send?
@silentwindstudio 7 месяцев назад
In their website they say that they capture no data from user, we can only pray that this is true lol
@HishamAl-Sanawi 10 месяцев назад
brilliant! thank you Matthew and thank you LM Studio
@ajaypranav1390 10 месяцев назад
Wow in your previous video commented on LMstudio and now I see a video on it. Wow you are the best
@nasimobeid2945 10 месяцев назад ⁺¹
Awesome content as always!
@SYEDNURULHasan1789 9 месяцев назад
crisp and concise content...
@aminalyaquob1387 3 месяца назад
awesome review! I wonder how to make the LLM constrained to read and analyze local files?
@JorgeGiro 5 месяцев назад
One thing I don't really understand is where I should place the php files, if I want to use php and curl, to access the local instance of the model.
@mishlaev 7 месяцев назад
Thank you for your tutorial and the channel. It would be nice if you can teach how to process files with LM Studio. For example, I have an email (HTML) that I want to parse and structure. I would be interesting to learn all the details how to tune temperature, tokens, context window, etc.
Thanks
@mdekleijn 10 месяцев назад
Love this! Thanks for sharing.
@aketo8082 5 месяцев назад
Looks great, Thank you. But LM Studio didn't work with own text, PDF or Docx files, right? Also no dialogue mode possible.
Is there a video that shows how to create own LLM? Thank you.
@sadeghnakhjavani1986 Месяц назад
Good job !
@RichardGetzPhotography 10 месяцев назад ⁺³
Matthew, can these models be DLed to an external drive and used from there? Can you set up Agents? No capability to upload files? Can you report how the M processor does against a GPU? How well does the locally ran dev version scale? Obviously based on the size of the computer this is running on, but will it handle multiple requests from developers?
@just..someone 10 месяцев назад ⁺¹
you can def. have the models on a separate drive, which is super useful. not sure about the rest, but to the last question: via the API mode (emulates style of open AI api) you can have several requests, that then get queued up one after the other.
@RichardGetzPhotography 10 месяцев назад ⁺¹
@@just..someone thanks for the reply
@ennio310 Месяц назад
Thank you for you tutorial! Some questions for you:
Can I upload text or other format document and question the model about the content?
And I need to be sure that the privacy of the text/document uploaded is granted. If the computer is online, the contents uploaded would always and anyway be private?
@Ilan-Aviv 14 дней назад
Love your videos :)))
@TrevorMatthews 10 месяцев назад ⁺²
Thanks @matthew_berman One challenge I haven't solved yet is moving an environment. At the office I have the OK to explore LLM potential BUT within the existing software and hardware constraints. My PC is good enough, but our network is so locked down none of the scripts can pull down requirement files and libraries. I'd need to setup an environment on an 'internet facing' computer and then be able to move it. And run it. Is that possible??
@OpenLLM4All 10 месяцев назад ⁺²
Could try using a VM. I noticed a company called Massed Compute has VMs specifically for Matthew. All of the tools he has used in his videos are pre-loaded
@beeeev 10 месяцев назад ⁺¹
But can you fine tune the models or have it access your private documents locally on your computer?
@joserodolfobeluzo3100 8 месяцев назад
How can I do fine tunning with my context? Is there any video that you explain it? It should be amazing! I tried LM studio! So easy! Thanks a lot!
@davidhendrie6061 7 месяцев назад ⁺¹
keep me updated
@steveyantis 5 месяцев назад
Thanks!
@AlejandroGarro Месяц назад
I Have a suggestion.... If is it possible make a video about how to get and see image ressponses from a image generation model like FLUX.1 or Stable Diffusion 2. I ran it in LM Studio by i got a markdown with an unuseful link to imgur or other similiar.. None of thos links works.. Thanks
@profittaker6662 5 месяцев назад
can you make a video about how to make that server in localhost, python and curl versions
@danaharden6283 8 месяцев назад
If you recommend using the latest best LLM, what do you recommend now?
@BigFarm_ah365 5 месяцев назад
Installed but none of the models are working and the log doesn't really point to anything. My CPU is fine, I have 16GB RAM, 12GB VRAM and can't run any models
@dlbet4110 4 месяца назад
I'm trying this on an older computer. I got the message "This processor does not support AVX2 instructions." is there a way to get this to work? Obviously, it would work on a newer processor, but I don't want to test it on computers I actually use. Or, is there a model that will work on less than AVX2 instructions?
@jack-IR 5 месяцев назад
you got the subscribe for the last part
@dylanalliata4809 10 месяцев назад
Very well done.
@trashboat2821 10 месяцев назад
Awesome! are you going to create a video on OpenAI's upcoming 'create your own gpt'? would love a video covering that, and exploring any alternatives for Mistral or Llama (ie open source).
@BabylonBaller 10 месяцев назад ⁺¹
I would love to install this, but I dont think it has a local web option like Gradio. Which would allow me to access it from any device in my network or from outside my home through a local IP / port
@Pyriold 10 месяцев назад
It has a local server mode, that was shown in the video.
@BabylonBaller 10 месяцев назад
@@Pyriold I did see that but from the video it seems its an api backend type of connection only, not a connection that has a gui and complete usage that you can use by simply browsing to the ip and port and see the entire front end like you can with Oobabooga, and Automatic 1111
@peterwan816 9 месяцев назад
真係好撚正XD
Its JUST AWESOME!!!
@petarstoev4848 6 месяцев назад
can you load pdf files for summarization etc. in those models?
@propolipropoli 4 месяца назад
Best Video Ever
@ZiB-Music 2 месяца назад
I use a laptop with 128 ssd. But VRAM is verry low!? How do i get more VRAM?? Its verey low, like 1.48 and i need way more...
@abdussamed107 9 месяцев назад ⁺¹
First I want to thank for sharing the useful AI content.
The LM Studio software was a key step to bring AI assistants a step closer to the customers and consumer.
I made use of the software as well and was recently experimenting with dolphin mistral llm 2.2.1 and wondered after a while what the token count 4984/2048 at the bottom right below the chat input means. As far as I understood, it's some sort of counter how many tokens the llm already has written and answered, but why does it matter? Is the chat history fed into the language model each time we enter something new, and this happens somehow behind the scenes? When these language models are working like this, I would understand that the natural limit of the input the language model supports also is the maximum size of the chat history.
I am not very familiar with LLM s and just started experimenting with them. Could someone please explain why the token count: yxcd/yxcd number is there and how it affects the Assistants' performance or affect the chat in which way?
Thanks in advance
@m12652 6 месяцев назад
How many millions of tons of carbon are being wasted listening to these models apologise?
@bigchungus3 10 месяцев назад ⁺²
Nooooooooooooooo not the hands on face shocked thumbnail. You're better than that.
@darksoul525 Месяц назад
Does the mac version work on macs with dual-core intel i5 or just the m1/m2?
@wilkerribeiro1997 10 месяцев назад
Could you explain more about how that "Apple Metal" configuration works? Is it only for models trained on apple metal? What changes if it is enabled or not?
@Pyriold 10 месяцев назад
I think training and inference are totally decoupled, so it does not matter how it was trained, you can use whatever hardware for inference.
@cold_static 7 месяцев назад
So can I use this with an AMD card effectively? It says on the site it supports AMD cards, but it only uses my CPU in the end..
@VolleyballFamilyAndFriends 10 месяцев назад
I'm curious how to get this working with whisper
@brianmi40 6 месяцев назад
LOL, went a bit RACY on the New Chat! "so taboo and so wrong"... "began to gather his clothes"
@secfeed6987 4 месяца назад
does the server keep all chat history, I want it to learn from my style and feed it data so I can train to how I like. Is that possible?
@tangobayus 5 месяцев назад
How do I get it to use my private documents?
@Skettalee 8 месяцев назад
There are so many models out there Ive been looking and my question is really can i find a model that would be able to answer any computer error messages i get and how to fix them, what model would i pick if I wanted that? ANd i guess as a side note im also a musician / music producer and would like to find a model that is best for music production or creative writing songs both chord structure/progressions as well as song ideas and lyrics. How do i find that? Ive already searched any and all the keywords I could think up to find that stuff but nearly any of the keywords I try does not bring up anything so im guessing I just dont understand how to look for things. Any help like a link of something to read or anything would be amazing.
@Beauty.and.FashionPhotographer 5 месяцев назад
is this an alternative to Midjourney , so text to image for macs?
@chrisbraeuer9476 10 месяцев назад
This is awesome.
@BrockWilcox-q3y 10 месяцев назад
Alt title: Running open-source models using closed-source software
@RichardGetzPhotography 10 месяцев назад
I guess this is a great time to purchase an M MBP?
@dhiraj223 3 месяца назад
Can we load safetensor models as well ?
@rogerbruce2896 10 месяцев назад
quick question, when you download how do you specify what hard drive to download to?
@NetworkDon 6 месяцев назад
Can you give it your own dataset and files for it to learn from and provide replies about?
@RZRRR1337 10 месяцев назад
Is there any playground studio like that but for commercial llms where you put your API keys and can play with anthropic, openAI, Cohere models in one interface?
@knkn5049 9 месяцев назад
Explain to me, isnt neural network is just a config file of weights and net structure? You train model, save it for the future, and if you downloaded something, dont you have getconfig method? What is it all about? Download our programm to run locally models...? I m very confused
@Skettalee 8 месяцев назад
I asked earlier about the best models for computer error messages and ChatGPT told me "bert-large-uncased-whole-word-masking-finetuned-squad" But I tried putting all OR even ANY of most of those words in the search and cant find it. I even looked for that model on Hugging Face and got the URL to paste into LM Studios search box (like it told me it could do) and yet it still didn't find anything for me to download. Is this program still good to try to get this stuff?
@tcb133 10 месяцев назад
Could you make a video about using this server Mode to power aider? I couldn't figure how to do it
@msa100rus 9 месяцев назад
I'm not sure what I'm doing wrong, but for some reason many models in this program give out nonsense. Or repeated phrases. Or something completely indistinct. It's very strange, especially when I take a model that should be very smart, judging by its ratings. What could be the problem?

Следующие

Автовоспроизведение

Create Custom GPTs 🫡 OpenAI's AGENTS Are Here! (No Code)