Ollama does Windows?!?

Matt Williams

Просмотров 23 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 25 окт 2024

Комментарии • 98

@technovangelist 6 месяцев назад ⁺¹
Someone posted a comment about a 3090 on windows. I hate this interface. Too easy to delete and no way to get it back. Ask again whoever you were
@technovangelist 6 месяцев назад
But the reason it was using so much cpu is that you need a gpu with more memory if you want to run a 70b model fast. Especially if there is a lot
Of context.
@luminaire7085 6 месяцев назад
@@technovangelist Ok so the 64GB memory requirement for 70b is at the GPU level not PC level. The RTX 3090 has only 24GB of GDDR6X memory that is why it is so sluggish. So, will be shopping for something like an A100 with 80GB of HBM2e memory and another rig since I won't be playing Helldivers 2 on that one!! ;-),
@technovangelist 6 месяцев назад
i am so glad you came back. so sorry about that. there were some really sketchy comments a while back for ages, so I approve each one, but the approve and delete buttons are close on the phone. And my daughter stepped out for a second and then DEMANDED i start reading Sophie Mouse (hopefully the need to read the same story every night goes away soon after she turns 6) right away and I fat fingered.
@narulavarun 5 месяцев назад
Hi Matt, are there any drivers for Intel UHD, ollama seems to use only the CPU
@technovangelist 5 месяцев назад
Amd and nvidia only
@ArianeQube 8 месяцев назад ⁺¹²
I literally checked their website 3-4 hours ago to see if the have the windows version up. I wasn't, now it is.
@TomM-p3o 8 месяцев назад
I checked this morning lol
@TalkingWithBots 8 месяцев назад
Magic 😁
@rodrimora 8 месяцев назад
Hi! I come from using textgen webui, how does it compare to Ollama? What parameters like, temp, rep penalty etc does it use? Also, what system prompt does it use?
@technovangelist 8 месяцев назад
the system prompt and template will come from the model. But they are already set (and customizable if you like) in the model itself.
@rasheedkhan4512 26 дней назад
Thanks for the great video! Ollama was super easy to get up and running on Windows. I've been testing "ollama run llama3.2" on a few different machines:
My refurbished Lenovo with an Intel i7, 64GB RAM, and 1TB SSD (no GPU) handles it pretty well for only $280!
My old HP ProLiant server with dual quad-core Xeons (no GPU) is a bit faster, but it cost me $10K years ago 😅
My M2 MacBook Pro with 32GB RAM and 1TB SSD absolutely flies through it! Definitely worth the $3.5K price tag for this kind of performance
@jbo8540 8 месяцев назад ⁺⁵
Thanks for giving me enough time to get hooked on linux
@BlackBagData 8 месяцев назад ⁺¹
I was actually waiting for this, but just like you said, decided to install Linux on all my machines :)
@pixelplayhouse89 8 месяцев назад
Thanks for updating.. I was struggling with WSL and almost give up, then your vid show up on my youtube homepage. XD
@smithnigelw 8 месяцев назад ⁺¹
Thank you. I’ve been waiting for this!
@k_alex 4 месяца назад ⁺¹
Thank you so much. I never touched Linux in my life, so this really saved me.
@alexanderpopov3587 8 месяцев назад
Thanks for bringing the good news! Just installed it yesterday on WSL2, guess I'll reinstall it natively now.
@mercadolibreventas 8 месяцев назад
There are directions that work, and those that do not, AI is difficult and lots of fake stuff. Trusting you is so easy, you are GOLD!
@MarkJoel1960 8 месяцев назад ⁺¹
Yay! I have a Windows Gaming Laptop that I have been dying to try Ollama on because of the GPU. This is going to be Soooo much better! Thanks guys!
@juanjesusligero391 8 месяцев назад ⁺¹
That's really good news, thanks! :D :D :D
It was about time! XD
@bernard2735 8 месяцев назад ⁺¹
Yay! No more WSL2. Thank you.
@sun1mal 8 месяцев назад
this is amazing and THANK YOU! 💌 Do you have to do anything special for ollama to use the nvidia drivers?
@aappoo6932 5 месяцев назад
That weird silenceness in the end XD
@HistoryIsAbsurd 8 месяцев назад ⁺¹
I open RUclips for a nice session of video watching and sir you have not disappointed! The timing was perfection!
Been waiting for this day!!!
Edit: Also for those wondering it is on the main Ollama webpage now too :)
@MUHAMMADQASIMMAKAOfficial 8 месяцев назад ⁺¹
Very nice sharing 👍
@WiLDCaT_tv 8 месяцев назад ⁺¹
congratulations !
@tiredofeverythingnew 8 месяцев назад
This is very cool, I am getting 10% improvement running native compare to WSL.
@darenbaker4569 8 месяцев назад ⁺¹
Omg awesome thanks for the update
@kenchang3456 8 месяцев назад
What?! YAY!!! Thank you very much Ollama wanna play 🙂
@HiveGod-k2d 3 месяца назад
Hi, I have llama 3.1 running on an rtx 2060 super. Computational resource requirements aren't nearly as high as people think with these models, hope this helps someone that thinks they need 10s of thousands in equipment :)
@technovangelist 3 месяца назад
It can need a lot of you configure it to use the full context. Straight from ollama it uses a 2k context but can support 128k
@vbywrde 8 месяцев назад
This is great. Thank you! I do have a question. When I run LM Studio (windows) I get a URL that I can use for the API to call the model from my python applications. I'm assuming that Ollama has the same functionality. How do I get that setup, and where do I find the URL to plug into my scripts? Thanks again!
@technovangelist 8 месяцев назад
localhost:11434/api/chat or generate. all the docs are in the repo. This is an area where ollama really shines over the alternatives.
@autoboto 8 месяцев назад
I have a bunch of model I downloaded for under WSL2. Can I migrate them for the windows install of Ollama? Be nice to save time and data of not re-downloading them all.
@jonmichaelgalindo 8 месяцев назад
Does this mean we'll have nuclear fusion this week???!!
@localminimum 8 месяцев назад
I haven't had time to check out ollama yet, what are the advantages of ollama over oobabooga? Is it mostly ease use? Does it have significant features that justify jumping over from oobabooga? I already have 100s of GBs of models downloaded, can I convert them to your blob format without having to re-download? Those are just my initial questions. I'll be checking it out regardless.
@technovangelist 8 месяцев назад ⁺¹
I would say ease of use and power. You can create new models from the weights you already downloaded, but you need to know the prompt and template. That should be available in your current app or where ever you got the models from. there are docs in the repo
@localminimum 8 месяцев назад
@@technovangelist Awesome, thanks!
@VijayDChauhaan 8 месяцев назад ⁺³
Finally
@VijayDChauhaan 8 месяцев назад
And thanks for reminding me to drink water❤
@DrMacabre 2 месяца назад
hello, any idea how to set keep_alive when running the windows exe ?
@szx-2020 6 месяцев назад ⁺¹
you are THE GUY thx
@KINGLIFERISM 5 месяцев назад
Dumb question but I have ollama webui on my synology nas and Ollama on my PC (GPU) how would one make the nas webui work with my pc????
@dennissdigitaldump8619 8 месяцев назад
Does Ollama have an SSE2, etc. requirement? I have a 12 core Xeon with 80gb ram & 12gb vram. Weirdly the python common tools they often have CPU requirements, but uh, it's mostly on GPU right?
@technovangelist 8 месяцев назад
Never heard that one come up for ollama. I don’t think so. Ollama has no python code at all. the first version did but it was removed quickly because it was way too limiting. There used to be a requirement for avx but now it only uses it if it’s there.
@fastrocket 8 месяцев назад
Great work. Now can you create more channels for the discord? One is a mess.
@smithnigelw 8 месяцев назад
Matt, I want to move the model blobs off of my C: drive to another drive where I have more free disk space. Does the Windows Ollama support the OLLAMA_MODELS environment variable somehow? And is there a way to confirm that Ollama is detecting my NVIDIA gpu?
@technovangelist 8 месяцев назад
yes, that env var is the way to do it.
@RowanStaghorn 5 месяцев назад
Hi, Ollama doesn't seem to be utilising my GPU, what can be the cause for this?
@akdn11 7 месяцев назад
this is great, thanks. how do Iinstall in a different location?
@narulavarun 5 месяцев назад
I have a Dell Latitude 3410 with Intel Core i5 10310U 8 core 1.7 GHz, 16 GB RAM, 8 shared with Intel UHD. When running ollama with some of the models like llama3 or qwen, I noticed in the Task Manager that the GPU is hardly being utilized. Is there some way to utilize the GPU and improve the performance of a model? Needless to say, the output is agonizingly slow.
@technovangelist 5 месяцев назад
I don’t see any mention of a nvidia or amd gpu. Without that no gpu can be used
@fishface6247 7 месяцев назад
Thanks for this. I downloaded mistral-7B-v0.1 from somewhere on the internet. How do I load this model that I have locally on my hard drive?
@technovangelist 7 месяцев назад ⁺¹
Watch the video Adding Custom Models to Ollama
ruclips.net/video/0ou51l-MLCo/видео.html
@fishface6247 7 месяцев назад
@@technovangelist thank you
@spdnova9012 8 месяцев назад
hey boys can you help me with this :"wsarecv: An existing connection was forcibly closed by the remote host."
@ShadowDefense2 3 месяца назад
so if i do what you said in the video, how can i make it listen to my voice and speak back to me like chatgpt4o?
@rajeshkrk 8 месяцев назад
When we download a model e.g. mistral where does it store in our local machine ? As its available locally so we do not need internet right ?
@technovangelist 8 месяцев назад
Yes. All local after you get it.
@crazymall 8 месяцев назад
What if I already have LLM files downloaded, how can I point Ollama to the folder and use it?
@technovangelist 8 месяцев назад
Yes but often it will be faster just to pull the models. That said you can create modelfiles for each of the model weights you have, include the system prompt and parameters and create a new model from that. There are some videos here about creating new ollama models or you can refer to the docs on modelfiles
@gianlucagiannetto5146 3 месяца назад
if I want to use ollama as a llm model in chatbot it can be useful?
@technovangelist 3 месяца назад
Yes, definitely
@RebelliousX 8 месяцев назад
now how to have code assistant in vs code an ollama in windows?
@TalkingWithBots 8 месяцев назад
Finally ❤
@carthagely122 8 месяцев назад
Thank you
@WeGoAllTheWayUp 5 месяцев назад
how do i make llama3 uncensored for research purposes
@aibi5532 4 месяца назад
u can run other unrestricted open source ai models
@RiskyBizz 2 месяца назад
you cant, i just wasted 3 hours trying
@tekphreak 2 месяца назад
Error: no suitable llama servers found
@technovangelist 2 месяца назад
sometimes rare edge cases come up. You will probably get a quicker response in the ollama discord at discord.gg/ollama.
@geomorillo 8 месяцев назад
it was about time lol
@-iIIiiiiiIiiiiIIIiiIi- 8 месяцев назад
Viewer: "You know can you use multiple takes in case you flub a line."
Matt: "What did you just say to me?"
@technovangelist 8 месяцев назад
Huh? I don’t understand
@johnne86sd 8 месяцев назад
Why would I use this over LMStudio?
@technovangelist 8 месяцев назад
The reason I have seen most mention is that LMStudio is a great place to start, but it's limiting. I haven't spent much time with it because its too frustrating, but that’s the feedback I have seen online. Not sure if it helps
@Mansyno 8 месяцев назад
@@technovangelist plus LMStudio is not open source
@seniorCode1 4 месяца назад
HELLO CAN ANYONE KNOW HOW TO DELETE THE OLLAMA MODELS FROM LINUX DISTRIBUTION ON WIN 11 . I CANNOW DELETE EXACT OLLAMA MODELS .AND SUDOR RM (OLLAMA MODEL ) ONLY DELETED THE NAME I SUPPOSY .THE SPACE STILL SAME
@technovangelist 4 месяца назад
Ollama rm modelname will delete the model. If another model uses the same weights file that will stay so you would need to delete all the models that use the same weights to see a difference.
@sanjayojha1 8 месяцев назад
Hi Matt, can we use ollama without GPU? If yes, how?
@technovangelist 8 месяцев назад ⁺¹
Yes you can! But its super slow. GPU is really needed to make the experience good.
@sanjayojha1 8 месяцев назад
@@technovangelist Yes you are correct, I checked it on wsl2 with gguf .. it was like 3-5 sec per token on my machine.
@firstlast493 8 месяцев назад
Can I install it on the D: drive?
@henkhbit5748 8 месяцев назад
Thanks for the update for the windows version. Can Ollama only run with gpu and not with cpu? (for example the gguf quantized version). I have 32gb of Ram and 2gb of vram on my laptop.
@technovangelist 8 месяцев назад ⁺¹
So ollama uses the quantized gguf models. So that means it will use gpu when there is a good gpu there, but will drop down to cpu when its not.
@MogensHertz 5 месяцев назад
204MB/s
DAMNNNNNNNNNNNNNNNNNNNN
@baheth3elmy16 8 месяцев назад
So it runs in CLI? Why is it there?
@technovangelist 8 месяцев назад
You would prefer it to not be on windows?
@StephenRayner 8 месяцев назад
Finally on windows… because windows sucks ass…
@AnythingGodamnit 8 месяцев назад
Glad to see this, though already running it in Docker and not sure what the advantage of switching to native is considering I don't have an NVidia GPU.
@technovangelist 8 месяцев назад ⁺¹
without a gpu, native is going to be far faster, because you don't have the multiple levels of abstraction. Docker on Windows is going to be the slowest of the 3 options.
@AnythingGodamnit 8 месяцев назад
@@technovangelist cool thanks, will definitely give it a go
@IdPreferNot1 8 месяцев назад
Windows Error??:
Anyone getting an error running a model on windows. When finish pulling or try to run model, i get:
Error: Post "127.0.0.1:11434/api/chat": read tcp 127.0.0.1:51387->127.0.0.1:11434: wsarecv: An existing connection was forcibly closed by the remote host.
I fully deleted my previous WSL install and cant see any port 11434 conflict.
Any ideas??

Следующие

Автовоспроизведение