FREE Local LLMs on Apple Silicon | FAST!

Alex Ziskind

Просмотров 233 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 30 янв 2025
Step by step setup guide for a totally local LLM with a ChatGPT-like UI, backend and frontend, and a Docker option.
Temperature/fan on your Mac: www.tunabellys... (affiliate link)
Run Windows on a Mac: prf.hn/click/c... (affiliate)
Use COUPON: ZISKIND10
🛒 Gear Links 🛒
🍏💥 New MacBook Air M1 Deal: amzn.to/3S59ID8
💻🔄 Renewed MacBook Air M1 Deal: amzn.to/45K1Gmk
🎧⚡ Great 40Gbps T4 enclosure: amzn.to/3JNwBGW
🛠️🚀 My nvme ssd: amzn.to/3YLEySo
📦🎮 My gear: www.amazon.com...
🎥 Related Videos 🎥
🌗 RAM torture test on Mac - • TRUTH about RAM vs SSD...
🛠️ Host the PERFECT Prompt - • Hosting the PERFECT Pr...
🛠️ Set up Conda on Mac - • python environment set...
🛠️ Set up Node on Mac - • Install Node and NVM o...
🤖 INSANE Machine Learning on Neural Engine - • INSANE Machine Learnin...
💰 This is what spending more on a MacBook Pro gets you - • Spend MORE on a MacBoo...
🛠️ Developer productivity Playlist - • Developer Productivity
🔗 AI for Coding Playlist: 📚 - • AI
Repo
github.com/ope...
Docs
docs.openwebui...
Docker Single Command
docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main
- - - - - - - - -
❤️ SUBSCRIBE TO MY RUclips CHANNEL 📺
Click here to subscribe: / @azisk
- - - - - - - - -
Join this channel to get access to perks:
/ @azisk
- - - - - - - - -
📱 ALEX ON X: / digitalix
#machinelearning #llm #softwaredevelopment

Комментарии • 341

@MelissaCirtain 2 месяца назад ⁺⁷
Thanks! I am a data scientist and I’ve been looking for a way to share my prototypes visually for talks, demos, w/ non-tech ppl, etc. This is awesome. Thank you! ❤
@AZisk 2 месяца назад
thank you 😊
@camsand6109 8 месяцев назад ⁺⁵¹
This channel is the gift that keeps on giving.
@crispatagonian 8 месяцев назад ⁺⁶
I like the manual installation process because it uses less resources than having docker running all the time on your mac.
thanks for the tutorial, I really enjoyed doing it and seeing how it works.
@Coldsteak 2 месяца назад
you can literally stop it
@Bjorick 4 месяца назад ⁺¹¹
For anyone looking, m1 mac with 8 gig ram, go for llama 3 7b set to 3 bit quauntimization (K_S) and you can still have firefox and chrome open and watch background videos
this is with ollama, silly tavern and 4k context size, one full prompt generation per minute which is good for the low ram specs
@maxxflyer 4 месяца назад ⁺²
thanks, how many tokens per second?
@keithdow8327 8 месяцев назад ⁺⁶
Thanks!
@AZisk 8 месяцев назад
Wow 🤩 thanks so much!
@innocent7048 8 месяцев назад ⁺³²
Here you have a super like - and a cup of coffee 🙂
@AZisk 8 месяцев назад ⁺¹¹
Yay, thank you! I haven't been to Denmark in a while - beautiful country.
@guyguy467 8 месяцев назад ⁺⁴
Thanks! Very nice video
@AZisk 8 месяцев назад
Wow! Thank you!
@shubha07m 5 месяцев назад ⁺²
Tons of appreciation to you on the behalf of all Mac community, you are amazing!
@llcraven 3 месяца назад ⁺³
I have no idea what you are talking about most of the time, but I find your videos
interesting. Maybe I'll learn something.
@AC-cg6mf 8 месяцев назад ⁺²⁸
I really like that you showed the non-docker install first. I think too many rely on docker black-boxes. I prefer this. Thanks!
@philipo1541 8 месяцев назад ⁺⁷
Dockers are not a black-box. You can get it in them, and change stuff!!!
@veccio 8 месяцев назад
Respectfully, Docker need not be a black box. Don’t be afraid to tinker and dig in. :) But I get how doing it manually forces you to touch different parts.
@klauszinser 3 месяца назад
I tried - before you explanation - the local install and failed. Docker consumes a lot of RAM and CPU. Will try again locally with your instructions.
@csj-anti-2030 8 месяцев назад ⁺⁴
Another great video Alex, I really enjoy your videos. And I really appreciate your perfect diction in English, which makes it easy to follow your explanations even for those who do not have English as their first language.
@isaquehg 7 месяцев назад ⁺²
I really appreciated your approach showing "behind the scenes" instead of just running Docker. Great video, as always!
@oldmandeckhand 8 месяцев назад ⁺¹⁸
Great video Alex! yes please make videos on image generation!
@brunosanmartin1065 8 месяцев назад ⁺³⁶
These videos are so exciting for me; this channel is the number one on RUclips. That's why I subscribe and gladly pay for RUclips Premium. A hug, Alex!
@AZisk 8 месяцев назад ⁺⁷
thanks for saying! means a lot
@RealtyWebDesigners 8 месяцев назад ⁺³
Now we need 1TB MEMORY DRIVES (Like the Amiga used to have 'fast ram' )
@MrMrvotie 8 месяцев назад
@@AZisk Is their any chance you could incorporate a PC GPU Relative Performance Equivalence to each new apple silicon microchip that you review?
@ReginaldoKono 8 месяцев назад ⁺¹
Yes Alex, you will help us more if we could learn with you how on how to add an image generator as well. We thank you for your time and colaboraron. Your channel is a must have subscription in it now-a-days.
@matteobottazzi6847 8 месяцев назад ⁺⁴
A video on how you could incorporate these LLMs in your applications would be super interesting! Let's say that in your application you have a set of pdfs or html files that provide documentation on your product. If you let these LLMs analyse that documentation, then the user could get very useful information just asking and not searching through all of the documentation files!
@FelipeViaud 8 месяцев назад ⁺²
+1
@neoqe6lb 8 месяцев назад ⁺¹
Ollama has api endpoints that you can integrate in your apps. Check their documentation.
@shapelessed 8 месяцев назад ⁺¹²
YO! Finally hearing of a big Svelte project!
Like really, it's so much quicker and easier to ship with Svelte than others, why am I only seeing this now?
@AZisk 8 месяцев назад ⁺⁴
Svelte for the win!
@precisionchoker 8 месяцев назад ⁺¹
Well.. Apple, Brave, New York times, IKEA among other big names all use svelte
@shapelessed 8 месяцев назад
@@precisionchoker But they do not acknowledge that too much..
@3monsterbeast 8 месяцев назад
This channel is going to be growing so fast; you make great videos that are very helpful!
@davidGA殿 8 месяцев назад ⁺¹⁷¹
My M1 Mac 16GB be real frightened on the side rn.
@ivomeadows 8 месяцев назад ⁺⁹
got macbook with the same specs. tried to run 15b starcoder2 quantized k5m in LM studio on it, max GPU layers, getting me around 12-13 tokens per sec, not good but manageable
@RobertMcGovernTarasis 8 месяцев назад ⁺¹⁷
Don't be, unless you are using other things that are super heavy as well. Llama3 8B(?) takes up about 4.7GB of Ram, with the Silicon's event use of the Nvme and Swap you'll be fine. (I prefer using LM Studio now to Ollama as it has CLI and Web built in, no need for Docker/OrbStack but, Ollama on its own without a WebUI works too)
@martinseal1987 8 месяцев назад
😂
@DanielHarrisCodes 8 месяцев назад
Great video. What format are LLM models download as? Looking into how I can use those downloaded with OLLAMA with other technologies like .NET
@chiroyce 8 месяцев назад ⁺²
Ay I was about to try too! Let me know if it runs alright, I've got the 8 core CPU/GPU 16GB M1 MacBook Air
@RDUBTutorial 2 месяца назад
This video is so great …took me forever to stumble on it. I want to watch the other videos you mentioned in this but your channel has SO many …which playlist will get me all I need for running LLM stuff on my M2 studio 128gr? I’m a beginner / novice.
@ElenaSpirkina-z8g 2 месяца назад
Amazing ! So easy to set up and start working with ! Thank you Alex!
@shiyammosies5975 6 месяцев назад ⁺²
Nice video. What's the benefit of running Open Web UI vs LLM Studio? what's the difference and which is better?
@DaveEtchells 8 месяцев назад
I was gonna spring for a maxed M3 Max MBP, but saw rumors that the M4 Max will have more AI-related chops, so just picked up a maxed M1 Max to tide me over 😁
Really excited about setting all this up, finding this vid was very timely, thanks!
@wenyang5916 4 месяца назад
Thank you Alex for the awesome video! Suggestion for a future video: how to set up and work with a local database and pipeline using this webUI
@RealtyWebDesigners 8 месяцев назад ⁺⁵
BTW - One of the BEST programmer channels!
@dibyajit9429 8 месяцев назад ⁺¹
I've just started my career as a Data Scientist, and I found this video to be awesome! 🤩🥳Could you please consider making a video on image generation (in LLama 3) in a private PC environment?🥺🥺
@gustavohalperin 8 месяцев назад ⁺⁵
Great video!! And yes, please add a video explaining how to add the images generator.
@iv4sik 8 месяцев назад ⁺¹
if ur trying docker, make sure it is version 4.29+, as host network driver (for mac) revealed there as a beta feature
@spadaacca Месяц назад
This was total gibberish to me but watched the whole thing and actually made me want to learn how to do it (maybe one day)
@bunkerbuster73 6 месяцев назад ⁺¹
Hi Alex, Thank you for the great video. I was following the NVM and NODE installation. You recommend to not install globally.... How the process is done ? You have to do into a conda ENV ? Thank you
@AaronHiltonSPD 8 месяцев назад ⁺⁵
Amazing tutorial. Great stuff!
@AZisk 8 месяцев назад ⁺¹
Thank you! Cheers!
@WokeSoros 8 месяцев назад
I was able to, by tracking down your Conda video, get this running.
I have some web dev and Linux experience, so it wasn’t a huge chore but certainly not easy going in relatively blind.
Great tutorial though. Much thanks.
@mendodsoregonbackroads6632 8 месяцев назад
Yes I’m interested in an image generation video. I’m running llama3 in Bash, haven’t had time to set up a front end yet. Cool video.
@7764803 8 месяцев назад ⁺²
Thanks Alex for videos like this 👍
I would like to see Image generation follow up video 😍
@loveenjain 8 месяцев назад
Excellent Video giving it a try tonight on my M3 Max 14 inch model and see what are the results will share probably...
@klaymoon1 6 месяцев назад ⁺²
Great video. Does anyone know what's the largest LLAMA model we can install on a mac studio?
@diAx007 2 дня назад
I have a significant request for you as the author of the channel. Could you test Apple computers with M4 Pro, Max and M2 Ultra chips, equipped with 48,64-128 GB of RAM, to determine the maximum size of local LLMs that can be practically used with acceptable performance? The question is whether it makes sense to invest in 128 GB versions, or if 64 or even 48 GB would be sufficient, considering that larger models might be unusable due to insufficient computational power.
As an LLM user, I’ve encountered the issue where I cannot properly deploy models larger than 14B on my 12 GB GPU. I am particularly interested in the practical use of LLMs on such machines. Models of 14B are not satisfactory for me due to their limitations in accuracy and capabilities.
@erwintan9848 8 месяцев назад ⁺¹
Is it fast on mac m1 pro too?
How many storage used for all instalation sir?
Your video is awesome!
@cjchand 8 месяцев назад ⁺¹
Just some food for thought for future vids: Anaconda's licensing terms changed to require any org > 200 employees to license it. For this reason, many Enterprises are steering their devs away from Anaconda. Would be helpful if the tutorials used "vanilla" Python (e.g.: venv) unless Conda were truly necessary. Thanks for the vids and keep up the great work!
@AZisk 8 месяцев назад
good to know. thanks
@AdityaSinghEEE 8 месяцев назад
Can't believe, I found this video today because I just started searching for Local LLMs yesterday and today, I found the complete guide. Great video Alex :)
@scorn7931 8 месяцев назад
You live in Matrix. Wake up
@youssefragab2109 8 месяцев назад ⁺¹
This is really cool, love the channel and the videos Alex! Just curious, how is this different to an app like LM Studio? Keep up the good work!
@yuanyuanintaiwan 8 месяцев назад
My guess is that this web UI has more capabilities such as image generation which LM Studio doesn’t have. If the goal is simply to have text interaction, then I agree that this may not be necessary
@grugbrain 8 месяцев назад
Yes yes please make a video generation video!!!
@lucerocj 7 месяцев назад
Would love to see an extended video on working with local files for this option.
@the-wm4715 8 месяцев назад
i believe my laptop has 80 Tensor cores. for starters. This looks like a really good shift for a fri night! thanks.
@ilkayayas 8 месяцев назад
Nice. Image generation and integrating new chatgpt in to this will be great.
@rafaelcordoba13 8 месяцев назад ⁺²
Can you train these local LLMs with your own code files? For example adding all files from a project as context so the AI suggests things based on your current code structure and classes.
@dmitrykomarov6152 8 месяцев назад
Yeap, you can then make a RAG with the LLMs you prefer. Will be making my own RAG with llama3 this weekend.
@georgelza 3 месяца назад
can you make a video about exactly what a prompt is... how they differ from each other.
@jorgeluengo9774 8 месяцев назад ⁺¹
by the way, I just joined your channel, I really enjoyed these videos, very helpful, thanks!
@AZisk 8 месяцев назад
awesome. welcome!
@filipjofce 8 месяцев назад
So cool, and it's free (if we don't count the 4 grands spent for the machine). I'd love to see the images generation
@aldousroy 8 месяцев назад ⁺¹
Awesome thing waiting for more videos on the way
@Twst704 8 месяцев назад ⁺¹
Great channel! I just did a build something similar with lm studio and flask based web ui. I’m going to try this method now. Btw, what was the ‘code .’ command you ran? Are you using visual studio code? Thanks again!
@AZisk 8 месяцев назад
Thanks! and thanks for joining. I did the flask thing a few videos ago, but it's just another thing to maintain. I find this webui a lot more feature rich and better looking. And yes, the 'code .' command just opens the current folder in VSCode
@sergelectric 20 дней назад
Thanks a lot for the video, yes we want the images part :D
@thesandworm874 17 дней назад
Great video! What Mac is he running here? I can't find the specs.
@ashesofasker 8 месяцев назад
Great video! So are you saying that we can get ChatGPT like quality just faster, more private and for free by running local LLM's on our personal machines? Like, do you feel that this replaces ChatGPT?
@swapwarick 8 месяцев назад ⁺²⁸
I am running llama, code Gemma on my laptop for local files intelligence. It's slow but damm it reads all my PDFs and give perfect overview
@devinou-programmationtechn9979 8 месяцев назад ⁺¹⁰
Do you do it through ollama and open webui ? I m curious as to how you can send files to be processed by llms
@ShakeAndBakeGuy 8 месяцев назад
@@devinou-programmationtechn9979 GP4All works fairly well with attachments. But I personally use Obsidian as a RAG to process markdown files and PDFs. There are tons of plugins like Text Generator and Smart Connections that can work with Ollama, LM Studio, etc.
@TheXabl0 8 месяцев назад
Can you describe this “perfect overview”? Just curious what you mean by
@swapwarick 8 месяцев назад
Yes running open webui for llama and code Gemma llms on windows machine. Running open webui on localhost gives textarea where you can upload the file. The upload takes time. Once it is done, you can ask questions like give me an overview of this document, tell me all the important points of this document etc
@TheChindoboi 8 месяцев назад
Gemma doesn’t seem to work well on Apple silicon
@ISK_VAGR 5 месяцев назад
Thanks really nice. I wonder why if you have everything local in your computer. You need to be connected to internet to use your local port when you are in the UI?
@99cya 8 месяцев назад ⁺¹
Hey Alex, would you say Apple is in a very good position when it comes to AI and the required hardware? So far Apple has been really quiet and lots of ppl dont think Apple can have an edge here. Whats your thought in general here?
@kaorunguyen7782 8 месяцев назад
Alex, I love this video very much. Thank you!
@mrdave5500 8 месяцев назад
Woot woot! great stuff. Nice easy tutorial and I now have a 'smarter' Mac. Thanks :)
@Ginto_O 8 месяцев назад ⁺¹
Thank you, got it to work without docker
@ChrisHaupt 8 месяцев назад ⁺¹
Very interesting, will definitely be trying this when I get a little downtime!
@AzrealNimer 8 месяцев назад ⁺¹
I would love to see the image generation tutorial 😁
@yianghan751 8 месяцев назад ⁺²
Alex, excellent video!
Can my MacBook air m2 with 16G RAM host these AI engines smoothly?
@johnsummers7389 8 месяцев назад ⁺²
Great Video Alex. Thanks.
@AZisk 8 месяцев назад ⁺¹
Glad you liked it!
@113bast 8 месяцев назад ⁺⁶
Please show image generation
@bvlmari6989 8 месяцев назад ⁺¹
Amazing video omg, incredible tutorial man
@AZisk 8 месяцев назад
Glad you liked it!
@YoriichiHeeheeRunFoyolife3334 8 месяцев назад ⁺²
hey alex why dont teach us how to program, start a series in python, or c++ or swift....
@icsu6530 8 месяцев назад
Just read on those documentations and you are good to go
@engr.hashimali758 7 месяцев назад
Hi Alex, I really appreciate your work, can you please sugggest me any model where I need a detailed summary and breakdown of quantities from this construction PDF drawings. Which organize the quantities by type (linear feet, square feet, each) and categorize them by division (e.g., General Requirements, Sitework). Waiting for your response!
@mpicuser 5 месяцев назад
Thanks for this video..please create a video about image generators.
@proflead 5 месяцев назад
Sime and nice one! Thanks Alex!
@justintie 8 месяцев назад ⁺¹
the question is: are opensource LLMs just as good as say chatGPT or Gemini?
@gligoran 8 месяцев назад
Amazing video! I'd just recommend Volta over nvm.
@AIandVisuals 7 месяцев назад
Great video. Congrats. But how to install node? Would it not work with a Python front end?
@jakubpeciak429 8 месяцев назад
Hi Alex, I would like to see the image generation video
@erenyeager655 8 месяцев назад
One thing for sure... I'll be implementing this on my menu bar for easy access :D
@thevirtualdenis3502 8 месяцев назад
Thanks ! Is Macbook air enough for that?
@Daydream_Dynamo 7 месяцев назад
Please do a video about image generation!!
@aliwesker 6 месяцев назад
The email setup on the Open WebUI is not fake. It is actually dependent on confirming the link, so it has an external connection. Also, the Model connects to the web for processing, so I am not sure how this is classified as local ( I am new to this, so any correction is welcome).
@OptimusMonk01 2 месяца назад
What is a co suppository? Is that like pooping back and forth
@cookiebinary 8 месяцев назад ⁺⁴
Tried llama3 on 8GB ram M1 :D ... I guess I was too optimistic
@TobiasWeg 4 месяца назад ⁺¹
phi 3 probably the way to go.
@marcdurbach7036 Месяц назад
Thanks for your cool videos. I have a question. If I don't want to install conda, but pyenv, I do not see any python3.11 available. So can I install conda in // of my existing python environments ?
@marcel948 7 месяцев назад
Excellent video! Keep up!
@Dominickleiner 8 месяцев назад ⁺¹
instant sub, great content thank you!
@AZisk 8 месяцев назад
Welcome aboard!
@OlegShulyakov 8 месяцев назад
When there will be a video to run LLM on an iPhone or iPad? Like using LLMFarm
@akhimohamed 8 месяцев назад ⁺²
As a game dev, this is so good to have. Btw am gonna try this on parallels for my m1 pro
@Lucas-fl8ug 8 месяцев назад
You mean in windows through parallels? why would it be useful?
@haralc 8 месяцев назад
Oh you got distracted! You're a true developer!
@83nnyt30 2 месяца назад
What about a video on Cheshire Cat AI open source "production ready AI agent framework"?
@sunflash9 8 месяцев назад ⁺³
I use Ollama with Continue plugin with VSCode. And Chatbox GUI when not code related. Work well with both Mac and Linux with Ryzen 7000 CPU. On linux it's running in a podman(docker) container. But best experience is with MacBook Pro, apple silicon and unified memory make it speedy.
@vadim487 8 месяцев назад
Alex, you are awesome!
@Raptor235 8 месяцев назад
Great video Alex, is there anyway to have an LLM execute local shell scripts to perform tasks?
@jehad4455 8 месяцев назад
Mr. Alex Ziskind
Could you clarify whether training deep learning models on a GPU for the Apple Silicon M3 Pro might reduce its lifespan?
Thank you.
@thetabletopskirmisher 8 месяцев назад
What advantage does this have over using LM Studio that you can install directly as an app instead of using the Terminal? (Genuine question)
@jorgeluengo9774 8 месяцев назад
Thank You Alex, amazing video, I followed all steps and I enjoyed the process and the results with my m3 max. I wonder if there is a GPT that we can use from the laptop and have searches online since the cutoff knowledge date of these models seem to be over a year ago or more. For example when I ask questions of what is the terraform provider version for aws or other type of platform, is old and there is a potential to have deprecated code responses. What do you recommend in this case? not sure if you have already a video for that lol.
@AZisk 8 месяцев назад ⁺¹
that’s a great question. you’ll need to use a framework like flowise or langchain to accomplish this I believe, but i don’t know much about them - it’s on my list of things to learn
@jorgeluengo9774 8 месяцев назад
@@AZisk makes sense, I will do some research about it and see what I can find out to test but I will look forward when you share a video with this type of model orchestration, will be fantastic.
@moranmono 8 месяцев назад ⁺¹
Great video. Awesome 👏
@sagnikbiswas8762 2 месяца назад
Hey like can you please answer this. Instead of conda can i just use a virtual environment for this one ?
@AZisk 2 месяца назад
yes absolutely
@gayanperera7273 8 месяцев назад
Thanks @Alex, by the way is there a reason it can only use GPU, any reason not taking advantage of NPUs ?
@bisarothub1644 8 месяцев назад
Great video. But I think Jan AI is a lot easier to configure and setup for mac users
@srl-svector 7 месяцев назад ⁺¹
how to do it with docker and also can i live this host it on vercel or netlify ??
@sikarinkaewjutaniti4920 8 месяцев назад
Thx for sharing good stuff for us. Nice onec
@tyron2854 8 месяцев назад ⁺¹
What about a new M4 iPad Pro video?

Следующие

Автовоспроизведение