How To Install CODE LLaMA LOCALLY (TextGen WebUI)

Matthew Berman

Просмотров 82 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 15 ноя 2024

Комментарии • 255

@jackflash6377 Год назад ⁺⁴⁴
Hands down the best AI focused RUclips channel.
Like your style, all stuff, no fluff. Easy to follow tutorials and always the cutting edge content.
Subbed and joined!
Now, let's push for something Aider like for local models.
@matthew_berman Год назад ⁺²
Thank you! Doesn’t Aider support local models now?
@emil5684 Год назад ⁺¹
@@matthew_berman Aider? Dont know what is this. I just start to study IA
@michaelmalzacher6018 Год назад
@@emil5684 respecf
@MrLargonaut Год назад ⁺⁸
You have been my go-to source for how-to's since GPT4 launched, which is when I joined the game. I started from literal scratch, knowing nothing about coding at all, instead letting a lifetime dream propel me. Thank you especially for your 'from scratch' videos, because there are many things that I don't know the jargon or phrasing for. For my amateur position, these types of vids do me the most good. *edited for them lovely mispellin's*
@matthew_berman Год назад
Love to hear it!
@harisjaved1379 Год назад ⁺³
MAN YOU DELIVERED! We asked you, and you delivered!
Thanks Matt!
@Shinkaze33 Год назад ⁺¹⁰
Holy Crap this is amazing. I just ran some programing tests and it's REALLY good, can't believe this is running on my local machine when 8 months ago this sort of tech required a Datacenter.....just WOW....all the WOW
@matthew_berman Год назад
Yea it’s super impressive. Especially because you can run this on pretty much any computer with the 1B versions
@Shinkaze33 Год назад ⁺³
Onscreen Typo @2:30 in the package name. You typed tortchvision instead of torchvision. The correct command should be:
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
@jeffersonvega622 Год назад ⁺³
00:00 📋 Introduction to installing Code LLaMA locally
01:01 🧪 Setting up the Conda environment and cloning the code
02:34 🛠 Troubleshooting installation issues
03:34 📦 Downloading and configuring the Code LLaMA model
05:05 💻 Setting model parameters and using prompt templates
06:07 📚 Conclusion and call to action
@mercadolibreventas Год назад ⁺⁸
You are Awesome ! I am now a Patreon! As you conitnue to build that methodology if you can, I conitnue to add more recurring. Thanks! Guys like you are Gold in Creation giving without expecting is the key to Abundance!
@matthew_berman Год назад
Much appreciated!! Keep the feedback and requests coming :)
@seanbergman8927 Год назад ⁺²
Great video. Looking forward to trying this soon. Thanks for walking through the entire process, including resolving the errors you encountered.
@matthew_berman Год назад
You got it!
@russellmm Год назад ⁺⁵
from scratch is typically "from scratch" not from the Anaconda environment. Or from scratch could be their Installer... Other than that (which I see happen a lot) I enjoy your channel.
@xmysty Год назад ⁺⁸
have learned so much from you - local results beyond expectations
Can't thank you enough Matthew, you've condensed so much into clearly explained new information 👏 oh and thanks to TheBloke
▶ all your vids
@matthew_berman Год назад ⁺²
Thanks so much, I love hearing this!
@ZainNaboulsi Год назад ⁺²
PSA for those going through the demo, the template needs to be changed at the bottom of the screen on the Default tab from QA to Alpaca-with-input.
@Proprogrammer001 Год назад
The "QA" one worked fine for me
@IvanRosaT 11 месяцев назад ⁺¹
This is so cool!, but in practice if following then either the repro hasn't been maintained or it has been mainted too much lol, most of us have issues with missing exllama, this breaks the in code chain, more info in the github instructions. But as a concept looks great
@contractorwolf Год назад ⁺¹
love your teaching style Matthew!
@johnnybueti Год назад ⁺⁵
How long does did that task take to generate with your RTX 4090? Great video. :)
@jayprice8246 Год назад ⁺³
Bro, you are killing the AI tutorial game right now. Thanks for all your awesome content!!!!
@matthew_berman Год назад
Haha thank you!
@normanlove222 Год назад ⁺⁴
Maybe I missed it, but what Programming languages can we ask about? Just Python? or other languages as well?
@astroportterraformationfor2776 Год назад ⁺²
Great howto. What hardware caracteristics are minimum and recommended. System CPU and RAM? minimum GPU and RAM ?
@RamiK-r9y Год назад ⁺¹
Great job, clear and straight to the point. Many thanks!!!!
Not sure if you have plan to explain (in your lovely way) how we can build our purpose model and train and optimise based on private data
@Tyronne_ Год назад
Needed this! Thanks bro, much appreciated.
@bubbajones5873 Год назад
Wow! Talk about timing. I just sat down to do this and needed this video 🎉
@AMindInOverdrive 11 месяцев назад
Any time I'm following these tutorials I always get an error that nobody else in the universe gets LOL
I was missing git - after a quick Google search I found that I needed to download it, and install it.
Thanks for your hard work making these videos for noobs like me ;-) Appreciate you man
Edit: Finally got to the end but mine displays no code in response...not sure why but will try figure it out LOL
@KevinTheCardigan 11 месяцев назад
The video is terrible because the packages and lines are outdated. I follow the instructions step by step, and get an error regarding typing_extensions being the wrong version. I downgrade to an appropriate version, and it only causes another error, which requires an upgrade on my pytorch. Upgrading pytorch then makes my typing_extensions obsolete. I'm going in circles and I hate the uploader because I spent my entire day on their terrible instructions.
@nicosilva4750 Год назад ⁺⁵
This is great. What are the recommended requirements: GPU cores, and RAM size and type for the the two largest models?
@ryanwebster3267 Год назад ⁺¹
Yes, please! I agree that this would be nice to know.
@matthew_berman Год назад ⁺⁵
70b you will need 48gb vram. 34b you can put on 24gb and possibly less if you used a quantized version and offload some to the CPU. It’s never a simple mapping of model to GPU.
@ryanwebster3267 Год назад
Thank you!@@matthew_berman
@vargonian Год назад ⁺⁵
One question I have before I try it out: One of the biggest limitations of ChatGPT is that its knowledge only extends to 2021, so there are lots of libraries / updates it's unfamiliar with. Are the models for Code LLaMA more up to date?
@imadreamerboy Год назад ⁺³
According to META the cutoff of llma2 is somwhere in july 2022 but some info is up2date till early 2023
@CoffeeblackUk Год назад
I have a pretty awesome bug. If i ask it a question the first answer is spot on. then if i ask another and click generate it forgets what we were talking about and i i click continue it gives me a sales speech about using google assistant. :-) good stuff though
@philcox2355 5 месяцев назад
Thanks Matthew So far so good.
@SanctuaryLife Год назад
Great Job Matt!
@alx8439 Год назад
Each time I hear quantized version doesn't lose a lot of quality I ask how people back this statement :) to my experience quantisation is alike a soft form of lobotomy - all the good stuff you see on leaderboards is just fading away when you take a closer look at the quantized version of the same size model.
@shootdaj Год назад
Thank you so much Matthew! You're a godsend 🙏
@enthrax1639 Год назад
Hi matthew... Appreciate your work... Thanks... One question... How to choose which version would be best for ones pc or laptop?
@coolmn786 Год назад
I wish I can like this video x10
Brilliant video, thanks man!
@roymikael4888 Год назад ⁺¹
Great info. Keep up the good work.
@matthew_berman Год назад
Thanks!
@yogenghodke 9 месяцев назад ⁺¹
White Bobby Deol
@ZainNaboulsi Год назад
Awesome video! Keep up the great work!
@grizzlybeer6356 Год назад
You are a bad influence on me Matt, because of you I literally am obssessed with anything AI. In fact you were such a bad influence on me that I took all the certificate courses on Coursera that was Machine learning related, and spent many hours listening to Andrew Nga..... lol, Love you bro! Thanks for being such a bad influence!
@marcfruchtman9473 Год назад ⁺¹
That is great. Thanks for the video.
Out of curiosity, what GPU are you running?
@matthew_berman Год назад
RTX4090
@muneerraza8521 Год назад
Subscribed and enabled notifications for all videos. Thanks from Pakistan !
@personone6881 Год назад
stupid question: when you first bring up Anaconda Prompt and you change the directory from your user folder on C: to a top level path on D: - C:\Users\mberm>d: - IS that an external drive? If so more importantly for me to understand - Is that the directory in where you installed your initial anaconda3 installation? - Just to be clear, what path did you install your initial installation on? I've just done mine on C:\anaconda3 ...have I fekked up already?
@dirtyPeter2 Год назад
Love it. Thanks!
@ricardo_cravo Год назад
Thank you ! I love you! Great video! IT worked!
@tomski2671 Год назад ⁺³
Anybody knows if this model can be further trained on user data? If so, how difficult would that process be?
@stephenthumb2912 Год назад
very easy to miss the torch install, thanks for pointing that out.
@Bundit_Buddhahai 8 месяцев назад
Thank you for your great video. Just to ask where is the downloaded WizardCoder-Python-13B-V1.0-GPTQ model located in Windows directory.
@SasaBocki Год назад ⁺¹
Possible to run any of that WizardCoder Python models without nvidia GPU?
@saravanajogan1221 Год назад ⁺¹
Thank you keep up the good work 👏
@matthew_berman Год назад
Thanks!
@itlackey1920 Год назад
Thank you this is very straightforward and helpful as always! Now to figure out how to get it to run on my Arc770 🤔
@mickelodiansurname9578 Год назад ⁺²
Be good if you could plug it into VS Code as a coding assistant extension.
@almahmeed Год назад ⁺¹
Hi .. This is really helpful .. Just a question, how can I remove the models that did not work for me as I was testing with many options?
@matthew_berman Год назад ⁺²
Go into the install folder and look for "model" folder.
@almahmeed Год назад
@@matthew_berman Thank you so much, Mathew .. I hope I can be sharing some results soon :)
@UserB_tm Год назад ⁺¹
I'm still testing out this model but so far I'm not impressed I asked it to write a basic python module that creates a text document named hello world write hello world inside of it and save it to the home directory. The first attempt it opened up a screenshot app and save the screenshot to the desktop the second attempt it did the same thing. And then when I asked it to correct the code it said it was good. Finally I had chat GPT write it in like 5 seconds and it worked perfect. I'm gonna do a few more tests but I'm not sure at this point.
@Lorant1984 Год назад
Have you run further tests? If so how did Llama fared against chatgpt, please?
@JuanHernandez-z4u Год назад
Great video! question what is the difference between GPTQ and AWQ models?
@ML-ud5pf Год назад ⁺¹
Very nice instructional video! Which AI tool would be best suited for non programmers to create an own python code (e.g. for developing an API client)? I find that GPT4 still requires me to know coding and that I cannot efficiently write code by prompting AI. Any advice on which tool to use for that case would be much appreciated!
@nicolasottavi9158 Год назад
Great! If I understood well it is only working in Python code generation ? No PHP or JS ?
@hiroroong693 Год назад
Great tutorial! I love the clear steps.
The current prompt is for 1 answer, how to change the prompt to have conversation style and has memory of past content?
@OriginalRaveParty Год назад ⁺¹
You're a boss. Thank you very much 👍
@mikegodfrey4482 Год назад ⁺¹
The error message I am getting is this " ERROR:Could not find repositories/exllama/. Make sure that exllama is cloned inside repositories/ and is up to date." Its up to date and its in my Anaconda folder. Any ideas?
@csabakallaicranq7055 Год назад
Great video!
@DariuszMakowski Год назад
Did u do any vids about how to train on top of that model using pdfs or c++/py source files to have our own fine tuning?
@theresalwaysanotherway3996 Год назад ⁺¹
you might want to mention that the max new tokens/context length can both be at 4096 'cause it's LlaMA 2, and also that the 34B is the one that competes with GPT4, the 13B is not as good.
@keylanoslokj1806 Год назад
What PC can make it run though
@wrOngplan3t Год назад
Linux Mint 21.1. This is the fist step-by-step that went without a hitch!
Only thing was the very last input field was different. From the "Prompt" drop-down list at the bottom I used "Alpaca-with-input" as a template and changed it to yours (I'm lazy, easier editing lol). I'm not sure if I have to do this every time though. Also, I'm not familiar with the somewhat strange-looking format ( "Below is an instruction that describes a task", etc.). Any more info on that btw?
I'm not familiar with Python either, my amateur coding language of choice is Processing Java (Java is close enough), and some C++ for Arduino. My simple first coding test is like your output integers 1-100 inclusive, with the added sum of them all. Worked great!
Great tutorial, Awesome stuff! Thanks!
@Spacewarpstudio Год назад ⁺¹
Unfortunately when using this to write c# code for Unity, it doesn't even seem to come close to what I can do with GPT-4, especially when it comes to developing code bit by bit. I've had amazing results with GPT-4 getting certain things working then adding more functionality as I go, copy pasting errors and asking for them to be corrected etc. I've tried doing similar with this in chat mode, or chat-instruct mode, and it doesn't seem to have a clue what I'm talking about. I ask it to correct its mistake and it just spits out totally random unrelated code. Until I can have a conversation about the code we're working on together like I can with chat GPT, this has extremely limited use.
@jeremywatson Год назад
Awesome Work. Is it possible to do the same setup on a Mac? I saw your video on how good M1/M2 is. Look forward to your next video nevertheless! I love the the no nonesense this is how it is. i.e. no dribble!
@RichardGetzPhotography Год назад
Any support for Mx Macs? Or am I buying a PC for the first time in 30 years?
@gazzalifahim Год назад
Just A-W-E-S-O-M-E!!
I have 2 questions.
My PC currently have Python 3.11, If I install conda Python 3.10, will there be any conflict? (I am noob into this Python stuff)😅
2nd, I have Nvidia 940mx in my laptop, can I run anything greater than 13B parameter?
@TheSolsboer Год назад
Great, thanks, "cuda fix" works only on nvidia gpu?
@REDULE26 Год назад ⁺¹
Nice tutorial ^^
@matthew_berman Год назад
Thank you!
@rforestier Год назад ⁺¹
Superfan of the channel, I would like to see a video of LLaVA: Large Language and Vision Assistant
@matthew_berman Год назад ⁺¹
Second time I’m hearing about it, I’ll have to check it out!
@Tr3kkR 11 месяцев назад
I'm getting an error for the install of cchardet (as you did, noted in the build instructions on your Git repo) and am installing the Visual Studio Build Tools 2022. cchardet requires Microsoft Visual C++ 14.0 or greater. Since you probably already had this installed, you may have had a different error.
@diegoigr7 10 месяцев назад
If you are having this error "ImportError: cannot import name 'Doc' from 'typing_extensions'" simply upgrade this library: "pip install typing_extensions==4.8.0 --upgrade" =)
@RedCloudServices Год назад
Matthew I have an old git repo that I would like to revive with a different backend would code llama allow me to upsert a repo and improve the code based on instructions?
@vasile2321 9 месяцев назад
I've got an i7-6700 + 16GB RAM + GTX1650 and is running very slow....What configuration do you have to run?
@tomski2671 Год назад
Thank You.
Is there a good place where we can learn about this kind of thing in general?
@DavidFlenaugh Год назад
Figured I would just start from scratch again and see what happens. Do we have to do this all over again every time?
@echofloripa Год назад
Great video, thanks foe the work. Do you think it will work on a 6 giga GPU? I tried lamma 7b and it worked, super slow but it worked 😅
@MrDataStorm007 Год назад
Thank you so much !
@todorp4056 Год назад
Great content. Is there a plug in for VSCode?
@RichardGetzPhotography Год назад
LMAO!! 5:28 are you looking at someone who might be a bit too creative with their code :)
@modolief Год назад
Matthew, sorry to bother you, but can you talk about, or give me some feedback about *conda* ? I used *pyenv* on my Mac about 2 years ago to install Python. I don't know how to square that with your comments about conda.
@genebeidl4011 Год назад
The WizardLM models don't load for me as it says the header is too large for both the 34B and 13B models. I have a 4090 so at a minimum the 13B model should load. TheBloke doesn't seem to have a 34B quantized model.
@scuzzynate11 11 месяцев назад
Hey Matthew - any chance you can drop the instructions you mentioned at 2:19? Not seeing the comment on my end here.
@creatiiveart341 Год назад ⁺⁶
I am on M1 Max Mac 64 GB and I am getting the following error when I try to load the model with ExLlama_HF :(
Traceback (most recent call last):
File “/Users/kc/text-generation-webui/modules/exllama_hf.py”, line 14, in
from exllama.model import ExLlama, ExLlamaCache, ExLlamaConfig
ModuleNotFoundError: No module named ‘exllama’
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “/Users/kc/text-generation-webui/modules/ui_model_menu.py”, line 182, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
File “/Users/kc/text-generation-webui/modules/models.py”, line 79, in load_model
output = load_func_map[loader](model_name)
File “/Users/kc/text-generation-webui/modules/models.py”, line 322, in ExLlama_HF_loader
from modules.exllama_hf import ExllamaHF
File “/Users/kc/text-generation-webui/modules/exllama_hf.py”, line 21, in
from model import ExLlama, ExLlamaCache, ExLlamaConfig
ModuleNotFoundError: No module named ‘model’
@interesting_vdos Год назад ⁺¹
I'm getting the same error. Can anyone help on this please
@interesting_vdos Год назад
@creatiiveart341 were you able to fix this error? Please let me know if you were able to fix this issue
@Joeespo2009 Год назад
Same error here too
@creatiiveart341 Год назад
Nop! I tried few things but no luck so far :(@@interesting_vdos
@owenrichards661 Год назад
Same with me.
@pathead Год назад
A+ Video!!
@PomStas Год назад
I don’t know why but I’m getting an error:” module can’t be found” when I’m trying to load a model
@andy12829 2 месяца назад
What Hardware configuration is required for running this ?
@caseyclayton01 Год назад ⁺¹
On my 4090 I can get the 34b "working" but it seems to run out of memory pretty quickly. With only around 52 tokens of input it was running out for me and with less input at 12 it was super slow and again errored out before completing the response. I was using the Phind v2 CoderLlama 34b.
@perc-ai Год назад
Ur ram is bottlenecking
@ulfschack Год назад
So what if you don’t nVidia?
(I use stable diffusion on my m2 macbook, and it’s fast)
@mikegodfrey4482 Год назад
I’m at the last step load the model loader. I’m using the llama2 3B to run on a laptop to test. I’ve tried every model loader and can’t seem to get it to work.
@sclim4142 Год назад
Hi Matthew,
I plan to purchase a new pc to run this locally. Is NVIDIA T1000 with 8GB of VRAM sufficient to support the program? Thanks
@readmarketings9061 Год назад
no
@mohamed_salah3165 9 месяцев назад ⁺¹
where are the links you said you gonna put in the comments/description
@Person-hb3dv Год назад
Is there a way to integrate the model or the web ui into vscode using some extension or something? It would be really nice if the model could have access to my code so it can use it as context.
@duanelawrence4311 Год назад
I did not ever get code llama working based on this video. This never appeared to install llama, just the pre-requisits.
@MC-or2tb Год назад
tyvm, have my upvote and my subscription
@goonie79 Год назад
Thank you for such great video. Is there a way to use a colab in this project?
@mercadolibreventas Год назад
Hi Matthew, can you do a video about the Docker desktop, that seems to be now integrated into VScode It seems so much cleaner, it loads the container directly into it. I have been using a NAS docker and it seems to always have issues, especially with so many projects and dependencies, The computer begins to lag and eventually needs to format and reinstall everything. Can you help with a video on how to do all these projects you keep putting out, the proper way, so we spend more time understanding, utilizing, and building learning patterns? Organization of all these tests, utilizing the nas only to store the containers/images, Thanks!
@AntmanClashBro Год назад
Hi Matt,
I am struggling alot to follow along with the text-generation-webui install. My error is Failed building wheel for llama-cpp-python when following the steps after pip install -r requirements_nocuda.txt. Any ideas? I am on a mac with 2.9 GHz Dual-Core Intel Core i5
@kawalier1 Год назад ⁺¹
I'll try it on GCS's VM.
@matthew_berman Год назад
Is it easy to install there?
@emil5684 Год назад ⁺¹
no link to install PyTorch. For me worked this: conda install pytorch torchvision torchaudio pytorch-cuda -c pytorch -c nvidia
@shukanimator Год назад
I've been using ChatGPT to write Python scripts for a few months and it's almost always necessary to tell GPT what errors happened or copy in some API or library documentation so it can fix the code. Is there a way to use this WizardCoder model and do a back and forth with the errors so that the AI can fix stuff? I have it running on the WebUI, and it's been able to generate a few working Python scripts using the 'default' method you demoed, but when the code doesn't work, the 'chat' mode isn't able to write code as well. It's almost as if it's not running the same model when in 'chat' mode.
@christopherbrown1187 6 месяцев назад
does not work for me. EnvironmentNameNotFound: Could not find conda environment: tg
You can list all discoverable environments with `conda info --envs`.
@antonkozyk Год назад
How can I use this LLM not only in the text-generation-webui but in my own apps? Is it some API or how it works?
@Mst.EshitaKhatun-y7u 9 месяцев назад
I got this error "ImportError: DLL load failed while importing exllamav2_ext: The specified module could not be found." Anyone how to solve that
@korchi Год назад
Hey Matt, is there a way that you can use local GPT project with code LLaMA and ingest your code? - use case would be to ask code LLaMA to add function or change function of your code. It would be great if you can do a video using CPU only (preferred on a mac).

Следующие

Автовоспроизведение

How To Install Code LLaMA 34b 👑 With Cloud GPU (Huge Model, Incredible Performance)