Step-by-step guide on how to setup and run Llama-2 model locally

ycopie

Просмотров 37 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 8 сен 2024
In this video we look at how to run Llama-2-7b model through hugginface and other nuances around it:
1. Getting Access to Llama Model via Meta and Hugging Face:
Learn how to obtain access to the Llama language model through Meta and Hugging Face platforms.
2. Downloading and Running Llama-2-7b Locally:
Follow step-by-step instructions on downloading the llama-2-7b model and running it on your local machine.
3. Tokenizing and Inputting Sentences:
Understand the process of tokenizing and inputting sentences for next-word prediction tasks using the Llama model.
4. Controlling Temperature Parameter:
Explore techniques for adjusting the temperature parameter to influence the creativity of Llama's output.
5. Challenges in the Base LLM Model:
Identify and address potential challenges and limitations associated with the base Llama language model and why one would go for fine-tuned model.
6. Choosing the Best Performing LLM:
Stay informed on how to check for the latest and best-performing Llama language models, ensuring optimal results for your tasks.
References and Links:
Previous Video on LLM concepts: • A basic introduction t...
Code: github.com/opp...
Llama 2 paper: arxiv.org/pdf/...
Huggingface: huggingface.co
Open LLM Leaderboard: huggingface.co...
Linkedin: / yash-agrawal-a22597162

Комментарии • 38

@gaspardtissandier3204 4 месяца назад ⁺¹
Great video, and it is indeed the right translation to french :)
@abubakeribrahim6473 7 месяцев назад ⁺¹
Thank you for the very nice presentation and explanation! I would like to a video with your wonderful explanation where you can tell us how we can fine-tune the base models to one fitting our specific tasks
@ycopie1126 7 месяцев назад ⁺¹
Noted, will do it. Thanks!
@vasanthnagkv5654 2 месяца назад
Thanks! this was my first AI development video watch.
@mj_cta 6 месяцев назад
24:39 - Fun part of the video, good luck Yash ! Thanks for the video.
@khaitruong9831 3 месяца назад
Great video. Thank you ycopie!
@bhaavamritdhaara 5 месяцев назад
Very helpful. Thanks a lot for this.
@weelianglien687 4 месяца назад ⁺²
Thank you for this hands-on! Initially I tried it on my laptop which although has an NVDA GEFORCE GTX, it can't run very well. Eventually I have to run it on Colab (T4 GPU), though not with adding the following lines to help with the GPU usage (just sharing) :
!pip install accelerate
from accelerate import Accelerator
accelerator = Accelerator()
device = accelerator.device
@22nd.of.may. 2 месяца назад
my model needs 16.2gb of GPU, which in colab is limited to 15gb, do you have any way to fix that?
@litttlemooncream5049 4 месяца назад
really subtle! subscribed
@harsh2014 5 месяцев назад
Thanks for this discussion !
@LeoSRajan 7 месяцев назад
Thank you so much for your time!!!
@thamilarasan4048 6 месяцев назад ⁺²
Please share you system specs, specially about GPU you are using
@pradachan 4 месяца назад
i'm new to LLM and i just wanted to know that you need all these access for using llama, but when you'd use ollama you just put "ollama run llama2" in the terminal, so whats the difference? they can access it without any explicit access from meta??
@jatindhiman448 Месяц назад
Really great explanation.....
But i am stuck in a problem of getting space on gpu. If i tried this on google collab ,the free version gets collapsed due to all memory usage. Pls suggest me for this solution or list the name of small models that are under 12gb of space & are used for prompting purpose.
@mayowaogundipe375 Месяц назад
Thanks for your time... Please may I ask how to download coda toolkit on my laptop to support GPU support. The code for Coda or cpu is not working on my laptop
@CarolinaHernandez-zt6li 5 месяцев назад ⁺¹
Do you offer any paid consulting? I’m stuck on an installation error.
@abhishekfnu7455 6 месяцев назад
Thank you so much for this video.
Could you please let us know how to connect with SQL database to fetch the information and implement semantic analysis?
@niklasweiss2557 2 месяца назад
I currently have the problem that it only says "Loading widget..." when I try to run the code and doesn't display the progress bar. Do you possibly know how to fix this?
@samirait-abbou5954 6 месяцев назад
good job!
@rastapopolous8446 6 месяцев назад
nice tutorial but how would you do to wait for the prompt.. so we can enter the prompt like what is capital in Indisa and press enter.. then the model should reply.. how to do it.
@jennilthiyam1261 3 месяца назад
what will we do if i need interactive mode, like having conversation like we do with chatgpt
@lesstalkeatmore9441 4 месяца назад
how to fine tune with our own data sets, like answer the pdf of our own data sets.
@fabiotarocasalino257 6 месяцев назад
good vid
@rakeshkumarrout2629 Месяц назад
can i usethis in vscode?
@litttlemooncream5049 4 месяца назад
love your username lol
@mohammedmujtabaahmed490 5 месяцев назад
ConnectionError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))
bro iam getting this error when running on jupyter notebook.
please help.
@sarahharte186 6 месяцев назад
Great vid - Thanks a mil! I am getting KeyError: 'llama' when running the script. I have copied in the model name/path from hugging face directly but its still causing an issue - Do you know what the problem could be?
@ycopie1126 6 месяцев назад
Will need more details like which line in your code is causing this
@sarahharte186 6 месяцев назад ⁺¹
@@ycopie1126 Sorry i think it actually was an issue with the version of transformers I had installed - i reinstalled and now the model seems to be downloading successfully - so all good! Appreciate your reply!
@sachinworld_ 4 месяца назад
ValueError: You are trying to offload the whole model to the disk. Please use the `disk_offload` function instead.
i got this error
@sumandas829 6 месяцев назад
hf represents human feedback not hugging face
@ycopie1126 6 месяцев назад
You can follow this discussion: github.com/facebookresearch/llama/issues/612
The model card has small difference which states that it's hugging face format.
@sumandas829 6 месяцев назад
Extremely sorry for doubting, just thought hf should mean human feedback, again I am wrong, sorry for that, good job
@ycopie1126 6 месяцев назад
No worries at all. Happy you put it in comments so it would help other people as well 😄
@SpartanDemiGod 6 месяцев назад ⁺⁵
Can you please tell me your PC specs ?
@MarkSikorski-xg7gh 7 месяцев назад
Hi when running this code I am getting an error File d:\Magister\llama_hugging_face\venv\lib\site-packages\huggingface_hub\utils\_validators.py:110, in validate_hf_hub_args.._inner_fn(*args, **kwargs)
109 if arg_name in ["repo_id", "from_id", "to_id"]:
--> 110 validate_repo_id(arg_value)
112 elif arg_name == "token" and arg_value is not None:
...
)
(norm): LlamaRMSNorm()
)
(lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)'. Please provide either the path to a local folder or the repo_id of a model on the Hub.
Did you encounter this problem?
@mohammedmujtabaahmed490 5 месяцев назад
ConnectionError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))
bro iam getting this error when running on jupyter notebook.
please help.

Следующие

Автовоспроизведение

What is the difference between Llama-2 and Llama-2-Chat model?