hey there! i actually want to make a chatbot for my college project, I want to fine tune the LLM with my own dataset and then deploy the project with an UI but due to low computational power I am facing a lot problems in the deployment stage, could you possibly help me here? big thanks
The answers I get from Ollama running in PrivateGPT are good and serve my purpose. I'm not sure that I understand what more could be achieved from fine-tuning.
Fine-tuning is done to make it better for specific use cases, the more you fine tune a model, the more it behaves ingrained and baised towards the data that you have trained with. That is very helpful for specific company use cases for example, if you want the answers from your company's secret 1000 table SQLs which u want your answers to be based on.
Sorry i had to keep u waiting. th is is the notebook you can follow. I will come up with a video as well. github.com/meta-llama/llama-recipes/tree/main/recipes/quickstart/finetuning
The process of creating a Conda environment is the same regardless of the GPU manufacturer (NVIDIA or AMD/Radeon). Conda is primarily focused on managing Python environments and packages, not directly tied to specific GPUs. However, the difference lies in how you configure the environment to take advantage of the GPU hardware. For NVIDIA GPUs, you typically use packages like cudatoolkit and libraries like tensorflow-gpu or torch that are built for CUDA (NVIDIA's GPU computing toolkit). For AMD Radeon GPUs, you would use different tools and libraries such as ROCm (Radeon Open Compute)
A quick check if this video is not there already. Its more on creating your own dataset and then training a model. Most of the confusion is around RAG vs fine tuning a model for PDF data set. So if you could make a video around creating your own data set from about 500 or 1000 pdfs (just an example) and using that data and the relevant fine tuning attributes necessary for this dataset, to train a model
Awesome video, great help, everything nice and clear, next time PLEASE put commands you use in description or in a pinned command, because extracting these commands from images is a nuisance. Otherwise, as I said great vid.
@@PromptEngineer48 not free .. costs power - + unsloth provides colab notebooks aka you can run the tuning job on the t4's you get free as long as it fits in the 16gb vram
Very helpful! Thanks so much. Perfect pacing
Thanks so much
hey there! i actually want to make a chatbot for my college project, I want to fine tune the LLM with my own dataset and then deploy the project with an UI but due to low computational power I am facing a lot problems in the deployment stage, could you possibly help me here? big thanks
reduce to 4 bit.. no other option. u can reduce to 4 bit, make a gguf, and use it using ollama. fastest and best option.
The answers I get from Ollama running in PrivateGPT are good and serve my purpose. I'm not sure that I understand what more could be achieved from fine-tuning.
Fine-tuning is done to make it better for specific use cases, the more you fine tune a model, the more it behaves ingrained and baised towards the data that you have trained with. That is very helpful for specific company use cases for example, if you want the answers from your company's secret 1000 table SQLs which u want your answers to be based on.
E.g., if you want an LLM to use tools, finetuning can ensure better handling of the formats, languages and such.
How can you train vision models?
Sorry i had to keep u waiting. th is is the notebook you can follow. I will come up with a video as well.
github.com/meta-llama/llama-recipes/tree/main/recipes/quickstart/finetuning
what would be the command if i have a radeon GPU ? for creating the environment?
The process of creating a Conda environment is the same regardless of the GPU manufacturer (NVIDIA or AMD/Radeon). Conda is primarily focused on managing Python environments and packages, not directly tied to specific GPUs.
However, the difference lies in how you configure the environment to take advantage of the GPU hardware. For NVIDIA GPUs, you typically use packages like cudatoolkit and libraries like tensorflow-gpu or torch that are built for CUDA (NVIDIA's GPU computing toolkit). For AMD Radeon GPUs, you would use different tools and libraries such as ROCm (Radeon Open Compute)
A quick check if this video is not there already. Its more on creating your own dataset and then training a model. Most of the confusion is around RAG vs fine tuning a model for PDF data set. So if you could make a video around creating your own data set from about 500 or 1000 pdfs (just an example) and using that data and the relevant fine tuning attributes necessary for this dataset, to train a model
Noted. Cool
Awesome video, great help, everything nice and clear, next time PLEASE put commands you use in description or in a pinned command, because extracting these commands from images is a nuisance. Otherwise, as I said great vid.
Cool.
how much that it cost to re-trained a llm cheapest
It's free if you decide to do it locally.
@@PromptEngineer48 not free .. costs power - + unsloth provides colab notebooks aka you can run the tuning job on the t4's you get free as long as it fits in the 16gb vram
I ran it on my local 4060. So that's a relief
typo in the title -> Unsloft is Unsloth !
U have got sharp eyes 👀.. Rectified the same.
I want more detail actually about these steps. I only understand a bit of it, not much.
Oky. I will definitely create a detailed video since you asked.
@@PromptEngineer48 you are cool man! good luck !
Thanks
@@opipico9144 additionally unsloth has a rather nice discord channel :) youll find help there