Fast Fine Tuning with Unsloth
HTML-код
- Опубликовано: 25 янв 2025
- 🚀 Discover how to fine-tune LLMs at blazing speeds on Windows and Linux! If you've been jealous of MLX's performance on Mac, Unsloth is the game-changing solution you've been waiting for.
🎯 In this video, you'll learn:
• How to set up Unsloth for lightning-fast model fine-tuning
• Step-by-step tutorial from Colab notebook to production script
• Tips for efficient fine-tuning on NVIDIA GPUs
• How to export your models directly to Ollama
• Common pitfalls and how to avoid them
🔧 Requirements:
• NVIDIA GPU (CUDA 7.0+)
• Python 3.10-3.12
• 8GB+ VRAM
Links Mentioned:
tvl.st/unsloth...
tvl.st/unslothamd
tvl.st/unslothreq
tvl.st/unsloth...
tvl.st/python3...
#MachineLearning #LLM #AIEngineering
My Links 🔗
👉🏻 Subscribe (free): / technovangelist
👉🏻 Join and Support: / @technovangelist
👉🏻 Newsletter: technovangelis...
👉🏻 Twitter: / technovangelist
👉🏻 Discord: / discord
👉🏻 Patreon: / technovangelist
👉🏻 Instagram: / technovangelist
👉🏻 Threads: www.threads.ne...
👉🏻 LinkedIn: / technovangelist
👉🏻 All Source Code: github.com/tec...
Want to sponsor this channel? Let me know what your plans are here: www.technovang...
Thank you, Matt, for reading comments and responding to myself and others.
Of course. I don’t respond to everything. Sometimes when there is a response to a comment from months ago I never see it
Thanks Matt! I'm really interested in training and/or fine-tuning models with SMBs customer's data on-premise. Literally inside the company on their network mainly for data compliance. Swiss here 🇨🇭😁
Would love to see more of such use cases. Great job again.
I built my own Jupiter hub session rather than depending on collab. I love the video because it gives me more ideas on what I can implement into my home lab
Thank you for sharing the knowledge. I am still struggling with understanding what a "model" is at all... Although i do not understand all that stuff, i am always amazed how you manage to explain the single steps
Thanks for the video, arrived shortly after I tried out Unsloth for the first time. The biggest hurdle was - like you said - the data preparation :).
Btw. something seems to be wrong with your mic-sound in this video: It sounds clipped and undersampled.
Yup. Noticed that too. Thought I had fixed it
Matt, thank you for all your helpful videos. Where can we fine tune models without worrying about the company taking our data?
What company? There is no company to take your data when fine tuning on your hw
Is it me or the sound is a bit off?
Hey Matt! Thanks for the content! I would suggest you to talk about creating and using a neo4j graph RAG using Ollama.
Hey! Your videos are super helpful!
I’ve started from a simple prompt in Ollama to build ollama server for perfect prompt generating perfect prompt in ComfyUI and developing apps and tools for my day to day! I had prior experience in scripting but not much (okay not much), getting better and better…
I’ve moved away a bit from Ollama now as my day to day is on an AMD CPU on Linux with an integrated GPU (I have a workstation that I use remotely, but well I like my little day to day on smaller LLMs).
Do you know if Ollama will support Vulcan out of the box? I’ve tried and tried to make it work but alas… I have to rely on GPT4ALL or LM Studio and I really prefer the lightweight of the Ollama server or Msty as a UI.
Another thing, I feel there’s is a lot of compression in you audio, maybe go slowly with the compressor! You have a deep voice and a good way to express yourself. Is your lavalier mic inside the shirt? You can put it out, it’s The RUclips age… no need to hide it :)
Noticed the compressor too. That and the compressor that RUclips has on the player unless you turn it off is annoying. Look out for the next one and let me know if I dialed it in better
Congrats..! very Helpful Thanks a lot..!
thank you so much for doing this !!!
i was wondering if i already have my models downloaded by ollawa can i tell unsloth where those models are located so that it does not download each model separately ?
unfortunately not. I was hoping there was a way to tune a gguf model, but no.
@@technovangelist Ouch. I run most of my models with LLAMA-HF.CPP under TGW using the GGUF format. I have provided them (MIXTRAL 8x7B and LLAMA 3.1 70B) with a memory facility using RAG style components. As you can imagine, assembling a coherent narrative of sparse conversation from chunks is quite a challenge. I have provided much guidance in the SYSTEM PROMPT, but a fine tune of how to use the chunks for the models might make more sense.
You may ask why TGW + LLAMA-HF.CPP? TGW has a very effective DRY implementation which only works for this type of format. I don't know why; but that is just the way it is.
Thank
Shhh... Keep this Quite Please! Lol