A Hackers' Guide to Language Models
HTML-код
- Опубликовано: 30 июн 2024
- In this deeply informative video, Jeremy Howard, co-founder of fast.ai and creator of the ULMFiT approach on which all modern language models (LMs) are based, takes you on a comprehensive journey through the fascinating landscape of LMs. Starting with the foundational concepts, Jeremy introduces the architecture and mechanics that make these AI systems tick. He then delves into critical evaluations of GPT-4, illuminates practical uses of language models in code writing and data analysis, and offers hands-on tips for working with the OpenAI API. The video also provides expert guidance on technical topics such as fine-tuning, decoding tokens, and running private instances of GPT models.
As we move further into the intricacies, Jeremy unpacks advanced strategies for model testing and optimization, utilizing tools like GPTQ and Hugging Face Transformers. He also explores the potential of specialized datasets like Orca and Platypus for fine-tuning and discusses cutting-edge trends in Retrieval Augmented Generation and information retrieval. Whether you're new to the field or an established professional, this presentation offers a wealth of insights to help you navigate the ever-evolving world of language models.
(The above summary was, of course, created by an LLM!)
For the notebook used in this talk, see github.com/fastai/lm-hackers.
00:00:00 Introduction & Basic Ideas of Language Models
00:18:05 Limitations & Capabilities of GPT-4
00:31:28 AI Applications in Code Writing, Data Analysis & OCR
00:38:50 Practical Tips on Using OpenAI API
00:46:36 Creating a Code Interpreter with Function Calling
00:51:57 Using Local Language Models & GPU Options
00:59:33 Fine-Tuning Models & Decoding Tokens
01:05:37 Testing & Optimizing Models
01:10:32 Retrieval Augmented Generation
01:20:08 Fine-Tuning Models
01:26:00 Running Models on Macs
01:27:42 Llama.cpp & Its Cross-Platform Abilities
This is an extended version of the keynote given at posit::conf(2023). Thanks to @wolpumba4099 for chapter titles.
Gotta admit I'm feeling kinda teary reading all the lovely comments here. Thank you everybody -- love you all!
You just deserved it.😃
Second in the replies. :3
You are beyond awesome, Jeremy
Thanks for your work and please help us keep an eye on the apostles of the emerging noosphere, like Ben Goertzel ect.
Jeremy Thank you! This has helped so much. I’ve been a FastAI builder since the early days in 2017. Youre my hero. Appreciate all of the work you’ve done in the field
Just realised Jeremys paper led to the LLM revolution. Such a humble kind man. God bless you and all your students. You are such an example to follow. An example in character, humility and intelligence.
how ? which paper?
@@circleAI ULMFiT
@@circleAI Part of the answer is in the video's description.
Yes exactly what I was thinking, why there's so many people bragging about what they are doing and look at this guy, just helping others out.
This is probably the best invested youtube time of this year so far. What a gem. A lot of things he mentions have taken me month to figure out by my own. My new GPT-4 prompts will begin with "You are the expert Jeremy Howard..."
This!
lol, nice
100%
So great.
Absolutely!
*Transcript Summary:*
- Introduction & Basic Ideas of Language Models (00:00:00 - 00:18:05)
- Limitations & Improvements of GPT-4 (00:18:05 - 00:31:28)
- AI Applications in Code Writing, Data Analysis & OCR (00:31:28 - 00:38:50)
- Practical Tips on Using OpenAI API (00:38:50 - 00:46:36)
- Creating a Code Interpreter with Function Calling (00:46:36 - 00:51:57)
- Using Local Language Models & GPU Options (00:51:57 - 00:59:33)
- Fine-Tuning Models & Decoding Tokens (00:59:33 - 01:05:37)
- Testing & Optimizing Models with GPTQ & Hugging Face (01:05:37 - 01:09:48)
- Fine-Tuning with Llama 2 & Platypus Datasets (01:09:48 - 01:10:32)
- Retrieval Augmented Generation & Information Retrieval (01:10:32 - 01:20:08)
- Running a Private GPT & Fine-Tuning Models (01:20:08 - 01:22:32)
- Running Models on Macs (01:26:00 - 01:27:42)
- Discussing Llama.cpp & Its Cross-Platform Abilities (01:27:42 - 01:30:07)
- Challenges & Opportunities in Language Models (01:30:07 - 01:31:05)
Key points of interest: Function usage in GPT-4 (00:46:36), OCR application with Google Bard (00:33:59), and improving GPT-4 responses with custom instructions (00:24:36).
Dope, did you do this by hand?
😅😅😅😅
I was expecting "By Tammy AI"
Thanks
The moment I got to know that you and Andrej weren't included in the Time's list, I realized that the people making such lists have no idea what they are doing. Loved the tutorial, thank you!
No one can explain a topic like Jeremy👍
I found this video so useful that I felt compelled to pull my keyboard closer toward me, fix my posture, and write this comment - something I rarely do. I'm a professional data scientist hoping to push my company's GenAI agenda and this video makes me feel like I can actually do it! Thank you for so clearly encapsulating the state of LLMs. I'd learned many of these concepts before and this video is the glue that now holds it together.
Hoping to look forward it.Ur fastai stable diffusion course was Perfect to the minute details
A true legend! So far, I have not seen a better educator than Jeremy. His approach of teaching is what all schools and universities need! I am always interested to learn more, whenever I hear Jeremy. Thank you!
Wow, thank you!
The best "intro" and Guide I have seen on this. Appreciate it so much that you took the time to put this together and share this with us (FOR FREE!).
This video landed up on my feed and out of curiosity I started watching and before I knew it had watched the entire video and taken copious amounts of notes too. One of the best videos I have ever watched!
very easy to know, practical! thanks Jeremy
Bravo. One of the best RUclips videos I've ever watched. Concise, entertaining, and chock full of useful insights.
By far the most useful practical guide to LLM's by length. Thank you Jeremy!
Fabulous tour of key points. Fantastic job! Definitely going to recommend this to people wanting a gateway into llms.
Truly enlightening! As a software engineer with limited math and data science knowledge, this video has been a revelation. The way Prof. Howard simplifies complex concepts is incredible, making each rewatch rewarding with new insights. Really grateful for his content that opens up the world of LLMs to a broader audience. His clear and thorough explanations are incredibly invaluable. Thanks, Prof. Howard, for demystifying this topic and helping us all learn.
Wow, thank you!
Thought provoking one code block at a time. As usual Jeremy the king
We do not deserve you Jeremy! YOU ARE AN AMAZING TEACHER AND HUMAN BEING! Thanks, really, for all these beautiful lectures!!
So many papers are being released, so it is important to have well-grounded information to understand LMs. Great delivery as always and practicable advice. Thank you.
*Positive Learnings:*
1. Language models, such as GPT4, are tools that can predict the next word in a sentence or fill in missing words in a sentence.
2. Language models have the ability to create a rich hierarchy of abstractions and representations which they can build on.
3. The guide will cover all the basic ideas of language models, including how to use open source and open AI-based models.
4. GPT4 can solve many tasks that it is often claimed it cannot.
5. GPT4 can be primed to give high-quality information by giving it custom instructions.
6. AI can be used to write code and parse large-scale data quickly and efficiently.
7. AI can be used in optical character recognition (OCR) for extracting text from images.
8. AI can be used in data analysis to create comprehensive tables from scattered information.
9. The OpenAI API allows users to use AI programmatically for data analysis and other repetitive tasks.
10. Function calling can be used to create a code interpreter that runs inside Jupiter.
11. Pre-trained models can be accessed using the Hugging Face library.
*Negative Learnings:*
1. Language models are not always useful on their own and need to be fine-tuned.
2. GPT4 often repeats mistakes and it is difficult to get it back on track once it starts making mistakes.
3. GPT4 has limitations such as not knowing about itself, not knowing anything about URLs, and not knowing anything after its knowledge cutoff in September 2021.
4. GPT4 does not always give correct responses.
5. AI has limitations in code interpretation and cannot substitute for human programmers.
6. The use of the OpenAI API can result in rate limits which need to be handled correctly.
7. Fine-tuning is needed to make the pre-trained models more useful.
8. The use of GPUs for local language models can be expensive and may require renting or purchasing GPUs.
So comprehensive. Perhaps the best introduction I have ever seen to the topic. Thanks so much.
Thanks for all you do Jeremy. I have learned so many things watching youtube as well as the PDLC tutorials. Your expanations are on point.
I liked the video even before watching. Thanks Jeremy for your work, always learning from your content.
You are literally changing lives, all for free. Thank you.
Jeremy you're one of the most legit AI person out there. An enormous thank you for providing this and all your content. ❤
I have waited for months for a classification and evaluation from Jeremy. For me, this is by far the most comprehensive technical summary and evaluation available for someone who wants to delve deeper. It took me several weeks, if not months, to gain even a partial personal understanding of the current hype. Thank you, Jeremy, for all your good work!👍
Great video. Watched it in one sitting. It's very interesting and engaging, and does cover a lot of areas on LLM, different model, types, examples, uses cases, etc. I learned a lot and hopefully will go through the notebook in detail and adapt to my use cases. Thanks for making this.
Hi Jeremy, excellent walkthrough! This is truly helpful. Please keep them coming!!
Very much appreciated this consolidation of the main LLM coding concepts to-date. Thank you!!
I'm really grateful how much people sharing their knowledge, can't imagine learning stuff for free. this means a lot for me.
This is golden summary of the state of the LLMs, Thank You
Brillant walk through. No hype. It is a real skill to explain complex topics is coherent way.
Thank you very much, Jeremy. Fascinating to see where we have come. The prose to SQL thing blew me out of the shoes. Can't wait to try this out by myself.
The "wolf, goat and cabbage" riddle example is just awesome. Gotta use it to illustrate what LLMs can't do and why. Cheers for that :)
🎯 Key Takeaways for quick navigation:
00:00 🤖 Introduction to Language Models
10:27 🧠 Neural Network Basics
16:38 🚀 The Power of GPT-4
24:53 🌐 Limitations of Language Models
25:23 💡 Language model limitations:
31:32 📊 Advanced Data Analysis:
36:18 💰 OpenAI API Pricing:
39:19 🧩 Using OpenAI Functions:
46:40 🐍 Custom Code Interpreter:
51:13 🐍 Creating a Python code interpreter
53:39 💻 Running a language model on your own computer
55:01 🏎️ Choosing a GPU for language model work
56:15 🖥️ Options for renting GPU resources
57:57 💾 GPU memory size and optimization
59:20 📚 Using Transformers from Hugging Face
01:00:06 🏆 Evaluating and selecting the right model
01:14:12 📖 Retrieval augmented generation for answering questions
01:17:10 📚 Overview of using language models for document retrieval and question answering
01:20:35 💼 Private GPT models for document retrieval
01:21:03 🎯 Fine-tuning language models for specific tasks
01:25:15 📊 Building a language model for SQL generation
01:26:36 💻 Running language models on Macs
gpt plugin? :D dauymn son
@@plebmarv9668 it's tammy ai, a youtube video talking points extractor
One of the best and most educational videos I've seen on the subject. Thank you, Jeremy!
There are hundreds of LLM tutorial coming out everyday, this is the one that I have been waiting for.
Great! It allowed me to understand how LM thinks and why.
Thank you Jeremy for this introduction. It just answered many of my questions and affirmed some of my doubts about how many of the applications that use LLMs work today.
Thanks so much Jeremy, been following you since Kaggle's launch. Inspirational to see an Australian continue to kick ass as much as you have in your career.
Came up in my feed. Thumbnail and title boring. By mistake I pressed play. But it was so interesting. I feel so enlightened after having been talked through this. Thanks for sharing this!
Impressive video; I spent days learning these concepts on my own. Had this been released two months ago, it would've been a game-changer. Excellent summary.
Luckily i'm a few weeks behind you! Happy learning mate!
I think the major problem is the retrieval. Would love a video just on that (best practices, best models out there etc.).
Great video, came across on x and subbed immediatley.
This is great. I don't know how to say how much grateful I am for your video. Thank you and keep the great work!
Such a great article! I learned a lot from this video, such as how complicated systems can be put together using a stack of models, illustrated in the RAG to name an example. Jeremy, you are such a kind person to share this with the world.
Keep on making videos man this was highly informative and my regards to being a person who was part of forming this architecture!
this one presentation is worth more than all the AI discourse on internet.
What a great primer! Very much needed! Thanks as always Jeremy!
Kudos, such pleasurable 1 hour and 31 mins and 12 seconds.
This is amazing, thanks so much for recording this and sharing it 👏
This is a real gem. Reminds me of the authentic, high quality training material from Andrej Karpathy. Looking forward to future similar tutorials if you decide to make them! Thank you!
Always look forward Jeremy to explain this topic. Finally it is here. 😀
Thank you for creating this amazing talk around all the basics and applications with language models, this is really helpful!
Damn I watched the whole video and didn't even realise that it was 1+ hr long! Thanks a lot for the great content!
This is remarkable. Thanks for sharing this topic for us Jemery!
Thank you Jeremy for all of your work and for sharing such quality videos. ❤
Can’t stop watching over and over again! Thank you 🙏
thankyou for the talk
I am a total beginner but u made me understand abt LM models way better than anuone else..u r such a great teacher..I pray for giving u Lord Gurus blessings dor more insight and vision for such a Humble and good Soul.😊😊
100 % agree! Blessings to you too.
Jeremy, we need more videos on this topic! Thank you so much!
Thank you Jeremy!! One of the most insightful and helpful vlog posts on the inner workings of LLMs... Top marks!!!
Happy birthday Jeremy! Just got to the section where your bday is revealed and it is today! Thank you for all the great work :)
Mr Howard never disappoints. Thanks a ton as usual Sir.
Hands down one of the best videos on LLMs on the internet.
Thanks Jeremy for another wonderful lecture! Much appreciated.
Awesome stuff, always like learning from your videos. Been watching since FastAI v1.
incredible, value-packed, practical video for developers working with LLMs.
Glad you liked it!
oh man great video
Great course! Hello from Almaty Google developers community!
Jeremy, Congrats on the 100k subscribers.
Well deserved and hopefully a catalyst to get your invaluable content more exposure.
I can not emphasize how incredible this video was
First comment on RUclips here. Among all those videos on RUclips, using custom instruction like what you did is literally eye opening. I thought current AI models’ limitations are limited by nature that it can’t be improved. Of course it is that you are professional in AI but things are so organized well and straightforward that I can understand and see the result right away. 😂 Gonna have to steal your instruction as well.
This is pure gold!!!!❤❤❤❤ I would have never found this on my own
Great content, and I love the way you structured it. Thanks :)
I remember that I was rewriting some chunk of TensorFlow code with GPT4 code interpreter, and it responded that it did not access to that framework. So I input 'gpt4: "try: import torch"', respond was 'Do you want to learn Pytorch' :) All about gpt's self.awareness mechanism :)
Such an amazing video! Thanks so much for this amazing overview of LLMs! A few suggestions for a follow-up video perhaps :) 1. How to build an LLM from scratch on a consumer grade machine (e.g. laptop with 1 good GPU or Macbook M2 Max), maybe not 1B params but e.g. 500M? 2. LORA and DPO. 3. OpenAI's updated API. 4. Agents based frameworks such as AutoGen. Thanks again - really enjoyed this video! ❤🙏
A wealth of Information and knowledge in this video .. a master piece .. Thank you
Such an excellent video, thanks for taking up your time producing this for those very much at the start of this journey
I've been a student of yours for years, thank you for all you do for us.
A non-optimal but correct solution to the cabbage, wolf & goat crossing works by just using incantation "think step by step" :
Think step by step to solve the following tricky problem:
Here is a logic puzzle. I need to carry a cabbage, a goat and a wolf across a river. I can only carry one item at a time with me in the boat. I can't leave the goat alone with the cabbage, and I can't leave the cabbage alone with the wolf. How can I get everything to the other side of the river?
This classic river crossing puzzle can be solved through a series of strategic moves that ensure the safety of each item. Here's a step-by-step solution:
1. **Take the goat across first:**
- You take the goat across the river and leave it on the other side.
- This ensures that the goat is safe from the wolf, and the cabbage is safe from the goat.
2. **Return alone:**
- You return to the original side alone to get either the cabbage or the wolf.
3. **Take the cabbage across next:**
- You take the cabbage across the river.
- Now, the goat is on the other side, and the cabbage is with you.
4. **Bring the goat back:**
- You bring the goat back with you to the original side.
- Now, the cabbage is alone on the other side, and the wolf is alone on the original side.
5. **Take the wolf across:**
- You then take the wolf across the river.
- Now, the wolf and the cabbage are on the other side, and you have the goat on the original side.
6. **Return alone:**
- You return to the original side alone.
7. **Take the goat across one final time:**
- Finally, you take the goat across the river again.
- Now, all three- the goat, the cabbage, and the wolf -are safely on the other side.
This way, you ensure that the goat is never left alone with the cabbage, and the cabbage is never left alone with the wolf, solving the puzzle.
chat.openai.com/share/3cb73481-c45c-4108-9960-30c11615ca41
I was waiting for this for a long time! Thank you very much
This is amazing and serious content. I love it. Thank you for making it available, greetings from Switzerland
Thank you for sharing your experience and knowledge, Sir.
Thanks for making video. Would love to see some follow up videos on use cases for fine tuning. Where does it make sense vs RAG or even just better prompts
Perfect description of Functions at 46:30!
Never miss Jeremy's lectures....
I feel like I've just been pretrained with the best AI video my creator could feed me.
I feel like this video was made personally just for me. Amazing.
Thank you for this. Couldn’t have asked for a better video.
Thanks...great summary....now i know the relatiinship between neural network parameters and vector DB's
What a very useful and informative video -- I watched this over the course of a day and took notes -- Thanks!
Glad it was helpful!
This is absolutely fabulous. Thank you!
You Jeremy and Andrej are my favourite ML people. Human, kind and helpful. I love your teaching style too. I listened to the video and am looking forward to do so again with video and python and do some coding! Thanks a lot for what you are doing.
wonderful lesson as always Jeremy!
I did have a laugh at the GPT-4 bit "Bad pattern recognition - thanks to Steve Newman", as if he's the sole individual responsible for that limitation
This is an absolutely amazing video, even with no real training on AI, it was easy to follow and made complete sense. Does anyone know if you can run any of these things on the new Intel graphics cards?
This is so well done and presented. Thank you.
Hey thank you for making these available for free. ❤
Jeremy this is a gem of a video. thanks again.
Wonderful overview, gives me confidence to dive in!
Oh yes!!! Can't wait to dig into this, thank you Jeremy!
Hope you enjoy it!
What a fantastic video. Really enjoyed thank you!!