Local Retrieval Augmented Generation (RAG) from Scratch (step by step tutorial)

Поделиться
HTML-код
  • Опубликовано: 21 авг 2024

Комментарии • 164

  • @mrdbourke
    @mrdbourke  5 месяцев назад +42

    Who’s ready to build a local RAG pipeline from scratch? 🔨🔥
    PS A big shout out to NVIDIA for sponsoring this video!
    Be sure to checkout NVIDIA GTC for the latest developments in AI, deep learning and GPU technology.
    It’s running from March 18-21 in San Jose California but is free to attend virtually (what I’m doing): nvda.ws/3GUZygQ

    • @nikosterizakis
      @nikosterizakis 5 месяцев назад

      Once again, an awesome video, Daniel. Two comments only, if I may: a. lose the tash :) b. I just got a mac pro and you started coding on a windows :(

    • @kameshsingh7867
      @kameshsingh7867 4 месяца назад

      Great video, I have been waiting for such video on RAG. Thanks Daniel

    • @Dan-Levi
      @Dan-Levi 2 месяца назад

      Great stuff, thanks for making this available for us noobs! I managed to follow it to the end with my custom source of text (a phpbb forum's posts), although it wasn't very good at answering it did produce some answers.
      All text was in norwegian, and the topics and the related answers was concatenated to its each line in a txt file.
      I'm sure i could have done this much better but for a first try i was rather satisfied!
      Could not have done this without this video tho, so yeah, great stuff!
      Any tips on models i can use for scandinavian/norwegian? I tried to search around a bit and after a quick talk with chatgpt (lol) i found the NB-BERT (Norwegian BERT), but did not manage to get it going.

    • @robertovillegas2220
      @robertovillegas2220 Месяц назад

      It would be great if you do a video on how to add to the query a file, just like in the Chat GPT. My use case is that, A group of files as a context for the RAG and a query with a document of the RAG to compare the document in the query vs criteria in the content of other documents and evaluate compliance of the document in the query.

    • @hassan_sid
      @hassan_sid Месяц назад

      This video is a gem. I have recently completed watching your Tensorflow and PyTorch tutorials. Those videos are great in terms of resources. Plz, make a video on fine-tuning LLMs. Looking for more amazing videos from you in the future.

  • @TheYephers
    @TheYephers 4 месяца назад +18

    A lot of times I struggle to make it through a 10 or 20 minute video on this stuff. But somehow I can watch 5hours or even 24hours of you. I love how you keep it lite and you let use see your typos. You are a special an unique person and I am glad you do what you do. I really loved your 24 hour pytorch video too.

  • @AprendizSerial1979
    @AprendizSerial1979 3 месяца назад +7

    I must give you my total gratitude for taking your time (5 hours) to explain baby steps such invaluable knowledge.

  • @hpac9687
    @hpac9687 5 месяцев назад +8

    I'm a follower for quite some time and also a big advocate for open sourcing, knowledge exchange and community building. Really happy to see you consistently push yourself and your work but never loosing site of your core values. Great Job!

  • @_MrCode
    @_MrCode 5 месяцев назад +14

    Machine Learning Bootcamps will never exist 'cause of Daniel 🤗

    • @newxceo
      @newxceo 5 месяцев назад +5

      Yeah! Tutorials on these topics are rare, he can make huge amount of money but he teaches as an ideal teacher.

  • @r0f115L4m
    @r0f115L4m 5 месяцев назад +8

    It’s always a good day when Daniel drops one of these tutorials. Love it!

  • @amitkumarchejara6042
    @amitkumarchejara6042 4 месяца назад +3

    Finished the tutorial today and will start building something using this tremendous knowledge. Thankyou so much!!
    And a note : We are happy and excited with any Idea that pops into your head about a new educational video.
    we need tutorial on different frameworks and building from scratch.

  • @tejasvix
    @tejasvix 7 дней назад

    I am baffled by the sheer quality of this content, damn thanks, would love to see a similar tutorial on fine-tuning newer models like llama 3.1 or mistral ones

  • @mortitotti
    @mortitotti Месяц назад +1

    So much useful information in one video. Can't be more grateful for this awsome work.

  • @vasoyarutvik2897
    @vasoyarutvik2897 Месяц назад +1

    Hey daniel, i have seen tensorflow and pytorch code videos and I just loved it and today I have completed this video.i just want to say thank you so much. love your content and your teaching style as well. love from INDIA. good luck

  • @evharten
    @evharten 2 месяца назад

    Almost 6 hours of training but it only took me two day's to get through and build our own RAG model, it was a breeze. Very good training and it really helped me in understanding what RAG is and how to implement it. Thank you very much!

  • @mehmetozkaya284
    @mehmetozkaya284 5 дней назад

    Thank you so much for everthing Daniel!

  • @studywithjames
    @studywithjames Месяц назад

    I am going to be a Graduate Researcher at my University and going to research RAG! I was worried since I was new to RAG until this video! Thanks, Daniel!

  • @afai264
    @afai264 3 месяца назад +1

    Excellent , I've been watching your videos for a while (the Python in 24 hours took some time to get through!) I like how you explain absolutely everything and don't jump ahead (even if some things may seem obvious) , it may take longer but by the end you get a very good understanding of the topic and all the concepts.

    • @mrdbourke
      @mrdbourke  3 месяца назад

      Thank you! Glad you enjoy!

  • @AshishWaikar
    @AshishWaikar Месяц назад +1

    This is great tutorial for RAG! Just amazing!

  • @tobichls
    @tobichls 5 месяцев назад +5

    Funny that I happened to find this video, it's gonna be very helpful for a project that I'm working on.

    • @mrdbourke
      @mrdbourke  5 месяцев назад +3

      Good timing! Enjoy!

  • @parinaztabari2332
    @parinaztabari2332 14 дней назад

    This tutorial is awesome!!! Thank you, Daniel. You are a fantastic teacher! I look forward to future videos about creating an app with this and optimizing LLMs locally! Super cool.

  • @akj3344
    @akj3344 5 месяцев назад +2

    Gonna watch this after buying gpu. Love your TF course and ZTM community!

    • @mrdbourke
      @mrdbourke  5 месяцев назад +2

      Enjoy legend!

  • @Ony_mods
    @Ony_mods 23 дня назад +3

    if i could, i would give this video multiple thumb up! 🥰

  • @felipeanime1999
    @felipeanime1999 Месяц назад

    Holy shit, this was awsome, first time implementing rag, following all steps, just immediately fell in love. Also, I would love to see a deploy using gradio, and adding a "upload file" feature on it. Thanks for sharing your knowledge Daniel!!!

  • @mustafamaree
    @mustafamaree 5 месяцев назад +3

    Great content i am watching TF course now thanks for your effort to make the learning easy and fun

  • @umer-un
    @umer-un 3 месяца назад

    Oh there you are Daniel, always ready to help with complex concepts.

  • @knubbe
    @knubbe 5 месяцев назад

    Finally! Now no more second guessing on what LLM RAG’s are. My boy Daniel got us covered!!

    • @mrdbourke
      @mrdbourke  5 месяцев назад

      You will definitely know what RAG’s are after this!

  • @leprodige5388
    @leprodige5388 4 месяца назад +1

    You are the goat of this planet. Thanks for the video, it's very clear.

  • @user-yu7ie2em5b
    @user-yu7ie2em5b 14 дней назад

    one of the best tutorials

  • @amitkumarchejara6042
    @amitkumarchejara6042 5 месяцев назад +1

    This is what we were waiting for!!
    Great content, All the best!!

  • @Trashpanda_404
    @Trashpanda_404 4 месяца назад

    Jesus almost 6 hours! 😳 Gonna have to do a rail of Adderal and dig in. 😂 Seriously though with my ADHD I’ll have to break it down in sections, but I’m excited to watch the content. Thanks for the in-depth video brothers!

  • @DataScienceGarage
    @DataScienceGarage 4 месяца назад +1

    This is a golden video! Thanks a lot for sharing this! :)

  • @darryloatridge7329
    @darryloatridge7329 3 месяца назад

    Thanks Daniel, so much to love about this video. Firstly I love this is true Python open-source code and not frameworks chained together. The delivery style works perfectly for me, with occasional offshoots to related, though not required, subjects of interest. Your thoughts and advice along with extensive notes build the perfect all-round package. Please Please do more on LLM optimization techniques you suggested and any other in-depth subject you care to share like embedding.

  • @CodeSnap01
    @CodeSnap01 5 месяцев назад

    Loved it daniel . Always admire your work , please add deployment methods , it will help alot

  • @AnyDeal
    @AnyDeal 4 месяца назад

    Great work Daniel educate the world 🌎 Keep going 🚴‍♂️

  • @bigwoodtony4063
    @bigwoodtony4063 4 месяца назад

    Always love your content, and I am going to follow throughout and build my own RAG tooo!

  • @dipayanbhowal7025
    @dipayanbhowal7025 2 месяца назад

    This tutorial is just too good

  • @basittanveer2358
    @basittanveer2358 12 дней назад

    Yes, we want an extension video on how to get this all into an app
    Video Stamp 3:36:50

  • @Daily_language
    @Daily_language Месяц назад

    Great! instead of following those wrapped functions defined by langchain, using this method could learn the details of RAG.

  • @ken-cheenshang6829
    @ken-cheenshang6829 24 дня назад

    amazing tutorial, thanks for you share!

  • @romaind4853
    @romaind4853 3 месяца назад

    Very interesting and well made. Thank you!

  • @farazshah4484
    @farazshah4484 4 месяца назад

    Best video on the internet regard RAG! 🔥

  • @pranavpai_
    @pranavpai_ Месяц назад

    I came from your "Day in a Life as a Startup Founder video". Do you have a video on setting up what you have set up in your business? How to create the infrastructure, deploy models etc., in a business context. Or anything close to that.

  • @alisamalakhova
    @alisamalakhova 3 месяца назад

    This is such a great and helpful video!!

  • @deepsuchak.09
    @deepsuchak.09 5 месяцев назад

    Big fan!
    Amazing content as usual
    Binging this asap!!

  • @leonardommarques
    @leonardommarques Месяц назад

    this was entertaining to watch.

  • @gokuljs8704
    @gokuljs8704 3 месяца назад

    It is always good to mention system requirements in the course. this needs GPU to run it

  • @mpc5584
    @mpc5584 3 месяца назад

    Thanks soooooo much this is a lifesaver!

  • @PyroGamez2016
    @PyroGamez2016 Месяц назад

    Day 1:
    42:01
    Day 2:
    1:31:27
    Day 3:
    2:22:44
    Day 4:
    4:03:57

  • @ComputingAndCoding
    @ComputingAndCoding 3 месяца назад

    Excellent Video!

  • @abdullahiabdulwasiu
    @abdullahiabdulwasiu Месяц назад

    Alot of insights here. Pls you could also show fine tuning of the LLM for a specific domain instead of embedding. 🎉

  • @mohammed333suliman
    @mohammed333suliman 4 месяца назад

    as always great stuff . thank you Dan . Based on this , can you make a video or share resources about evaluation methods . and additional one for deployment.

  • @Mars18542
    @Mars18542 5 месяцев назад

    Great tutorial 👏

  • @nayeemuddinkhan5852
    @nayeemuddinkhan5852 3 месяца назад

    Well done !

  • @vasoyarutvik2897
    @vasoyarutvik2897 Месяц назад

    Hello daniel. i love your content i just want to request you to create video on LLM from scratch and Stabel diffusing image generation from scratch. thank you again for very good leaning resources. good luck

  • @RameshP-ds4xt
    @RameshP-ds4xt 18 дней назад

    Excellent

  • @andydataguy
    @andydataguy 5 месяцев назад

    This is awesome!! Thank you 🙌🏾💜

  • @FireFly969
    @FireFly969 5 месяцев назад

    A wonderfull project men, thank you so much ❤

    • @mrdbourke
      @mrdbourke  5 месяцев назад

      You’re welcome! Have fun!

  • @Dr_Varghese_Manappallil_Joy
    @Dr_Varghese_Manappallil_Joy 5 месяцев назад

    Hi Danny !!!....Great Job !!!!...Sky is the limit

  • @AryanMishra_21
    @AryanMishra_21 Месяц назад

    You are God, There is Daniel and then there is Jesus! Thanks for this!!!!

    • @mrdbourke
      @mrdbourke  Месяц назад

      Hahaha that’s a very big compliment! Thank you for the kind words

  • @Itsme-havefuntogether
    @Itsme-havefuntogether 3 месяца назад +1

    I see the whole video and I didn't expect that we are using a pretrained model for that 😢 I am watching because your title said from the start , for that we can directly use the model like gpt 2 or other that will work in less then 10 lines of code 😕

    • @costa2150
      @costa2150 3 месяца назад

      gpt 2 cant query the pdf used in this tutorial

    • @Itsme-havefuntogether
      @Itsme-havefuntogether 3 месяца назад

      @@costa2150 simple we can use any model from the hugging face library to do that In even short code

  • @user-wi6wf2yn4b
    @user-wi6wf2yn4b Месяц назад

    thank you daniel

  • @KaranSharma-cg6uh
    @KaranSharma-cg6uh 4 месяца назад

    Hi Dan, great content and thank you! Is it possible to have another video on the "optional" chat application for a complete end-to-end RAG pipeline? Thanks again!

  • @continuouslearner
    @continuouslearner 5 месяцев назад +1

    Can i run this on my macbook M1 or do i need to purchase a nvidia GPU + windows OS? Can you clarify?

    • @mrdbourke
      @mrdbourke  5 месяцев назад +4

      You can run this on a MacBook, however, you will need to change the device to “mps” (Metal Performance Shaders) rather than “cuda”. For example, `device = “mps” if torch.backends.mps.is_available() else “cpu”`, see more here: pytorch.org/docs/stable/notes/mps.html
      Also, depending on how much memory you have available will depend on the LLM you can use

  •  24 дня назад

    Amazing video thank you. Can you do something for FAISS

  • @Tony-fm4tf
    @Tony-fm4tf 4 месяца назад

    Thank You!!

  • @kavyajeetbora2585
    @kavyajeetbora2585 4 месяца назад

    Thanks @mrdbourke, this was really helpfull in breaking down some concepts in LLM.

  • @rathtakrit
    @rathtakrit Месяц назад

    god bless you

  • @alessiosaladino6664
    @alessiosaladino6664 2 месяца назад

    Hi, I want to watch this video but since it is 5 hours long i first need some informations:
    -Does it requires any paid API subscription?
    -The models and the code can also run on google colab?
    -Does it requires any training or it relies on pre-trained models?
    Thank you!

  • @alimustafa2682
    @alimustafa2682 5 месяцев назад

    Life Saver

  • @iFastee
    @iFastee 4 месяца назад

    so, at around the 5ish hour mark, where we're playing with the base prompt and connecting it to our retrieved context, I am wondering how can I be sure that all that fits my LLM context window?
    am I missing something here? is the llm context window much greater than for example our previous embedding model that tokenized max 384 tokens per sample (i know its not apples to apples, as the embedding model will embed all the sequences and maybe leave some tokens out on some of them, but in the llm window context we have to make sure we are fitting literally everything in our prompt)?

  • @AnshumanKumar007
    @AnshumanKumar007 5 месяцев назад

    Need videos on RAG agents that can handle math also.
    Working with clients, I am really seeing the limitations of the RAG framework when it comes to math stuff. It just sucks.
    RAG + Graph Agent is worth exploring.

  • @sarathgentela9444
    @sarathgentela9444 4 месяца назад

    Great Video Daniel! Thank you Would be a small word. Do you think We can apply same mechanism to build a chatbot on a CSV file or SQL database. How can we handle different columns with different datatypes likes strings for names, integers for ranks, counts,and scores and float for rates and other metrics. How to make an embedding model understand the differences?
    Would Love if you can make a video on it.

  • @PrithivThanga
    @PrithivThanga 4 месяца назад

    return torch.mm(a, b.transpose(0, 1))
    RuntimeError: Tensor for argument #2 'mat2' is on CPU, but expected it to be on GPU (while checking arguments for mm)

  • @Alex-bv2eu
    @Alex-bv2eu 4 месяца назад

    Great video ! I will definitively make my own.
    Is there other ways to evaluate the LLM responses ? In order to evaluate automatically many answers given by the LLM (as for supervised learning for example) I guess we would have to pay the GPT4 access to the API... Could we use embeddings and cosine again to compare the generated answer to a label ? Or maybe use another local LLM on a bigger hardware ?
    Thank you so much !

  • @wimboj568
    @wimboj568 4 месяца назад

    If the pdf I use contain formulaes, will the AI system be able to interpret and apply the formulae in questions or scenarios I give? Similarly I'm curious whether or not it is able to acknowledge diagrams (though I assume we have to remove them as it cannot be included)

  • @denisebby1895
    @denisebby1895 Месяц назад +1

    nice tutorial! Anyone else using gemma-2b-it with 4 bit precision and getting answers like 'The context does not provide any information about the best source to fulfill nutritional requirements, so I cannot answer this question from the provided context.'? Maybe a larger llm will help.

    • @cavemanpl5166
      @cavemanpl5166 Месяц назад

      hi, the same here on old 8gb nvidia 3060. i tested it on two gpus and the same gemma-2b-it model works smooth and painless on new 64gb ampre

  • @user-dn9ub6jr5y
    @user-dn9ub6jr5y 4 месяца назад

    Thank you so much for a great video!
    I watched all of your deep learning and machine learning content and absolutely loved it!
    Always craving for more, can we expect a following video teaching Attention models and transformers or vision-related generative models such as DDPM?

  • @user-vd5hj1kh6w
    @user-vd5hj1kh6w 2 месяца назад

    Is there any practical difference in approach between RAG as described here and RAG for tabular data (CSVs and excel docs)?

  • @holycake123
    @holycake123 Месяц назад

    Which one is better - a MacBook Pro with the M3 Max chip, or a setup with an RTX 4090 for tasks like creating embeddings and building an RAG system?

    • @shivampradhan6101
      @shivampradhan6101 Месяц назад

      an RTX 4090

    • @holycake123
      @holycake123 Месяц назад

      @@shivampradhan6101 thanks, what if the llm requires more than 24GB VRAM to run?

  • @hassan_sid
    @hassan_sid Месяц назад

    @mrdbourke Can you make a detailed video on LLM fine tuning?

  • @jahidmdhasan1021
    @jahidmdhasan1021 8 дней назад +1

    Should I try to build on GTX 1650Ti will be able to run, or it will crash

  • @mohammedmouaadhdjawedbaghd2990
    @mohammedmouaadhdjawedbaghd2990 3 месяца назад

    Thanks for the video, I see that all the optimization techniques in order to speed up the process of generating the output requires a good GPU, is there any techniques recommanded that fits on the cpu (i don't have a gpu) , thanks

  • @rohitmondal4143
    @rohitmondal4143 Месяц назад

    Hi, please bring in the optimised generation video of llm please

  • @mohammedemadeldin5753
    @mohammedemadeldin5753 3 месяца назад

    Bravoooooo 🙅

  • @kaanvural2920
    @kaanvural2920 22 дня назад

    I did not understand exactly if the sentence_chunk property is the same thing as the text of that page itself. Why we first break it into chunks and then joined them? What am I missing?

    • @mrdbourke
      @mrdbourke  20 дней назад

      Great question! It’s because many of the pages in the textbook we used don’t actually have that many sentences of text. If the page had many sentences (eg 20-50 sentences), we may not want to embed the whole page because our embedding model can only handle sequences of 512 tokens (anything after this would be cut off), so we reduce the pages to chunks of 10 sentences so each chunk fits into our embedding model.

  • @ShreYyy20902
    @ShreYyy20902 2 месяца назад

    hi(i am at the beginning of the video idk if u've explained the mini specs reqd) I HAVE RTX3050 4gb Vram laptop and an 8gb ram w ryzen 5600H
    i had tried to build a local chtbot sometime back but it used to take a lot of time to profuce results .. i just wanna know if it will work for mine (i can use some low level models as well but yea)

  • @fintech1378
    @fintech1378 4 месяца назад

    does preparing text like tokenization etc before feeding it LLM really help? any data? isnt this more for model before LLM?

  • @dhualshammaa2062
    @dhualshammaa2062 Месяц назад

    Do you have video showing how to great an phone app using LLM from A to Z ?

  • @takshashilaupsc3592
    @takshashilaupsc3592 6 дней назад

    Can you please make video for evaluating the answer and most important to deploy over cost effective server and run over internet.
    This we will call the deployment of model local created. Pleaseeeeeee

  • @RandomGuyThatLivesSomewhere
    @RandomGuyThatLivesSomewhere 5 дней назад

    1:45:40 Can't this be fixed by doing replacing "".join() with " ".join()? Why did he need to do a regex?

  • @InzideEntertainment
    @InzideEntertainment 5 месяцев назад

    You wanted to know why your VS Code got slow...
    - it is possibly because when you Install Git on your windows machine: Scalar Git Add-on to Manager Large-Scale Repositories = OFF (Default). 😵
    also,
    - Built in File Watcher (experimental) to cache commonly used files and not have to scan 100% for every query... ❤‍🔥
    I ❤‍🔥 optimization.

  • @artinwords8308
    @artinwords8308 4 месяца назад

    Hi Daniel, thanks a lot for your support! I'm a real beginner (art historian).... where do I put in the code??? THANKS from Vienna

  • @123arskas
    @123arskas 5 месяцев назад +1

    Wow. Yummy content

    • @mrdbourke
      @mrdbourke  5 месяцев назад +3

      Like a machine learning cooking show!

  • @flamboyanta4993
    @flamboyanta4993 5 месяцев назад

    Will this be added to your fantastic Udemy course?

  • @rohithdon2621
    @rohithdon2621 3 месяца назад

    In here didn’t use vector DB like chroma, Fiass

  • @junaidjaved5109
    @junaidjaved5109 День назад

    Hi Daniel, I am quite new to AI. I am trying to sign up to hugging face but am getting 418 error on signup. I found a reddit thread too related to that error on signup but without any solutions to that question posted. Is there any other way around to do so? I am trying to use Gemma-2b-it.

  • @erdemaslan6067
    @erdemaslan6067 5 месяцев назад

    thanks man and you are so funny

    • @mrdbourke
      @mrdbourke  5 месяцев назад

      You’re welcome! Enjoy!

    • @erdemaslan6067
      @erdemaslan6067 4 месяца назад

      ​@@mrdbourke
      Thank you for your time. I've reached the end of this video.

  • @mikhailkalashnik0v
    @mikhailkalashnik0v Месяц назад

    Will a laptop w a 4090 gpu work or does it strictly need to have the desktop 4090 power?

  • @alimustafa2682
    @alimustafa2682 5 месяцев назад

    After the final query (my own pdfs), RAG replies: The context you provided doesnt contain answer to your query...
    Any idea ?

  • @shingshing825
    @shingshing825 2 дня назад

    Hi Daniel in PymuPDF I get an error failed to open file when i do fitz.open(pdf_path) thought it was originally because of the for-loop any idea what could be causing it. code=7 no objects found. (Great content either way)

    • @shingshing825
      @shingshing825 2 дня назад

      So if you use the requests to download it the file somehow became corrupt

  • @dipayanbhowal7025
    @dipayanbhowal7025 2 месяца назад

    Hey Daniel , i need one help setting up CuDa for GTX card for this project. Any insights?

  • @thangammarriage3509
    @thangammarriage3509 2 месяца назад

    what are prerequisites to learn RAG ?

  • @deepakkapoor2430
    @deepakkapoor2430 4 месяца назад

    can we use AMD GPU on windows to run this or have to use Cuda only?