Building a RAG application using open-source models (Asking questions from a PDF using Llama2)

Поделиться
HTML-код
  • Опубликовано: 25 янв 2025

Комментарии • 150

  • @GaurangDave
    @GaurangDave 9 месяцев назад +13

    Oh please don't stop creating these videos, this is really helpful. Very detailed and well explained! Thank you so much for this!

  • @berkbatuhangurhan708
    @berkbatuhangurhan708 10 месяцев назад +15

    Came from X, this is an amazing and very detailed walk through. Thanks for explaining even the tiniest bits of everything. Highly recommend this.

  • @TooyAshy-100
    @TooyAshy-100 10 месяцев назад +8

    Santiago, your videos on LLMs have been incredibly helpful! Thank you so much for sharing your expertise.
    I'm eager to see more of your content in the future.

  • @Meetlimbani27
    @Meetlimbani27 3 месяца назад +2

    Seriously, This is the Best Tutorial I have seen which explains everything by building it with scratch. I was searching for a good tutorial from scratch for around 15 days now. My search ends here

  • @anonymoustechnopath1138
    @anonymoustechnopath1138 10 месяцев назад +12

    Thanks a lot Santiago!!
    Really needed these videos for LLMs.
    Keep them coming!

  • @sarash5061
    @sarash5061 9 месяцев назад +1

    This was just Amazing, You are a Star. Thanks for all the effort.

  • @vadud3
    @vadud3 3 месяца назад +1

    I went through tons of youtube videos and no one breaks down the process like this video. You will actually learn how python tools are used step by steps. This is the best video from my research. Thank you for making it available!!

  • @surygarcia6823
    @surygarcia6823 5 месяцев назад +3

    This is easily the best RAG video out there

  • @yasirgamieldien
    @yasirgamieldien 9 месяцев назад +1

    This is an amazing video. Literally answered all the questions I had on building a RAG and it was really useful to see the comparison between GPT, Llama, and Mixtral

  • @QuentinFennessy
    @QuentinFennessy 10 месяцев назад +3

    This is an excellent walk through - easy to follow and very practical

  • @swatantrasohni5235
    @swatantrasohni5235 10 месяцев назад +4

    Thanks Santiago for wonderful video..running LLM locally is something very handy for variety of task..Eventually everyone will have their own LLM running locally in device..thats the future..

  • @liuyan8066
    @liuyan8066 10 месяцев назад +1

    I like this fundamental courses, especially the last RAG one, I followed other training to build AI products, some teaching is over 10 hours. After i finished, I still didn't fully understand why I coded like that. Now these courses can make the connection step by step. Thank you.

  • @lokeshsharma4177
    @lokeshsharma4177 8 месяцев назад +3

    This is the BEST video ever made comparing all the LLMs performing same task. God Bless You

  • @MarkoKhomytsya
    @MarkoKhomytsya 10 месяцев назад +4

    Thank you for the video!
    I found it particularly intriguing to consider the possibility of obtaining more accurate responses from the PDF using the Llama2 model. Given that local Language Models (LMs) tend to be highly sensitive to how queries are formatted, I believe it's crucial to refine your example further. Here are a couple of suggestions:
    1) Instead of relying on a basic parser, it would be beneficial to prepare a set of predefined questions and answers. For instance, a question like "How much does the course cost?" could have a straightforward answer like "$400."
    2) It's also important to determine the optimal format for prompts, specifically tailored for models like Mistral.
    By addressing these points, you could develop a truly functional product that delivers accurate responses. As it stands, most examples seem to demonstrate that local models struggle with practical applications and aren't quite ready for real-world deployment.

    • @underfitted
      @underfitted  10 месяцев назад

      Great suggestions!

    • @mehmetbakideniz
      @mehmetbakideniz 10 месяцев назад +1

      hi. prompt engineering would definetely solve the problem of verbose answers but do you think it would also correct hallucinations as seen in the video?

    • @MarkoKhomytsya
      @MarkoKhomytsya 10 месяцев назад

      good question @@mehmetbakideniz ! I would like to know answer too!

  • @geethikaisurusampath
    @geethikaisurusampath 9 месяцев назад +1

    This is really Helpful. Specially the explainations behind why do it. Keep up the good work. Respect to you man.

  • @junaidali1853
    @junaidali1853 10 месяцев назад +3

    Lovely. Super useful video. I’ll be building a RAG system with a Vector Database and langchain for my freelance client for around $2,000 or more. Thanks Santiago for helping make my life better.

    • @mune8937
      @mune8937 3 месяца назад

      Wow! As a student, I work in this job. Where did you get this job?

    • @junaidali1853
      @junaidali1853 3 месяца назад

      @@mune8937 that's on Upwork

  • @ankandas3413
    @ankandas3413 3 месяца назад

    Teachers like you make the world better.

  • @sumitrana8114
    @sumitrana8114 10 месяцев назад +1

    Thank you for leaving your job and starting your channel.

  • @sushanths.l4865
    @sushanths.l4865 10 месяцев назад +5

    This is the great video santiago I really learned a lot

  • @asifm3520
    @asifm3520 8 месяцев назад

    That was a really clear explanation. Even novices will have no trouble following along.

  • @vasuchewprecha
    @vasuchewprecha 10 месяцев назад

    you are by far the best teacher on youtube regarding ML/AI. please consider launching a course on generative AI.

  • @bhusanchettri8594
    @bhusanchettri8594 10 месяцев назад +2

    Great piece of work. Well explained!

  • @SuhasKM-tl1rg
    @SuhasKM-tl1rg 10 месяцев назад +1

    I love your content. More of this in my feed please!

  • @samcavalera9489
    @samcavalera9489 10 месяцев назад +1

    Thanks Santiago! I am a student of your ML School course and I haven taken your course in two different cohorts. You ML School course is definitely the best of its kind in the market. Can you please design a new course on RAG that covers everything about this awesome technology including evaluation techniques and deployment? That will be wonderful and I cannot wait to enrol in your RAG (and any other AI) course!

    • @underfitted
      @underfitted  10 месяцев назад +4

      Working on it!

    • @samcavalera9489
      @samcavalera9489 10 месяцев назад

      @@underfitted Many thanks 🙏 🙏 🙏

  • @nikkypuvvada2666
    @nikkypuvvada2666 9 месяцев назад

    Thanks

  • @HelloIamLauraa
    @HelloIamLauraa 4 месяца назад

    omg, exactly what i need can’t wait to watch

  • @ThamBui-ll7qc
    @ThamBui-ll7qc 9 месяцев назад +1

    Great video, I would love to see how to properly structure the prompt and make the bot remember context as conversation goes on...

  • @fintech1378
    @fintech1378 10 месяцев назад +2

    is searching via embedding always better compared to 'traditional' search aka very long context window? where should we use one or the other..how bout if we wanna build multimodal video recommendation system

  • @mehmetbakideniz
    @mehmetbakideniz 10 месяцев назад

    Thanks!

    • @underfitted
      @underfitted  10 месяцев назад

      Thank you so much! Really appreciate you!

  • @RaviShamihoke
    @RaviShamihoke 6 месяцев назад +2

    just created the open ai api key but getting rate limit error

  • @koko9712
    @koko9712 10 месяцев назад +1

    Nice video Santiago ! Keep up the good work

  • @Orenji902
    @Orenji902 7 месяцев назад

    Incredible video, really like the longer content codealong.

  • @noa2427
    @noa2427 9 месяцев назад +3

    I am running in to vector store problem saying import error docarray which i installed. I tried many ways i tried many vertions of docarray and DocArrayInMemorySearch any helpfull thanks

  • @toddroloff93
    @toddroloff93 7 месяцев назад

    Nicely done. Always learn something from your video's. Looking forward to more content. Thanks for doing them.

    • @underfitted
      @underfitted  7 месяцев назад

      Thanks for coming back!

  • @mehmetbakideniz
    @mehmetbakideniz 10 месяцев назад +1

    this was super helpfull. I noticed that using m2 pro some cells took 16 seconds in my laptop while it just took 0.5 second in your computer. then you said you are using m3gpu. How can I make sure that I am using gpu instead of cpu in executing this code? or does langchain already utilize gpu when needed?

  • @square007tube
    @square007tube 8 месяцев назад

    Many Thanks for this video. I walked through the video, I was able to install Ollama3 on my machine, but I have nvidia GPU MX250, which is taking long time to answer the questions. it take 7 mins to answer two questions. I will watch your playlist of LLM.

  • @adinathdesai6880
    @adinathdesai6880 8 месяцев назад

    Amazing Video. You added great value to our knowledge. Thank you so much.

  • @seanb9949
    @seanb9949 10 месяцев назад

    Another great video Santiago! I really look forward to seeing more of these. Heck, I'll watch the ads to make sure you get some $$$ 🙏

  • @asnair
    @asnair 5 месяцев назад

    Excellent! What about the last step -- saving the vector embeddings in the pinecone database?

  • @ravindarmadishetty736
    @ravindarmadishetty736 7 месяцев назад

    It's such a fantastic video, Santiago. 🎉

  • @mrskenz1068
    @mrskenz1068 10 месяцев назад +1

    Thanks for the vidéo. How we can do for scientific PDF that contains a lot of mathematical and chemical formulas.

  • @SarthakVashisth-y6z
    @SarthakVashisth-y6z Месяц назад

    if i would like to use the newer llama models like 3.1 or even 3.3 till some extent, would i need to make some changes in the code or libraries perhaps to make it work?

  • @dannysuarez6265
    @dannysuarez6265 8 месяцев назад

    What a great presentation! Thank you so much, sir!

  • @PoojaGori
    @PoojaGori 17 дней назад

    Santiago, does the mlschool course also include the latest GenAI development ?

  • @GEORGE.M.M
    @GEORGE.M.M 6 месяцев назад

    Hi there! I'm new to AI and RAG systems and have spent the past few days diving into your tutorial to understand each step and debug along the way. I have to say, THIS IS A GREAT TUTORIAL! I do have a question about alternative local LLM approaches. Regarding the accuracy of asking questions and retrieving relevant information from PDF documents like research papers and books using local models, would you recommend this approach over using tools like Ollama UI and LM Studio? Are there specific advantages or disadvantages to consider when choosing between these methods?

  • @surygarcia6823
    @surygarcia6823 5 месяцев назад

    Is there any tutorial for making our own vector database for production?

  • @ravindarmadishetty736
    @ravindarmadishetty736 Месяц назад

    I have a question when we deal with huge document repository how this repository can be scaled? As the Chromadb and pinecone are with limited i guess. Please help me how to handle this.

  • @RameshBaburbabu
    @RameshBaburbabu 10 месяцев назад

    Wow gr8 video, I am able walk with you and finished till the end. "Batch" is gr8 . Thanks please post more videos .. 🙏🙏

  • @alextiger548
    @alextiger548 8 месяцев назад

    Ma, thanks for what you are doing! Fantastic stuff.

  • @MD.IKRAMULHOQUE-c3o
    @MD.IKRAMULHOQUE-c3o 17 дней назад

    Can anyone clear the concept how the locally hosted Ollama model is connected to the program? i can see we are using OPEAI KEY. So got confused

  • @chanukyapekala
    @chanukyapekala 9 месяцев назад

    excellent work! so clear and concise..

  • @avinashnair5064
    @avinashnair5064 5 месяцев назад

    Hey can you please createa video how can we use this in a UI as I tried using streamlit , so I am not able to get the same output what I am getting in the terminal..

  • @fatiga2426
    @fatiga2426 8 месяцев назад

    Santiago, muy buen video!
    Una pregunta, por que usas un parser para obtener el output del modelo como string? Por que mejor no obtener el content directamente?
    Saludos

  • @RomanovDK
    @RomanovDK 3 месяца назад

    Basic question - which editor is that ? The Terminal app on my Mac does not seem to be it.?

  • @tipiapagupo
    @tipiapagupo 8 месяцев назад

    Amazing video! Are you still planning to make the video on how to communicate with websites? I'm really curious about the technologies you consider most relevant.

  • @GrantNaylor-b8l
    @GrantNaylor-b8l 3 месяца назад

    More great content. Clear, easy to follow. I have a question. If you only wanted a simple RAG to answer questions from small snippets of text (a few pages not 100's) would a vector memory store really be a bad thing?

  • @TempleTimes
    @TempleTimes 6 месяцев назад

    At 17.25, when I try to run the code to invoke llama 2 it gives me an error saying module not callable. Have installed llama 2 in laptop kindly help

  • @farukondertr
    @farukondertr 9 месяцев назад

    dude, its awesome! do not stop pls

  • @researchpaper7440
    @researchpaper7440 10 месяцев назад +1

    Looking for these videos, next i am looking a model to train on SQL data

  • @gauravpratapsingh8840
    @gauravpratapsingh8840 7 месяцев назад

    Hey can you make a video for website page Q/A chatbot by using langchain framework? by using some open llms or free public API keys?

  • @antonioskarvelas1325
    @antonioskarvelas1325 7 месяцев назад

    I have problem with the code. I run the code in VScode and I get the error: ValueError: Ollama call failed with status code 403.
    Could you help me?

  • @TomasTrenor
    @TomasTrenor 8 месяцев назад

    Amazing video Santiago ! Many thanks . Just tried it with Llama3 8b and, as it seems , is not so accurate as Llama2 ( what does not make sense obviously). I need to deep into it

  • @chiragharish1020
    @chiragharish1020 7 месяцев назад

    Great video 👏👏Really helpfull , can you tell which model mac do u have,I have a m2 mac 8gb -i am doubtfull i will be able to run these powerfull models.

  • @energyexecs
    @energyexecs 3 месяца назад

    I collect old vintage engineering and science books from old bookstores that are ready to be junked - hopefully they are not copyrighted. 😊 My goal is to somehow convert these old books to digital then embed in a vector database. Then take them through transformer and layers and semantic to a LLM of some time for search by "natural language". I created an Independent Private Library Association on Facebook so my thought is to use Meta Llama as my LLM - I probably need to ask Meta Llama people for the costs to do this. It's just a hobby of mine. I have lots of old books. And now Libraries are trashing beautiful old vintage books. I want to do all modalities because these books have beautiful lithograph sketches and single line drawings - I think you've seen such books.

  • @_kissimusic
    @_kissimusic 7 месяцев назад

    Can I embed this, such as with laravel, and then serve it on a host online. So I can access it anywhere?

  • @alexstele5315
    @alexstele5315 10 месяцев назад

    Thanks a bunch! 🎉 I've been looking for something like that.

  • @nguyenquocviet4287
    @nguyenquocviet4287 10 месяцев назад

    Dear Santiago,
    I would like to ask you about the evaluation metric?
    Do you know any evaluation metric for evaluating between the generated answers and true answers? (eg. Rouge metric)
    Thank you so much!

  • @Jonathan-ru9zl
    @Jonathan-ru9zl 9 месяцев назад

    Great! Can this model and setup serve as an assistant to, lets say, a board design engineer that have thousands of components specs in pdf files?
    To find and analyze the components faster?

  • @gilbertoparra5255
    @gilbertoparra5255 6 месяцев назад

    Thank you for the content very helpful, I have one question, how do I know it is running locally? I mean for every model we used the langchain library as if it were consulted through the API.

    • @underfitted
      @underfitted  6 месяцев назад

      They are running locally because I’m using Ollama to host the models in my computer. We use Langchain exactly as we would use it to access online models. That’s good. It means we can switch models without changing the code.

  • @TheMunishk
    @TheMunishk 10 месяцев назад

    Congrats and well done for producing this useful content. Exactly what I was looking to kick start my langchain journey with the models. Let me practice this but I was also looking for how to integrate all this in the front end. Do you have a video of which tools to build a front end for the prompt that will interact with the backend LLMs?

  • @joeldartez829
    @joeldartez829 10 месяцев назад

    Truly, you're the best. I've never met someone who explains things so well.
    I apologize if it is written somewhere and I missed it, but I wanted to ask if I buy your course today, can I have access to the past content today? I don't want to wait until the live sessions in April (or I want to arrive prepared for them). Thank you very much.

    • @underfitted
      @underfitted  10 месяцев назад +1

      Yes, you get immediate access to everything from day 1.

  • @azharsham
    @azharsham 8 месяцев назад

    Brilliant video! but one quick question , are you passing the same string to both question and context here ? If yes does it always work in case of document reader

  • @fredericv3497
    @fredericv3497 8 месяцев назад

    Really good job and clear tutorial ! Thank you

  • @AntoineToussaint
    @AntoineToussaint 3 месяца назад

    Great videos by the way, started with the "Build from scratch" one. Quick question though: why does the embeddings for the RAG need to match the model type? My understanding is we only need to be consistent on the question embedding and the store embedding for similarities but then the retriever pulls the actual documents and send then to the model.
    Later in the video, you say you can use any retriever , like Google search or anything so it seems to me the embeddings can be anything as long as question and documents embedding are the same.

    • @underfitted
      @underfitted  3 месяца назад

      They don’t. That was my mistake. You only need to make sure the embedding you do on the query matches the embedding you used when processing the data.
      But you can use any embedding model regardless of the embedding used by the LLM.

    • @AntoineToussaint
      @AntoineToussaint 3 месяца назад

      @@underfitted Thank you for the prompt -- pun intended -- reply! I watched two videos and now I am running ollama! You explain so well and give a lot of passion for these. I am trying to build a simple model so I can take a query and extract some custom filtering to send OpenSearch. Great stuff. Thank you for making these videos!

  • @UsmanTahir6
    @UsmanTahir6 4 месяца назад

    Very good tutorial thanks. Although due to some reason docarray is not working at my end therefore i used FAISS (if anyone interested to know). Thanks Santiago! 🙂

  • @jpagano569
    @jpagano569 7 месяцев назад

    Hmm is there a way to run this in CoLab or Github Codespace? I suppose the point is to run locally, but I hate setting up dev environments (because I'm new to coding!)

  • @TexttoInvoice
    @TexttoInvoice 9 месяцев назад

    This video is awesome so so great!!! Thank you so much for such a quality video.
    Question: what’s the best way to improve the results being accurate to the document, can using structured data such as spreadsheets and CSV files give you a more accurate answer and maybe the model prefers interacting with them?
    Also, if there was more instances of the data, say multiple different documents, containing the same information that needs to reference ?
    Anyone who has found the best way to optimize getting correct answers from your retrieval. Please let me know! Thanks

  • @KumR
    @KumR 7 месяцев назад

    Can u also pl add an UI using streamlit

  • @HelloIamLauraa
    @HelloIamLauraa 3 месяца назад

    why don't we use chunks?

  • @Rituraj-s5b
    @Rituraj-s5b 10 месяцев назад

    Very informative video, Santiago!

  • @ManuRaj-i4b
    @ManuRaj-i4b 8 месяцев назад

    Sir how can i do this project using java or spring boot

  • @mehershahzad-n5s
    @mehershahzad-n5s 4 месяца назад

    Your content is superb

  • @vivekatbitm
    @vivekatbitm 7 месяцев назад

    Great video, just curious though how come both gpt and llama model generated the same joke? Isn't that weird??

  • @anmolacharya7872
    @anmolacharya7872 5 месяцев назад

    why wont this work for llama 3.1?? It says "llama2 not found". Please help

    • @anmolacharya7872
      @anmolacharya7872 5 месяцев назад +1

      edit. Made it work.....Had to go to every instance of llama2 in the code and modify that into the ollama model I was using.

  • @marschrr
    @marschrr 10 месяцев назад

    Came here from X. Great overview on how to implement an LLM+RAG locally. Any multimodal ones incoming?

  • @gonzaloplazag
    @gonzaloplazag 8 месяцев назад

    Great video! incredibly helpful!!!

  • @loko5307
    @loko5307 5 месяцев назад

    Great content! helped me alot. Thanks!

  • @Hizar_127
    @Hizar_127 8 месяцев назад

    i want to deploy it on cloud. does it is possible?

  • @derekottman9622
    @derekottman9622 9 месяцев назад

    This video is supposed to have a link to ANOTHER "from scratch" video as a popup - but WHEN that link pops up, I think it's ACTUALLY a link within this video pointing in a circular loop BACK TO THIS VIDEO, instead of the "pointing elsewhere" link to the other video it's supposed to be. (This video has a link to itself, if I didn't get MY wires crossed.)

    • @underfitted
      @underfitted  9 месяцев назад

      I don’t think that’s possible? Anyway, you’ll find the other video here: ruclips.net/video/BrsocJb-fAo/видео.htmlsi=BVJfS_0Iq9lwRX0B

  • @basantsingh6404
    @basantsingh6404 10 месяцев назад

    if you are using open_ai key, it means you are paying to use the open ai model. how is it open source ?

  • @theDrewDag
    @theDrewDag 10 месяцев назад +5

    Is it actually true that you need models to be aligned with their respective embeddings? I don't think so 🤔 Embeddings are used only for the vector search and lookup functionality. At the end of the day all the model is seeing is your textual prompt. You can use OpenAI embeddings with any open source models and viceversa.

    • @underfitted
      @underfitted  10 месяцев назад +7

      You are right. In this example I only use the embeddings for the search, so what I said is irrelevant here.

  • @ОлександрВасильєв-б5д
    @ОлександрВасильєв-б5д 8 месяцев назад

    Can I use LLama 3 models with your tutorial?

  • @kergee
    @kergee 8 месяцев назад

    The lesson was very good, thanks

  • @Omar-p9r3c
    @Omar-p9r3c 10 месяцев назад

    has anybody used ollamaembedding and got it working?

  • @TheHikmaQuest
    @TheHikmaQuest 9 месяцев назад

    kindly create another video in which you use Pinecone and also give a GUI making it a complete standalone application

    • @serhiua
      @serhiua 8 месяцев назад

      Pinecode already explained here ruclips.net/video/BrsocJb-fAo/видео.html&ab_channel=Underfitted

  •  10 месяцев назад

    Love your video ! Thanks !

  • @calixtera
    @calixtera 4 месяца назад

    Good content here, thank you!

  • @nevildev
    @nevildev 10 месяцев назад

    Thx! Very straightforward

  • @peacefullmusic8374
    @peacefullmusic8374 8 месяцев назад

    best tutorial for start

  • @sumittupe3925
    @sumittupe3925 9 месяцев назад

    Thanks for the video.....
    Well Explained....!