4 ways to do question answering in LangChain | chat with long PDF docs | BEST method

Поделиться
HTML-код
  • Опубликовано: 2 дек 2024

Комментарии • 101

  • @paqueteaguilera6
    @paqueteaguilera6 Год назад

    You saved my life. I had been struggling with large pdf files. New follower!

  • @konstantinkonovalov7292
    @konstantinkonovalov7292 Год назад

    Sophia, you are the BEST! THANK YOU, for so easy explaining!

  • @MichaelNinoEvensen
    @MichaelNinoEvensen Год назад +9

    This is amazing. Really appreciate these super clear walkthroughs!

  • @kwstasg
    @kwstasg Год назад

    The most comprehensive and to the point videos on document QA found so far keep up the good work! gonna watch more from you.Thank you very much!

  • @GregHacob
    @GregHacob Год назад +3

    Thank you for providing a tutorial that was both informative and concise.

  • @MukaddasKhusniddinova
    @MukaddasKhusniddinova 9 месяцев назад

    Thank you Sophia, you really explained everything in clear way. How many hours(may be days, already) lost trying to find what you have explained! (sigh)👏

  • @axelef2344
    @axelef2344 Год назад

    You are the hidden gem of my AI learning journey. It was definitely worth to stray towards finding you💚 Greeting from Poland.

  • @samsek123
    @samsek123 Год назад +1

    That's really helpful, I have been searching around on the different approached! Well explained on everything

  • @happyday.mjohnson
    @happyday.mjohnson Год назад

    Excellent. Thank you! Clear, concise, includes code.

  • @santicodaro
    @santicodaro Год назад +1

    AMAZING video thanks a lot! You have just clearly explained the concepts rather than sharing a monkey copy-pasting notebook as a lot of "content creators" are doing currently.

    • @SophiaYangDS
      @SophiaYangDS  Год назад

      Thanks so much for the kind words 🙏😊

  • @georgegomes5344
    @georgegomes5344 Год назад +2

    Hi Sophia,
    Thanks for the video. I've been looking around on RUclips for langchain content and yours is the most understandable and watchable. Keep it up and look forward to more vids :)

  • @TheRoom2Breathe
    @TheRoom2Breathe 3 месяца назад

    The job is offering AI courses for all whose interested. I'm so glad, I definitely want to remain relevant in the workplace. Learning is 1 of my favorite hobbies, so let me challenge myself.

  • @AlonAvramson
    @AlonAvramson 11 месяцев назад

    Thank you! this is a fantastic review of the various options in lanchain

  • @openai_developer
    @openai_developer Год назад +6

    Thanks for the tutorial. I suggest using gpt-3.5-turbo as llm model, which is %90 cheaper than default davinci models.

    • @SophiaYangDS
      @SophiaYangDS  Год назад

      That's good to know! Thanks 🙏😊

  • @kevon217
    @kevon217 Год назад

    This video really cemented the explosion in my mind. 🤯🙏🧠!

  • @dyllanusher1379
    @dyllanusher1379 Год назад

    I've been thinking about this for days, this is exactly what I was looking for! ❤️

    • @SophiaYangDS
      @SophiaYangDS  Год назад

      Thanks so much for the kind comment!

  • @RBC2_
    @RBC2_ Год назад

    Good Job Sophia. Very helpful indeed.

  • @felipeblin8616
    @felipeblin8616 Год назад

    Thank you Sophia, you really explained everything in clear way. How many hours lost trying to find what you have explained! (sigh)👏

    • @SophiaYangDS
      @SophiaYangDS  Год назад +1

      glad you find it useful. thanks so much ☺️🙏

  • @nedyalkokarabadzhakov5405
    @nedyalkokarabadzhakov5405 Год назад

    very detail and clear explantion

  • @cagri1894
    @cagri1894 Год назад +1

    Great! Thanks you alots

  • @andrewblake904
    @andrewblake904 Год назад

    I found this extremely helpful Sophia, thank you! :)

  • @anonymous1943
    @anonymous1943 Год назад

    I said that's beautiful when I first saw the red at 2:16, even it's expected, I still got the humour.😊

  • @MukaddasKhusniddinova
    @MukaddasKhusniddinova 9 месяцев назад

    thank you very much, sister!

  • @seanharvey4744
    @seanharvey4744 Год назад

    Awesome great explanation! Thank you Sophia!

  • @bwilliams060
    @bwilliams060 Год назад

    Thank you Sophia. I really appreciate your homework on Langchain & then how you clarify the nuances so simply. Cheers!

  • @gazul05
    @gazul05 Год назад +1

    Wow. Thanks. Greetings from Mexico

  • @aibeginnertutorials
    @aibeginnertutorials Год назад

    Awesome sauce! Thank you.

  • @akashkewar
    @akashkewar Год назад

    Great job Sophia!! You have earned +1 subscriber :D

  • @SachinthaAdikari
    @SachinthaAdikari Год назад

    This is so cool. Thank you so much. This gave me a very good foundation and more to something I am trying to learn right now. Instant sub.

  • @m.d.1470
    @m.d.1470 Год назад

    Sophia, great, thank you!
    Q: Which IDE do you use in this video? Do you recommend?

  • @bingolio
    @bingolio Год назад

    Good Job!

  • @adilmajeed8439
    @adilmajeed8439 Год назад +1

    Really nice work, looking forward for the next video. Do you recommend any learning path to get the right understanding?

    • @SophiaYangDS
      @SophiaYangDS  Год назад

      Thanks so much! I'm hoping to get the next one out tomorrow 🤞 Learning path for LangChain? I just read the docs, the read, and try things out : ) I made another video on LangChain overview if you are interested: ruclips.net/video/kmbS6FDQh7c/видео.html

  • @usama57926
    @usama57926 9 месяцев назад

    Thank you

  • @conanssam
    @conanssam Год назад

    thanks your video, Very helpful

  • @VenkatesanVenkat-fd4hg
    @VenkatesanVenkat-fd4hg Год назад

    Thanks for your valuable video. How to do boolean qa response any suggestions...

  • @Dattakhillare999
    @Dattakhillare999 Год назад

    Thank you, Sophia. we really love your knowledge and videos just one I have a request can make a video for a CSV file do question answering in LangChain end to end plz

  • @anubhavghildiyal3559
    @anubhavghildiyal3559 Год назад +1

    Hi Sophia,
    Great video! I had a few doubts. If I want to build a chat bot that answers user queries related to a knowledge base(group of pdfs), lets say I use the method described above by using embeddings, doing the similarity search and then outputting the result.
    But I want the output of the llm to be in a certain conversational tone and format. How to train/fine-tune the llm to output the results in the desired way, also, would the similarity search still be required in this particular work flow?
    Thanks in advance!

  • @blackhat965
    @blackhat965 Год назад +2

    This is really great stuff! Is there any way to save Chroma locally so you don't have to re-embed the vectors every time? It's killing me and costing a ton of tokens!

  • @ruhtraeel
    @ruhtraeel Год назад

    One thing that I don't think was covered in the video,
    What does defining the chain_type do for the other methods aside from load_qa_chain? For example, in the RetrievalQA method, you can still pass in a chain type. What would be the difference in passing in "stuff" here vs something like "map reduce"?

  • @code4AI
    @code4AI Год назад

    given the 13 available options of vector stores supported, which would you prefer, given your experience?

  • @YEO7K
    @YEO7K Год назад

    Thank you!!!!!

  • @imranaalam
    @imranaalam Год назад

    thanks

  • @WooyoungJoo
    @WooyoungJoo Год назад

    Hi Sophia, thank you for your tutorial!
    I was curious, what is the difference / pros and cons between:
    1. Using a similarity search to first get relevant documents to the query a embedded query -> then using loadQA chain to answer the question
    vs
    2. Directly using RetrievalQA Chain.
    Thank you!

    • @SophiaYangDS
      @SophiaYangDS  Год назад +1

      I think it's the same thing. Different interfaces give you different levels of control

  • @jonathanreyesvite8354
    @jonathanreyesvite8354 Год назад

    Hi Sophia, have you ever feed this example with PDF docs about coding and then ask about create a simple code based on that books?

  • @anacarmina
    @anacarmina Год назад

    Hi Sophia! Would you use any of these ways for a QA where the answer for the question is exact and you want to predict based on the pattern of those QA? For example: for Q1 = A1, for Q2 = A2, what's A3 for Q3?
    Not sure if I explained myself 😅

  • @jcrsantiago
    @jcrsantiago Год назад +1

    Is there a way to load your documents once and keep referring to them? Once. I load the information I want to make it static and not have to load it again.

  • @mistercakes
    @mistercakes Год назад

    so are the embeddings the mechanism for dividing the total tokens into different vectors?

  • @8eck
    @8eck Год назад

    Please correct me if i'm wrong. That pdf loader is extracting and tokenize all sentences in the pdf file and stores embeddings in the database, so that the actual question to chat-gpt could be properly assembled containing the most relevant (by similarity distance metric) context along with your question and to avoid going over allowed message tokens amount in chatgpt?

  • @evyguo7577
    @evyguo7577 Год назад

    But load_qa_chain can also use the similarity search result based on the question from the vector store of all text chunks, as the 'input_docuement', isn't that similar to RetrievalQA chain retrieving the relevant docs, to feed to the llm?

  • @ShivanshuGautam-h3o
    @ShivanshuGautam-h3o Год назад

    Hi, I am working on the project of QA with documents. I am having a doubt that how can we make chunks dynamically or more better for the proper result of semantic search.

  • @chrchab
    @chrchab Год назад

    Hello. Excellent material! I'm having an issue, maybe you can help me. I'm working with the ConversationalRetrievalChain option. The thing is that sometimes it responds in English to questions in Spanish. When I look into the reason, I realize that Langchain in its function, by default includes its own prompt which is in English. I tried changing this default prompt to change this behavior, but I didn't know how to do it. Could you please indicate how I can do it? Changing the default prompt is important, not only for language purposes, but also for prompt engineering and improving responses. Thank you very much!

  • @haroldnelsonjr
    @haroldnelsonjr Год назад

    Hi Sophia,
    I can't get your code to run because of a problem installing langchain. It's about the CALLBACK MANAGER. I'm using the anaconda distribution and conda. Pip failed completely. Can you make this code run today?

  • @garfieldlavi
    @garfieldlavi Год назад

    I want to know the difference between map_reduce and map_rerank. Map_rerank uses matching scores to find the most suitable answer. But how does map_reduce determine which batch has the best answer?

  • @MohitKumar-gp6nr
    @MohitKumar-gp6nr Год назад

    I have some JSON files which I want to use for chatbot data source. How to store the JSON information in Croma DB using embedding and then retrieve it based on the user query. I googled a lot but did not find any answers.

  • @stutikumari4161
    @stutikumari4161 Год назад

    can i use document size of 3MB ,does it throw rate limit error

  • @8dmarcus
    @8dmarcus Год назад

    great video!!! I do have a question. I was trying to test the difference between search type "mmr" and "similarity" (retriever = db.as_retriever(search_type="mmr", search_kwargs={"k":2, "fetch_k":5, "lambda_mult": 0.5}) vs, retriever = db.as_retriever(search_type="similarity", search_kwargs={"k":2}). It seems to take 10x the time to run my chain when I use mmr vs. similarity (all chunks are same size). Any idea why this would be the case?

    • @SophiaYangDS
      @SophiaYangDS  Год назад

      mmr is iterating over the text chunks to see if they are similar to the already selected examples. So it has an extra step on top of the similarity search. That's why it's slower.

  • @felipeblin8616
    @felipeblin8616 Год назад

    Great!! A question: How could one add a Prompt to ConversationalRetrievalChain?

    • @SophiaYangDS
      @SophiaYangDS  Год назад +1

      Add your prompt to the "question" parameter

    • @felipeblin8616
      @felipeblin8616 Год назад

      @@SophiaYangDS The problem is that the chat history contains the prompt for every question

    • @SophiaYangDS
      @SophiaYangDS  Год назад +1

      @@felipeblin8616 you can summarize the previous history instead of using all the chat history

  • @shanx1243
    @shanx1243 Год назад

    Which one do you recommend?

    • @SophiaYangDS
      @SophiaYangDS  Год назад

      it depends on your use cases. Do you want the language model to see all the text? Do you want to pass in the chat history to the language model?

  • @Cawow123456
    @Cawow123456 Год назад

    If the PDF file as Chinses language this way still working?

  • @josephmdev
    @josephmdev Год назад

    I could not get this to work at shown. I had to add the following import statement:
    from langchain.chat_models import ChatOpenAI
    and then I had to change the #create a chain to answer questions section to:
    qa = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(),...
    This topic is also covered in the deeplearning course with Andrew Ng and Harrison Chase

  • @nrajesh602
    @nrajesh602 Год назад

    what happens if we work with different documents at different times,the index get updated or new index replaces with old index

  • @ahmedkotb3089
    @ahmedkotb3089 Год назад

    How to prevent llm to answer any question if the question is not in my pdf?

  • @manikant1990
    @manikant1990 Год назад

    I tried all 4 approaches with all chain stuff and I am getting same vector length error, its not working at all

  • @fozantalat4509
    @fozantalat4509 Год назад

    How to load multiple pdf files and ask questions ?

  • @jorgeseifert
    @jorgeseifert Год назад

    Hi Sofia, I have thos issue "Could not import chromadb python package. "
    73 "Please install it with `pip install chromadb`." and trying to install:"Failed to build hnswlib", could you help me? thanks

    • @theptrk
      @theptrk Год назад

      add a line to your jupyter notebook
      !pip install chromadb

  • @prasanosara1944
    @prasanosara1944 Год назад

    Hi, can i use json instead of pdfs?

    • @SophiaYangDS
      @SophiaYangDS  Год назад

      Yes you can read in json files using the json package or use a json document loader from LangChain

    • @prasanosara1944
      @prasanosara1944 Год назад

      @@SophiaYangDS Thanks for reply, there is no specific JSON loader in the documentation. Are you referring to Airbyte JSON? Kindly help

  • @hannespi2886
    @hannespi2886 Год назад

    I have been trying to interact with a PDF containing mostly tables but keep running into problems. It doesnt seem to read the table properly or doesnt interpret its data like we humans do. Any idea or tips?

    • @SophiaYangDS
      @SophiaYangDS  Год назад

      Yeah great question! I ran into the same problem. I'm not sure actually. I wonder if there exists a tool to understand tables in a PDF...

  • @bonjourcotonou1691
    @bonjourcotonou1691 Год назад

    How to make the same thing without any API ?

    • @SophiaYangDS
      @SophiaYangDS  Год назад

      You can use llama.cpp with LangChain

  • @TestTest-e5d
    @TestTest-e5d 8 месяцев назад

    i have an api error

  • @konstantinkonovalov7292
    @konstantinkonovalov7292 Год назад

    But on my tests, TF-IDF worked better. Maybe I don't know how to cook it, coz I'm not DataAnalyst.

    • @SophiaYangDS
      @SophiaYangDS  Год назад

      Interesting! LangChain supports TF-IDF as well. It'd be interesting to do some experiments

  • @ShivanshuGautam-h3o
    @ShivanshuGautam-h3o Год назад

    Hi, I am working on the project of QA with documents. I am having a doubt that how can we make chunks dynamically or more better for the proper result of semantic search.