4 ways to do question answering in LangChain | chat with long PDF docs | BEST method

Sophia Yang

Просмотров 40 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 2 дек 2024

Комментарии • 101

@paqueteaguilera6 Год назад
You saved my life. I had been struggling with large pdf files. New follower!
@SophiaYangDS Год назад
Thank you 🙏
@konstantinkonovalov7292 Год назад
Sophia, you are the BEST! THANK YOU, for so easy explaining!
@SophiaYangDS Год назад
Thanks so much 😊🙏
@MichaelNinoEvensen Год назад ⁺⁹
This is amazing. Really appreciate these super clear walkthroughs!
@kwstasg Год назад
The most comprehensive and to the point videos on document QA found so far keep up the good work! gonna watch more from you.Thank you very much!
@SophiaYangDS Год назад ⁺¹
Thank you 🙏
@GregHacob Год назад ⁺³
Thank you for providing a tutorial that was both informative and concise.
@SophiaYangDS Год назад
Thank you 🙏😊
@MukaddasKhusniddinova 9 месяцев назад
Thank you Sophia, you really explained everything in clear way. How many hours(may be days, already) lost trying to find what you have explained! (sigh)👏
@axelef2344 Год назад
You are the hidden gem of my AI learning journey. It was definitely worth to stray towards finding you💚 Greeting from Poland.
@samsek123 Год назад ⁺¹
That's really helpful, I have been searching around on the different approached! Well explained on everything
@happyday.mjohnson Год назад
Excellent. Thank you! Clear, concise, includes code.
@santicodaro Год назад ⁺¹
AMAZING video thanks a lot! You have just clearly explained the concepts rather than sharing a monkey copy-pasting notebook as a lot of "content creators" are doing currently.
@SophiaYangDS Год назад
Thanks so much for the kind words 🙏😊
@georgegomes5344 Год назад ⁺²
Hi Sophia,
Thanks for the video. I've been looking around on RUclips for langchain content and yours is the most understandable and watchable. Keep it up and look forward to more vids :)
@TheRoom2Breathe 3 месяца назад
The job is offering AI courses for all whose interested. I'm so glad, I definitely want to remain relevant in the workplace. Learning is 1 of my favorite hobbies, so let me challenge myself.
@AlonAvramson 11 месяцев назад
Thank you! this is a fantastic review of the various options in lanchain
@openai_developer Год назад ⁺⁶
Thanks for the tutorial. I suggest using gpt-3.5-turbo as llm model, which is %90 cheaper than default davinci models.
@SophiaYangDS Год назад
That's good to know! Thanks 🙏😊
@kevon217 Год назад
This video really cemented the explosion in my mind. 🤯🙏🧠!
@dyllanusher1379 Год назад
I've been thinking about this for days, this is exactly what I was looking for! ❤️
@SophiaYangDS Год назад
Thanks so much for the kind comment!
@RBC2_ Год назад
Good Job Sophia. Very helpful indeed.
@felipeblin8616 Год назад
Thank you Sophia, you really explained everything in clear way. How many hours lost trying to find what you have explained! (sigh)👏
@SophiaYangDS Год назад ⁺¹
glad you find it useful. thanks so much ☺️🙏
@nedyalkokarabadzhakov5405 Год назад
very detail and clear explantion
@SophiaYangDS Год назад
Thank you!
@cagri1894 Год назад ⁺¹
Great! Thanks you alots
@andrewblake904 Год назад
I found this extremely helpful Sophia, thank you! :)
@anonymous1943 Год назад
I said that's beautiful when I first saw the red at 2:16, even it's expected, I still got the humour.😊
@MukaddasKhusniddinova 9 месяцев назад
thank you very much, sister!
@seanharvey4744 Год назад
Awesome great explanation! Thank you Sophia!
@bwilliams060 Год назад
Thank you Sophia. I really appreciate your homework on Langchain & then how you clarify the nuances so simply. Cheers!
@SophiaYangDS Год назад
Glad it was helpful!
@gazul05 Год назад ⁺¹
Wow. Thanks. Greetings from Mexico
@SophiaYangDS Год назад ⁺¹
Thank you!!!😊
@aibeginnertutorials Год назад
Awesome sauce! Thank you.
@akashkewar Год назад
Great job Sophia!! You have earned +1 subscriber :D
@SophiaYangDS Год назад ⁺¹
Thank you 🙏
@SachinthaAdikari Год назад
This is so cool. Thank you so much. This gave me a very good foundation and more to something I am trying to learn right now. Instant sub.
@SophiaYangDS Год назад
Thanks so much 🙏
@m.d.1470 Год назад
Sophia, great, thank you!
Q: Which IDE do you use in this video? Do you recommend?
@bingolio Год назад
Good Job!
@adilmajeed8439 Год назад ⁺¹
Really nice work, looking forward for the next video. Do you recommend any learning path to get the right understanding?
@SophiaYangDS Год назад
Thanks so much! I'm hoping to get the next one out tomorrow 🤞 Learning path for LangChain? I just read the docs, the read, and try things out : ) I made another video on LangChain overview if you are interested: ruclips.net/video/kmbS6FDQh7c/видео.html
@usama57926 9 месяцев назад
Thank you
@conanssam Год назад
thanks your video, Very helpful
@VenkatesanVenkat-fd4hg Год назад
Thanks for your valuable video. How to do boolean qa response any suggestions...
@Dattakhillare999 Год назад
Thank you, Sophia. we really love your knowledge and videos just one I have a request can make a video for a CSV file do question answering in LangChain end to end plz
@SophiaYangDS Год назад
Great idea 💡 thanks!
@anubhavghildiyal3559 Год назад ⁺¹
Hi Sophia,
Great video! I had a few doubts. If I want to build a chat bot that answers user queries related to a knowledge base(group of pdfs), lets say I use the method described above by using embeddings, doing the similarity search and then outputting the result.
But I want the output of the llm to be in a certain conversational tone and format. How to train/fine-tune the llm to output the results in the desired way, also, would the similarity search still be required in this particular work flow?
Thanks in advance!
@blackhat965 Год назад ⁺²
This is really great stuff! Is there any way to save Chroma locally so you don't have to re-embed the vectors every time? It's killing me and costing a ton of tokens!
@ruhtraeel Год назад
One thing that I don't think was covered in the video,
What does defining the chain_type do for the other methods aside from load_qa_chain? For example, in the RetrievalQA method, you can still pass in a chain type. What would be the difference in passing in "stuff" here vs something like "map reduce"?
@code4AI Год назад
given the 13 available options of vector stores supported, which would you prefer, given your experience?
@YEO7K Год назад
Thank you!!!!!
@imranaalam Год назад
thanks
@WooyoungJoo Год назад
Hi Sophia, thank you for your tutorial!
I was curious, what is the difference / pros and cons between:
1. Using a similarity search to first get relevant documents to the query a embedded query -> then using loadQA chain to answer the question
vs
2. Directly using RetrievalQA Chain.
Thank you!
@SophiaYangDS Год назад ⁺¹
I think it's the same thing. Different interfaces give you different levels of control
@jonathanreyesvite8354 Год назад
Hi Sophia, have you ever feed this example with PDF docs about coding and then ask about create a simple code based on that books?
@anacarmina Год назад
Hi Sophia! Would you use any of these ways for a QA where the answer for the question is exact and you want to predict based on the pattern of those QA? For example: for Q1 = A1, for Q2 = A2, what's A3 for Q3?
Not sure if I explained myself 😅
@jcrsantiago Год назад ⁺¹
Is there a way to load your documents once and keep referring to them? Once. I load the information I want to make it static and not have to load it again.
@mistercakes Год назад
so are the embeddings the mechanism for dividing the total tokens into different vectors?
@8eck Год назад
Please correct me if i'm wrong. That pdf loader is extracting and tokenize all sentences in the pdf file and stores embeddings in the database, so that the actual question to chat-gpt could be properly assembled containing the most relevant (by similarity distance metric) context along with your question and to avoid going over allowed message tokens amount in chatgpt?
@evyguo7577 Год назад
But load_qa_chain can also use the similarity search result based on the question from the vector store of all text chunks, as the 'input_docuement', isn't that similar to RetrievalQA chain retrieving the relevant docs, to feed to the llm?
@ShivanshuGautam-h3o Год назад
Hi, I am working on the project of QA with documents. I am having a doubt that how can we make chunks dynamically or more better for the proper result of semantic search.
@chrchab Год назад
Hello. Excellent material! I'm having an issue, maybe you can help me. I'm working with the ConversationalRetrievalChain option. The thing is that sometimes it responds in English to questions in Spanish. When I look into the reason, I realize that Langchain in its function, by default includes its own prompt which is in English. I tried changing this default prompt to change this behavior, but I didn't know how to do it. Could you please indicate how I can do it? Changing the default prompt is important, not only for language purposes, but also for prompt engineering and improving responses. Thank you very much!
@haroldnelsonjr Год назад
Hi Sophia,
I can't get your code to run because of a problem installing langchain. It's about the CALLBACK MANAGER. I'm using the anaconda distribution and conda. Pip failed completely. Can you make this code run today?
@garfieldlavi Год назад
I want to know the difference between map_reduce and map_rerank. Map_rerank uses matching scores to find the most suitable answer. But how does map_reduce determine which batch has the best answer?
@MohitKumar-gp6nr Год назад
I have some JSON files which I want to use for chatbot data source. How to store the JSON information in Croma DB using embedding and then retrieve it based on the user query. I googled a lot but did not find any answers.
@stutikumari4161 Год назад
can i use document size of 3MB ,does it throw rate limit error
@8dmarcus Год назад
great video!!! I do have a question. I was trying to test the difference between search type "mmr" and "similarity" (retriever = db.as_retriever(search_type="mmr", search_kwargs={"k":2, "fetch_k":5, "lambda_mult": 0.5}) vs, retriever = db.as_retriever(search_type="similarity", search_kwargs={"k":2}). It seems to take 10x the time to run my chain when I use mmr vs. similarity (all chunks are same size). Any idea why this would be the case?
@SophiaYangDS Год назад
mmr is iterating over the text chunks to see if they are similar to the already selected examples. So it has an extra step on top of the similarity search. That's why it's slower.
@felipeblin8616 Год назад
Great!! A question: How could one add a Prompt to ConversationalRetrievalChain?
@SophiaYangDS Год назад ⁺¹
Add your prompt to the "question" parameter
@felipeblin8616 Год назад
@@SophiaYangDS The problem is that the chat history contains the prompt for every question
@SophiaYangDS Год назад ⁺¹
@@felipeblin8616 you can summarize the previous history instead of using all the chat history
@shanx1243 Год назад
Which one do you recommend?
@SophiaYangDS Год назад
it depends on your use cases. Do you want the language model to see all the text? Do you want to pass in the chat history to the language model?
@Cawow123456 Год назад
If the PDF file as Chinses language this way still working?
@josephmdev Год назад
I could not get this to work at shown. I had to add the following import statement:
from langchain.chat_models import ChatOpenAI
and then I had to change the #create a chain to answer questions section to:
qa = RetrievalQA.from_chain_type(
llm=ChatOpenAI(),...
This topic is also covered in the deeplearning course with Andrew Ng and Harrison Chase
@nrajesh602 Год назад
what happens if we work with different documents at different times,the index get updated or new index replaces with old index
@ahmedkotb3089 Год назад
How to prevent llm to answer any question if the question is not in my pdf?
@manikant1990 Год назад
I tried all 4 approaches with all chain stuff and I am getting same vector length error, its not working at all
@fozantalat4509 Год назад
How to load multiple pdf files and ask questions ?
@jorgeseifert Год назад
Hi Sofia, I have thos issue "Could not import chromadb python package. "
73 "Please install it with `pip install chromadb`." and trying to install:"Failed to build hnswlib", could you help me? thanks
@theptrk Год назад
add a line to your jupyter notebook
!pip install chromadb
@prasanosara1944 Год назад
Hi, can i use json instead of pdfs?
@SophiaYangDS Год назад
Yes you can read in json files using the json package or use a json document loader from LangChain
@prasanosara1944 Год назад
@@SophiaYangDS Thanks for reply, there is no specific JSON loader in the documentation. Are you referring to Airbyte JSON? Kindly help
@hannespi2886 Год назад
I have been trying to interact with a PDF containing mostly tables but keep running into problems. It doesnt seem to read the table properly or doesnt interpret its data like we humans do. Any idea or tips?
@SophiaYangDS Год назад
Yeah great question! I ran into the same problem. I'm not sure actually. I wonder if there exists a tool to understand tables in a PDF...
@bonjourcotonou1691 Год назад
How to make the same thing without any API ?
@SophiaYangDS Год назад
You can use llama.cpp with LangChain
@TestTest-e5d 8 месяцев назад
i have an api error
@konstantinkonovalov7292 Год назад
But on my tests, TF-IDF worked better. Maybe I don't know how to cook it, coz I'm not DataAnalyst.
@SophiaYangDS Год назад
Interesting! LangChain supports TF-IDF as well. It'd be interesting to do some experiments
@ShivanshuGautam-h3o Год назад
Hi, I am working on the project of QA with documents. I am having a doubt that how can we make chunks dynamically or more better for the proper result of semantic search.

Следующие

Автовоспроизведение

MiniGPT4: image understanding & open-source!