Thank you Sophia, you really explained everything in clear way. How many hours(may be days, already) lost trying to find what you have explained! (sigh)👏
AMAZING video thanks a lot! You have just clearly explained the concepts rather than sharing a monkey copy-pasting notebook as a lot of "content creators" are doing currently.
Hi Sophia, Thanks for the video. I've been looking around on RUclips for langchain content and yours is the most understandable and watchable. Keep it up and look forward to more vids :)
The job is offering AI courses for all whose interested. I'm so glad, I definitely want to remain relevant in the workplace. Learning is 1 of my favorite hobbies, so let me challenge myself.
Thanks so much! I'm hoping to get the next one out tomorrow 🤞 Learning path for LangChain? I just read the docs, the read, and try things out : ) I made another video on LangChain overview if you are interested: ruclips.net/video/kmbS6FDQh7c/видео.html
Thank you, Sophia. we really love your knowledge and videos just one I have a request can make a video for a CSV file do question answering in LangChain end to end plz
Hi Sophia, Great video! I had a few doubts. If I want to build a chat bot that answers user queries related to a knowledge base(group of pdfs), lets say I use the method described above by using embeddings, doing the similarity search and then outputting the result. But I want the output of the llm to be in a certain conversational tone and format. How to train/fine-tune the llm to output the results in the desired way, also, would the similarity search still be required in this particular work flow? Thanks in advance!
This is really great stuff! Is there any way to save Chroma locally so you don't have to re-embed the vectors every time? It's killing me and costing a ton of tokens!
One thing that I don't think was covered in the video, What does defining the chain_type do for the other methods aside from load_qa_chain? For example, in the RetrievalQA method, you can still pass in a chain type. What would be the difference in passing in "stuff" here vs something like "map reduce"?
Hi Sophia, thank you for your tutorial! I was curious, what is the difference / pros and cons between: 1. Using a similarity search to first get relevant documents to the query a embedded query -> then using loadQA chain to answer the question vs 2. Directly using RetrievalQA Chain. Thank you!
Hi Sophia! Would you use any of these ways for a QA where the answer for the question is exact and you want to predict based on the pattern of those QA? For example: for Q1 = A1, for Q2 = A2, what's A3 for Q3? Not sure if I explained myself 😅
Is there a way to load your documents once and keep referring to them? Once. I load the information I want to make it static and not have to load it again.
Please correct me if i'm wrong. That pdf loader is extracting and tokenize all sentences in the pdf file and stores embeddings in the database, so that the actual question to chat-gpt could be properly assembled containing the most relevant (by similarity distance metric) context along with your question and to avoid going over allowed message tokens amount in chatgpt?
But load_qa_chain can also use the similarity search result based on the question from the vector store of all text chunks, as the 'input_docuement', isn't that similar to RetrievalQA chain retrieving the relevant docs, to feed to the llm?
Hi, I am working on the project of QA with documents. I am having a doubt that how can we make chunks dynamically or more better for the proper result of semantic search.
Hello. Excellent material! I'm having an issue, maybe you can help me. I'm working with the ConversationalRetrievalChain option. The thing is that sometimes it responds in English to questions in Spanish. When I look into the reason, I realize that Langchain in its function, by default includes its own prompt which is in English. I tried changing this default prompt to change this behavior, but I didn't know how to do it. Could you please indicate how I can do it? Changing the default prompt is important, not only for language purposes, but also for prompt engineering and improving responses. Thank you very much!
Hi Sophia, I can't get your code to run because of a problem installing langchain. It's about the CALLBACK MANAGER. I'm using the anaconda distribution and conda. Pip failed completely. Can you make this code run today?
I want to know the difference between map_reduce and map_rerank. Map_rerank uses matching scores to find the most suitable answer. But how does map_reduce determine which batch has the best answer?
I have some JSON files which I want to use for chatbot data source. How to store the JSON information in Croma DB using embedding and then retrieve it based on the user query. I googled a lot but did not find any answers.
great video!!! I do have a question. I was trying to test the difference between search type "mmr" and "similarity" (retriever = db.as_retriever(search_type="mmr", search_kwargs={"k":2, "fetch_k":5, "lambda_mult": 0.5}) vs, retriever = db.as_retriever(search_type="similarity", search_kwargs={"k":2}). It seems to take 10x the time to run my chain when I use mmr vs. similarity (all chunks are same size). Any idea why this would be the case?
mmr is iterating over the text chunks to see if they are similar to the already selected examples. So it has an extra step on top of the similarity search. That's why it's slower.
I could not get this to work at shown. I had to add the following import statement: from langchain.chat_models import ChatOpenAI and then I had to change the #create a chain to answer questions section to: qa = RetrievalQA.from_chain_type( llm=ChatOpenAI(),... This topic is also covered in the deeplearning course with Andrew Ng and Harrison Chase
Hi Sofia, I have thos issue "Could not import chromadb python package. " 73 "Please install it with `pip install chromadb`." and trying to install:"Failed to build hnswlib", could you help me? thanks
I have been trying to interact with a PDF containing mostly tables but keep running into problems. It doesnt seem to read the table properly or doesnt interpret its data like we humans do. Any idea or tips?
Hi, I am working on the project of QA with documents. I am having a doubt that how can we make chunks dynamically or more better for the proper result of semantic search.
You saved my life. I had been struggling with large pdf files. New follower!
Thank you 🙏
Sophia, you are the BEST! THANK YOU, for so easy explaining!
Thanks so much 😊🙏
This is amazing. Really appreciate these super clear walkthroughs!
The most comprehensive and to the point videos on document QA found so far keep up the good work! gonna watch more from you.Thank you very much!
Thank you 🙏
Thank you for providing a tutorial that was both informative and concise.
Thank you 🙏😊
Thank you Sophia, you really explained everything in clear way. How many hours(may be days, already) lost trying to find what you have explained! (sigh)👏
You are the hidden gem of my AI learning journey. It was definitely worth to stray towards finding you💚 Greeting from Poland.
That's really helpful, I have been searching around on the different approached! Well explained on everything
Excellent. Thank you! Clear, concise, includes code.
AMAZING video thanks a lot! You have just clearly explained the concepts rather than sharing a monkey copy-pasting notebook as a lot of "content creators" are doing currently.
Thanks so much for the kind words 🙏😊
Hi Sophia,
Thanks for the video. I've been looking around on RUclips for langchain content and yours is the most understandable and watchable. Keep it up and look forward to more vids :)
The job is offering AI courses for all whose interested. I'm so glad, I definitely want to remain relevant in the workplace. Learning is 1 of my favorite hobbies, so let me challenge myself.
Thank you! this is a fantastic review of the various options in lanchain
Thanks for the tutorial. I suggest using gpt-3.5-turbo as llm model, which is %90 cheaper than default davinci models.
That's good to know! Thanks 🙏😊
This video really cemented the explosion in my mind. 🤯🙏🧠!
I've been thinking about this for days, this is exactly what I was looking for! ❤️
Thanks so much for the kind comment!
Good Job Sophia. Very helpful indeed.
Thank you Sophia, you really explained everything in clear way. How many hours lost trying to find what you have explained! (sigh)👏
glad you find it useful. thanks so much ☺️🙏
very detail and clear explantion
Thank you!
Great! Thanks you alots
I found this extremely helpful Sophia, thank you! :)
I said that's beautiful when I first saw the red at 2:16, even it's expected, I still got the humour.😊
thank you very much, sister!
Awesome great explanation! Thank you Sophia!
Thank you Sophia. I really appreciate your homework on Langchain & then how you clarify the nuances so simply. Cheers!
Glad it was helpful!
Wow. Thanks. Greetings from Mexico
Thank you!!!😊
Awesome sauce! Thank you.
Great job Sophia!! You have earned +1 subscriber :D
Thank you 🙏
This is so cool. Thank you so much. This gave me a very good foundation and more to something I am trying to learn right now. Instant sub.
Thanks so much 🙏
Sophia, great, thank you!
Q: Which IDE do you use in this video? Do you recommend?
Good Job!
Really nice work, looking forward for the next video. Do you recommend any learning path to get the right understanding?
Thanks so much! I'm hoping to get the next one out tomorrow 🤞 Learning path for LangChain? I just read the docs, the read, and try things out : ) I made another video on LangChain overview if you are interested: ruclips.net/video/kmbS6FDQh7c/видео.html
Thank you
thanks your video, Very helpful
Thanks for your valuable video. How to do boolean qa response any suggestions...
Thank you, Sophia. we really love your knowledge and videos just one I have a request can make a video for a CSV file do question answering in LangChain end to end plz
Great idea 💡 thanks!
Hi Sophia,
Great video! I had a few doubts. If I want to build a chat bot that answers user queries related to a knowledge base(group of pdfs), lets say I use the method described above by using embeddings, doing the similarity search and then outputting the result.
But I want the output of the llm to be in a certain conversational tone and format. How to train/fine-tune the llm to output the results in the desired way, also, would the similarity search still be required in this particular work flow?
Thanks in advance!
This is really great stuff! Is there any way to save Chroma locally so you don't have to re-embed the vectors every time? It's killing me and costing a ton of tokens!
One thing that I don't think was covered in the video,
What does defining the chain_type do for the other methods aside from load_qa_chain? For example, in the RetrievalQA method, you can still pass in a chain type. What would be the difference in passing in "stuff" here vs something like "map reduce"?
given the 13 available options of vector stores supported, which would you prefer, given your experience?
Thank you!!!!!
thanks
Hi Sophia, thank you for your tutorial!
I was curious, what is the difference / pros and cons between:
1. Using a similarity search to first get relevant documents to the query a embedded query -> then using loadQA chain to answer the question
vs
2. Directly using RetrievalQA Chain.
Thank you!
I think it's the same thing. Different interfaces give you different levels of control
Hi Sophia, have you ever feed this example with PDF docs about coding and then ask about create a simple code based on that books?
Hi Sophia! Would you use any of these ways for a QA where the answer for the question is exact and you want to predict based on the pattern of those QA? For example: for Q1 = A1, for Q2 = A2, what's A3 for Q3?
Not sure if I explained myself 😅
Is there a way to load your documents once and keep referring to them? Once. I load the information I want to make it static and not have to load it again.
so are the embeddings the mechanism for dividing the total tokens into different vectors?
Please correct me if i'm wrong. That pdf loader is extracting and tokenize all sentences in the pdf file and stores embeddings in the database, so that the actual question to chat-gpt could be properly assembled containing the most relevant (by similarity distance metric) context along with your question and to avoid going over allowed message tokens amount in chatgpt?
But load_qa_chain can also use the similarity search result based on the question from the vector store of all text chunks, as the 'input_docuement', isn't that similar to RetrievalQA chain retrieving the relevant docs, to feed to the llm?
Hi, I am working on the project of QA with documents. I am having a doubt that how can we make chunks dynamically or more better for the proper result of semantic search.
Hello. Excellent material! I'm having an issue, maybe you can help me. I'm working with the ConversationalRetrievalChain option. The thing is that sometimes it responds in English to questions in Spanish. When I look into the reason, I realize that Langchain in its function, by default includes its own prompt which is in English. I tried changing this default prompt to change this behavior, but I didn't know how to do it. Could you please indicate how I can do it? Changing the default prompt is important, not only for language purposes, but also for prompt engineering and improving responses. Thank you very much!
Hi Sophia,
I can't get your code to run because of a problem installing langchain. It's about the CALLBACK MANAGER. I'm using the anaconda distribution and conda. Pip failed completely. Can you make this code run today?
I want to know the difference between map_reduce and map_rerank. Map_rerank uses matching scores to find the most suitable answer. But how does map_reduce determine which batch has the best answer?
I have some JSON files which I want to use for chatbot data source. How to store the JSON information in Croma DB using embedding and then retrieve it based on the user query. I googled a lot but did not find any answers.
can i use document size of 3MB ,does it throw rate limit error
great video!!! I do have a question. I was trying to test the difference between search type "mmr" and "similarity" (retriever = db.as_retriever(search_type="mmr", search_kwargs={"k":2, "fetch_k":5, "lambda_mult": 0.5}) vs, retriever = db.as_retriever(search_type="similarity", search_kwargs={"k":2}). It seems to take 10x the time to run my chain when I use mmr vs. similarity (all chunks are same size). Any idea why this would be the case?
mmr is iterating over the text chunks to see if they are similar to the already selected examples. So it has an extra step on top of the similarity search. That's why it's slower.
Great!! A question: How could one add a Prompt to ConversationalRetrievalChain?
Add your prompt to the "question" parameter
@@SophiaYangDS The problem is that the chat history contains the prompt for every question
@@felipeblin8616 you can summarize the previous history instead of using all the chat history
Which one do you recommend?
it depends on your use cases. Do you want the language model to see all the text? Do you want to pass in the chat history to the language model?
If the PDF file as Chinses language this way still working?
I could not get this to work at shown. I had to add the following import statement:
from langchain.chat_models import ChatOpenAI
and then I had to change the #create a chain to answer questions section to:
qa = RetrievalQA.from_chain_type(
llm=ChatOpenAI(),...
This topic is also covered in the deeplearning course with Andrew Ng and Harrison Chase
what happens if we work with different documents at different times,the index get updated or new index replaces with old index
How to prevent llm to answer any question if the question is not in my pdf?
I tried all 4 approaches with all chain stuff and I am getting same vector length error, its not working at all
How to load multiple pdf files and ask questions ?
Hi Sofia, I have thos issue "Could not import chromadb python package. "
73 "Please install it with `pip install chromadb`." and trying to install:"Failed to build hnswlib", could you help me? thanks
add a line to your jupyter notebook
!pip install chromadb
Hi, can i use json instead of pdfs?
Yes you can read in json files using the json package or use a json document loader from LangChain
@@SophiaYangDS Thanks for reply, there is no specific JSON loader in the documentation. Are you referring to Airbyte JSON? Kindly help
I have been trying to interact with a PDF containing mostly tables but keep running into problems. It doesnt seem to read the table properly or doesnt interpret its data like we humans do. Any idea or tips?
Yeah great question! I ran into the same problem. I'm not sure actually. I wonder if there exists a tool to understand tables in a PDF...
How to make the same thing without any API ?
You can use llama.cpp with LangChain
i have an api error
But on my tests, TF-IDF worked better. Maybe I don't know how to cook it, coz I'm not DataAnalyst.
Interesting! LangChain supports TF-IDF as well. It'd be interesting to do some experiments
Hi, I am working on the project of QA with documents. I am having a doubt that how can we make chunks dynamically or more better for the proper result of semantic search.