LangChain + OpenAI tutorial: Building a Q&A system w/ own text data

Samuel Chan

Просмотров 30 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 11 дек 2024
LangChain is a fantastic tool for developers looking to build AI systems using the variety of LLMs (large language models, like GPT-4, Alpaca, Llama etc), as it helps unify and standardize the developer experience in text embeddings, vector stores / databases (like Chroma), and chaining it for downstream applications through agents. In this tutorial we're using our own custom text / data and training a question and answer agent on it.
Want to learn more about LLMs (large language models)? Here's my learning path:
Watch PART 2 of the LangChain / LLM series:
• LangChain + OpenAI to ...
Watch PART 3 of the LangChain / LLM series
LangChain + HuggingFace's Inference API (no OpenAI credits required!)
• LangChain + HuggingFac...
Watch PART 4 of the LangChain / LLM sereis:
How Embeddings in LLMs work (a practical tutorial + code demo)
All the code for the LLM (large language models) series featuring GPT-3, ChatGPT, LangChain, LlamaIndex and more are on my github repository so go and ⭐ star or 🍴 fork it. Happy Coding!
github.com/onl...
Other links mentioned in the video:
LangChain documentation: python.langcha...
Visualizing embeddings: github.com/onl...
Learn about DuckDB: • DuckDB: Hi-performance...

Комментарии • 97

@kenchang3456 Год назад ⁺²
Just wanted stop at this first video in the playlist to thank you very much for sharing your knowledge.
@SamuelChan Год назад
And I want to say thank you for taking the time out to do this. This means a lot!
@ЕрболДжаумытбаев Год назад ⁺¹
Все по теме четко и ясно, без лишней воды. Good job
@SamuelChan Год назад
Спасибо! я очень ценю это
@yazanrisheh5127 Год назад ⁺¹
I just found your channel and I've been watching your videos which honestly imo, best explains all about LLMs. I just would like to ask if you could make a pop up of your video whenever you mention that you did a video about it so that we can click it and it takes us there. Thanks!
@SamuelChan Год назад
That’s very good feedback, I’ll incorporate that. Thank you for your kind words!
@eliluong Год назад ⁺¹
Thanks for the video! When you do `vector_store = Chroma.from_documents(texts, embeddings)`, you only use `embeddings = OpenAIEmbeddings()`. But in previous code when introducing embeddings, you also use `doc_embeddings = embeddings.embed_documents([text])`. We don't need to do that here?
@gabrielmartinphilot980 9 месяцев назад ⁺¹
Hey man, im getting error on your:
vecstore = Chroma.from_documents(texts, embeddings)
qa = RetrievalQA.from_chain_type(
llm=OpenAI(),
chain_type = "stuff",
retriver = vecstore.as_retriever()
)
ValidationError: 2 validation errors for RetrievalQA
retriever
field required (type=value_error.missing)
retriver
extra fields not permitted (type=value_error.extra)
Can u help me out?
@MarcDimmick Год назад ⁺³
Really like how you explained and demo the code. Great video will be checking out your channel quite a bit. Thank you.
@SamuelChan Год назад ⁺¹
Wow thank you Marc!
@subhranshudas8862 Год назад
What a perfect no B.S. intro. Thanks.
@SamuelChan Год назад
Thank you!
@TC-gz1ko Год назад
This video is great! I can't wait to check out all your other videos as well. Please keep posting great contents!
@SamuelChan Год назад
Thank you! Means a lot!
@hyperbolictimeacademy Год назад ⁺¹
This video should have more views
@SamuelChan Год назад
Wouldn’t mind having more views!
@mihaelacostea5783 7 месяцев назад
Hi, could this type of chatbot be integrated with Messenger and own website for customer support like interactions, based on own Q&A files and text?
@kevinehsani3358 Год назад ⁺¹
Great video, with the advent of plugins and agents coming up is there anything langchain can do which they can not?
@SamuelChan Год назад ⁺¹
Yeah a few main scenarios where you’d use LangChain:
1. Don’t want to use OpenAI. Plug in other LLMs directly into LangChain as the API is quite unified
2. Chaining different providers. You might use Pinecone for the vector store but something else for the Q&A chain
3. The “preprocessing”: chunking up a large corpus to “chunks” to fit into context window size, or calling it a directory loader on a directory of pdf then handle some text extraction in the upstream etc.
4. Want to work on your local environment / “offline mode”
@devangsoni-kc7mj Год назад
If possible can you please explain how to brought that chromadb folder ??
@Zerofoxthree Год назад
Great Sam, clean explanation and easy to understand, i wanna know more about embeddings data preparation that you put in news directory
@SamuelChan Год назад
Thank you Gede! There is almost no preparation at the moment, our team builds a simple news monitor targeting news relating to the mining and commodities industry, so I take a sample of that and build the Q&A downstream. Vector stores like Chroma and indexing tools like Llamaindex all handle these unstructured text files very robustly.
I can write up an article on how we / our clients use it later on! Thanks for the feedbacks!
@moreshk Год назад ⁺¹
would be great to see a langchain tutorial that reads google docs that are access controlled and uses an open source LLM
@SamuelChan Год назад ⁺¹
Hey Moresh, thanks for the suggestion!
I do have a video on my channel on using open LLMs through HuggingFace and I’m also in the process of recording another video on using locally-hosted LLM (runs on your machine). I’ll see when I have time to finish it and do the editing! :)
LangChain + HuggingFace's Inference API (no OpenAI credits required!)
ruclips.net/video/dD_xNmePdd0/видео.html
@moreshk Год назад ⁺¹
@@SamuelChan are you planning to use GPT4All for the locally hosted version? Great tutorials btw, keep up the good work!
@SamuelChan Год назад ⁺¹
@@moreshk GPT4All is someting I'm keeping a close eye on!
Thank you for the kind words Moresh!
@helios6261 Год назад ⁺¹
Ty for video but I have a question. Does your code have limits on the use of tokens? I use my code but got error
This model's maximum context length is 4097 tokens, however you requested 4225 tokens(3975 in your prompt; 250 for the completion). please reduce your prompt; or completion length
@SamuelChan Год назад
hey Helios, are you also using the sample data I provided in the github repo? Are you using your own data?
@associatedbiblestudentsofs5308 Год назад
Chromadb is not combatable with python 3.11. is there going to be a workaround?
@neiltucker1355 Год назад
great video thanks Samuel, easy to follow.
@SamuelChan Год назад
Thank you, glad it was helpful!
@jdhenckel2 Год назад
The AI did not answer the "Why?" at the end of your third question. 17:20 Perhaps because the AI does not know why the ban was implemented?
@SamuelChan Год назад
Interesting. In practice I would maybe use Guidance for the prompt design. This adds more structure to the output and makes the assistant more deterministic. “Why” should still be answered, even if it has to formulate a speculative response (also getting the assistant to remark on the “knowns” from the “unknowns” in its response)
ruclips.net/video/k4Ejc3bLQiU/видео.html
@cathyhu7386 Год назад
What is the difference with using the first method of creating the Q and A chain, and using the SQLChain method?
@SamuelChan Год назад
RetrievalQA pseudocode:
docsearch = Chroma.from_documents(texts)
qa = RetrievalQA.from_chain_type(
llm=OpenAI(),
chain_type="stuff",
retriever=docsearch.as_retriever()
)
SQLDatabaseChain pseudocode:
db = SQLDatabase.from_uri("sqlite:///something.db")
db_chain = SQLDatabaseChain(llm= OpenAI(), database=db, verbose=True)
The latter kicks off a database agent process and operate on the database, I have a longer video on using OpenAI on your custom csv or database here:
ruclips.net/video/Fz0WJWzfNPI/видео.html
@benfield1866 Год назад
only a few minutes in but super helpful, thank you!
@SamuelChan Год назад
Thank you Ben! Means a lot!
@atultiwari88 Год назад ⁺¹
Hi, Thank you for your tutorials. I am following your tutorials for quite some time now. I have watched your whole playlist on this. However I am unable to figure out best economic approach for my use case.
I want to create a Q & A chatbot on streamlit which answers only my custom single document of about 500 pages. The document is final and won't change. From my understanding so far, I should either choose Langchain or LlamaIndex. But, I will have to use OpenAI api to get best answers, but that API is quite costly for me. So far I have thought of using Chroma for embedding and somhow storing the vectors as pkl or json on streamlit itself for re-use, so I don't have to spend again for vectors/indexing. I don't have enough credits to test different methods myself.
Kindly guide me. Thank you.
@SamuelChan Год назад ⁺³
Hey Atul, if the document is final and won't change, the first cost-saving opportunity is on embeddings: you can use Sentence Transformer, which is free and available through the transformers API / HuggingFace inference. You will then save the embeddings into Chroma, which again is free and open source (unlike pinecone, which has a pricing plan). At this point all of this is still free.
Then for Q&A, you call OpenAI's API but if this is in production setting, I may be tempted to implement some sort of caching mechanisms for very common queries. For example if you provide a few "canned questions" to choose from, those can be served through a cache. Eg. "Break down the R&D spending for me from this financial report" is a question that could be fully deterministic in its answer and can be cached. There's plenty of other opportunities for optimizations but these would significantly reduce your bills.
@CodingAfterThirty 11 месяцев назад
Great video. Learned a lot.
@rishabpoddar3866 Год назад ⁺¹
Using embedded DuckDB without persistence: data will be transient
Output exceeds the size limit. Open the full output data in a text editor---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
getting this error whenever i trying to convert to vector
on this command
vecstore = Chroma. from_documents(texts, embeddings)
even tried with faiss getting a ratelimiterror
please helppp
@SamuelChan Год назад
Looking at the exact error message "Output exceeds the size limit " it looks like it's something specific to VSCode and has nothing to do with langchain or chroma. If you run it on a terminal, does the same message appear? Are you printing a bunch of things (using print statements anywhere in your code)?
@rishabpoddar3866 Год назад
@@SamuelChan the last error i got was Using embedded DuckDB without persistence: data will be transient
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
@rishabpoddar3866 Год назад
@@SamuelChan Yes even if I am running it on terminal it shows the exact same thing
@WinterWishin Год назад
Great tutorial! Would love to learn more on this topic
@SamuelChan Год назад ⁺¹
Thanks! Yeah definitely; have a few ideas in mind just need to plan them out
@WaltsBJR Год назад
Hello, Samuel! Great video, my friend!
Is it possible to specify the OpenAI LLM version? I conducted some tests, and it always uses daVinci. Is it possible to use a more cost-effective model?
Thank you.
@SamuelChan Год назад
Thank you! Yeah of course, specifying the LLM model is one of the most basic feature of LangChain. Use the model_name parameter:
from langchain.llms import OpenAI
from langchain.embeddings.openai import OpenAIEmbeddings
model_name = 'text-embedding-ada-002'
embed = OpenAIEmbeddings(
model=model_name,
openai_api_key=OPENAI_API_KEY
)
llm = OpenAI(model_name="text-davinci-002")
Change model_name to whichever you need it to be! :)
@vidhandhagai672 Год назад
Great video! Is there a way to create a chatbot that smartly uses our data + gpt-3.5 data and give us a COMBINED answer from both the data set instead of just our data set or gpt data set?
So let's say your document had details about china and indonesia around the energy and export details and there's no information for a country like India. If my question remains the same except the country name now changing to "India and Indonesia"...it should still be able to answer by looking up our data set for details related to Indonesia and then look-up the gpt-3.5 data for details related to India.
@SamuelChan Год назад
But the gpt-3.5 data would have been stale data from when it’s last trained right? So it would have given you information from 2021, and hallucinate with high confidence as if they were actual data.
In my use case I would have prefer it to use a proper source of information (“vector store”) and not conflate it with anything, LLM in my workflows are just text generation mechanism, not sources of information or factual data
@palashmalu6278 Год назад ⁺¹
hey first of all nice video
i was having a error , i am unable to import a proper dotenv version
can anyone help
@SamuelChan Год назад
what does that mean? Check that you have installed dotenv (pip install) and the import statement should work :)
@pietraderdetective8953 Год назад
This is awesome, yes 100% interested to dig deeper into this topic!
A question: the rate limiter still applies, correct? How do we deal with that? Can we set up multiple API keys and make a script to rotate the keys?
@SamuelChan Год назад ⁺²
Yes it does apply. One caveat about rotation pool is that OpenAI enforce the rate limiting on the org level, not account level. If you set up a ton of API keys but under the same org, it counts toward the same balance 😊
Each model has a different limit, you can send 200x more tokens per minute compared to a DaVinci model. You can also fill up a rate increase form to get even higher rate limit ceilings as well!
platform.openai.com/docs/guides/rate-limits/overview
@pietraderdetective8953 Год назад
@@SamuelChan wow i really appreciate the insights! you rock Sir!
@seadude Год назад ⁺¹
Nice video! Your style is expert!
I would REALLY like to learn about the SQLChain methods of translating natural language to SQL to query a database then receive natural language back. If you have the bandwidth, that would be a hit!
@SamuelChan Год назад ⁺²
Great suggestion! Say no more. Here's using natural language to query and chat with your database! (freshly released 2 mins ago)
ruclips.net/video/Fz0WJWzfNPI/видео.html
@RunForPeace-hk1cu Год назад ⁺¹
@@SamuelChan I noticed you used chain type "stuff" in your example @15:52 .... what if you have hundreds/thousands of documents, should I use "map reduce" instead?
But If I use "map reduce" wouldn't it cost a lot of credit (money) since open charges per AI call.
How do I reduce the cost of my query if I have a lot of documents?
thanks!
@SamuelChan Год назад
@@RunForPeace-hk1cu hey I replied to you on Discord. But I would encourage you to check out the other videos in this playllist as they would systemically cover these concepts
ruclips.net/p/PLXsFtK46HZxUQERRbOmuGoqbMD-KWLkOS
Where map reduce is mentioned along with caching:
ruclips.net/video/Uk_SJSnQRU8/видео.html
Where Pinecone is mentioned (for indexing hundreds of pages on cloud) is also helpful:
ruclips.net/video/k8G1EDZgF1E/видео.html
Good luck!
@AashiGoyal-z9h Год назад ⁺¹
I am getting the error ImportError: cannot import name 'Document' from 'langchain.schema' (/usr/local/lib/python3.10/dist-packages/langchain/schema.py)
even though i ran the requiremnets.txt file can you help
@SamuelChan Год назад
Hey can you check again? I’ve just released a bump this morning to bump LangChain and LlamaIndex to the latest version so make sure you upgrade both libs - check requirements.txt if unsure! :)
@waheedahmed6602 Год назад
Hi Samuel, great content!
One question i have is, is it possible to make open ai take the data from a document lets say a 20 page pdf, and i ask it to write an article based on that data in a special style, something to note, there are no data about the special style in my document but should be there in its general knowledgebase? I want the ai to use custom knowledge on top of its existing knowledge and not only custom knowledge.
@SamuelChan Год назад ⁺¹
Yeah sure, that’s quite a common use case too.
You probably can’t fit all 20 pages of the pdf in one go, so something like LangChain and llamaindex helps you “chunk” them up for the indexing, and you use a LLM to generate an article or a summary based off that index (what you call custom knowledge).
The langchain series on this channel goes through these use cases as well! You can use existing pdf, existing csv / spreadsheets, existing websites, existing documents or books to build your index and have your LLM synthesize based on these custom document. Very useful for internal knowledge bases, for example
@waheedahmed6602 Год назад
@@SamuelChan Thank you so much for the response! Is it ideal to set the temperature of open ai model to more than 0.5 to have the special style from open ai's general knowledgebase but get the main content from the document? Also, Do you know of a way to get open ai to send articles of more than 1000 words?
@SamuelChan Год назад ⁺¹
Yeah on temperature: closer to 1.0 leads to “wilder”, more adventurous behavior. Close to 0 leads to safer, more conservative behavior / writing styles. Something like temperature of 0 would make it deterministic.
I’ve used langchain on some 15 page PDFs, and intend to record a tutorial on that as well. I didn’t count it to know if it has >1000 words but my guess is it might come close. Currently have 5 videos in the LLM series already though so this may have to wait a bit before going live on the channel! 😄
@waheedahmed6602 Год назад
@@SamuelChan sorry i couldn't ask my question properly, i meant since there's a token limit for conversations, i want to know if you know of a way to get chatgpt api to give you an article of over 1000 words using any technique
@SamuelChan Год назад
@@waheedahmed6602 hey no worries at all!
At minute 19:20 of this video on my channel you see me use the max_length argument to change the length of text that gpt generates for me given a certain prompt. This might be what you're looking for! Docs described it as "The maximum length of the sequence to be generated".
ruclips.net/video/dD_xNmePdd0/видео.html
There should be an upper limit of max length you can use with gpt, but that probably depends on the model you choose among other things (~500 words, or 4,000 characters)
@AkashSunilNirantar Год назад
Did anyone tried to deploy this on any cloud service? I tried to deploy my app on GAE but having chromadb as one of the dependency in requirements.txt fails the deploy. Can anyone help me with this? (The error code displayed is 13 after I deploy)
@SamuelChan Год назад
Hey Akash! Chroma is dependent on sentence-transformers, which is dependent on pytorch. If you're using python 3.11 in your prod, this might be the problem (github.com/chroma-core/chroma/issues/249). The easy workaround is to go one version lower, to python 3.10, or use the Dockerfile (github.com/chroma-core/chroma/blob/main/docker-compose.yml).
If that isn't the issue, then I'm afraid we need more details on the error you're getting to be able to help :)
@AkashSunilNirantar Год назад
@@SamuelChan Thanks for the reply! I am using Python 3.9. I'll try to add torch dependency
@AkashSunilNirantar Год назад
@@SamuelChan I added all the dependencies. I have also given the required permissions to my service account, still the deployment fails but if i remove chromadb and its dependencies, it gets deployed.
@尤奕翔-v5t Год назад
This is amazing!!Thank you for the tutorial~ I am wondering after embedding our own data.Does it change the original opai llm's in some way? (like if you ask it about the question that isn't related to our data set, is the answer gonna be the same as if we didn't do the embedding , or what if the data we provided didn't match the answer it original thought? I'm sorry if my question is stupid.I just started with AI. Anyway ,Thank you for the tutorial ~~
@SamuelChan Год назад ⁺¹
Hey that is not a stupid question! :)
It doesn’t change the LLM in any way. But a query is executed against the index / vector store and not directly on your text itself. This might be a simple Vector Store, it might be a tree structure, it might be the KeywordTable structure etc (the full LLM playlist on my channel has 8+ videos, which goes into these). So you’ll do the embeddings to have a vector representation of the Q&A to work.
If the data doesn’t include the answer - the Q&A chain would say something like “I couldn’t answer the question with the provided information”.
In practice, if your data is static (let’s say you’re training a LLM to answer questions on Investor Relations report from 2010-2022) and doesn’t grow or evolve, you can train the embeddings ONCE. Then store it locally (watch the other videos on this channel to see the options: local json, Chroma and Pinecone). Then your Question Answer Retriever will query against this vector store instead of re-building the embeddings and re-indexing everything. This makes your query faster, cheaper and also more deterministic in theory since it’s the same vector store.
@kartavyabagga Год назад
Train with our data as new embeddings
But where & how do i train llama
@WelcomemynameisLiam Год назад
Is there an easy way to save the embeddings/chroma so that instead of going to OpenAI, we can just refer to the saved file?
@SamuelChan Год назад
Hey Liam, yes! You can save the embeddings in json, as a python dict, or just on disk (as a json file, passing a name)
Here’s my sample implementation: github.com/onlyphantom/llm-python/blob/main/2b_llama_chroma.py
Find the lines with the .save_to_dict, .save_to_disk() code!
I also have videos covering these usage patterns in the LLM playlist on this channel! Hope it helps!
@SamuelChan Год назад
Another example implementation:
github.com/onlyphantom/llm-python/blob/main/2_llama.py
This demonstrates saving embeddings to disk and loading from it!
@WelcomemynameisLiam Год назад ⁺¹
@@SamuelChan Oh thats awesome! So we have to use Llama index instead of Langchain??? Thank you so much for the responses !!!
@SamuelChan Год назад
I’d say LangChain and Llamindex have a pretty large overlap with how both libraries evolved. LangChain has a wider feature set with how it implements agents and sequential “chains”, and llamaindex’s originally was a project to help transform unstructured data to structured data that can be thrown into gpt.
Both langchain and llamaindex supports lots of embedding db / index implementations and that includes Chroma! :)
@WelcomemynameisLiam Год назад
@@SamuelChan Oh I see! So I can save the indexes created by Llamaindex to avoid using tokens again and again. Load the index from the file, and then can use Langchain to query that index in a smarter way with their wider feature set?
@iamexperimentingnow9454 Год назад ⁺¹
This video has clear explanation. Thanks for that, Sam. I have question, instead of using OpenAI could you please provide a demo with open source LLM? Like question answering from doc using open source LLM. It would be really helpful.
@SamuelChan Год назад ⁺¹
Hey! Thank you! I do have a video that demonstrates using langchain with an open source LLM:
LangChain + HuggingFace's Inference API (no OpenAI credits required!)
ruclips.net/video/dD_xNmePdd0/видео.html
You won’t even need OpenAI credits to go through all the examples there. I also have another video that I’m publishing on Wednesday featuring langchain with a locally-hosted LLM (running on your own machine)!
@muhammadahsankhan8480 Год назад
make a video to connect gpt with data base and get response
@SamuelChan Год назад
Right here! Just published a week ago! :)
LangChain + OpenAI to chat w/ (query) own Database / CSV!
ruclips.net/video/Fz0WJWzfNPI/видео.html
@vidricsimamora Год назад ⁺²
I watch this video with 0.75 speed
@SamuelChan Год назад ⁺²
I gotta learn to speak slower 😬

Следующие

Автовоспроизведение

LangChain + OpenAI to chat w/ (query) own Database / CSV!