OMG, this is exactly the functionality I need as a long-form fiction writer, not just to be able to look up continuity stuff in previous works in a series so that I don't contradict myself or reinvent wheels ^^ -- but then to also do productive brainstorming/editing/feedback with the chatbot. I need to figure out how to make exactly this happen! Thank you for the video!
Agreed. Do you have any simplified tutorials? Like explaining langchain I fed my novel into chatgpt page by page it worked..ok but I kept running into roadblocks. Memory cache limits and more.
@@areacode3816 maybe from ur pinecone reaching its limit? or ur 4000 gpt3 token limit? i would check these first, if its pinecone the fix is easy, jus buy more space, but if its due to gpt then try gpt4 it has double the token at 8k or if that doesnt work i would figure out an intermediary step in between to introduce another sumarizing algorithm before passing it to gpt3
Great job on the video. I understood a lot more in 12 mins than from a day of reading documentation. Would be extremely helpful if you can bookend this video with 1. dependencies and set up and 2. turning this into a web app. If you can make this into a playlist of 3 videos, even better.
Can you do a more indepth Pinecone video? It seems like an interesting concept alongside embeddings and i think it'll help seam together the understanding of embeddings for more 'web devs' like me. I like how you used relatable terms while introducing it in this video and i think it deserves its own space. Please consider an Embeddings + Pinecone fundamentals video. Thank you.
@@DataIndependent I thinks that general pinecone video would be great, and connecting it with LangChain and building similar apps to this would be awesome
This is exactly what I was looking to do, but I could'nt sort it out. This video is legit the best resource on this subject matter. You're gentleman and a scholar. I tip my hat to you, good sir.
thanks for making these videos! I've been going through the playlist and learning a lot. One thing I wanted to mention that I find really helpful in addition to the concepts explained is the background music! Would love to get that playlist :)
Thank you! A lot of people gave constructive feedback that they didn't like it. Especially when they sped up the track and listed to it on 1.2x or 1.5x Here is where I got the music! lofigenerator.com/
Hi! Awesome tutorial. This is exactly what I was looking for. I really love this series you've started and hope you'll keep it up. I also wanted to ask: 1. What's the difference between using Pinecone or another vector store like Chrome, FAISS, Weaviate, etc? And what made you choose Pinecone for this particular tutorial? 2. What was the cost for creating embeddings for this book? (time & money) 3. Is there a way to estimate the cost of embeddings with LangChain beforehand? Thank you very much and looking forward to more vids like this! 🤟
For your questions 1. The difference with Pinecone/Chrome,etc. Not much. They store your embeddings and they run a similarity calc for you. However the space is super new, as things progress one may be a no brainer over another. Ex: You could also do this in GCP but you'd have to deal with their overhead as well. 2. Hm, unsure about the book but here is the pricing for Ada embeddings: $0.0004 / 1K tokens. So if you had 120K word book which is ~147K tokens, it would be $.05. Not very steep... 3. Yes, you can calc the number of tokens you're going to use and the task, then look up their pricing table and see how much it'll be.
Great tutorial bro. You're really doing good out here for us the ignorant. Took me a while to figure out that I needed to run pip install pinecone-client to install pinecone. So this is for anyone else who is stuck there
Its one project only on starter tier, that one project can contain multiple documents under one vector vector db. For me it was certainty enough to get an understanding of the potential. From my limited experience, to create multiple vector db's for different project types you will need to premium/paid and the cost is quite high. There may be other competitors offering cheaper entry level if you wish to develop apps but for a hobbyist/learning the starter tier on pinecone is fine IMO.
Duudee!!! This video is exactly what I was looking for! Still a complete noob at all this LLM integration stuff and so visual tutorials are so incredibly helpful! Thank you for putting this together 🙌🏿🎉🙌🏿
I actually scanned the whole Mars trilogy to have something substantial, and it works fine. The queries generally return decent answers, although some of them are way off. Thanks for your excellent work!
Did you look at the results returned from Pinecone so you could determine if the answers that were off were due to Pinecone not providing the right context or OpenAi not interpreting the data correctly?
@@bartvandeenen I've been watching a few videos about LangChain and they did bring up that the chunk size (and overlap) can have a huge impact on the quality of the results. They not only said there hasn't been much research on an ideal size but they said it should likely vary depending on the structure of the document. One presenter suggested 3 sentences with overlap might be a good starting point. But I don't know enough about LangChain, yet, to know how you specify a split on the number of sentences vs just a chunk size.
this is awesome! my question is, what happens when the model is asked a question outside of the knowledge base that was just uploaded? For example, what would happen if you asked a question about who is the best soccer player?
This is a great video - succinct and easy to follow. Two questions: 1) How easy is it to add more than one document to the same vector db 2) Is it possible to append an additional ... field(?) to that database table - so that the provenance of the reference can be reported back with the synethised result?
@@DataIndependent Amazing (and thanks for the reply). One final follow up then, is it easy / possible to delete vectors from the db too (I assume yes wanted to ask). I assume this is done by using a query e.g. if meta data contains "Document ID X" then delete?
Awesome tutorial, brief and easy to understand, Do you think this could be an approach to make semantic search on private data from clients? my concern is data privacy so, I guess by using pinecone and openAI, is that openAI only process what we send (to respond in a NL), but they don't store any of our documents.
Also, I’ve been trying to make some type of “theorems, definitions, and corollaries” assistant which extracts from my textbook all the math theorems, definitions, and corollaries. The goal there was to create textbook summaries to reference when I work through tough problems which require me to flip back and forth through my book all day long. But more interesting, I am struggling to create a “math_proofs” assistant. Your approach in this video is awesome, but I can’t find any of your resources in which you use markdown, or latex, or any mathematical textbook to be queried. I use MathPix to convert my textbooks to latex, wordDoc, or markdown. But when I use my new converted markdown text, despite working hand-in-hand with the lang chain documentation, I still fail to get a working agent that proves statements. I feed the model: “Prove the zero vector is unique” and it replies nonsense, even though this proof is explicitly written in the text. It is not even something it had to “think” to produce (simple for the sake of example, these proofs are matrix theory so they get crazy). Could you please chime in?
Pulling all of that information out could be tough. I have a video on the playlist around "topic modeling" which is really just pulling structured information out of a piece to text. That one may be what you're looking for
Awesome tutorial, brief, and easy to understand. My concern is data privacy, what happens with the data we turn into embeddings by using OpenAI, is that data used by them? Do they train further their models with that data? Can someone please answer if you have info on this privacy topic.
Nice video. i tweaked the code and split the index part and the query part so that i can index once and keep querying - like how we would do in the real world. Nicely put together !!
This is really cool but i havent yet seen a query for a specific information store (in your case, a book) that chatgpt cant natively answer. For example i queried chatgpt the questions you asked and got detailed answers that echoed the answers you received and then some.
heeeey! Loving this! Greg, I'm running an e-commerce site. We've got a metric shit-ton of products and endless amounts of purchase data. It would be extremely interesting to see how we could work with this to get all our product data loaded into Pinecone and then be able to query it in some meaningful sense. I guess a lot of the comments are in a similar vein. Would be super cool to get a video on that. I could supply some product data from our shop if need be.
@@DataIndependent So I'm running a shop for car parts and equipment for cars. I think, from a consumer point of view, it would be amazing if we could solve two major issues. 1. If you're browsing for something to solve your problem rather than an actual product. Say that you have some stains on your car. It would be amazing if you could just ask the Friendly Chat Support how to deal with the issue and the support AI would have all the information about all our products and all the content that we've written at hand. And could go "Yeah, so you would use this product and go about it in so and so manner". 2. It would be super cool if it also had access to user data and past purchases etc. And go "Hey.. last time you bought this and this. How did that work out for you? From 1 to 10 how much did you love it?" etc etc. -- It feels like this is a scenario that is predicated on the idea that the AI has very specific knowledge
Amazing content man , love the diagrams and how you deliver ,absolutely professional . quick question , is the text returned by the chain is exactly the same from the book or does the openAI engine make some touches and make it better ?
Great video!! Loved your explanation. Could you create another video on how to estimate the costs? Is the process of turning the Documents to Embeddings using OpenAI running every time you make a new question? or just the first time? Thanks!
Pinecone is basically a search engine for ai. It doesn't need the entire book but just segments of it instead. This saves a lot of tokens cause only segments of information end up in the prompt. Like adding some information into gpt's short term memory
Nice! I was working with pinecone / gpt code recently that gave your chat history basically infinite memory of past chats by storing them in pinecone which was pretty sweet as you can use it to give your chatbot more context for the conversation as it then remembers everything you ever talked about. Will be combining this with a custom dataset pinecone storage this week (like a book) to create a super powered custom gpt with infinite recall of past convos. Would be curious on your take, particularly how to keep the book data universally available to all users but at the same time keeping the past chat data of a particular user totally private but still being able to store both types of data on the free tier pinecone which I can see you are using (and I will be using too).
Nice! That's great. Soon if you have too much information (like in the book example above), you'll need to get good at picking which pieces of previous history you want to parse out. I imagine that won't be too hard in the beginning but it will later on.
@@DataIndependent Doesnt the k variable take care of this? It only returns the top k number in order of relevance that you end up querying. Or are you talking about the chat history and not the corpus? I see no reason why you would not just specify a k variable of 5 or 10 in regard to the chat history too. For example if a user was seeking relationship advice and the system knew their entire relationship history and the user said something like "this reminds of of the first relationship that I told you about", it would be easy for the system to do an exact recall of the relationship, the name of the partner and from there recall everything very quickly using the k variable on the chat history. I use relationships as an example because I just trained my system on a book that I wrote called sex 3.0 (something that gpt knows nothing about) and I am going to be giving it infinite memory and recall this week.
@@PizzaLord Yes, the K variable will help w/ this. My comment was around the chance for more noise to get introduced the more data you have. Ex: More documents creep in that share a close semantic meaning, but aren't actually what you're looking for. For small projects this shouldn't be an issue. Nice! That's cool about the project. Let me know how it goes. The langchain discord #tools would love to see it too
@@DataIndependent Another thing I will look at, and I think it would be cool if you looked at it too, is certain chat questions triggering an event like a graphic or a video link being shown where by the video can be played without leaving the chat. This can be done by either embedding the video in the chat response area or by having a separate area of the same html page which is the multimedia area or pane that gets updated. After all the whole point of langchain is to be able to chain things together, no? Once you chain things together you can get workflow. This gets around one of chat gpts main limitations right now which is that its text only in terms of what you can teach it and the internet loves its visuals and videos. Once this event flow stuff is in place you can easily use it to flow through all kinds of workflow with gpt at the centre like collecting data in forms, doing quick survey so you can store users preferences and opinions about what they might want to get out of an online course that you are teaching it and then storing that in a vector DB. It can become its own platform at that point.
@@PizzaLord You could likely do that by defining a custom tool, grabbing an image based off a URL (or generating one) and then displaying in your chat box. Doing custom tools is interesting and I'm going to look into a video for that.
This is great, thanks! have you thought about how to extend it to be able to CHAT about the book? (as opposed to a question at a time). I am running into problems figuring out when to keep a chain of chat and when to realize its a new or related question that needs new pulling of similar docs
Hm how many chatbots will depend on your product use case. I would put them in the same index, but make sure your metadata is explicit so you can easily filter with them
Thank you for the excellent tutorial. I have a few questions to ask. How can I pre-filter the vector in multiple document situations? Secondly, I am not familiar with using Pinecone. How should I determine the optimal settings for dimensions and metrics in multiple documents? By the way thank you so much again.
Thank you! > How can I pre-filter the vector in multiple document situations? Check out this code line, it has an argument where you can pass a filter to metadata github.com/hwchase17/langchain/blob/3c2468452284ee37b8a88a20b864255fa4385b65/langchain/vectorstores/pinecone.py#L132 Dimensions will be the number of values in each of your vectors. So the optimal one is what your embedding engine recommends and outputs.
@@DataIndependent Thank you for your previous response. I have an additional question regarding the language setting in Langchain. I am currently working on a Korean I/O based project, I would like to modify my language settings to receive ChatGPT's responses in Korean. How can I apply these changes?
Great tutorial, I wonder how to generate questions based on the content of the book? I would probably have to pass the entire content of the book to the GPT model.
I would love to see a video on the limitations of RAG. For instance say you have a document containing a summary of each country in Europe. Naturally one of the facts listed for each country would be the year they joined the EU. Unless explicitly stated, RAG wouldn't be able to tell you how many countries there are in the EU. I would love to see a tutorial on working around that limitation.
Hey, Greg! I'm trying to connect the dots on GPT + langchain and your videos have been excelent sources! To give it a try, I'm planning to build some kind of personal assistant for a specific industry (i.e. law, healthcare), and down the road the vector database will become pretty big. Any guideline on how to sort the best results and also how to show the source of where the information was pulled from?
@@DataIndependent I'm curious what's a better option for this use case and would love to hear your thoughts. Why LangChain over Haystack? I want to pass through thousands of text documents into a question answering system and am still learning the best way to structure it. Also, an integration into something like Paperless would be cool! I'm a total noob so excuse my ignorance. Thanks!
@@philipsnowden I haven't used Haystack yet so I can't comment on it. If you have 1K text documents you'll definitely want to get embeddings and store them, retrieve them, then pass them into your prompt for the answer. Haven't used paperless yet either :)
@@DataIndependent Could you do a more in depth explainer on this? I'm struggling to take a directory of text files and get it going. I've been reading and trying the docs for langchain but am having a hard time . And can you use the new turbo 3.5 model to answer the questions? Thanks for your time, have a tip jar?
Thanks Greg for the great work. I actually ran some Q & A with a financial reporting (PDF) based on your examples. While the model did really great for text, it struggled with structured financial data outlined in tables, as typical for financial reporting. Do you think that can be improved further down the line (I assume that's something Open AI has to address in their LLM and not necessarily LangChain)?
For those examples it's all about the data preprocessing. The information is there, my guess is it's hard to read in table form though. Yes I'm hoping that there is more support for this in the future.
Love the article, I have few questions 1) after finding the relevant docs with highest cos similarity score, what's happening when you call OpenAI API? Is it summarising all the 5 docs together? Or are you doing few-shot with 5 docs as examples for prompt? 2) I would like to understand the shortcomings of having to divide into segments/documents - for example, if sentences containing same context gets cut into two docs and only one is included in the shortlist of similar docs, wouldn't some information go missing? Really love your video and you made it so easy to understand, but would love know your thoughts on these! Thanks :)
Here ya go: 1) This depends on the "chain_type" you specify. There are a few ways to do it. Check out my video on "workaround openai token limit" for more information on it 2) I agree with your point. It's not ideal to split on physical characters because you might split context like you're mentioning. What we are *really* trying to do is group together meaning and we're using sentences and paragraphs to split on this. My prediction is we'll soon see semantic splitting that groups together ideas or is more intelligent than just character splitting.
Thanks! When you stuff all the docs together into prompt and make and OpenAI API call, what is GPT doing? Is it just summarizing the docs in the prompt or is it doing few-shot learning to answer the question?
@@abcd-zi4kc In this example you're telling OpenAI to answer a question given context (which are the similar documents you've retrieved). We don't give any examples.
Hi Greg! Thanks so much for the video! I am wondering what OpenAI embedding model you used, and what OpenAI chat model you used, and where can I find that in the code? Additionally, is there a way to view the cost of querying in terms of tokens consumed? Thanks!
For embeddings I just use openai's ada-002 model. For chat model, if one isn't provided, then it's gpt-3.5 (as of today), I used the default so you won't see it unless you check out the langchain source code
Great overview, thanks for the video. I have a question that I was unable to find an answer. Let's assume you asked a question that book does not cover. How do you fall back to the OpenAI knowledge base? Does it mean that in my case I will get 0 documents back? And just run an API call to openAI?
I am getting Index 'None' not found in your Pinecone project. Did you mean one of the following indexes : langchain1 for below line docsearch = Pinecone.from_texts([t.page_content for t in texts], embeddings, index_name) Any idea what the issue could be. I checked index_name variable is set correctly as langchain1
Can you talk about the costs of using OpenAI API for this? Is this like the cost of fine-tunning and usage with own data or the use of embbeding (ADA)?
Thank you - Super helpful to understand how to use external data sources with OpenAI. What are some of the limitations of this approach i.e. size of content being indexed in pinecone, any limits on correlating and summarizing data across multiple documents/sources, can I combine multiple types of sources of information about a certain topic (document, database, blogs, cases etc.) into a single large vector?
Hey Greg, great video! Do you know if it's possible to automatically create a pinecone db index from code? So that you don't have to create them manually
Would using this methodology be a good way to build up a Q&A Body of Knowledge ontop of a businesses SOP documents? Allowing newcomers to a company to query best-practice protocols thorugh a query system - negating the need to always go to their manager?
Big time, it's a great starting point. If you need more advanced retrieval techniques you could try out one of these: github.com/gkamradt/langchain-tutorials/blob/main/data_generation/Advanced%20Retrieval%20With%20LangChain.ipynb
Great video, how do I call the embeddings from pinecone next time I run the application (instead of having to generating them again via openai at a cost)?
Excellent...!! Just one question, Once we load data is this data now belongs to OpenA/ChatGPT. ? In other words can they use this uploaded book data to answer questions that other users may ask?
Is it a fine tuned model ? Because if not we will charged high for using openai api. Please make a video on fine tuned langchain openai ai model like text-ada-001
I have a doubt. Please help me in this. I am trying to create a chatbot in which I provide companies information and it will refer that information and provide answer. Currently I was trying to achieve this by fine-tuning the openai gpt model but not getting the desired results. How much I have understood that this technique will work for the above use case. Am i right?
I have a video about building a simple web app in 23 minutes using streamlit which may help! If not then vercel seems like another good option. Soon pynecone will be once they add hosting.
Great video! so what if my question is out of context of the pdf document? Will the open ai answers it from its generic knowledge? Or it will simply say that it doesn't know the answer? Either way can we configure it to respond the way we want?
Great tutorial! Though what would happen If you load the same document. Would the vectorstore store it again? Reason I ask is ,I would like to build something similar but would want to prevent the vectored db be populated with the same vectors and waste embedding then again. Thanks jn advance
My one regret with this video is not making it clear how to query the docs again without having to reload the embeddings. Yep it's possible. github.com/hwchase17/langchain/blob/4654c58f7238e10b35544633bd780b73bbb75c75/langchain/vectorstores/pinecone.py#L250
@Data Independent this makes more sense! Create an index based on the PDF file or files. Then when the PDF is uploaded again just use the stored index. Though you would have to connect the index with the uploaded PDF(s) e.g. concatenate PDF name or hash them somehow. I know how to get this done with a vector library e.g. Chroma
thank you for this series. I'm confused about one thing. When querying the db, you passed the text, not its embedding. How does pinecone know how to embed the text?
Is there any limit on the number/ size of the documents that can be uploaded so that the model performs efficiently? I am guessing with larger size, cosine similarity search might take higher computational time
In LangChain is "similarity search" used as a synonym for "semantic search", or they are referring to different types of search? To my knowledge similarity search focuses on finding items that are similar based on their features or characteristics, while semantic search aims to understand the meaning and intent behind the query to provide contextually relevant results
if I already have some embedding vector stored in pinecone, I don't need to embed again, how can I modify the following code ''docsearch = Pinecone.from_texts([t.page_content for t in texts], embeddings, index_name=index_name)'' and use docsearch.similarity_search() in the next step?
It’s incredible instructions. In my case, I have some documents in Vietnamese language, will Pinecone support utf8 ? OpenAI + langchain + pincone,.. very helpful in many fields especially in customer services
Awesome video! What I don't understand is, shouldn't the query on line 14 be embedded? Shouldn't it be a vector, rather than a string, to query the vector db?
Kind of new to LLMs and all of this, so I'm not sure if I'm asking the right question here. But my concern with doing this is hallucinations. Could asking a question that is outside of the scope of this data science book still result in hallucinated answers? I've been doing a lot of reading about retrieval augmented generation and how that mitigates hallucination, so is that what's going on here? Great video!!
Yes, it definitely could. There are hallucination techniques you could try. Sometimes simply just putting in the prompt "don't make anything up" works.
Ok that's good to know! Do you have any thoughts/insights about using RAG tools like Vectara to mitigate hallucinations? Is there any way to really eliminate hallucinations or is that just a pipe dream @@DataIndependent
In 1994 Richard E. Osgood created a conversational reading system called "ASK Michael" for Michael Porter's book "The Competitive Advantage of Nations". Please let me know when you can automate the conceptual indexing and question-based indexing of a book including the creation and categorization of relevant questions that a novice that doesn't know any keywords or relevant vocabulary can ask.
It's really a great video to get start with langchain. I have a small confusion here. what if I want to send all the similar docs to the llm model not just k=5. Is there a way to deal with it?
great video, one question for chatbot , how about the same or similar questions send to robot from different client? does the cost double, how to optimize it ?
Great video series. After you have created embeddings and uploaded them to Pinecone; what if your source data changes. How would you update the embeddings without first deleting them and re-embedding the entire source again which would become expensive?? Is there a way to only create embeddings on the changed parts etc ??
Since each embedding tied to a document you can check diffs on your docs and only update when a doc changes. Each embedding has metadata you’ll need to keep track of. It’ll be a hassle but doable
@@DataIndependent I’ve seen another python library called llama-index claim that they can do updates automatically so I’m figuring out how to do it in langchain. Seems you can integrate the two. Use llama-index to manage your documents and embeddings and pass that as a tool to a langchain agent. The composability here is very nice!!
thanks for the great content! do you know how to better control the cost of having such a retrieval-based chatbot? Based on my experience, it is quite costly to run QnA on just the simple pdf that provided in LangChain repo, using default embeddings and llm models provided from the langchain example
Ok, so maybe I misunderstand this one. I used the full text of War and Peace, just to test. My query was "How many times does the word 'fire' appear in War and Peace?" and when it finishes running there is no output... is this not the right set up for that kind of question? Then, I set the query to 'What are the main philosophical ideas in War and Peace?' and also returned nothing. Didn't error out. I double checked and all my code is good.
Ah yes this is a fun question. So LLMs won't be good at counting words like you're describing. That's. a task they aren't well suited for yet. I would use regular regex or a .find() for that The 2nd question is also hard, you need to review multiple pieces of text in the book to form a good opinion of the philosophical ideas. Just doing an similar embedding approach won't get you there. If you wanted to answer the philosophical question I would do a map reduce or refine with a large context window. However war and peace is huge so that would cost a lot.
Hi there, thanks for the great video! It worked beautifully for my first query, but I want to be able to access the embeddings from pinecone without having to generate the texts variable every time as my file is massiveee! Would you recommend saving the all the split documents in a file, or is there a way to continuously query the same docs using just pinecone? Sorry for the convoluted question
Thanks for this great video but I have a question. Does this method support the user case to create a summary of the book. What if I would ask: Give me the the overview of all chapters including a small summary. According to what I see that is not a good use case for this architecture because the query must focus on the entire book rather on a specific place. Or am I mistaken? Thanks, Michael
I would do a bit of pre processing of your data and make a summary of each chapter individually and then add them all back together. Asking it to do the whole thing may get it overwhelmed
hey. great work. question. If I want to summarize a very large document, Could I split it into multiple documents and create embeddings to create summary? what is the best practice?
So even Ryan Gosling's getting into this now.
It's a fun topic!
@@DataIndependent he was referring to the fact you look like Ryan Gosling.
@@SockerDad I think understands that.
@@Author_SoftwareDesigner lol I couldn’t tell if he understood that when he said it’s a fun topic.
yesss
OMG, this is exactly the functionality I need as a long-form fiction writer, not just to be able to look up continuity stuff in previous works in a series so that I don't contradict myself or reinvent wheels ^^ -- but then to also do productive brainstorming/editing/feedback with the chatbot. I need to figure out how to make exactly this happen! Thank you for the video!
Nice! Glad it was helpful
Agreed. Do you have any simplified tutorials? Like explaining langchain I fed my novel into chatgpt page by page it worked..ok but I kept running into roadblocks. Memory cache limits and more.
@@areacode3816 maybe from ur pinecone reaching its limit? or ur 4000 gpt3 token limit? i would check these first, if its pinecone the fix is easy, jus buy more space, but if its due to gpt then try gpt4 it has double the token at 8k or if that doesnt work i would figure out an intermediary step in between to introduce another sumarizing algorithm before passing it to gpt3
How would I use this to make a smart chat bot for our chat support on our company? Specific to our company items
@@gjsxnobody7534I have same query!
you know it's something big when The GRAY MAN himself is teaching you AI!!
No idea how long i've been searching the web for this exact tutorial. Thank you.
Wonderful - glad it worked out.
@@DataIndependentdo you offer consulting? I'd like to do something like this for my learners / learning business. 🙂
@@koraegis Happy to chat! Can you send me an email at contact@dataindependent.com with more details?
@@DataIndependent Thanks! Will do it now :D
Your series is just so so good. What a passionate, talented teacher you are!
Nice! Thank you!
Great job on the video. I understood a lot more in 12 mins than from a day of reading documentation. Would be extremely helpful if you can bookend this video with 1. dependencies and set up and 2. turning this into a web app. If you can make this into a playlist of 3 videos, even better.
Can you do a more indepth Pinecone video? It seems like an interesting concept alongside embeddings and i think it'll help seam together the understanding of embeddings for more 'web devs' like me. I like how you used relatable terms while introducing it in this video and i think it deserves its own space. Please consider an Embeddings + Pinecone fundamentals video. Thank you.
Nice! Thank you. What's the question you have about the process?
@@DataIndependent I thinks that general pinecone video would be great, and connecting it with LangChain and building similar apps to this would be awesome
Weaviet is even better
This is absolutely brilliant! I love the way you explain everything and just give away all notes in such detailed and easy to follow way.. 🤩
This is exactly what I was looking to do, but I could'nt sort it out. This video is legit the best resource on this subject matter. You're gentleman and a scholar. I tip my hat to you, good sir.
This is the best video i've watched explaining the use of pinecone.
Nice!!
thanks for making these videos! I've been going through the playlist and learning a lot. One thing I wanted to mention that I find really helpful in addition to the concepts explained is the background music! Would love to get that playlist :)
Thank you! A lot of people gave constructive feedback that they didn't like it. Especially when they sped up the track and listed to it on 1.2x or 1.5x
Here is where I got the music!
lofigenerator.com/
bro thank you so much honestly this video means so much to me, I really appreciate this all the best in all your future endeavors
Love it - what was your use case?
Hi! Awesome tutorial. This is exactly what I was looking for. I really love this series you've started and hope you'll keep it up. I also wanted to ask:
1. What's the difference between using Pinecone or another vector store like Chrome, FAISS, Weaviate, etc? And what made you choose Pinecone for this particular tutorial?
2. What was the cost for creating embeddings for this book? (time & money)
3. Is there a way to estimate the cost of embeddings with LangChain beforehand?
Thank you very much and looking forward to more vids like this! 🤟
For your questions
1. The difference with Pinecone/Chrome,etc. Not much. They store your embeddings and they run a similarity calc for you. However the space is super new, as things progress one may be a no brainer over another. Ex: You could also do this in GCP but you'd have to deal with their overhead as well.
2. Hm, unsure about the book but here is the pricing for Ada embeddings: $0.0004 / 1K tokens. So if you had 120K word book which is ~147K tokens, it would be $.05. Not very steep...
3. Yes, you can calc the number of tokens you're going to use and the task, then look up their pricing table and see how much it'll be.
@@myplaylista1594 This one should help out
help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them
@@DataIndependent It can't be so expensive. text-embedding-ada-002 is about ~3,000 pages per US dollar (assuming ~800 tokens per page).
@@klaudioz_ ya, you’re right my mistake. I didn’t divide by the extra thousand in the previous calc. Fixing now
@@DataIndependent No problem. Thanks for your great videos !!
Great tutorial bro. You're really doing good out here for us the ignorant. Took me a while to figure out that I needed to run pip install pinecone-client to install pinecone. So this is for anyone else who is stuck there
Glad it worked out
Fantastic video thanks. I obtained excellent results (accuracy) following your guide compared to other tutorials I tried previously.
Ah that's great - thanks for the comment
Was the starter tier of pinecone enough for you?
Its one project only on starter tier, that one project can contain multiple documents under one vector vector db. For me it was certainty enough to get an understanding of the potential.
From my limited experience, to create multiple vector db's for different project types you will need to premium/paid and the cost is quite high.
There may be other competitors offering cheaper entry level if you wish to develop apps but for a hobbyist/learning the starter tier on pinecone is fine IMO.
Duudee!!! This video is exactly what I was looking for! Still a complete noob at all this LLM integration stuff and so visual tutorials are so incredibly helpful!
Thank you for putting this together 🙌🏿🎉🙌🏿
Great to hear! Checkout the video on the '7 core concepts' which may help round out the learnings
I actually scanned the whole Mars trilogy to have something substantial, and it works fine. The queries generally return decent answers, although some of them are way off.
Thanks for your excellent work!
Nice! Glad to hear it. How many pages/words is the mars trilogy?
@@DataIndependent About 1500 pages in total.
Did you look at the results returned from Pinecone so you could determine if the answers that were off were due to Pinecone not providing the right context or OpenAi not interpreting the data correctly?
@@keithprice3369 no I haven't.good idea to do this. I know have gpt4 access so can use much larger prompts
@@bartvandeenen I've been watching a few videos about LangChain and they did bring up that the chunk size (and overlap) can have a huge impact on the quality of the results. They not only said there hasn't been much research on an ideal size but they said it should likely vary depending on the structure of the document. One presenter suggested 3 sentences with overlap might be a good starting point. But I don't know enough about LangChain, yet, to know how you specify a split on the number of sentences vs just a chunk size.
Great video man. Loved it. I had been looking for this solution for some time. Keep up the good work.
This is super awesome!!! And so easily explained! You made my year. Please keep up the greatest work
this is awesome! my question is, what happens when the model is asked a question outside of the knowledge base that was just uploaded? For example, what would happen if you asked a question about who is the best soccer player?
This is a great video - succinct and easy to follow.
Two questions:
1) How easy is it to add more than one document to the same vector db
2) Is it possible to append an additional ... field(?) to that database table - so that the provenance of the reference can be reported back with the synethised result?
1) Super easy. Just upload another
2) Yep you can, it's the metadata field as you can add a whole bunch. People will often do this for document id's
@@DataIndependent Amazing (and thanks for the reply). One final follow up then, is it easy / possible to delete vectors from the db too (I assume yes wanted to ask). I assume this is done by using a query e.g. if meta data contains "Document ID X" then delete?
Awesome tutorial, brief and easy to understand, Do you think this could be an approach to make semantic search on private data from clients? my concern is data privacy so, I guess by using pinecone and openAI, is that openAI only process what we send (to respond in a NL), but they don't store any of our documents.
I like the video because it was to the point and the presentation with the initial overview diagram is great.
This is a great video and Greg is awesome. Let's hope he puts together a course!
Greg, you are INCREDIBLE! Your channel and GitHub are a goldmine. Thank you 🙏. At 9:09, what install on Mac is necessary to assess methods like that?
Also, I’ve been trying to make some type of “theorems, definitions, and corollaries” assistant which extracts from my textbook all the math theorems, definitions, and corollaries. The goal there was to create textbook summaries to reference when I work through tough problems which require me to flip back and forth through my book all day long.
But more interesting, I am struggling to create a “math_proofs” assistant. Your approach in this video is awesome, but I can’t find any of your resources in which you use markdown, or latex, or any mathematical textbook to be queried. I use MathPix to convert my textbooks to latex, wordDoc, or markdown. But when I use my new converted markdown text, despite working hand-in-hand with the lang chain documentation, I still fail to get a working agent that proves statements.
I feed the model:
“Prove the zero vector is unique” and it replies nonsense, even though this proof is explicitly written in the text. It is not even something it had to “think” to produce (simple for the sake of example, these proofs are matrix theory so they get crazy). Could you please chime in?
Pulling all of that information out could be tough. I have a video on the playlist around "topic modeling" which is really just pulling structured information out of a piece to text. That one may be what you're looking for
this helped me a lot, thanks, for the updated code in description as well!
This is such a game changer. Can’t wait to hook all of this up to GPT-4 as well as countless other things
Nice! What other ideas do you think it should be hooked up to?
Thumbs up and subscribed.
Thank you soooo much I am using this knowledge soo much for my school projects.
Awesome tutorial, brief, and easy to understand. My concern is data privacy, what happens with the data we turn into embeddings by using OpenAI, is that data used by them? Do they train further their models with that data? Can someone please answer if you have info on this privacy topic.
Nice video. i tweaked the code and split the index part and the query part so that i can index once and keep querying - like how we would do in the real world. Nicely put together !!
Hello, Do you have an example of how you did that. This is the part that I have become confused about how to reuse the same indexes. Thanks
Can you pls provide an example?
Thank you very much for doing this. It's absolutely awesome!!! Also can you do a video on how to improve the quality of answers?
Got to say, you are awesome! Keep up the good work, you got a subscriber here!
Nice! Thank you. I just ordered upgrades for my recording set up so quality will increase soon.
This is gold ! please do another one with data in Excel or Google sheet please :)
This is really cool but i havent yet seen a query for a specific information store (in your case, a book) that chatgpt cant natively answer. For example i queried chatgpt the questions you asked and got detailed answers that echoed the answers you received and then some.
heeeey! Loving this! Greg, I'm running an e-commerce site. We've got a metric shit-ton of products and endless amounts of purchase data. It would be extremely interesting to see how we could work with this to get all our product data loaded into Pinecone and then be able to query it in some meaningful sense. I guess a lot of the comments are in a similar vein. Would be super cool to get a video on that. I could supply some product data from our shop if need be.
Nice! What would be the business use case or problem you'd be trying to solve?
@@DataIndependent So I'm running a shop for car parts and equipment for cars. I think, from a consumer point of view, it would be amazing if we could solve two major issues. 1. If you're browsing for something to solve your problem rather than an actual product. Say that you have some stains on your car. It would be amazing if you could just ask the Friendly Chat Support how to deal with the issue and the support AI would have all the information about all our products and all the content that we've written at hand. And could go "Yeah, so you would use this product and go about it in so and so manner". 2. It would be super cool if it also had access to user data and past purchases etc. And go "Hey.. last time you bought this and this. How did that work out for you? From 1 to 10 how much did you love it?" etc etc. -- It feels like this is a scenario that is predicated on the idea that the AI has very specific knowledge
Awesome example, thanks for putting this together!
Nice! Glad it worked out. Let me know if you have any questions
This is definitely cool, thank you. There seem to be several dependencies left out. It would be great if all dependencies were shown or listed...
ok, thank you and will do. Are you having a hard time installing them all?
@@DataIndependent hey I'm stuck on the dependency part as well
Amazing content man , love the diagrams and how you deliver ,absolutely professional .
quick question , is the text returned by the chain is exactly the same from the book or does the openAI engine make some touches and make it better ?
Great video!! Loved your explanation. Could you create another video on how to estimate the costs? Is the process of turning the Documents to Embeddings using OpenAI running every time you make a new question? or just the first time? Thanks!
Pinecone is basically a search engine for ai. It doesn't need the entire book but just segments of it instead. This saves a lot of tokens cause only segments of information end up in the prompt.
Like adding some information into gpt's short term memory
Great! What are the limits? How many pages can it handle, and what are the costs?
However many pages you want. It just storage space. Check out pinecone's pricing for more
Nice!
I was working with pinecone / gpt code recently that gave your chat history basically infinite memory of past chats by storing them in pinecone which was pretty sweet as you can use it to give your chatbot more context for the conversation as it then remembers everything you ever talked about.
Will be combining this with a custom dataset pinecone storage this week (like a book) to create a super powered custom gpt with infinite recall of past convos.
Would be curious on your take, particularly how to keep the book data universally available to all users but at the same time keeping the past chat data of a particular user totally private but still being able to store both types of data on the free tier pinecone which I can see you are using (and I will be using too).
Nice! That's great. Soon if you have too much information (like in the book example above), you'll need to get good at picking which pieces of previous history you want to parse out. I imagine that won't be too hard in the beginning but it will later on.
@@DataIndependent Doesnt the k variable take care of this? It only returns the top k number in order of relevance that you end up querying.
Or are you talking about the chat history and not the corpus?
I see no reason why you would not just specify a k variable of 5 or 10 in regard to the chat history too. For example if a user was seeking relationship advice and the system knew their entire relationship history and the user said something like "this reminds of of the first relationship that I told you about", it would be easy for the system to do an exact recall of the relationship, the name of the partner and from there recall everything very quickly using the k variable on the chat history.
I use relationships as an example because I just trained my system on a book that I wrote called sex 3.0 (something that gpt knows nothing about) and I am going to be giving it infinite memory and recall this week.
@@PizzaLord Yes, the K variable will help w/ this. My comment was around the chance for more noise to get introduced the more data you have. Ex: More documents creep in that share a close semantic meaning, but aren't actually what you're looking for. For small projects this shouldn't be an issue.
Nice! That's cool about the project. Let me know how it goes.
The langchain discord #tools would love to see it too
@@DataIndependent Another thing I will look at, and I think it would be cool if you looked at it too, is certain chat questions triggering an event like a graphic or a video link being shown where by the video can be played without leaving the chat. This can be done by either embedding the video in the chat response area or by having a separate area of the same html page which is the multimedia area or pane that gets updated.
After all the whole point of langchain is to be able to chain things together, no? Once you chain things together you can get workflow.
This gets around one of chat gpts main limitations right now which is that its text only in terms of what you can teach it and the internet loves its visuals and videos.
Once this event flow stuff is in place you can easily use it to flow through all kinds of workflow with gpt at the centre like collecting data in forms, doing quick survey so you can store users preferences and opinions about what they might want to get out of an online course that you are teaching it and then storing that in a vector DB. It can become its own platform at that point.
@@PizzaLord You could likely do that by defining a custom tool, grabbing an image based off a URL (or generating one) and then displaying in your chat box. Doing custom tools is interesting and I'm going to look into a video for that.
Thanks for sharing. Could you elaborate on why you didn’t use overlap?
This is great, thanks! have you thought about how to extend it to be able to CHAT about the book? (as opposed to a question at a time). I am running into problems figuring out when to keep a chain of chat and when to realize its a new or related question that needs new pulling of similar docs
Really clear, useful demo - thanks for sharing
Great tut, thank you. Any advice on vectorizing a ton of widely varied documents? How many qa chatbots? One per index?
Hm how many chatbots will depend on your product use case.
I would put them in the same index, but make sure your metadata is explicit so you can easily filter with them
@@DataIndependent Thank you
great video. thanks so much.
How do you query the index without creating the embeddings all the time? is it possible?
thanks
Hi, i found this : docsearch = Pinecone.from_existing_index(index_name, embeddings)
Thank you for the excellent tutorial. I have a few questions to ask.
How can I pre-filter the vector in multiple document situations?
Secondly, I am not familiar with using Pinecone. How should I determine the optimal settings for dimensions and metrics in multiple documents?
By the way thank you so much again.
Thank you!
> How can I pre-filter the vector in multiple document situations?
Check out this code line, it has an argument where you can pass a filter to metadata github.com/hwchase17/langchain/blob/3c2468452284ee37b8a88a20b864255fa4385b65/langchain/vectorstores/pinecone.py#L132
Dimensions will be the number of values in each of your vectors. So the optimal one is what your embedding engine recommends and outputs.
@@DataIndependent Thank you for your previous response. I have an additional question regarding the language setting in Langchain. I am currently working on a Korean I/O based project, I would like to modify my language settings to receive ChatGPT's responses in Korean. How can I apply these changes?
Great tutorial, I wonder how to generate questions based on the content of the book? I would probably have to pass the entire content of the book to the GPT model.
I would love to see a video on the limitations of RAG. For instance say you have a document containing a summary of each country in Europe. Naturally one of the facts listed for each country would be the year they joined the EU. Unless explicitly stated, RAG wouldn't be able to tell you how many countries there are in the EU. I would love to see a tutorial on working around that limitation.
nice! That's fun thanks for the input on that.
You're right, that isn't a standard question and you'll need a different type of system set up for that
Hey, Greg! I'm trying to connect the dots on GPT + langchain and your videos have been excelent sources! To give it a try, I'm planning to build some kind of personal assistant for a specific industry (i.e. law, healthcare), and down the road the vector database will become pretty big. Any guideline on how to sort the best results and also how to show the source of where the information was pulled from?
Nice! Check out the langchain documentation for "q&a with sources" you're able to get them back pretty easily.
Your videos are amazing. Keep it up and thanks!
Thanks Philip. Anything else you want to see?
@@DataIndependent I'm curious what's a better option for this use case and would love to hear your thoughts. Why LangChain over Haystack? I want to pass through thousands of text documents into a question answering system and am still learning the best way to structure it. Also, an integration into something like Paperless would be cool!
I'm a total noob so excuse my ignorance. Thanks!
@@philipsnowden I haven't used Haystack yet so I can't comment on it.
If you have 1K text documents you'll definitely want to get embeddings and store them, retrieve them, then pass them into your prompt for the answer.
Haven't used paperless yet either :)
@@DataIndependent Good info, thank you.
@@DataIndependent Could you do a more in depth explainer on this? I'm struggling to take a directory of text files and get it going. I've been reading and trying the docs for langchain but am having a hard time . And can you use the new turbo 3.5 model to answer the questions? Thanks for your time, have a tip jar?
Thanks Greg for the great work. I actually ran some Q & A with a financial reporting (PDF) based on your examples. While the model did really great for text, it struggled with structured financial data outlined in tables, as typical for financial reporting. Do you think that can be improved further down the line (I assume that's something Open AI has to address in their LLM and not necessarily LangChain)?
For those examples it's all about the data preprocessing. The information is there, my guess is it's hard to read in table form though.
Yes I'm hoping that there is more support for this in the future.
You might try TabLLM. It may complicate your process but it can reformat tables to be understood by an llm
Wolfram Alpha?
BloombergGPT?
This is awesome! Thank you very much for the video. One quick question. How much did this cost with OpenAI and Pinecone API usage?
Pinecone at the time was free, openai was a couple cents
Love the article, I have few questions
1) after finding the relevant docs with highest cos similarity score, what's happening when you call OpenAI API? Is it summarising all the 5 docs together? Or are you doing few-shot with 5 docs as examples for prompt?
2) I would like to understand the shortcomings of having to divide into segments/documents - for example, if sentences containing same context gets cut into two docs and only one is included in the shortlist of similar docs, wouldn't some information go missing?
Really love your video and you made it so easy to understand, but would love know your thoughts on these! Thanks :)
Here ya go:
1) This depends on the "chain_type" you specify. There are a few ways to do it. Check out my video on "workaround openai token limit" for more information on it
2) I agree with your point. It's not ideal to split on physical characters because you might split context like you're mentioning. What we are *really* trying to do is group together meaning and we're using sentences and paragraphs to split on this.
My prediction is we'll soon see semantic splitting that groups together ideas or is more intelligent than just character splitting.
Thanks!
When you stuff all the docs together into prompt and make and OpenAI API call, what is GPT doing? Is it just summarizing the docs in the prompt or is it doing few-shot learning to answer the question?
@@abcd-zi4kc In this example you're telling OpenAI to answer a question given context (which are the similar documents you've retrieved). We don't give any examples.
Got it, thanks!
Hi Greg! Thanks so much for the video! I am wondering what OpenAI embedding model you used, and what OpenAI chat model you used, and where can I find that in the code? Additionally, is there a way to view the cost of querying in terms of tokens consumed? Thanks!
For embeddings I just use openai's ada-002 model.
For chat model, if one isn't provided, then it's gpt-3.5 (as of today), I used the default so you won't see it unless you check out the langchain source code
Another great tutorial Greg!
Curious if you've played around with Faiss. And if so, what you think of Pinecone vs Faiss?
Yep! I've played around with it and love it for local use cases. I had a hard time w/ a supporting library in it the last time I used it
@@DataIndependent Pinecone was getting expensive for us, so we're trying out Faiss now
Thanks as always Greg!
Awesome thank you
Great overview, thanks for the video. I have a question that I was unable to find an answer. Let's assume you asked a question that book does not cover. How do you fall back to the OpenAI knowledge base? Does it mean that in my case I will get 0 documents back? And just run an API call to openAI?
Depends on how you instruct it in your prompt. You could tell it to answer with "I don't know" if it doesn't know. Or tell it to make things up
I am getting Index 'None' not found in your Pinecone project. Did you mean one of the following indexes : langchain1 for below line
docsearch = Pinecone.from_texts([t.page_content for t in texts], embeddings, index_name)
Any idea what the issue could be. I checked index_name variable is set correctly as langchain1
Can you talk about the costs of using OpenAI API for this? Is this like the cost of fine-tunning and usage with own data or the use of embbeding (ADA)?
Check out the pricing page on OpenAI for a detailed breakdown. I might do a video on it since it's a good topic to understand
Thank you - Super helpful to understand how to use external data sources with OpenAI. What are some of the limitations of this approach i.e. size of content being indexed in pinecone, any limits on correlating and summarizing data across multiple documents/sources, can I combine multiple types of sources of information about a certain topic (document, database, blogs, cases etc.) into a single large vector?
This is great! thanks, do you have a video that shows how to connect what you did to a chatbot interface?
Not currently but this is on the horizon - I'll make a post on this channel in a few weeks
Hey Greg, great video!
Do you know if it's possible to automatically create a pinecone db index from code?
So that you don't have to create them manually
Would using this methodology be a good way to build up a Q&A Body of Knowledge ontop of a businesses SOP documents? Allowing newcomers to a company to query best-practice protocols thorugh a query system - negating the need to always go to their manager?
Big time, it's a great starting point. If you need more advanced retrieval techniques you could try out one of these: github.com/gkamradt/langchain-tutorials/blob/main/data_generation/Advanced%20Retrieval%20With%20LangChain.ipynb
@@DataIndependent Thank you!
Thanks for this very helpful practical tutorial!
Great video, how do I call the embeddings from pinecone next time I run the application (instead of having to generating them again via openai at a cost)?
Great Question. Did you ever get a response? I am looking for the same thing
thank you Greg! very helpful tutorial!!
Thanks Guiliana!
Hey, great video! What do you mean when you say that it's going to be more expensive with additional documents? What drives the cost?
Thank you!
Excellent...!! Just one question, Once we load data is this data now belongs to OpenA/ChatGPT. ? In other words can they use this uploaded book data to answer questions that other users may ask?
Is it a fine tuned model ? Because if not we will charged high for using openai api.
Please make a video on fine tuned langchain openai ai model like text-ada-001
I have a doubt. Please help me in this.
I am trying to create a chatbot in which I provide companies information and it will refer that information and provide answer.
Currently I was trying to achieve this by fine-tuning the openai gpt model but not getting the desired results.
How much I have understood that this technique will work for the above use case.
Am i right?
Yes, it would help with that. You just need to pass your company's documents into the loader
@@DataIndependentThank you for the reply!
What about a video on hosting this on AWS and adding a Front end to make it accessible to clients?
I have a video about building a simple web app in 23 minutes using streamlit which may help! If not then vercel seems like another good option. Soon pynecone will be once they add hosting.
Great video! so what if my question is out of context of the pdf document? Will the open ai answers it from its generic knowledge? Or it will simply say that it doesn't know the answer? Either way can we configure it to respond the way we want?
Awesome! How do we query multiple documents? Perhaps multiple 300 page books or 30 100 pages PDFs?
You could utilize filters or simply load up all those docs if you don't care about sources
Great tutorial! Though what would happen If you load the same document. Would the vectorstore store it again? Reason I ask is ,I would like to build something similar but would want to prevent the vectored db be populated with the same vectors and waste embedding then again. Thanks jn advance
My one regret with this video is not making it clear how to query the docs again without having to reload the embeddings. Yep it's possible.
github.com/hwchase17/langchain/blob/4654c58f7238e10b35544633bd780b73bbb75c75/langchain/vectorstores/pinecone.py#L250
@Data Independent this makes more sense! Create an index based on the PDF file or files. Then when the PDF is uploaded again just use the stored index. Though you would have to connect the index with the uploaded PDF(s) e.g. concatenate PDF name or hash them somehow. I know how to get this done with a vector library e.g. Chroma
thank you for this series. I'm confused about one thing. When querying the db, you passed the text, not its embedding. How does pinecone know how to embed the text?
Is there any limit on the number/ size of the documents that can be uploaded so that the model performs efficiently? I am guessing with larger size, cosine similarity search might take higher computational time
ya it likely would take longer. I haven't seen a limit yet. At that point it's an engineering problem rather than LLM/LangChain situation
Thanks for the tutorial series! May I ask could I work with multiple different PDFs at the same time (except combining them?)?
In LangChain is "similarity search" used as a synonym for "semantic search", or they are referring to different types of search?
To my knowledge similarity search focuses on finding items that are similar based on their features or characteristics, while semantic search aims to understand the meaning and intent behind the query to provide contextually relevant results
Loved it. 1 Question, what model of openai does this approach uses? For example, davinci etc?
Very helpful Video, Thank you!
if I already have some embedding vector stored in pinecone, I don't need to embed again, how can I modify the following code ''docsearch = Pinecone.from_texts([t.page_content for t in texts], embeddings, index_name=index_name)'' and use docsearch.similarity_search() in the next step?
Well this indeed is the unanswered question. Unfortunately that is the problem with Jupiter Notebook cells.
Great video , I am wondering is there way to use the PDFs which made from photocopy of the document ( need to convert image to text )
It’s incredible instructions. In my case, I have some documents in Vietnamese language, will Pinecone support utf8 ? OpenAI + langchain + pincone,.. very helpful in many fields especially in customer services
Awesome video! What I don't understand is, shouldn't the query on line 14 be embedded? Shouldn't it be a vector, rather than a string, to query the vector db?
Additionally, how can this be done on an existing database. I.e. how to replace docsearch with any pinecone index?
It’s crazy that you can ask books questions like they’re people now.
Kind of new to LLMs and all of this, so I'm not sure if I'm asking the right question here. But my concern with doing this is hallucinations. Could asking a question that is outside of the scope of this data science book still result in hallucinated answers? I've been doing a lot of reading about retrieval augmented generation and how that mitigates hallucination, so is that what's going on here? Great video!!
Yes, it definitely could. There are hallucination techniques you could try. Sometimes simply just putting in the prompt "don't make anything up" works.
Ok that's good to know! Do you have any thoughts/insights about using RAG tools like Vectara to mitigate hallucinations? Is there any way to really eliminate hallucinations or is that just a pipe dream @@DataIndependent
In 1994 Richard E. Osgood created a conversational reading system called "ASK Michael" for Michael Porter's book "The Competitive Advantage of Nations". Please let me know when you can automate the conceptual indexing and question-based indexing of a book including the creation and categorization of relevant questions that a novice that doesn't know any keywords or relevant vocabulary can ask.
It's really a great video to get start with langchain. I have a small confusion here. what if I want to send all the similar docs to the llm model not just k=5. Is there a way to deal with it?
great video, one question for chatbot , how about the same or similar questions send to robot from different client? does the cost double, how to optimize it ?
Love this brother!
Great video series. After you have created embeddings and uploaded them to Pinecone; what if your source data changes. How would you update the embeddings without first deleting them and re-embedding the entire source again which would become expensive??
Is there a way to only create embeddings on the changed parts etc ??
Since each embedding tied to a document you can check diffs on your docs and only update when a doc changes.
Each embedding has metadata you’ll need to keep track of. It’ll be a hassle but doable
@@DataIndependent I’ve seen another python library called llama-index claim that they can do updates automatically so I’m figuring out how to do it in langchain.
Seems you can integrate the two. Use llama-index to manage your documents and embeddings and pass that as a tool to a langchain agent.
The composability here is very nice!!
How does langchain wraps the history of the chat ? Or it doesn't ?
Internally, how does it send the prompt to OpenAI ?
Thanks for the amazing tutorial
thanks for the great content! do you know how to better control the cost of having such a retrieval-based chatbot? Based on my experience, it is quite costly to run QnA on just the simple pdf that provided in LangChain repo, using default embeddings and llm models provided from the langchain example
Great stuff! What GUI wrapper do you recommend?
Ok, so maybe I misunderstand this one. I used the full text of War and Peace, just to test. My query was "How many times does the word 'fire' appear in War and Peace?" and when it finishes running there is no output... is this not the right set up for that kind of question?
Then, I set the query to 'What are the main philosophical ideas in War and Peace?' and also returned nothing. Didn't error out. I double checked and all my code is good.
Ah yes this is a fun question.
So LLMs won't be good at counting words like you're describing. That's. a task they aren't well suited for yet. I would use regular regex or a .find() for that
The 2nd question is also hard, you need to review multiple pieces of text in the book to form a good opinion of the philosophical ideas.
Just doing an similar embedding approach won't get you there.
If you wanted to answer the philosophical question I would do a map reduce or refine with a large context window. However war and peace is huge so that would cost a lot.
Excellent video! Thanks for this!
Is there a way to use conversational memory while doing generative Q&A?
Big time - check out the latest webinar on this exact topic. It should be on the langchain twitter
Hi there, thanks for the great video! It worked beautifully for my first query, but I want to be able to access the embeddings from pinecone without having to generate the texts variable every time as my file is massiveee! Would you recommend saving the all the split documents in a file, or is there a way to continuously query the same docs using just pinecone? Sorry for the convoluted question
It's all good, you'll want to load the index from an existing source. Check out persistence in the documentation which has details on it
@@DataIndependent love you so much 🫡
@@Mohammed-lo7xr simple load: docsearch : Pinecone = Pinecone.from_existing_index(index=index, embedding=embeddings, namespace='pdf-test')
Thanks for this great video but I have a question. Does this method support the user case to create a summary of the book. What if I would ask: Give me the the overview of all chapters including a small summary. According to what I see that is not a good use case for this architecture because the query must focus on the entire book rather on a specific place. Or am I mistaken? Thanks, Michael
I would do a bit of pre processing of your data and make a summary of each chapter individually and then add them all back together.
Asking it to do the whole thing may get it overwhelmed
hey. great work. question. If I want to summarize a very large document, Could I split it into multiple documents and create embeddings to create summary? what is the best practice?
Nice! This is a great question - I have a whole video on summarization methods depending on text length.
Check out “5 levels of summarization”