For real! I'm a visual learner, and if I can't visualize something in my head then it's hard for me to understand what's going on. I've watched so many videos from others that are similar to this and I got so much more out of this one just from the opening 5 minutes than I did from the hours and hours of other videos I watched on this topic.
Amazing job! Even as a non-coding attorney, I was able to follow your instruction (with a little help from GPT) and install the code from Github on my Mac, and it worked flawlessly! Occasionally, I get some incorrect or incomplete answers from the chatbox, but it's amazing nonetheless. I sent you a message through Twitter regarding consulting with my developers. Looking forward to hearing from you. : ) Thanks again!👏👏👏
Do you feel an app that automatically does this for you would have value ? App would contain all these cases and you'd probably only have to ask questions. Also, for example if you search for cases,it would give out similar cases,and say score how similar each of these cases are ? Also, for example if another feature could predict a new case's possible result and potential areas that could a lawyer focus on in terms of one's argument, would this help as well ?
I am on the Weaviate Slack forum and I see newbie questions all the time from people struggling to understand the relationship between embeddings (vectorization) and chat completion. I've been referring them to this video for an overview of the process. Good work!
If you're running into errors on "ingesting" the data due "Error calling upsert", I posted a list of major culprits to troubleshoot here: github.com/mayooear/gpt4-pdf-chatbot-langchain/issues/4 It's likely you haven't properly configured your Pinecone settings or your insertions above Pinecone's size and insertion limits per upsert request.
This is an awesome tutorial ! Hands-on especially for the clear and very well schematised system design :) Subscribed and will wait for upcoming videos !
Great explanation! So can this chatbot only find the answer to a question if it is included in a single chunk of the original data? So if some question requires information from page 2 and page 30 it can't answer that question correctly?
Great video! Would love to see this done with a notion database using langchain. Perhaps outputting long text into additional motion pages that are then added to pinecone so that it creates a veritable limitless memory of previous responses, answers, and structured content for the usecase
❤ this is the first time I've been able to comprehend the whole chatbot process. Please please can you share this diagram flow chart. Its honestly the best visually and i woild love to be able to play around with it.
Great App! I see the potential of how powerful it can be. Your video is excellent, I got the script up and running and I am a beginner in learning code. Thanks for the tutorial
Thanks for sharing the code and demo. Nitpicking a bit but in the video, under the "Visual Overview" section the "Ingestion pipeline" part of the diagram does not show use of OpenAI Embeddings.
Was replicating this project and encounted this error after running npm run ingest: error TypeError: Cannot read properties of undefined (reading 'text'). does anyone know a fix for this?`
Great work!. Few questions, if you can reply plz. 1) how can ask it to look answers in particular file only, rather than all ? 2) what about adding files later, do we need to ingest all, again?. Thanks
Hi I am a beginner and just wanted to experiment and play with the program. I followed most steps and I have reached the npm run dev step and opened local3000 to the chat bot, when I test prompted "hello" in the chatbox , it returned this error: PineconeClient: Error calling query: PineconeError: The requested feature 'Namespaces' is not supported by the current index type 'Starter'.
Great video! I have two questions: is there any way to prevent the chatbot from coming up with an answer? and is there any structure the input data must follow for better results? My chatbot some times sends the wrong answer or completely makes up an answer.
Amazing video, thank you very much! I am trying to get it running with an Azure OpenAI model but get errors there all the time, do you happen to know what has to be changed for that to work?
Yes, it's a similar approach, you'd just need to more prompt engineering to prime GPT to be an expert "mathematician" and then also provide examples of problems and solutions.
Is it possible to use chroma instead of Pinecone ... what would be the downsides of chroma ...? I am asking because chroma is open source and I can store the embeddings on premise as well as in the cloud. Furthermore chroma is free.
I need to use version 4 of ChatGPT and upload tens of thousands of documents If I upload documents in version 3.5, can I use them in version 4 without paying?
What if you want to use contious data. For intance I want to create a model that uses embeddings from a csv. This csv file needs updated daily so that it has the most relevant information. How would I update the embeddings with new information from a csv without having create a new embedding everytime I had to do so?
I'll do a video on this soon, but you'd "upsert" the vectorbase only with the new embeddings from the csv file without re-running the ingesting from scratch. Where it gets tricky is when you want to "update" a specific part of your csv file.
Hi, Great app. I have 2 questions. Apart from gpt3.5 turbo and gpt 4 can we use davinci? I'd like for the AI to reply in spanish, and I don't have access to gpt 4 yet. And.. is there a way not to limit the answers to the pdf and use Chat GPT as well?
I wonder if this is the way the new AskYourPDF plug-in works? It clearly can take PDF documents longer than the token window, I guess because it creates these chunks and embeddings, the issue I have with it is that it messes up when searching for the right embeddings, it does not really retrieve the correct info I want. For example, I give it a pdf of a research paper and I want a summary of just the introduction, which is just 2 pages long, and it gives me a mixture of information from around the entire document, not just the introduction. Because I'm guessing it retrieves chunks from all around the document based on similitude to my prompt. It works better when I ask for specific concepts, but when I want it to be more sequential and thorough, like summarize each subsection of a section of the document, it skips some or grabs some stuff from another section, probably because those bits and pieces are on different chunks that still have close relation to my prompt.
Hi thanks for your sharing. I came across one error message is creating vector store... error [Error: PineconeClient: Project name not set. Call init() first.] d:\chatPDF\gpt4-pdf-chatbot-langchain\scripts\ingest-data.ts:43 throw new Error('Failed to ingest your data'); but I have set my index name at Pinecone as "ChatPDF001" and set this value in the config file. Any suggestion here? thanks.
Thank you for the Awesome tutorial, Please I want to use the ingest script as a function in the index file in order to upload a file and use it instead of passing a static path. Thanks in advance I already tried, but it's a litle bit complex with next js and typescript. Please if you can help
This is a great flowchart. However, I still don't understand why, in your architecture, you: "chat history + new question --> submit to gpt to generate new standalone question --> vector search with new standalone question" instead of "chat history + new question --> vector search". We now have three OpenAI api calls per query (creation of standalone question, vectorization of question and standalone question + context documents). And, what goes into the chat history? The questions, context documents submitted and responses OR just standalone questions and responses?
If you've used a lot of LLMs like I have, you'd likely intuitively KNOW that you should do it this way. You develop a feel for what's going to produce the best results. But anyways, here are some specific reasons (courtesy of gpt3.5): 1. Avoidance of context collapse: By creating a standalone question that combines the recent chat history with the new question, the app is able to avoid "context collapse", where the model becomes confused by different topics and the resulting response may not be relevant to the specific question being asked. By creating a standalone question, the app can ensure that the model is focused on the specific topic at hand. 2. Reduction in computational costs: By creating embeddings based on the standalone question, the app can reduce the computational cost of vector search, as the search is only being done on a single, well-defined question rather than the potentially complex and convoluted chat history. This can speed up the response time of the app and improve the overall user experience. 3. Improved accuracy: By creating a standalone question, the app can ensure that the model is focusing on the most important aspects of the conversation and the specific question being asked. This can improve the accuracy of the resulting response and make the app more effective at providing useful information to the user.
@@sissyrecovery I have only worked with the OpenAI LLMs, and not for very long, so this response is extremely helpful and enlightening for me. I am using the Weaviate GraphQL query architecture, so I've got another step after I get the standalone question: I've got to extract from it the core "concept" of the question in order to retrieve the relevant docs to submit as "context" to the LLM. Sheesh! Again, thanks for taking the time to respond.
@@sissyrecovery I saved your comment to review as I was coding my app. It is only a prototype at this point, but it is executing chat completions successfully, standalone questions and all! Just wanted to come back and say "Thank you" for helping me figure this out.
i dont know exactly, looks like pinecone sometimes give wrong file if you using large different LAW CASE document, i use supabase as vector, i think its more accurate
Hey I changed the PDF on my docs folder and have been getting "Warning: Invalid stream: "FormatError: Bad FCHECK in flate stream: 120, 239"" after that change. The new PDF turns entirelly blank after I run ingest. Plese help guys :/
I've written a prompt for GPT-4 that I use with chatGPT in Macromancy formatting to transform it into a legal assistant. and the results have been stellar. Is it possible to encode this prompt into the system you describe so that the bot operates with it in mind?
I ran the command "npm run ingest" and got the error The requested module '@esbuild-kit/core-utils' is expected to be of type CommonJS, could anyone who can help?
I don't understand how the server works, I'm trying to replicate this using react but can't understand how the chat component "knows" about the api? Is this a next feature that I don't understand? Also thanks for the video!! Amazing content!
how to work with uploaded documents , if the user uploads document and gets a trained chatbot on it
Год назад+1
Loved your video, certainly learnt few things out of there, Im not programmer, github, it's something completly new for me, it's clear the explanation of the code and how things basically work, amazing, sadly I didn't manage to make anything happen. Probably the only thing I made happen was to clone the repo send it to visual studio and that's it. The instructions pnpm install and so on, dont know where to start. If there's any video or explanation how to make this work so I can move forward I would appreciate it. Thanks anyway for the great tips.
The dimensions are 1536 due to openai's embeddings function, but isn't related the size of your pdf, it's related to the number representations of your docs. If you don't set this figure, pinecone won't let you insert into the database. You set this when you created an "index". docs.pinecone.io/docs/choosing-index-type-and-size
Great video, how would you gather the text from a PDF that doesn't encode the text, so its essentially a picture as PDF, would you have to implement some OCR system to extract the text from the PDF to then encode the embeddings? Essentially I have a PDF document that when parsed the text output is ` pageContent: '
Yes, you won't have control over customizing your prompts and there will be upload limits. Using this open source repo, you can upload PDF docs without outages, limits, privacy concerns etc.
I see the following problem with this: Imagine a text like an earning transcript where the company doesn't say our revenue was x$, but rather talks about all their different venturers separately and lists the according revenue. Like this: in the cloud business we made a revenue of 13 billion... In the ai business we made a revenue of 5 billion... As I understand it, using the embeddings the ai would probably think, that the total revenue was 13 or maybe 5 billion, but wouldn't be able to calculate the whole revenue, because it would need the context of the whole earnings call for this. Is this right, or am I misunderstanding something?
Is one PDF considered one token? Or does the chat analyse each word in the document as a token? This could make a world of difference when it comes to Api modelling.
Great video. I was curious if we can follow this implementation if the data is kind of semi structured like Project Management data(both text and numerical fields). Does this approach work well only in case of Text data ? Was curious if embeddings work well for mixed data(numerical/text). How would query work if it involves multiple filters on text and numerical fields?
Are you a contributor to Langchain? Your videos are awesome, but the langchain typescript sdk can be a bit buggy as of right now. Which is fair because it's still not in a stable 1.0 release yet, and is in the early stages, but it makes it difficult to use in production.
Unless it's verifiable data from an external source the LLM can utilize an API to reach, it will need feedback from the user. Then you can optimize your prompts based on this.
Awesome open source sharing! I have make it working, but what if we want to add more Docs? like thousands of PDF files,ingest command still working good or how to set the files path(const filePath = 'docs/filename.pdf';) in the code? Thank you!
Hey thanks. Yes as I mentioned in the video, it will take quite a bit of time to explain how that would work + Q&A around, so based on viewers requests I'll cover that in the upcoming step-by-step chatbot program here: tinyurl.com/37v3k2fz
@Chat with data; I'm curious to understand how the flow (specifically in terms of grabbing the appropriate context/embeddings that are related to the question prompt) when the related context is sparse across the original document (meaning they're spread on a lot of vectors/embeddings) ?
You mean the related context belongs to more than one chunk/vector? There are several options, but one is to pass the top say 3 returned source documents (by vectorSimilarityScore) as the context, then tweak the prompt accordingly.
@@resistance_tn Yes. So by default in this codebase 'returnsourcedocuments' is set to true. So you can play around with the 'k' value for no of returned source docs and also the final prompt sent to the LLM that contains the context.
Great Work! well done i have a question : do we load the data every time to the model or it is saved? i mean each time we need to run we have to do the embdeing to chatGPT 3.5 or 4? cost wise for large data this will be costly..
‚Embeddings Ada’ Usage is priced per input token, at a rate of $0.0004 per 1000 tokens, or about ~3,000 pages per US dollar (assuming ~800 tokens per page): also you need a workaround for massive pdf’s as it’s not able to handle so many chunks (you can find a explanation on the GitHub’s issue page) .. so in my opinion for most use cases its not that costly.. Also the output of GPT is likely not that long so you just pay like 0,002 / 1000 tokens which is also pretty good
No, once you've made the embeddings it's stored on the vectorbase in permanent state and you can start running queries against it without re-running 'ingest' again. You can also backup and download your embeddings locally.
what happens with an excel file or if there is a table of numbers inside the PDF? can langchain w gpt be used to ask computational questions? can you make a video using Excel into vectors?
00:03 PDF demo (56-page Legal PDF doc)
02:05 Visual overview of pdf chatbot architecture
06:56 Code walkthrough pt.1
11:10 Pinecone dashboard + setup
13:43 Code walkthrough pt.2
You should post this in the video description in order for RUclips to use it to chapterize your video
@@yeezythabest ohhh thanks for the tip
can you make a tutorial on mac os,like step by step guide me follow please.I am a beginner in coding world
@@chatwithdata - error Error: Cannot find module 'next/dist/server/future/route-modules/route-module.js'
please help
That design chart overview was excellent! That’s what is usually missing from most tech tutorials.
For real! I'm a visual learner, and if I can't visualize something in my head then it's hard for me to understand what's going on. I've watched so many videos from others that are similar to this and I got so much more out of this one just from the opening 5 minutes than I did from the hours and hours of other videos I watched on this topic.
What tool is it?
@@benohanlon any updates?, I want to know.
@@ESGamingCentral I guess it is Excalidraw
@Oscar Llerena if you wanted to answer the question, without answering the question, then 10/10.
As a software engineer i’m very happy to say: one of the best and coincise video ever saw. Thanks. Really appreciate 🎉😊
Thank you
Amazing job! Even as a non-coding attorney, I was able to follow your instruction (with a little help from GPT) and install the code from Github on my Mac, and it worked flawlessly! Occasionally, I get some incorrect or incomplete answers from the chatbox, but it's amazing nonetheless. I sent you a message through Twitter regarding consulting with my developers. Looking forward to hearing from you. : ) Thanks again!👏👏👏
Do you feel an app that automatically does this for you would have value ? App would contain all these cases and you'd probably only have to ask questions.
Also, for example if you search for cases,it would give out similar cases,and say score how similar each of these cases are ?
Also, for example if another feature could predict a new case's possible result and potential areas that could a lawyer focus on in terms of one's argument, would this help as well ?
Yes
As a programmer who worked with attorneys I wish more attorneys were like you lol
I am on the Weaviate Slack forum and I see newbie questions all the time from people struggling to understand the relationship between embeddings (vectorization) and chat completion. I've been referring them to this video for an overview of the process. Good work!
You taught me what I needed most. thank you my messiah!
If you're running into errors on "ingesting" the data due "Error calling upsert", I posted a list of major culprits to troubleshoot here: github.com/mayooear/gpt4-pdf-chatbot-langchain/issues/4
It's likely you haven't properly configured your Pinecone settings or your insertions above Pinecone's size and insertion limits per upsert request.
What I was looking for. Everything explained so well. Way better than some paid courses.
you have implemented it,if yes can you give your repo link,i am getting pinecone ingest error.even after following the video.
yes same@@internproj
This is an awesome tutorial ! Hands-on especially for the clear and very well schematised system design :) Subscribed and will wait for upcoming videos !
I got this running and am SO excited to keep working on it. Thank you.
Love to see a more detailed strip through. Awesome vid!
I'm covering that here in the upcoming workshop: tinyurl.com/zcks7jsk
Thank you so much for the tutorial. It was very well explained. I appreciate you sharing your knowledge with the community!
Can it ingest tabular data from the PDFs accurately? Superb presentation btw
is it possible to use similar method here, to review a long code instead of a pdf doc?
Great explanation! So can this chatbot only find the answer to a question if it is included in a single chunk of the original data? So if some question requires information from page 2 and page 30 it can't answer that question correctly?
Great video! Would love to see this done with a notion database using langchain. Perhaps outputting long text into additional motion pages that are then added to pinecone so that it creates a veritable limitless memory of previous responses, answers, and structured content for the usecase
love the audio quality
❤ this is the first time I've been able to comprehend the whole chatbot process. Please please can you share this diagram flow chart. Its honestly the best visually and i woild love to be able to play around with it.
How can we manipulate the response length to generate longer responses?
Awesome Job 🎉🎉 I will be translates this material to Spanish and post on Channel. Obviously give you all credits
Great App! I see the potential of how powerful it can be. Your video is excellent, I got the script up and running and I am a beginner in learning code. Thanks for the tutorial
As a Ukrainian I have to say you picked a good example pdf! ❤
Thanks for sharing the code and demo. Nitpicking a bit but in the video, under the "Visual Overview" section the "Ingestion pipeline" part of the diagram does not show use of OpenAI Embeddings.
Was replicating this project and encounted this error after running npm run ingest: error TypeError: Cannot read properties of undefined (reading 'text'). does anyone know a fix for this?`
I didn't find the code line where you find and output the link of the location of the question. Can you help me with that? thanks
Is there a way to upsert embeddings without using fromDocuments? As pinecone free tier doesn't support namespaces anymore
Great work!. Few questions, if you can reply plz. 1) how can ask it to look answers in particular file only, rather than all ? 2) what about adding files later, do we need to ingest all, again?. Thanks
Hi I am a beginner and just wanted to experiment and play with the program. I followed most steps and I have reached the npm run dev step and opened local3000 to the chat bot, when I test prompted "hello" in the chatbox , it returned this error: PineconeClient: Error calling query: PineconeError: The requested feature 'Namespaces' is not supported by the current index type 'Starter'.
Your work is so valuable. Thanks a lot for this info and the code!
Great video! I have two questions: is there any way to prevent the chatbot from coming up with an answer? and is there any structure the input data must follow for better results?
My chatbot some times sends the wrong answer or completely makes up an answer.
Hey tou have solve the problem ?😢
I have an error:
- error Error: Cannot find module 'next/dist/server/future/route-modules/route-module.js'
Great work, waiting for more videos and tutorials.
Amazing video, thank you very much! I am trying to get it running with an Azure OpenAI model but get errors there all the time, do you happen to know what has to be changed for that to work?
I must say, you are an amazing person for doing this. So helpful and so explanatory. Id love to build with you
can you use this for math books as well?
Depend on how good the OCR is. Essentially whatever you put id converted to text
Yes, it's a similar approach, you'd just need to more prompt engineering to prime GPT to be an expert "mathematician" and then also provide examples of problems and solutions.
The upcoming Mathematica plugin integration could probably help as well
Is it possible to submit more than one book and then chat with all of them at once?
My main question is does open ai reads directly from the store or is it sending all that data up to openAI for every question?
managed to get this running locally, but on vercel I am getting a 504 SyntaxError: Unexpected token 'A', "An error o"... is not valid JSON. Any ideas?
Amazing presentation skills!
Is it possible to use chroma instead of Pinecone ... what would be the downsides of chroma ...? I am asking because chroma is open source and I can store the embeddings on premise as well as in the cloud. Furthermore chroma is free.
I need to use version 4 of ChatGPT and upload tens of thousands of documents
If I upload documents in version 3.5, can I use them in version 4 without paying?
When will enrollment open up? Would love to get used to using these tools.
Thank you- it great job, but may I ask how the chatbot bring the url links since, you are searching within the pdf. GPT 4 browsing the internet ?
Great vid. Setup complete. How did you get it on your localhost to test? BTW, anyone ever tell you that you sound like the UFC Champ Adesanya :)
why did you put 1536 into the dimensions field in the pinecone dashboard? Where does this number come from? thanks
What if you want to use contious data. For intance I want to create a model that uses embeddings from a csv. This csv file needs updated daily so that it has the most relevant information. How would I update the embeddings with new information from a csv without having create a new embedding everytime I had to do so?
I'll do a video on this soon, but you'd "upsert" the vectorbase only with the new embeddings from the csv file without re-running the ingesting from scratch. Where it gets tricky is when you want to "update" a specific part of your csv file.
Thats very nice.
I wanted to ask you,
How it is so fast in the video.
I run it locally and it is like 10-15 seconds slow
Really good explanation and diagram to get your head around the process!
Hi, Great app. I have 2 questions. Apart from gpt3.5 turbo and gpt 4 can we use davinci? I'd like for the AI to reply in spanish, and I don't have access to gpt 4 yet. And.. is there a way not to limit the answers to the pdf and use Chat GPT as well?
I wonder if this is the way the new AskYourPDF plug-in works? It clearly can take PDF documents longer than the token window, I guess because it creates these chunks and embeddings, the issue I have with it is that it messes up when searching for the right embeddings, it does not really retrieve the correct info I want. For example, I give it a pdf of a research paper and I want a summary of just the introduction, which is just 2 pages long, and it gives me a mixture of information from around the entire document, not just the introduction. Because I'm guessing it retrieves chunks from all around the document based on similitude to my prompt. It works better when I ask for specific concepts, but when I want it to be more sequential and thorough, like summarize each subsection of a section of the document, it skips some or grabs some stuff from another section, probably because those bits and pieces are on different chunks that still have close relation to my prompt.
Hi, Is there a solution to replace pinecone with cosmodb?
So, the answer needs to be quite localized on a chunk of the document… right?
Why is the LLM needed to create the standalone question from the chat history? Isn't it just a concatenated string?
Is the chat history storing the question and the answers, or just the questions?
Hi thanks for your sharing. I came across one error message is creating vector store...
error [Error: PineconeClient: Project name not set. Call init() first.]
d:\chatPDF\gpt4-pdf-chatbot-langchain\scripts\ingest-data.ts:43
throw new Error('Failed to ingest your data'); but I have set my index name at Pinecone as "ChatPDF001" and set this value in the config file. Any suggestion here? thanks.
Try and upgrade to the latest node version.
Whats the difference between the ”standalone question” and the original question, do they differ a lot?
How does full document summary work when you are working with pinecone nearest neighbor document snippets?
Thank you for the Awesome tutorial,
Please I want to use the ingest script as a function in the index file in order to upload a file and use it instead of passing a static path.
Thanks in advance
I already tried, but it's a litle bit complex with next js and typescript. Please if you can help
I am stuck in the same problem 🙂
You want to upload the file from the frontend and run the ingest script automatically?
@@chatwithdata Yes that's right, please if you can help
@@chatwithdata Yes that's right, please if you can help
This is a great flowchart. However, I still don't understand why, in your architecture, you: "chat history + new question --> submit to gpt to generate new standalone question --> vector search with new standalone question" instead of "chat history + new question --> vector search". We now have three OpenAI api calls per query (creation of standalone question, vectorization of question and standalone question + context documents). And, what goes into the chat history? The questions, context documents submitted and responses OR just standalone questions and responses?
If you've used a lot of LLMs like I have, you'd likely intuitively KNOW that you should do it this way. You develop a feel for what's going to produce the best results. But anyways, here are some specific reasons (courtesy of gpt3.5):
1. Avoidance of context collapse: By creating a standalone question that combines the recent chat history with the new question, the app is able to avoid "context collapse", where the model becomes confused by different topics and the resulting response may not be relevant to the specific question being asked. By creating a standalone question, the app can ensure that the model is focused on the specific topic at hand.
2. Reduction in computational costs: By creating embeddings based on the standalone question, the app can reduce the computational cost of vector search, as the search is only being done on a single, well-defined question rather than the potentially complex and convoluted chat history. This can speed up the response time of the app and improve the overall user experience.
3. Improved accuracy: By creating a standalone question, the app can ensure that the model is focusing on the most important aspects of the conversation and the specific question being asked. This can improve the accuracy of the resulting response and make the app more effective at providing useful information to the user.
@@sissyrecovery I have only worked with the OpenAI LLMs, and not for very long, so this response is extremely helpful and enlightening for me. I am using the Weaviate GraphQL query architecture, so I've got another step after I get the standalone question: I've got to extract from it the core "concept" of the question in order to retrieve the relevant docs to submit as "context" to the LLM. Sheesh! Again, thanks for taking the time to respond.
@@sissyrecovery I saved your comment to review as I was coding my app. It is only a prototype at this point, but it is executing chat completions successfully, standalone questions and all! Just wanted to come back and say "Thank you" for helping me figure this out.
@@SwingingInTheHood great to hear, glad I could help!
great video! waitinig for more with Langchain 🙌
i dont know exactly, looks like pinecone sometimes give wrong file if you using large different LAW CASE document, i use supabase as vector, i think its more accurate
Hey I changed the PDF on my docs folder and have been getting "Warning: Invalid stream: "FormatError: Bad FCHECK in flate stream: 120, 239"" after that change. The new PDF turns entirelly blank after I run ingest. Plese help guys :/
Hey can you tell me whats your microphone setup? It sounds amazing
Awsome tutorial 🎉, I wanted to create a similar project. This will give me the head start on this project. ❤
hi, thanks a lot for this video. Can you please tell me how to implement the output of messages not in English? e.g. in French
I've written a prompt for GPT-4 that I use with chatGPT in Macromancy formatting to transform it into a legal assistant. and the results have been stellar. Is it possible to encode this prompt into the system you describe so that the bot operates with it in mind?
Is there a step by step by step tutorial on how to install this on a mac?
I ran the command "npm run ingest" and got the error The requested module '@esbuild-kit/core-utils' is expected to be of type CommonJS, could anyone who can help?
How can I make it use links like you did instead of pdf?
What's the tool to do this discussion?
How do we think this would do with a PDF of a long fiction novel? TY!
I don't understand how the server works, I'm trying to replicate this using react but can't understand how the chat component "knows" about the api? Is this a next feature that I don't understand? Also thanks for the video!! Amazing content!
how to work with uploaded documents , if the user uploads document and gets a trained chatbot on it
Loved your video, certainly learnt few things out of there, Im not programmer, github, it's something completly new for me, it's clear the explanation of the code and how things basically work, amazing, sadly I didn't manage to make anything happen. Probably the only thing I made happen was to clone the repo send it to visual studio and that's it. The instructions pnpm install and so on, dont know where to start. If there's any video or explanation how to make this work so I can move forward I would appreciate it. Thanks anyway for the great tips.
so do i
How do you set the dimensions in pinecone? is just a sufficient number for a large pdf or it came from a calculation? Great work btw!
The dimensions are 1536 due to openai's embeddings function, but isn't related the size of your pdf, it's related to the number representations of your docs. If you don't set this figure, pinecone won't let you insert into the database. You set this when you created an "index". docs.pinecone.io/docs/choosing-index-type-and-size
@@chatwithdata ..... now I'll subscribe to your channel. Important detail. thanks
can it return images if you ask to the chatbot?
Great video, how would you gather the text from a PDF that doesn't encode the text, so its essentially a picture as PDF, would you have to implement some OCR system to extract the text from the PDF to then encode the embeddings? Essentially I have a PDF document that when parsed the text output is ` pageContent: '
',`
is it possible to use gpt 3.5 instead?
With the new plugin announced, will this be relevant in a few months?
Yes, you won't have control over customizing your prompts and there will be upload limits. Using this open source repo, you can upload PDF docs without outages, limits, privacy concerns etc.
Its available ? you have a link ?
May I ask how you jumped to the interface of 6.56? Please ---From a junior high school student who is humble in learning
I have watched this video no less than ten times but still can't figure it out
I see the following problem with this: Imagine a text like an earning transcript where the company doesn't say our revenue was x$, but rather talks about all their different venturers separately and lists the according revenue. Like this:
in the cloud business we made a revenue of 13 billion...
In the ai business we made a revenue of 5 billion...
As I understand it, using the embeddings the ai would probably think, that the total revenue was 13 or maybe 5 billion, but wouldn't be able to calculate the whole revenue, because it would need the context of the whole earnings call for this.
Is this right, or am I misunderstanding something?
Is one PDF considered one token? Or does the chat analyse each word in the document as a token? This could make a world of difference when it comes to Api modelling.
The PDF is converted to numbers (embeddings). It is these embeddings that are analyzed and used to retrieve the relevant docs.
Could using embeddings potentially lead to information leakage?
For the textsplitting part, do you recommend to tokenise the text first?
The textsplitter function in the repo does that all for you under the hood.
Great video. I was curious if we can follow this implementation if the data is kind of semi structured like Project Management data(both text and numerical fields). Does this approach work well only in case of Text data ? Was curious if embeddings work well for mixed data(numerical/text). How would query work if it involves multiple filters on text and numerical fields?
Are you a contributor to Langchain? Your videos are awesome, but the langchain typescript sdk can be a bit buggy as of right now. Which is fair because it's still not in a stable 1.0 release yet, and is in the early stages, but it makes it difficult to use in production.
Is there any way to weight or improve the responses given? i.e. selecting this was useful/correct and have that feedback into the model?
Unless it's verifiable data from an external source the LLM can utilize an API to reach, it will need feedback from the user. Then you can optimize your prompts based on this.
Can we do something similar but with Opensource models incase the document is having some privacy realted stuff ?
Can I upload 1TB of docs/pdf to openai and do prompt chat with it?
you can, but you will need to pay pinecone and open ai api for that
What software was used to create that graphic organizer?
Very cool tutorial. Extremely informative!!
How do you convert your standalone question into embeddings?
Great Work... Thanks I will be able to use this for lot of things !
Awesome open source sharing! I have make it working, but what if we want to add more Docs? like thousands of PDF files,ingest command still working good or how to set the files path(const filePath = 'docs/filename.pdf';) in the code? Thank you!
I have the same question as Buddy. Maybe just add the docs that hasn't been indexed?
Hey thanks. Yes as I mentioned in the video, it will take quite a bit of time to explain how that would work + Q&A around, so based on viewers requests I'll cover that in the upcoming step-by-step chatbot program here: tinyurl.com/37v3k2fz
Very concise and clear. Thank you.
@Chat with data; I'm curious to understand how the flow (specifically in terms of grabbing the appropriate context/embeddings that are related to the question prompt) when the related context is sparse across the original document (meaning they're spread on a lot of vectors/embeddings) ?
You mean the related context belongs to more than one chunk/vector? There are several options, but one is to pass the top say 3 returned source documents (by vectorSimilarityScore) as the context, then tweak the prompt accordingly.
@@chatwithdata yep that's what I meant thank you for the idea ! I guess that would be something to play with in the parameters right ?
@@resistance_tn Yes. So by default in this codebase 'returnsourcedocuments' is set to true. So you can play around with the 'k' value for no of returned source docs and also the final prompt sent to the LLM that contains the context.
@@chatwithdata cool thank you very much ! Looking forward for the next content :)
Is this working with pictures?
Great Work! well done
i have a question : do we load the data every time to the model or it is saved? i mean each time we need to run we have to do the embdeing to chatGPT 3.5 or 4? cost wise for large data this will be costly..
I have the same question…
‚Embeddings Ada’ Usage is priced per input token, at a rate of $0.0004 per 1000 tokens, or about ~3,000 pages per US dollar (assuming ~800 tokens per page): also you need a workaround for massive pdf’s as it’s not able to handle so many chunks (you can find a explanation on the GitHub’s issue page) .. so in my opinion for most use cases its not that costly.. Also the output of GPT is likely not that long so you just pay like 0,002 / 1000 tokens which is also pretty good
No, once you've made the embeddings it's stored on the vectorbase in permanent state and you can start running queries against it without re-running 'ingest' again. You can also backup and download your embeddings locally.
This is cool! Can it work with chat-gpt model or gpt--3 mode? I don't have gpt4 api access :)
what happens with an excel file or if there is a table of numbers inside the PDF? can langchain w gpt be used to ask computational questions? can you make a video using Excel into vectors?
Yes it can. I have an upcoming video addressing this.