RAG-GPT: Chat with any documents and summarize long PDF files with Langchain | Gradio App

Farzad Roozitalab

Просмотров 31 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 28 сен 2024
RAG stands for Retrieval Augmented Generation and RAG-GPT is a powerful chatbot that supports three methods of usage:
1. Chat with offline documents: Engage with documents that you've pre-processed and vectorized. These documents will be integrated into your chat sessions.
2. Chat with real-time uploads: Easily upload documents during your chat sessions, allowing the chatbot to process and set up a RAG pipeline enabling the user to chat with the documents on the fly.
3. Summarization Requests: Request the chatbot to provide a comprehensive summary of an entire PDF or document in a single interaction, streamlining information retrieval.
00:01:30 Chatbot demo
00:07:04 GitHub repository explanation
00:08:15 RAG presentation (explaining different RAG techniques)
00:17:18 Project schema
00:26:50 Designing the data ingestion section
00:38:12 Designing the pipeline for connecting the GPT model to the vectorDB
00:46:45 Designing the chatbot interface
00:49:14 Connecting the backend to the chatbot interface
00:54:09 Testing the RAG side of the project
01:04:28 Designing and testing the document summarization section
01:19:26 Optimization strategies and deployment considerations
🚀 GitHub Repository:
LLM-Zero-to-Hundred Project: github.com/Far...
RAG-GPT project: github.com/Far...
📚 Main Libraries:
OpenAI: platform.opena...
Gradio: www.gradio.app...
Langchain: python.langcha...
Chroma: docs.trychroma...
📺 Introduction to Text Embedding:
Watch the Video: • RAG explained: A Step-...
#RAG #llm #ChatBot #GPT #Python #AI #OpenAI #Langchain #Gradio #chroma

Комментарии • 162

@mikew2883 8 месяцев назад ⁺²
Hi there. Quick question. I know the documentation states to open and modify the "cfg.py" but noticed it was replaced with app_config.yml and the other configuration are located in "load_config.py". I am receiving an error with the values I supplied so I was wondering what is the format and values the following configurations are looking for value-wise? I am using OpenAI for instance and not Azure. Thanks!
openai.api_type = os.getenv("OPENAI_API_TYPE")
openai.api_base = os.getenv("OPENAI_API_BASE")
openai.api_version = os.getenv("OPENAI_API_VERSION")
@airoundtable 8 месяцев назад ⁺²
Hi Mike. You are right, the correct configuration file is app_config.yml. Whenever you make changes to this YAML file, the updates should seamlessly propagate throughout the project through load_config.py script that handles the distribution of configuration values.
To add a new configuration parameter, you'll need to follow these steps:
1. Introduce the new parameter in the app_config.yml file.
2. Update load_config.py to ensure that the new parameter is loaded correctly.
3. Access the new parameter in your project's modules by creating an instance of the configuration loader, like so:
```
APPCFG = LoadConfig()
your_new_parameter = APPCFG.new_argument
```
Regarding the OpenAI credentials, it's crucial to handle them securely. I use an environment variable to store these credentials, which is not included in the GitHub repository to maintain security. You should create a .env file within your project directory and populate it with your OpenAI credentials. Here's an example of what that might look like:
OPENAI_API_KEY=
Since you're not utilizing Azure, you may not require all four of the credential arguments I use. Just include the necessary ones provided by OpenAI. I suggest checking these links and make a simple API call to OpenAI to ensure you understand the process.
* www.datacamp.com/tutorial/using-gpt-models-via-the-openai-api-in-python
* platform.openai.com/docs/api-reference/streaming?lang=python
Alternatively, if you prefer not to use a .env file, you can directly insert your OpenAI credentials into the load_config.py module, although this is less secure and not recommended for sensitive information.
I hope this clarifies the configuration process for you.
@mikew2883 8 месяцев назад ⁺²
@airoundtable Thank you for your reply. I ended up creating a .env but still getting the following error. I even left the the API endpoint out as well since you mentioned this was only needed for Azure.
APIConnectionError: Error communicating with OpenAI: Invalid URL 'None/engines/gpt-35-turbo-16k/chat/completions': No scheme supplied. Perhaps you meant None/engines/gpt-35-turbo-16k/chat/completions?
@airoundtable 8 месяцев назад ⁺²
@@mikew2883 No problem! Well, that means that the API call is now ok but the code to generate the response from the GPT mode is not compatible with OpenAI. The issue is because I am using something like:
```
openai.ChatCompletion.create(
engine=gpt_model,
messages=[
{"role": "system", "content": llm_system_role},
{"role": "user", "content": prompt}
],
temperature=temperature,
)
```
This is the code that works with Azure OpenAi. While for those who use OepnAI directly, the code is something like this:
```
client.chat.completions.create(
model="gpt-4",
messages= messages=[
{"role": "system", "content": llm_system_role},
{"role": "user", "content": prompt}
],
)
```
So, in order to fix the problem:
1. Check the project schema and find where GPT models are generating a response
2. Find the code in the project and change them with the OpenAI format.
I suggest you first generate a response from a GPT model using your API key and make sure that the code that you are using works as expected.
Check this link for more info on how to generate the response with OpenAI:
platform.openai.com/docs/api-reference/streaming?lang=python
,
@mikew2883 8 месяцев назад ⁺¹
Will do. Thanks for the reply! @@airoundtable
@airoundtable 8 месяцев назад
You're welcome!@@mikew2883
@navanshukhare 7 месяцев назад ⁺⁹
This is one video if you want to learn and make an advanced RAG Project. Other videos are also equally great and I love your approach to how organized you are in your videos; your code quality is just WOW.
@RZOLTANM 6 месяцев назад ⁺¹
Really good and perfectly articulated presentation on RAG. Thank you!
@airoundtable 6 месяцев назад ⁺¹
Great to see that you liked the content!
@SaddamBinSyed Месяц назад
A very well done @Farzad. Great explanation. This is exactly the concept I was looking to understand and implement. You are simply 100x amazing. I am highly excited to listen to your other videos as well. Thanks for keeping this channel so informative. One suggestion from my side: next time, please use local LLMs like Ollama Llama 3.1 so those who cannot afford it will benefit.
@airoundtable Месяц назад
Thanks! I appreciate the kind words and I am glad that the content was helpful. Thanks for the suggestion. I have almost the identical project using open source LLMs. Please check out:
ruclips.net/video/6dyz2M_UWLw/видео.htmlsi=u-QWc-Mz5oOA17LS
@SaddamBinSyed Месяц назад
@@airoundtable Thanks for sharing, Watching right away....
@JJaitley 7 месяцев назад ⁺³
What are your suggestions on cleaning the company docs before chunking? Some of the challenges faced are how to handle the index pages in multiple pdfs also the headers and footers. You should definitely make some video related to cleaning a pdf before chunking much needed.
@airoundtable 7 месяцев назад ⁺²
Well, handling company documents for integration into a RAG system is indeed a complex task. It's often so detailed and requires such a hands-on approach that I would strongly suggest treating the document preparation as a separate project from the RAG chatbot development. Even that project by itself can be divided into two main flows:
1. Cleaning and preparing existing documents
2. Establishing a standard format for all the new documents for easier future integration
Since the RAG system is going to perform a vector search across the entire document set, I suggest removing the unnecessary or duplicate content (for instance I cannot think of any possible way that a separate index would add value to the conventional RAG strategies and vector search techniques, unless you design a complex RAG system that incorporates hierarchical graph methodologies).
Finally, if your documents contain domain-specific abbreviations that general language models may not recognize, you can think of implementing an advanced RAG system with a fine-tuned LLM on your specific domain data (There is another video in the channel that explains how to fine-tune an LLM on company documents which might give you some good ideas).
And thanks for the suggestion! I'll consider creating a tutorial video and address this issue.
@musumo1908 6 месяцев назад ⁺¹
Great work! Would love to see this with LiteLLM as an option and some sort of basic user login system…along the lines of open webui
@airoundtable 6 месяцев назад
Thanks! That is indeed a great combo. I haven't looked at it closely yet but I will definetly check it down the road
@doctorbill37 7 месяцев назад ⁺¹
I just discovered your video today in my feed. This is an excellent project with great attention to detail. Very well done.
I cloned it and saw a project in your bullet list called "Open Source LLMs" along with the note that it is coming soon. Do you have any idea when that might be? This is important for those of us wanting to run LLMs with RAG locally on our machines. Very much looking forward to seeing this. Thanks for your work,
@airoundtable 7 месяцев назад ⁺¹
Thank you very much for the positive feedback @doctorbill37. I am glad to hear you liked the video. For the open-source-RAG project, I have a good news. I have already started to take the video. It will be uploaded in the next couple of days
@doctorbill37 7 месяцев назад
@@airoundtable Wonderful -I am subcribed!
@341yes 6 месяцев назад
Thanks for this project! Very useful! Will watch every1 of ur videos from now on! ☺
@airoundtable 6 месяцев назад
Thanks @341yes! I am glad to see that you liked the video!
@revanthreddy6136 5 месяцев назад ⁺¹
hello sir is the open ai api key paid ? do we have to pay for it in order to access it and use it ?
@airoundtable 5 месяцев назад
Hi, yes. You have to pay for it to be able to make the API calls. If you go to openAI website you will see how to get the API key from there. Just also keep it in mind that the project is currently using Azure OpenAI. In case you want to use OpenAI directly, a couple of modifications are required. I have pinned a comment here (you can see it on top), where I explained all the steps in detail.
@PrinceBrosnan Месяц назад ⁺²
is there a similar ready-made solution on the site "poe"? I am a beginner and want such a model, but not to make it, but to work with it
@airoundtable Месяц назад
Check out my video below:
ruclips.net/video/8iMIGVWMPPQ/видео.htmlsi=ryvfD6m65Jyro205
This is a free RAG app that you can use
@hammadyounas2688 10 дней назад ⁺¹
Can we have the option to upload files to the vectore store to update the assistant? like upload files to the vector store of openai?
@airoundtable 10 дней назад
If you mean whether you can add/remove/modify the files of the vectorstore that you created, the answer is yes you can. you can easily find the info on it. I just did a quick search and saw this tutorial:
www.datacamp.com/tutorial/chromadb-tutorial-step-by-step-guide
But I'd search the ChromaDB documentation to find all the details.
@LorenzoPozzi-g6h Месяц назад
Very good content, thanks for the video!!
@airoundtable Месяц назад
Thanks!
@KinesitherapieImanesghuri 6 месяцев назад ⁺¹
How can we evaluate the responses generated by the RAG system?
@airoundtable 6 месяцев назад
That is the million dollar question.There have been a lot of effort around evaluating RAG systems but the challenging part is that there is no single metric that can tell you how accurate your system's response is. Instead, to evaluate RAG systems, we usually use the help of LLMs themselves. But before that we need to understand what the challenges in a RAG system are. here is a brief summary of some of the key steps:
1. In data preparation pipeline: data quality + chunking strategy + embedding quality
2. In retrieval side: user's query quality + search quality + relevance of the contexts of the retrieved documents to the query
3. In sysnthesis side: context overflow + LLM hallucination + answer relevance
These are the components that need to be adjusted and evaluated in a RAG system. For the evaluation pipeline itself, you can either use the frameworks that are being developed for this purpose: e.g: TruLens, Langsmith, Galileo (My recommendation: Langsmith)
Or you can design a custom pipeline depending on your goal and usecase. I have a video on the channel called, "Langchain vs Llama-index" where I design an end to end pipeline and evaluate the performance of 5 different RAG techniques. There I go into much more detail about this topic.
Overall, this task requires a good amount of testing and iteration specially if the requirements are too specific and complex.
@RetiredVet 7 месяцев назад ⁺¹
You did a great job, but the videos are so small, I have to constantly expand them to read it. It would be nice if you could read the text without going full screen all the time.
@airoundtable 7 месяцев назад ⁺¹
Thanks @RetiredVet! You are right. I have to find a way to increase the size if the contents for an easier read. That is actually why in langchain vs llama-index video I omitted the powerpoint and showed everything on screen and also showed each command that I was executing on the screen as well. However, I am constantly looking for ways to improve the quality as I just recently started to upload videos on RUclips.
@RetiredVet 7 месяцев назад
@@airoundtableI enjoyed your video and think your code is great. The code and explanations are the important part. You can learn the video stuff much easier. I've looked at a lot of langchain videos and your explanations are very clear.
Unfortunately, I am an intermediate python programmer and I had no idea that requirements' files were so different between windows and Linux. I cannot use your requirement files and when I try installing langchain with pip these days, it never works. If the RUclips video is a week old, the requirements have changed. I try to downgrade, to use the recommended versions, but then langchain installs packages that don't work. I am learning a lot more about package management than I ever wanted to.
Langchain is a very interesting project, but it is moving so fast, it is difficult for me to keep up.
Keep up the good work.
@serjmitaki 7 месяцев назад ⁺²
you did a great job! Thank you!
@airoundtable 7 месяцев назад ⁺¹
Glad to hear you liked the video! Thanks for the feedback!
@Thomas-nx5vo 6 месяцев назад ⁺¹
Great video. I'm testing out the project but it seems that the chatbot also takes information from the web, as it accesses websites when there is no uploaded docs. I would like to have it only interact with the PDFs/uploaded docs... Any fixes?
@airoundtable 6 месяцев назад
Thanks Thomas! The chabot does not have access to the internet but it has access to the pre-trained knowledge of the GPT model. Overall, it works in 3 different ways:
1. If you have already preprcessed some documents and start using the chatbot, it will give you answer based on those documents (This is the chat with pre-processed doc feature).
2. If you select Chat with upload docs feature and upload documents, it will start giving you answer based on the uploaded documents (until you change the setting to with pre-processed doc again)
3. In case the user's question is not related to any documents the chatbot will use its own knowledge but in a limited way to just act as a friendly chatbot. If you would like to restrict it even more, you can change the LLM_system_role in the config folder (configs/app_config.yml: llm_system_role argument). It is where I am instructing it and explain to it how it should behave. I explain it in the video at:
00:40:45 LLM system role
I hope this helps you solve the problem.
@anandkhule375 7 месяцев назад ⁺²
Great Project .. Good Work !
@airoundtable 7 месяцев назад ⁺¹
Glad to see you liked it! More projects are on the way!
@Abdulrahmanmuhammed-cy9zq 4 месяца назад ⁺¹
Hi, I have to ask beginner questations
is that support arabic document as well ?
and is that key for free to use ?
@airoundtable 4 месяца назад
Hi,
I am really not sure how the models would perform on Arabic. You can give it a try or search in arab forums and see what models do they suggest for arabic.
Which key are you refering to?
@0ZeroTheHero 6 месяцев назад ⁺¹
How can this be modified to include .epub files?
@airoundtable 6 месяцев назад ⁺¹
To add .epub files, modify "prepare_vectordb.py" module and add the condition for adding .epub and then use the following links to prepare the langchain loader and pass the loaded files for chunking.
js.langchain.com/docs/integrations/document_loaders/file_loaders/epub
python.langchain.com/docs/integrations/document_loaders/epub
api.python.langchain.com/en/latest/document_loaders/langchain_community.document_loaders.epub.UnstructuredEPubLoader.html
@MH-xx6df Месяц назад ⁺¹
module error pwd. Are you running this on a unix-like system? What modifications must I make for Windows pls?
@airoundtable Месяц назад ⁺¹
I ran this project on Windows but it does not matter. The project is designed in a way that it can be executed on any OS without needing any modification. If you are getting it while installing the requirements.txt just remove the library that is causing this issue and try again
@taylorfans1000 7 месяцев назад ⁺¹
Hi, i am using Openai entirely and not azure, i change the chat completion function as per your solution in the comments, but i am getting the error : TypeError: 'ChatCompletion' object is not subscriptable in response["choices"][0]["message"]["content"]. Please suggest some resolution for the same. Also, the application stops fetching answers after the first question. Please help. @AI RoundTable
@airoundtable 7 месяцев назад ⁺²
Hi @taylorfans1000, this problem is not hard to debug.
1. Based on OpenAI's website, even by using their model directly you should be able to extract the response from response["choices"][0]["message"]["content"]. Here is the reference: platform.openai.com/docs/guides/text-generation/chat-completions-api
1. But to make sure that you GPT model is working as expected, test it separately on a notebook. Use your API key and the Client.chat.completion class from openai and make a successful API call with the GPT model that your are using.
2. Don't use response["choices"][0]["message"]["content"] and directly print(response) itself and make sure that you are getting the whole json response from OpenAI.
3. Once you've got the API call working, then you can try to get the specific message content by using response["choices"][0]["message"]["content"] and make sure that you can extract the response content from OpenAI's json response.
4. After you went through these steps successfully, apply it to the following files and lines in your project:
src/util/chatbot.py in lines 68 to 78
src\utils/summarizer.py in lines 110 to 118
My guess is the fetching problem will be solved as well after you fix this. I hope this helps, feel free let me know in caseyou have more questions.
@341yes 6 месяцев назад
Hey! Were you able to debug this issue? ....i'm facing the same! Thanks in advance!
@chintujha7404 6 месяцев назад ⁺¹
Great video on RAG. One quick question. Can we add documents from UI for preprocessing and chat with that rather than adding documents to data folder from backend? I mean..add a functionality in UI that will allow me to add documents in data folder and preprocess it so that i can chat? Thank you
@airoundtable 6 месяцев назад
Thanks! yes the chatbot has that capability. Follow these steps:
1. In the "Rag with" drop down choose chat with Upload docs
2. Use the "Upload doc" button and select your documents.
3. Stay a few seconds until the chatbot tells you that the documents are procssed and you can start asking questions.
@RealLexable Месяц назад ⁺¹
Would .csv files or images also be able to add?
@airoundtable Месяц назад
Not to this version. For performing Q&A and RAG with CSV files please have a look at my LLM agent videos. For images you would need more complex approaches and the one presented by Langchain and Unstructured are not ready for production and they took a very long time to process the images. Although in my next video, I am aiming to show how to find-tune multimodal LLMs on custom image datasets and those models can be used to perform RAG on images. Here is the link to my video describing how to chat with SQL and tabular databases:
ruclips.net/video/ZtltjSjFPDg/видео.htmlsi=bh9xdkJqufFBMrBI
@abhiruwijesinghe2677 5 месяцев назад ⁺¹
Hi,
I got an error like there is no files inside the data/docs
FileNotFoundError: [WinError 3] The system cannot find the path specified: 'data/docs'
but I didnt change anything of your repo, I just clone it and run. Can you give some guidance for this issue
@airoundtable 5 месяцев назад
Hi,
It shouldn't be the case. this directory is part of the repository and I am using 'pyprojroot' library to manage the directoris in the project automatically. Without the full traceback of the error, I cannot understand why it is happening. In case you still get the error, feel free to share the full traceback and I will let you know the source of the problem.
@usamaahmed8075 4 месяца назад ⁺¹
facing an issue when i run ma chatbot i doesnot give answer its show error when i processed the file also show error
@airoundtable 4 месяца назад
As I mentioned, please check the repository issues and open a new one if needed
@ragavanajith4538 8 месяцев назад ⁺¹
How to Stop the stream response in llm Langchain we are using X axel buffering
@airoundtable 8 месяцев назад
It is hard to tell without seeing the code and it depends on how you are calling the model and generating the response. In the code that I put in my github repository, the model does not stream the response. But in case you are using a different code with langchain, check these links:
- python.langchain.com/docs/modules/model_io/chat/streaming
- python.langchain.com/docs/modules/model_io/llms/streaming_llm
@NinVibe 5 месяцев назад
Hi, I love the whole project, but I would be happy if you go more in-depth on the following statement in the repo: "It is strongly recommended to design a more robust and secure document handling process for any production deployment."
Do you mean like improvement on security of the documents and restricted access for the app and implement such steps, or?
@airoundtable 5 месяцев назад ⁺¹
Hi, thanks.
No, I would suggest the accessing level on the chatbot side. In general, for RAG rjects, I suggest to separate the doument processing pipeline from the chatbot itself. And for the data pipeline there are so many factors that need to be taken into account. If I assume the company is mid size or bigger:
1. Document cleaning, and transformation (for instance, if you are dealing with, .txt, .docx, and .pdf, after preprocessing, you can convert all to PDF documents for a managable workflow).
2. Content validity check. This step can be managed on a division level or on multiple levels depending on the size of the company and the complexity of the documents. (The verification teams should verify the contents and be responsible for what is in there.).
3. Avoid duplicates. (Sometimes differenti dvisions have very similar documents but for different usecases and purposes. This can cause confusion in a RAG project.)
4. Address specific cases (e.g: if a PDF file was created using scanner, you would not be able to perform RAG on it.)
5. Manage security (who can add/remove a document from the pipeline itself and also the access level)
6. Implement a CI/CD pipeline for automating the workflow and managing scale (e.g: in case a document is added to a database, your pipeline should be able to only add that document to the vectorDB rather than recreating the vectordb from scratch)
7. Secure the pipeline
8. Test and improve
These are some of the key aspects of the data pipeline for RAG projects.
@shahnaz9026 6 месяцев назад ⁺¹
I have a doubt... This project is only for text pdf? Or it can be used for the pdf which contains images taable as well?
@airoundtable 6 месяцев назад ⁺¹
This project is only for .txt and PDF files and easily extendable for docx files (you just need to add it to the list of acceptable files in the code). If your documents contain images and tables, that would not hurt the performance on text. But for implementing RAG with images and tables this project needs to be upgraded.
RAG on images requires image embedding and vector search on those embeddings. Still there is no solid approach that can handle various type of images (e.g: technical drawings). But there are a few priliminary solutions out there that work on generic images. So, the industrial application is still very limited.
Tables on the other hand requires specific approaches that are able to extract the contents of a table properly. "Unstructured" library has been working on this aspect and Langchain has adapted it within its framework. So, since I used Langchain in this project, you can easily modify it and add that approach to it. But the problem with that is handling tables with that approach takes a very long time (in my experience) which makes it impractical for industrial purposes. That leaves the door open to custom solutions that can suit a specific business need which can vary widely.
@shahnaz9026 6 месяцев назад ⁺¹
@@airoundtable thank you so much for replying
One more doubt can i write all those code in Jupytter notebooks and run?
@airoundtable 6 месяцев назад ⁺¹
@@shahnaz9026 technically you can. You need to make alot of refactoring to the code but it is doable if you want to run it on jupyter. Keep it in mind that this code is currently using Azure OpenAI, in case you want to use OpenAI directly, the chat completion functions need to be modified as well.
@hillmanlai3270 6 месяцев назад
This is a great video that clearly explains the whole RAG development process.
One quick question, I created the .env file to store the OPENAI_API_KEY. But it still could not find it. where should I put the OPENAI_API_KEY?
@airoundtable 6 месяцев назад
Thanks, I am glad that you liked the video!
To make that work:
create a raw file and name it ".env" and put it in the parent folder of the project (RAG-GPT) folder and add your arguments like this:
OPENAI_API_TYPE=azure
OPENAI_API_VERSION=
OPENAI_API_KEY=
OPENAI_API_BASE=
Then to test if it is working properly, open a notebook or a raw .py module and run this command:
import os
from dotenv import load_dotenv, find_dotenv
# This line automatically finds the .env file in your environment
_ = load_dotenv(find_dotenv())
openai_api_type = os.getenv("OPENAI_API_TYPE")
openai_api_base = os.getenv("OPENAI_API_BASE")
openai_api_version = os.getenv("OPENAI_API_VERSION")
openai_api_key = os.getenv("OPENAI_API_KEY")
Then you can print and make sure that it got it right:
print(openai_api_type )
If you see the values by printing them, then you are good to go.
@hungryforasmr1157 13 дней назад ⁺¹
bro it was giving error for modifying pip
@airoundtable 11 дней назад
Open a ticket on the repo if you still couldn't solve the problem. I need to see the error
@coffeepod1 Месяц назад
how to register Azure Deployment?
we already have some params in load_openai_cfg(self), like API_KEY, API_BASE, API_VERSION, API_TYPE. I tried to add parameters API_DEPLOYMENT, but it still error: No deployment found
@airoundtable Месяц назад
I hope you have already found the answer to your question. But in case you haven't, first check and see if you can call your models from a separate notbook from the project. After the successful call, then try to add the models to the project. This error was raised because the project could not properly communicate with the model. That is either due to the model name or the credentials that you took from Azure OpenAI
@coffeepod1 Месяц назад ⁺¹
@@airoundtable Yes already solved. But I ended up using OpenAI instead of Azure. Hi, I want to ask again:
How can we make the chatbot understand the format of documents? Example, I have a document with format: title, dates, content. Then I upload new document and check if my new document has the same format with my preprocessed document. And I altered system prompt in about_config.yml, to add capability of the chatbot to detect typos, but it doesn't work. How can I edit the project? Thanks before
@airoundtable Месяц назад
@@coffeepod1 Glad to hear it.
Regarding your new question, I've never done it but if I want to solve it, I would use a hybrid approach by combining python libraries that are able to extract document info based on the document structure (headings, subheadings, etc.) and from there I could either hard code it to make sure the structure is correct or I would use an LLM to make the judgment for me. But just sharing a document to an LLM would not be an effective way for achieving your goal.
About the typos, since LLMs work with tokens and not the words, it is sometimes hard for them to detect typos (especially when the context length is long). The benefit is when we interact with them they don't care if we have typos in our queries. But on the downside, when it comes to fixing these typos, there is a chance that they miss the errors. They can perform better with smaller context lengths which is not usually the case for RAG systems.
@coffeepod1 Месяц назад ⁺¹
@@airoundtable I see, we are not OpenAI (or big AI companies), limited on resource indeed. But, what if "prompt engineering" with few shots about the doc structure? or how about NER approach since it is similar with information extraction. and about typos, I don't really think about the obstacles like you explained man, I just think the GPT model (I use GPT-4o) has capabilities to detect typos. Poor me, working on hard project
@airoundtable 29 дней назад
@@coffeepod1 I am not sure about the suggestions. I haven't work with them. But at this point, I would say the best strategy would be to test different approaches quickly and see which one is more effective than the other ones. Then spend time and improve that approach
@kunalsatpute8379 6 месяцев назад
Very detailed explanation and thank you for making it open source.
Is there any plan to advancement to this application? like
1 advance rag pipeline which can extract text and table data or image based on user question
2 create vector db based on text image and table data?
3 providing login and admin panel to track the information like no of token used by different users etc
4 using react node for better app experience?
5 Complete deployment process ?
@airoundtable 6 месяцев назад ⁺¹
Thanks! I am glad that you liked the video.
These are all great points. For some of them yes, there will be a video soon and for some I still have no plan. I am looking into solutions for taking into account tables and unstructured documents along with images. There are already some solutions out there (unstructured library and image vector databases) but non of them are still practical in my opinion. For instance the available approach that "langchain" and "unstructured" proposed for processing the tables in documents is super slow and technically non practical. So, I will make a video as soon as I see an approach that can be applied in real-world scenarios.
The next two videos would be interesting for you I guess. The next one is a multimodal chatbot that uses 5 different models in the background and is able to answer questions asked from the context of an image as well. And the one after is an advanced RAG chatbot that uses knowledge graph and take into account more detailed relationships between the content of a document and related chunks.
3,4, and 5 have crossed my mind but I still have not planned a video for them. I will keep it in mind and think about it after the next two videos.
Thanks for the suggestions @kunalsatpute8379!
@kunalsatpute8379 6 месяцев назад ⁺¹
@@airoundtable Thank you for replying, and excited for your videos. One question Will these videos be extension or enhancing this application ? or would be be entire separate video?
@airoundtable 6 месяцев назад ⁺¹
@@kunalsatpute8379 That would be expansion. Beside LLM applications, one of the main ideas behind this series was to walk through all the necessary steps required for an advanced multimodal chatbot. I started by explaining function calling and vector search and using them I designed multiple projects. The next video would be:
1. A combination of all the chatbots that I have designed and uploaded on the channel so far (RAG-GPT + connecting the GPT to the search engine + chatting and summarizing websites)
2. We will use the concept that I showed on open-source RAG for creating a web server for serving models
3. We will add more abilities: The user can interact with the chatbot by sending voice, text, and image and the chatbot will respond in voice, text + we can ask questions about a specific image that we uploaded and the chatbot would be able to answer the questions about the image context + we can ask the chatbot to generate image for us as well.
So it would be an any-to-any chatbot
input: voice, text, image
output: voice, text, image
functionalities: Normal AI chatbot + RAG with documents + RAG with websites + search the web using a search engine + summarize documents + summarize websites + understand image both for answering questions and for generating them
So, RAG-GPT project would be one arm of that chatbot and I am thinking to give the user the ability to work with around 9 or 10 different Gen AI models (all open-source except GPT). So, in that video I will just briefly touch RAG-GPT and the other parts that I have already covered in the previous videos and the focus would be to explain the multimodal side of it and how the whole chatbot was designed. That is a huge project
@arian2168 6 месяцев назад ⁺¹
Hello thanks for the video but when i try to run the app i get this error : "
openai.error.InvalidRequestError: Invalid URL (POST /v1/engines/gpt-35-turbo/chat/completions)" because i dont have access to azure open ai api yet im using openai api, Would you be able to help me with this ? Thank you
@airoundtable 6 месяцев назад
Hi Arian. Thanks. I just pinned a message on top in the comment section where I discussed this issue with @mikew2883. Please read that discussion. I provided all the necessary guidance for changing the openAI API call from Azure to OpenAI itself. let me know if you have any other question
@arian2168 6 месяцев назад
@@airoundtable Thank you for your reply. I made a few additional adjustments, and now it works, really appreciate your awesome work and the effort you put in. Damet Garm 👊
@airoundtable 6 месяцев назад
@@arian2168 happy to hear it. Thanks Arian!
@malleshtelagarapu9219 7 месяцев назад ⁺¹
Very Informative video, understood all the components involved, could you please let me know where should I define "OPEN_API_KEY", is it in moderation.py, or this along with other parameters should be defined in environment variables, could you please help me with this.
@airoundtable 7 месяцев назад ⁺¹
Thanks! To keep important information like passwords or API keys safe, we put them in a special file called .env in our project. This is a common way to handle private settings. Here's a really easy guide to help you set it up:
Make a new file in the main folder of your project and call it .env.
1. Inside this .env file, you can write your private information like this:
```
OPEN_AI_API_KEY=yourapikeyhere
ANOTHER_SECRET_KEY=yoursecretkeyhere
```
2. To use these settings in your code, you'll need to add a couple of lines to tell your program to read the .env file. Here's how you do it:
First, you need to add these lines at the beginning of your Python script:
```
import os
from dotenv import load_dotenv
load_dotenv() # This tells Python to read your .env file
```
3. Then, whenever you need to use the information from the .env file, you can get it like this:
```
openai_api_key = os.getenv("OPEN_AI_API_KEY")
another_secret_key = os.getenv("ANOTHER_SECRET_KEY")
```
Replace OPEN_AI_API_KEYand ANOTHER_SECRET_KEYwith the actual names you used in your .env file. Now your code can use your private settings safely!
@malleshtelagarapu9219 7 месяцев назад ⁺¹
Thanks a lot, I will try the above method.
@RZOLTANM 6 месяцев назад ⁺¹
QQ What version of Python is best here as there's a lot of packages and hence implicit dependancies?
@airoundtable 6 месяцев назад
I have been using python 3.11 and never faced any issue with RAG projects. I also included a requirements.txt file in the project root directory that you can check the versions of all the libraries that I used for this chatbot.
@RZOLTANM 6 месяцев назад
Thanks.Apologies if you have already clarified that. I will give it a go with 3.11.@@airoundtable
@TechPuzzle_Haven 7 месяцев назад ⁺¹
Excellent Video. Thanks a lot. Could you pls help me to get the terminal_q_and_a.py file
@airoundtable 7 месяцев назад ⁺¹
Hi, Thanks! Sure, I will update the repository in about an hour and push that file in it. I saw you opened an issue on Github as well, so I will inform you there when the file is added.
@TechPuzzle_Haven 7 месяцев назад ⁺¹
@@airoundtable I got the file .Thanks a lot for your quick response and excellent video
@airoundtable 7 месяцев назад
@@TechPuzzle_Haven You're welcome! Thanks!
@Preds23 5 месяцев назад
This is a fantastic tutorial. Nice Work!
**Question, sporadically the chat history and retrieved content are appended to the chatbot along with the question/response. Not sure if it's related but, it seems to only happen when I adjust(increase text details) the Prompt instructions. Any idea why? Thanks.
@airoundtable 5 месяцев назад
Thanks! I am glad that you liked the video.
I am not sure if I understood your point correctly. The chat history, is appended to the model's input in every query. (Along with the retrieved content, and the user's query itself). Therefore, the GPT model should always have access to the chat history. But in case the performance is not consistant, the source can be
1. Due to the context length that the GPT model recived. GPT 3.5 has 4096 token limit and based on my experience if it recives somethong over 3000, there is a great chance that you see degradation in performance. In case you are interested, have a look at minute 8 of Open Source RAG Chatbot video for a more technical explanation of this aspect.
2. Format of the input: In case the chunk sizes are very small in a sense that the input does not create an understandable context, GPT model can get confused as well.
3. Model system role: model system role should clearly guide the GPT model to understand each part of the input context. A vague system role can make the GPT model confused and therefore it can affect the usage of the chat history.
Please let me know if this could answer your question.
@Preds23 5 месяцев назад ⁺¹
@@airoundtable Sorry for the confusion. After more testing, I don't think adjusting the prompt is the issue. Sporadically, the chat history, retrieved content(similar to references sidebar), source text and the input query are displayed on the output of the Gradio chatbot. Is this normal functionality? It doesn't happen on all questions though, some just return the response. Thanks again for the interaction.
@airoundtable 5 месяцев назад
@@Preds23 No worries. Now I think I got your point which is a good point indeed. In some occasions the GPT model also provides the source with the response and in some cases you don't see it unless you click on references. Here is why:
That behavior is due to the context length that I mentioned in the previous message. If the context that the GPT model receives is not too long (in a way that it overwhelms the model) the model is able to pick up that piece of instruction and show the source in the output. But if the context length that the model receives is long (chat history + user prompt + retrieved content) GPT 3.5 is not powerful enough to be able to follow all the details. So, it will only focus on the main body of the instruction and misses that part where it should also provide the source. However, as you noticed you can always see the reference and the chunks in the reference bar. But in case you want a more consistent behavior from GPT on that matter:
1. Use GPT 4 instead of GPT 3.5. with the current config, GPT 4 would be almost always return that source as the end.
or
2. Reduce the size of chunks and the number of chat history to be injected to the model. To do that you have to make sure that this change is not so drastic to downgrade the model's behavior but to some extent, you can make the model's input a bit shorter so the model can pick up all the instruction's details.
To put it in a more simple explanation think of it this way: GPT 3.5 has 4096 token limit. If you pass an input to it with 3500 tokens, the model will focus more on the beginning and end of the input and starts to forget (or ignore) what was said in the middle section of the input. And in case you pass an input with 2000 tokens, the model can understand and follow all the instructions nicely without any issue. This is an intrinsic characteristic of all LLMs.
Hope this help you understand the problem.
@agep13 7 месяцев назад ⁺¹
Hi, Could you give me your .env setting you use, openai_api_type, openai_api_base, and openai_api_version? I am struggling to make it works.. thank.. great video btw..
@airoundtable 7 месяцев назад
Thank you for your positive feedback @agep13.
Regarding the .env settings, it's important to note that these often contain sensitive information, such as API keys, which should be kept private and not shared openly to ensure security. However, I can certainly guide you on setting up your own .env file.
For the openai_api_type, openai_api_base, and openai_api_version, you'll need to consult the official documentation provided by OpenAI or Azure to determine the correct values. These details are typically available in the API or developer section of your account dashboard.
Here's a basic template for what your .env file might include:
OPENAI_API_KEY=your_unique_api_key
OPENAI_API_TYPE=the_type_of_api_you_are_using
OPENAI_API_BASE=azure_open_ai_endpoint_url
OPENAI_API_VERSION=version_number
Please ensure you replace the placeholders with your actual API key and the appropriate values for your use case. The OPENAI_API_TYPE will depend on the API service you've subscribed to (e.g., GPT in the video.), while the OPENAI_API_BASE and OPENAI_API_VERSION are generally standard URLs used for accessing OpenAI's API.
Remember to keep your .env file secure and avoid uploading it to public repositories to prevent any unauthorized use of your API keys.
@ShadowScales 7 месяцев назад ⁺¹
@dtable hello great video, i know the OPENAI_API_key is in your open ai account and openai_api_type would be like gpt3.5 or 4 but for the rest im still confused on where to get the base and version how do i check which urls to use
@airoundtable 7 месяцев назад
Hi@@ShadowScales, Thanks! Your confusion is on point because to use GPT models from OpenAI directly, you won't need to insert the endpoint and the api_version. You probably missed that I am using GPT model from Microsoft Azure. That is why I have 4 credentials for it. In order to understand how to adjust the code to OpenAi API directly, please read the comments under @mikew2883 down below (it is the comment with the 7 replies). There we had a full discussion on how to properly modify the project. I hope this helps. In case you have any question along the way, please let me know.
@sujit5013 3 месяца назад
Is the code no longer available?
@airoundtable 3 месяца назад
It is available:
github.com/Farzad-R/LLM-Zero-to-Hundred/tree/master/RAG-GPT
@pschoro 7 месяцев назад ⁺¹
not working anymore....AttributeError: 'NoneType' object has no attribute 'lower'
@airoundtable 7 месяцев назад ⁺²
Hi @pschoro,
I just created a new environment and installed the libraries and tested the project. It works fine. The error that you sent is not complete and I cannot understand what the problem was exactly. However, my guess is that there is a conflict in one of your libraries. I just added a new "requirements.txt" in RAG-GPT folder. I recommend you to create a new environment and install the libraries using that file. I tested it with python 3.11.7 on windows 11 and RAG-GPT worked without any issue. Hope it helps you solve the problem
@pschoro 7 месяцев назад ⁺¹
unfortunately...Still the same error. Even on 3.11. When I try to submit a document or when I try to start chatting...
To create a public link, set `share=True` in `launch()`.
Document length: 59
Generating the summary..
Traceback (most recent call last):
File "c:\zone\REPO
ag LLM\LLM-Zero-to-Hundred\RAG-GPT\venv\Lib\site-packages\gradio
outes.py", line 567, in predict
output = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\zone\REPO
ag LLM\LLM-Zero-to-Hundred\RAG-GPT\venv\Lib\site-packages\gradio
oute_utils.py", line 232, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\zone\REPO
ag LLM\LLM-Zero-to-Hundred\RAG-GPT\venv\Lib\site-packages\gradio\blocks.py", line 1561, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\zone\REPO
ag LLM\LLM-Zero-to-Hundred\RAG-GPT\venv\Lib\site-packages\gradio\blocks.py", line 1179, in call_function
prediction = await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\zone\REPO
ag LLM\LLM-Zero-to-Hundred\RAG-GPT\venv\Lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\zone\REPO
ag LLM\LLM-Zero-to-Hundred\RAG-GPT\venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "c:\zone\REPO
ag LLM\LLM-Zero-to-Hundred\RAG-GPT\venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\zone\REPO
ag LLM\LLM-Zero-to-Hundred\RAG-GPT\venv\Lib\site-packages\gradio\utils.py", line 678, in wrapper
response = f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "c:\zone\REPO
ag LLM\LLM-Zero-to-Hundred\RAG-GPT\src\utils\upload_file.py", line 39, in process_uploaded_files
final_summary = Summarizer.summarize_the_pdf(file_dir=files_dir[0],
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\zone\REPO
ag LLM\LLM-Zero-to-Hundred\RAG-GPT\src\utils\summarizer.py", line 74, in summarize_the_pdf
full_summary += Summarizer.get_llm_response(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\zone\REPO
ag LLM\LLM-Zero-to-Hundred\RAG-GPT\src\utils\summarizer.py", line 110, in get_llm_response
response = openai.ChatCompletion.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\zone\REPO
ag LLM\LLM-Zero-to-Hundred\RAG-GPT\venv\Lib\site-packages\openai\api_resources\chat_completion.py", line 25, in create
return super().create(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\zone\REPO
ag LLM\LLM-Zero-to-Hundred\RAG-GPT\venv\Lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 149, in create
) = cls.__prepare_create_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\zone\REPO
ag LLM\LLM-Zero-to-Hundred\RAG-GPT\venv\Lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 80, in __prepare_create_request
typed_api_type = cls._get_api_type_and_version(api_type=api_type)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\zone\REPO
ag LLM\LLM-Zero-to-Hundred\RAG-GPT\venv\Lib\site-packages\openai\api_resources\abstract\api_resource.py", line 169, in _get_api_type_and_version
else ApiType.from_str(openai.api_type)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\zone\REPO
ag LLM\LLM-Zero-to-Hundred\RAG-GPT\venv\Lib\site-packages\openai\util.py", line 35, in from_str
if label.lower() == "azure":
^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'lower'
@@airoundtable
@airoundtable 7 месяцев назад ⁺²
Now it is more clear@@pschoro. This error is not because of the chatbot's codes. It is an AttributeError occurring in openai's backend. The chatbot is using azure to access openai's models, and my guess is you either are using openai's GPT model directly from OpenAI or if you are using azure, you haven't set the credentials properly. So, depending on where you are taking the GPT model from, the solution would be different. If you are using azure, ensure that all required environment variables or configurations related to the OpenAI API (like the API key and the API type). If you are using the GPT model directly from OpenAI, you need to both set the credentials and also change the code for generating the response from the model. In the previous comments under mikew2883, I have explained these steps in detail. I hope this help and you can solve the problem.
@kunalsatpute8379 6 месяцев назад ⁺¹
Hi,
I am using windows machine and getting error while running upload_data_manually.py file
its giving me runtimeerror: your system has an unsupported version of sqlite3. Chroma requires sqlite>=3.35.0
then i checked sqlite version using
sqlite3 -version its showing me 3.41.2 which is greater than 3.35 🤔
@airoundtable 6 месяцев назад
Hi, please try to replicate the project using the exact library versions that I included in requirements.txt and let me know if the problem appears again.
@kunalsatpute8379 6 месяцев назад ⁺¹
@@airoundtable I downloaded sqlite3 DLL files and copied those files to python installation directory. this resolve this issue
@alikhalili9961 7 месяцев назад ⁺³
This is an incredibly informative and well-structured video! The detailed breakdown of the RAG-GPT chatbot, along with the time-stamped sections, makes it easy to navigate and understand. The inclusion of real-time document uploads and summarization requests showcases the versatility of this chatbot. The GitHub links and references to the main libraries used are very helpful for those who want to delve deeper. Keep up the great work! Looking forward to more content like this. 👏👏👏
@airoundtable 7 месяцев назад ⁺²
Thanks ! Glad you liked the video!
@usamaahmed8075 5 месяцев назад
@@airoundtable can you help me am facing an issue with db when i run server it said that uploaded db doesnot exist
@usamaahmed8075 5 месяцев назад
can you help me am facing an issue with db when i run server it said that uploaded db doesnot exist
@airoundtable 5 месяцев назад
@@usamaahmed8075 @usamaahmed8075 Hi. what do you mean by the uploaded db? if it said vectordb does not exist. it is because you have to run this module first: upload_data_manually.py
I explained it completely in the video. If you are planning to upload a document and then start chatting with it, you need to first upload it using the upload button and make sure that you see the confirmation message on the screen.
@martiancoders1518 8 месяцев назад ⁺²
Can I use any other model or point me to section where I can use gguf file
@airoundtable 8 месяцев назад ⁺²
Sure, you can use other LLMs but you have to modify the code for that. The code is now using OpenAI GPT 3.5 (with API calls) model for inferencing. If you want to change the LLM and run the code:
1. Use a powerful LLM for a good performance (consider the context length, chunk sizes, and the instructions that you want to give to the model along with the available computational power that you have at hand)
2. You need to change the code wherever it is getting the response from the GPT model, which happens in two location:
a. src.utils.chatbot.py - response function - line 68 to 72
b. src.utils.summarizer - get_llm_response function
3. Depending on the model that you are using, you may need to process the response in a differen way as well. (e.g llama2's output will contain both the query and the response along with some special characters that need to be processed for a neat user experience.)
@martiancoders1518 8 месяцев назад ⁺¹
@@airoundtable excellent, will give it a try. Thank you 💯💯💯
@neelamyadav533 5 месяцев назад
My url is generated for UI , but nothing is getting displayed. I checked vector db is also created under chroma folder for the documents already stored under docs folder, I am using azure open ai credentials , what could be the reason?
@airoundtable 5 месяцев назад
It is hard to tell without seing the Traceback of the problem. Whatever is happening, you can see it from the terminal. In case the problem is not solved yet, feel free top open up an issue on the github repository and post the traceback there.
@tobiasbuchmann6972 6 месяцев назад ⁺¹
Thanks for the great tutorial! Just out of interest, would it also be possible to use streamlit as a user interface or are there any technical issues? Thanks again.
@airoundtable 6 месяцев назад ⁺¹
Thanks for the feedback! I am glad that you liked the video. Sure, you can use streamlit as well. In my opinion using streamlit is a bit easier than Gradio. That was also one of the goals of this series to show how to use streamlit, gradio, and chainlit. I used esch one of them in a separate video. If you check the channel you will see a chatbot that I designed with streamlit.
"Connect a GPT agent to duckduckgo search engine".
Feel free to reach out if you have any other questions.
@tobiasbuchmann6972 6 месяцев назад ⁺¹
@@airoundtableThanks! You mentioned also the issue of data flow management. Let’s assume that I upload ten documents in advance to the database then I have another one that I upload while using the chatbot. Will the chatbot use all eleven documents to answer my question? Thanks again for your help!
@airoundtable 6 месяцев назад ⁺¹
That is a good question@@tobiasbuchmann6972. No this chatbot treats the documents that were prepared in advance differently from the ones that you upload while using the chatbot. So, to your example, it creates an index for those 10 documents that you preprocessed earlier, and it creates a separate index for that one single document that you passed to it while using it. And also let's say that during using the UI you upload documents in multiple steps. Everytime that you upload a new set of documents, it creates a new index for them and points the chatbot to the most recent index. Finally, whenever you run the UI, it make sure that all the indexes that were created for the uploaded documents during the previous user's session are removed and clean up the disk and it gives you a fresh start.
But this is just one way of doing it. At the end of the day, all these functionalities can be adjusted based on your needs.
@tonysingh9426 5 месяцев назад ⁺¹
Hello there, I am running into the following error: The API deployment for this resource does not exist. If you created the deployment within the last 5 minutes, please wait a moment and try again.
@airoundtable 5 месяцев назад
Hello. The project is using Azure OpenAI. Therefore, it needs access the API key, endpoint, and deployment name of the GPT model and embedding model in azure. And for that you need to deploy them on Azure OpenAI studio first. This error rises in two scenarios:
1. The model has not been deployed yet
2. The model is deployed but the deployment name that was passed to the chatbot is not the same as the deployment name that was used in the OpenAI studio.
And in case you want to use OpenAI directly, instead of Azure OpenAI, please read the pinned comment for a step by step description of the changes that are required for that.
@tonysingh9426 5 месяцев назад ⁺¹
@@airoundtable thank you, how do i pass the deployment name to the chat bot?
@airoundtable 5 месяцев назад
@@tonysingh9426 create a file in the project directory and name it: .env
in there create these arguments:
OPENAI_API_TYPE=azure
OPENAI_API_VERSION=
OPENAI_API_KEY=
OPENAI_API_BASE=
gpt_deployment_name=
embed_deployment_name=
Also in the config folder, in llm_config there is the name of the gpt in engine argument, change that as well.
Then run the chatbot. The project automatically loads the .env file and extract all these information from it
@tonysingh9426 5 месяцев назад ⁺¹
Thank you, sir
@tonysingh9426 5 месяцев назад ⁺¹
@@airoundtable do i need a second deployment for the embedding model? and if so should it be the same engine as the gpt_deployment?
@zlatomirradev3030 9 месяцев назад ⁺¹
Nice project, thank you very much for the great content sir👏👏👏
@airoundtable 9 месяцев назад ⁺¹
Glad to see that you liked the project 😉
@AlonAvramson 7 месяцев назад ⁺¹
Thank you! great video and explanations!
@airoundtable 7 месяцев назад
Thanks for the feedback! I am glad to hear that you enjoyed the video!
@OBRosewell 5 месяцев назад
dude!!!! INSANE!! such a good tutorial. you rock.
my one question is: credits. Will the vector function save you credits? e..g i want to build a legal document reader & Q&A. Some docs are 100 pages long. wont each doc cost hundreds in API credits? OR is that what vectorisation & DBs are for?
@airoundtable 5 месяцев назад
Thanks! I am glad you liked the video. Storing the vectorized documents in a vectorDB definitely can save costs and it won't be efficient to build such a system without the use of vector databases (especially since you can use many of them for free). About the pricing itself, vectorization is not a very costly task unless you are dealing with thousands of documents. According to OpenAI documentation, text-embedding-3-large will be priced at $0.00013 / 1k tokens. Keep it in mind text-embedding-3-large is their new model and most expensive one (also more expensive than the one that I used in the video).
Source: openai.com/blog/new-embedding-models-and-api-updates
So, imagine for a document with 100 pages, in case you turn it into let's say 500 chunks, you will make 500 API calls to this model and it will roughly cost you something around: 0.0325 US$ (approximate).
I considered each chunk to be around 1500 characters which makes it around 400 tokens.
Also, keep it in mind that you can manage the vectordbs by editing them (add/remove documents) rather than re-creating them on every small change.
@vectorautomationsystems 6 месяцев назад
Thank you very much for this. Very concise in your explanations. Any chance this can be done without OpenAI, but instead use a local LLM like ollama?
@airoundtable 6 месяцев назад ⁺¹
Absolutely, please check out my latest video called:
Open Source RAG Chatbot with Gemma and Langchain | (Deploy LLM on-prem)
i took RAG-GPT and replaced GPT models with Google Gemini7B. I also replaced OpenAI's embedding model with an open source model. I would never suggest using a 7B LLM for deployment but my main goal was to show how you can have the same pipeline (RAG-GPT) but with an open source model on prem.
@thaukhoorz 5 месяцев назад
Thank you so much for this! Excellent instructions, excellent documentation.
@airoundtable 5 месяцев назад
You're very welcome! I am glad it was helpful
@damianoiucci309 5 месяцев назад
is commercial or not openai need api key? there is possible to use llm otherwise of IA of third parts?
@airoundtable 5 месяцев назад
OpenAI GPT and embedding models need an API key. If I got your point correctly, you want to use open source LLMs. You absolutely can. Please check the Open Source RAG video in my channel. That is alsomt the identical project with open source models.
ruclips.net/video/6dyz2M_UWLw/видео.htmlsi=W9dW4JNbC2KH_tHs
@deborahjamesmathew Месяц назад
Thank you so much for the informative video
@airoundtable Месяц назад
Thanks! I am glad it was helpful
@Krishna-p6r2h 6 месяцев назад
Thank u Sir🙏
@rajasengupta1125 7 месяцев назад ⁺²
This is the most comprehensive RAG tutorial video I have seen on RUclips. What a great effort and command over the subject sir!
I am from a low-code business analyst background so I heavily depend upon co-pilot to guide me on python script functionality
Still I was able to set up the system as explained by you on my local PC, however I am getting the error on executing python src
aggpt_app.py
"
import pwd
ModuleNotFoundError: No module named 'pwd'
"
Can you guide me on what I am missing
Many thanks
@airoundtable 7 месяцев назад ⁺¹
Thanks! I am happy to hear that you liked the video! It is a bit difficult for me to debug that code without the whole traceback. What operating system are you using? and can you send me the full error?
@duynguyen-kn6tg 7 месяцев назад
@@airoundtable
I have same error, I use Windows 10
...
File "D:\Install\PyThon\lib\site-packages\langchain\document_loaders\__init__.py", line 18, in
from langchain_community.document_loaders.acreom import AcreomLoader
File "D:\Install\PyThon\lib\site-packages\langchain_community\document_loaders\__init__.py", line 163, in
from langchain_community.document_loaders.pebblo import PebbloSafeLoader
File "D:\Install\PyThon\lib\site-packages\langchain_community\document_loaders\pebblo.py", line 5, in
import pwd
ModuleNotFoundError: No module named 'pwd'
@rajasengupta1125 7 месяцев назад
Thanks for your kind reply. I am using windows 10.
On VS Code I have created the virtual env, installed all libraries and followed all instructions as given in readme (for RAG-GPT application)
When i initialize python serve.py its ok
PS F:\Users\XYZ\Desktop\projects\LLM-Zero-to-Hundred\RAG-GPT> cd src
PS F:\Users\XYZ\Desktop\projects\LLM-Zero-to-Hundred\RAG-GPT\src> python serve.py
Serving at port 8000
But when I initialize python raggpt_app.py ( To launch Gradio ). I get the following error ( truncated )
PS F:\Users\XYZ\Desktop\projects\LLM-Zero-to-Hundred\RAG-GPT> cd src
PS F:\Users\XYZ\Desktop\projects\LLM-Zero-to-Hundred\RAG-GPT\src> python raggpt_app.py
Traceback (most recent call last):
File "F:\Users\XYZ\Desktop\projects\LLM-Zero-to-Hundred\RAG-GPT\src
aggpt_app.py", line 25, in
from utils.upload_file import UploadFile
File "F:\Users\XYZ\miniconda3\lib\site-packages\langchain_community\document_loaders\pebblo.py", line 5, in
import pwd
ModuleNotFoundError: No module named 'pwd'
PS F:\Users\XYZ\Desktop\projects\LLM-Zero-to-Hundred\RAG-GPT\src>
@ojasvisingh786 6 месяцев назад
🎉❤

Следующие

Автовоспроизведение

Connect GPT Agent to Duckduckgo Search Engine | Streamlit Chatbot