🐱 GitHub repository: github.com/alejandro-ao/chat-with-websites 🔥 Join the LangChain Master Program (early access): link.alejandro-ao.com/langchain-mastery 💬 Ask your questions in our Discord Server (but please leave a comment here too for engagement): link.alejandro-ao.com/981ypA ❤ Buy me a coffee (thanks): link.alejandro-ao.com/YR8Fkw
when fine tuning a model : it seems that it is adjusting the model weights based on the input and expected output? that would mean the brain is @Open? so its basically live when taking to the brain in training does this effect its weights ? can only the trainer effect the weights of the model ? i expect this is happening in memory? based on Positive and Negative feedback in Chat could we not be talking to the brain and teaching it and adjusting its weights on the fly? Also (sorry) When using the Rag type systems are the documents being tokenized using the tokenizer from the model ? (i know it locks the database down a bit) , as then we could consider the local rag as the working memory system ? and the llm as the long term memory system , should there be a bridge between the database and trainer so that it could essentially update the Longterm memory periodically releasing the local rag data? ie essentially training a lora to be applied (or merged) ... hence the llm should have a lot of loras from each interval of updates or training Or not if the strategy is full merge?
can i just say you make this whole thing bearable because your voice delivery is on point i won't go anywhere else i'm gonna learn everything here what a bless
You are an amazing mentor. The way you explain is far better than university professors. Hope to learn more from you. Make more videos on this kind of AI app Development, Please.
Wow, this video is fantastic! 🤩 I’ve been wanting to create a chatbot for a while, and your step-by-step guide made it so easy to follow. 🙌 Now I can build my own chatbot to chat with any website-how cool is that? 😎 Thanks for sharing your knowledge! 👍🔥 Keep up the great work! 👏😊
hey Alejandro, thanks for your video! This project is my first side project! I reaally appropriate your amazing job! If you are looking for the idea for next tutorial, the text-to-mindmap maybe a good idea!
hey there! i'm glad you enjoyed the project. that sounds like a fun idea, I will probably me doing something like that in the future. Maybe something using knowledge graphs?
Awesome video. What extensions do u use in VSCode? Seems to be very helpful. Also, can you please show how to input many data sources as an input. i.e many pages of a website
Excellent video. But just for suggestion can you make video on how do we deploy the same code using some microservices like fastapi? As most of your videos are using streamlit ( I actually learned a lot about streamlit 😅) but in case of simple app deployment on even localhost with fastapi or flask will be very helpful.
hey, in case you haven't seen it. here's the video on how to put this to production in a simple, free way: ruclips.net/video/74c3KaAXPvk/видео.htmlsi=xfL4RZuDTb3H4rgr
Thanks for sharing! Suggestion: It would be great if the app would also include a way for the user to see the text chunks used to generate the answer, so people can check that the answer is accurate and not either hallucinated or taken from knowledge the AI has from elsewhere. Unfortunately, I've seen too many RAG apps where either answers are inaccurate or the LLM is responding based on other knowledge and not the documents it's supposed to use.
hey there Sharon! thanks for the suggestion! i will mention how to do this in a future video. it has changed a little since the introduction of LCEL, so it would be a very good topic to cover :)
Thank you Alejandro for your clear and insightful tutorials. They are much appreciated. I remade your full solution and it's working like a charm. However, when I ask it a question that's outside of the context of the webpage (like: What's the capital of France?, for example), it still answers. How can I ground it to answer only from the context? i tried to add the following to the system prompt of the get_conversational_rag_chain: ("system", "Answer the user question based on the below context. If the context does not contain the answer, do not make up an answer and respond with \"I am sorry I cannot answer this out-of-context question!\":
Context:{context}") However, it still answers irrelevant questions. What do you think?
Thank you for the great explanation. 🙏 I have two questions: 1. At some point the context window of the llm will be exceeded, how can I deal with that? 2. You used the llm with the context retrieval chain to get the query. Is there a way to get the query without using the llm? In this way the app would be a bit faster and some extra costs would be avoided. Thanks in advance
Hey there, thanks for these questions. And sorry about the late reply, i was sick last week. 1. Yeah, as soon as your conversation becomes too long, you will exceed the context window. To fix this, you can use one of LangChain's memory classes. In short, they allow you to summarize the history of a conversation instead of sending the entire history in your prompt. Something like: "we are talking about .... and this and that was considered. Now answer the following question based on this context...". There are several ways to summarize a conversation (entity memory, buffer memory, etc) Here are the docs: python.langchain.com/docs/modules/memory/types/ 2. Good point. You could make your own chain that will make its own similarity search. But the problem here is that if your conversation is too long, then your similarity search will not be as efficient as if you had summarized it with a query beforehand. I suppose you can test both and see which one works best for your app. Because you're right, we are basically making 2 LLM requests for every user query here. Let me know if this helped!
@@alejandro_ao no worries at all! I hope you're feeling better now. Yeah this answers my questions. Using langchain's memory classes would be better than just asking the user to start a new chat. I'll take a look at the documentation. Thank you and have a great day 🙏 Looking forward for your new videos :)
Question - If retrieval chain is already finding the most relevant documents chunks based on conversation history and user's input and passing it through the {context}, what is the need to integrate retrieval chain using "create_retrieval_chain(retriever_chain, stuff_documents_chain)"
Amazing content as always. I have a request: Could you create a tutorial on creating a vector database with PDF files and using LangChain to query on it?
@@alejandro_aoHi. It would be appreciated if you could add one last part showing how you would use beautifulSoup to extend this app from a web page chat to a whole web site chat. Thanks.
Great tutorial! I'm a new subscriber and found it really helpful. I'm interested in using the Pinecone vector database for my projects. Could you please provide some guidance on how to get started with it? Any tips or resources would be greatly appreciated. Thank you!
You are a rare gem. I really appreciate your knowledge sharing. Please help us release video that uses natural language to SQL and we can connect to WhatsApp. To make things more exciting that we can load image from WhatsApp to our database.
could you make a version where you don't delete stuff all the time as it wasn't always clear what was supposed to be deleted and what needed to be kept so i often missed when you did it and it made it harder to follow what you were doing.
Hi, thank a lot for your sharing. And also I want to ask about error message APIConnectionError: Connection error while running Chroma.from_documents(""chunkc","embeddings"). Here I use AzureOpenAIEmbeddings.. how to solve this error? thanks for helping.
great question! remember that every time something happens in your application (you enter a value into a field, press a button, etc), streamlit runs the whole code again. let's call this a streamlit cycle. st.session_state is used to keep the values across each cycle. we test the condition before assigning the value because otherwise we would be running `st.session_state.chat_history = []`on every cycle, effectively reinitializing it every time and defeating the purpose of having a session state altogether... so we test if the variable is already defined in the session state on every cycle. if it is already there, it means it has some information so we should not reinitialize it. try testing the app without testing for this condition to see how this changes. i hope it is clear!
@@alejandro_ao thank you so much for taking time and explaining it clearly! Can I ask you one more question off topic? How are the GPT’s from GPT builder different compared to these RAG applications?
Awesome video thanks, one question, once it is deployed on a server, does it also run the chunks,embeddings,vector stores part each time a user loads the page, or will only be called at the time it is starts to run on the server? Thanks again
hey there! yes it does! you would have to set up a persistent data store for it to be able to keep your data. you can do this in your own server or use a hosted service such as Qdrant Cloud or Pinecone
Great video! subscribed and liked. Just one part that I could not understand. the stuff_documents_chain tasks input and context. How is the context being passed from retriever_chain to stuff_documents_chain? Dose LangChain just defined that creste_retrieval_chain can pass context from the first argument to the second?
hey there! welcome to the club! yeah that's exactly what is happening. we are using a prebuild chain that does the `.invoke('{"context": [...]"})`by itself without us having to call the variable. if you look at the prebuilt chain's source code, you will see that it is calling the invoke method inside the runnable. i will make a future video creating chains ourselves so that this is easier to understand!
Thank you so much for your amazing videos , this one in particular is outstanding as far as RAG is concerned , you are a real master in gen AI , I do find your videos tremendously helpful , keep on making them cheers from Saudi Arabia 😍😍😍💯
Hola, he seguido casi todos tus tutoriales, son magnificos y tus explicaciones muy claras y precisas, no hablo mucho ingles asi que entre mira la traduccion y mira el video, jaja, entretenido. he aprendido mucho, muchas gracias de verdad, solo una pregunta, solo funciona con chatgpt 4 o se le pude poner chatgtp 3.5?. saludos desde Mexico
Hola! Qué gusto saber que te hayan sido útiles estos videos :) De hecho, en el ejemplo del video, estamos usando el modelo gpt-3.5-turbo porque es el que Langchain usa por defecto. Si quieres usar gpt-4, tienes que indicar al inicializar tu LLM que quieres otro modelo. Así: llm = ChatOpenAI(model='gpt-4-0125-preview'). Sigue así! Puedes unirte al servidor de discord para participar en la comunidad si quieres. Está en inglés, pero no todos allí hablan inglés como lengua materna 👉 link.alejandro-ao.com/discord
Thats what Im very curious about. I made a pdf webscraper for mining documents to train a model, but i could only get it to pull the ones on the actual page the url directed to specifically. Wondering if theres a technique for automatically searching through all the pages associated with the original url. Im sure there is, I'm just a programming noob so im learning best I can.. ✌️
hey there! there is a way indeed, but it is a bit more complex. you would have to either: 1. have access to the website database. this way is simpler because you would just have to apply a RAG algorithm to a database. 2. scrape the website. this is more complex, as it requires using something like python's beautifulsoup to scrape the contents of the entire website. but beware because some websites don't allow bots (sometimes they can even try to get you in trouble). a no-code tool for scraping that is very good is octoparse, but know that this is on the edge of what is allowed and they have had several lawsuits in the past for making scraping so easy.
It looks interesting. 👌 But One question: Can I give any website link into it and ask for the best keys used for this website (like using it for digital marketing concepts)
i suppose you can! although i would think that this requires a different approach than using a LLM. you might need other NLP algorithms to deal with this. maybe some pipeline that strips your corpus of text, removes filler and useless words and gets the main words. it then could generate a bag of words for the keywords of your website. check it out, you can do all of this with python ;)
Hi Alejandro, thanks for your videos, your videos helps me to solve my first steps in use of LLM models. Please is it possible, that you show some solution of your last Langchain videos but as GPU version, how to run ChatBot with own PDF on GPU? Thanks alot.
hey Mathias, thanks mate, it means a lot! sure thing, i'm very glad that you share these video ideas that can be super useful to the community! i'll be working on a video about it!
Hi, your videos are excellent! You have saved me more times than I can count. I have a question, if instead of streamlit, we are using a flask server, how can I deal with the st.session_state? What about production, would it look the same as your script?
hey vanessa! thanks! that is exactly the way to go for a more professional app. in this video, i am using session_state because i am doing everything in the front end, which is good for POC and showcasing the app. but ideally, you would have your API in flask with all the logic there and the front end interacting with it. in your API, you would have your controller that initializes your vector store from a persistent database. this API would expose an endpoint such as `get_response(user_query, chat_history)`. and your front end would make calls to this endpoint. so there would be no need to put your vectorstore in the front end (st.session_state) since it would be dealt with in flask. the only thing that you would take track of in the front end would be things related to the session, like the chat history. you can join the discord server too and maybe we can help you a bit more 👉 link.alejandro-ao.com/discord
Excellent tutorial, awesome. Many Thanks! If you are able to show how to work with multimodal searches on uploaded PDF's with text and images and how to use private LLMs like OLLAMA that would be great 💪.
@@alejandro_ao hi. No, there are so many models and the evolution of each is advancing is fast, it’s hard to keep up as a noob. Happy to see a video on mixtral, a particular focus on image querying would be amazing. 🙏
Salut, super tuto !! Beaucoup de concepts se sont éclaircis grace à ce tuto. Prévois tu d'en faire un sur lang graph? Qui a l'air un peu plus complexe.
hey there, as always it depends! i would suggest you start low, say 2k usd. then provide the best service you can. if your client accepted right away, next time you're selling the product, double the price. rinse and repeat. your technical skill will improve with each project and so will your client-relationship skills 💪
heyy I had a doubt u said that after we do similarity based ranking with a vector db and get few chunks(context) to answer our query and then we'd pass these contexts with user query along with chat_history to an LLM, But if we pass chat_history wouldn't that exceed the max token size of an LLM if conversation went too long ??
hey there. great question. yes, totally. if the conversation is too long, then you can exceed the context window of your LLM. however, keep in mind that modern LLMs, such as GPT-4, Claude and especially Gemini 1.5 have gigantic context windows, so this might not be too much of a concern. also, consider that sending the entire conversation history is only one method for implementing memory in these systems. you can also send a summary of the conversation + the last 10 messages, for example. or produce a NER-based memory. i don't think there is an industry standard yet for implementing memory, though. so feel free to try out several methods.
This is so cool, thank you so much. Can this be applied to database with so many views or tables so you can ask questions and it's intelligent to perform joins to bring the answer? It will be interesting to see if it's possible or create a video. Thank you so much
hey there, the mic is a simple TnB SC 420 USB that I borrowed from a friend. But i'm buying an audio technica at2020 very soon. And the camera is just my iPhone 13 :) I do much more post prod recently than in earlier videos, mostly with the audio, though
Great video! A suggestion for next is to combine retrieval from a site, combined with PDF documents. Also, it might be useful to spider a whole site. Thanks for the great work!
thanks! yeah totally, we could technically combine many of these apps together and make a supercharged local llm app.... that's giving me ideas for an open source repo
Hey pretty cool of your video! Like it! BTW could you suggest which plugin you're using for the auto-code-completing feature? Looks very useful to me. Thx!
I compared it a bit by following your video and turned out most of the time I could get similar code prediction as gh copilot did. Only sometimes it focused more on correcting the word but not giving the correct package name within a module. I can still fix it by debugging. So overall, Continue is good and enough for me, and free!@@alejandro_ao
Thanks for the video, it's great stuff. Wonder if you can do these videos with Gemini LLM, the LLM and embeddings are free as far as I can tell. I somehow made it work by watching your video, thanks man.
hey there, that's a great idea. good that you made that work. i was going to use them for a video a few weeks ago, but then realized that i needed to set up a vpn before recording. these models were not available in the eu when i checked a few weeks ago (great)... so i got lazy and went for openai. i will set up this vpn thing (or move out of europe) :P
What a great tutorial!!! Really enjoyed it!! New subscriber here!! Only one question...just for testing purposes I used the chat to ask if he could answer topics different from the context of the websites I was chatting with, and it did. It even wrote code for me. Is there a way to restrict the app to only answer about the content of the website we are chatting with and not other questions? Thanks for the amazing video
hey there, welcome onboard! yeah, that's on of the main complications of creating rag applications. i have seen several ways of dealing with this. you might want to try some of these: 1. the main thing to do to control this issue is to add a restrictive instruction to your initial prompt. so you would have to add something like this: "Don’t justify your answers. Don’t give information not mentioned in the CONTEXT INFORMATION" 2. to go even further, you can fine-tune your model with some of these edge cases. 3. lastly, a more complex approach would be to add a "policing" ai to read the main ai's answers and decide wether to accept them or not. this is, of course, more expensive and complex.
How would you handle streaming with the response ? I The docs aren't exactly clear on how to achieve this when you aren't using LCEL and the new syntax.
i agree with that, i'm working on a video about precisely this! it is a little trickier to deal with this in streamlit, but it can be done. i would think this feature is more aimed at more sophisticated front-end frameworks such as react, though
Hi, great video! but after I change the website URL,it keeps answering based in the first URL. I closed the browser and opened again. Inserted a different URL, and asked What is the title of the articles and still giving the title of the first URL. Even after change the URL in the text input, the program does not change the URL, because keeps answering based on the first URL.
Hey, good video, thank you for your work. Have you tried visual recognition instead of using RAG over site's http? I saw some attempts to give an LLM an ability to navige though a site using Puppeteer and recognize pages' content from screenshots. To me the quality of the recognition appeared to be too low but this is a viry interesting way of doing the web scrapping and chatting with the collected data. How good do you find the quality of your approach?
that would be sick. this approach is pretty much for a single HTML file. but yeah, technically i guess it is possible to use a vision api like openai's gpt4-vision and plug it into an agent that would use puppeteer to navigate the site. it would be a one-fit-all scrapper. because usually, when you want to scrape a site, you have to check its structure and target what you want to scrape manually. let me know if you make the puppeteer thing, it sounds cool
@alejandro_ao Well, this is not my idea. I met it first in this video: ruclips.net/video/VeQR17k7fiU/видео.html I coded my version of the same idea to grab a list of new horror movies from one torrent tracker. Just to keep track of what new horrors are released. I was not satisfied with the recognition results. There was some room for improvement but I decided to switch to a different project. Later I met a continuation of this idea. It is possible to highlight links and buttons on a screenshot (also with Puppeteer) and let an LLM function to click on them. That sounds really interesting but this I haven't tried myself.
Friend, could you teach me how to receive a data stream from OpenAI with langchain through document analysis? I swear I've put a lot of effort into figuring it out, but I'm struggling a bit with the language and the new documentation 😢
hey my friend, sure thing. but i'm not sure i can help more without more info. why don't you bring that up in our discord server? maybe we can help you out there: link.alejandro-ao.com/discord
Hey Alejandro. I tried to deploy Chatwithpdf app to stream lit but its not working. whenever I upload pdf and start processing its starts to download weights. I am using huggingface APIs instead of OpenAI. Does streamlit has any sort of limitation ? IF yes is there any way I can perform this task without downloading weights just like OpenAI APIs?
Hi! The chat is showing the scraped text and and the message in prompt template How to solve it? (As we only want to show queries and the response only.)
🐱 GitHub repository: github.com/alejandro-ao/chat-with-websites
🔥 Join the LangChain Master Program (early access): link.alejandro-ao.com/langchain-mastery
💬 Ask your questions in our Discord Server (but please leave a comment here too for engagement): link.alejandro-ao.com/981ypA
❤ Buy me a coffee (thanks): link.alejandro-ao.com/YR8Fkw
when fine tuning a model : it seems that it is adjusting the model weights based on the input and expected output?
that would mean the brain is @Open? so its basically live when taking to the brain in training does this effect its weights ? can only the trainer effect the weights of the model ? i expect this is happening in memory?
based on Positive and Negative feedback in Chat could we not be talking to the brain and teaching it and adjusting its weights on the fly?
Also (sorry)
When using the Rag type systems are the documents being tokenized using the tokenizer from the model ? (i know it locks the database down a bit) , as then we could consider the local rag as the working memory system ? and the llm as the long term memory system , should there be a bridge between the database and trainer so that it could essentially update the Longterm memory periodically releasing the local rag data? ie essentially training a lora to be applied (or merged) ...
hence the llm should have a lot of loras from each interval of updates or training Or not if the strategy is full merge?
can i just say you make this whole thing bearable because your voice delivery is on point i won't go anywhere else i'm gonna learn everything here what a bless
underrated content, edit: forgot to say thanks, was too focused with your content, Thanks!
thank you! i really appreciate it :)
Incredible! It's the first time I understand everything just by watching a tutorial once. A true educator
it means a lot! keep it up 💪
such a gem this video is .... i don't know why your tutorials are not on trending page ..... Bro you are just awesome 🔥🔥🔥🔥🔥🔥
thank you!! hopefully i'll get there someday 🥲
Exceptional clarity - great video!!!
Glad you liked it!
You are an amazing mentor. The way you explain is far better than university professors. Hope to learn more from you. Make more videos on this kind of AI app Development, Please.
thank you man, i won't stop!
Yes, it was challenging until you came in. I'm proud of you!
i appreciate it!
I very rarely comment a video.
It was very clear, good flow to understand.
Thanks a lot
i appreciate it :) i hope you learned a lot! keep it up!
I like your style, warm and peaceful
i am very warm and peaceful indeed
Wow, this video is fantastic! 🤩 I’ve been wanting to create a chatbot for a while, and your step-by-step guide made it so easy to follow. 🙌 Now I can build my own chatbot to chat with any website-how cool is that? 😎 Thanks for sharing your knowledge! 👍🔥
Keep up the great work! 👏😊
you make me want to keep making tutorials. very glad to hear this helped!
Great video, u rock!! Waiting for a "Chat with any database using Python and Langchain" 💯💯
Thanks! Coming soon!
I think LlamaIndex has a MongoDB loader and JSON_query_loader.
One of the best tutors. Hatsoff keep the good work buddy. Wishes🎉 from India
Your clarity and grasp of the topic are great!
hey there! very glad you liked this! :)
Yes! We did learn a lot about the latest version of LANGCHAIN!! Thank You!!!
it is my pleasure! stay tuned for more :)
the GUI tutorials are fire - God Bless
Its an amazing tutorial! Really really good. Clear articulation and reasoning behind every line
Simplemente genial y una muy buena explicación, gracias Alejandro por este tipo de contenido.
hola jorge! gracias a ti! un placer poder ayudar :)
Fantastically clear and methodical walkthrough. Keep it up!!!
Thanks, will do!
Did not take a break for an hour and 20 mins so committed :) excellent video
hey Alejandro, thanks for your video! This project is my first side project! I reaally appropriate your amazing job! If you are looking for the idea for next tutorial, the text-to-mindmap maybe a good idea!
hey there! i'm glad you enjoyed the project. that sounds like a fun idea, I will probably me doing something like that in the future. Maybe something using knowledge graphs?
Wow....seriously great tutorial....keep up the good work..👍
Thank you so much 😀 let me know what other topics/libraries you would like to see covered
Hi! Your video is buttery smooth and your explanation, voice are so peaceful.! How did you learn all these topics, methods and that flow?
Awesome video. What extensions do u use in VSCode? Seems to be very helpful. Also, can you please show how to input many data sources as an input. i.e many pages of a website
Excellent video. But just for suggestion can you make video on how do we deploy the same code using some microservices like fastapi? As most of your videos are using streamlit ( I actually learned a lot about streamlit 😅) but in case of simple app deployment on even localhost with fastapi or flask will be very helpful.
I was very impressed by your tutorial and your style of explaining. I am currently starting to dive deeper into AI, so this was very helpful to me :D
i'm so happy to hear this! keep it up!
@@alejandro_ao I will, thank you :)
Nice one! Is there any chance to cover how to put all of this into production in the most cheapest possible way? Thanks!
Working on it!
hey, in case you haven't seen it. here's the video on how to put this to production in a simple, free way: ruclips.net/video/74c3KaAXPvk/видео.htmlsi=xfL4RZuDTb3H4rgr
@@alejandro_ao Thanks! Checked! Amazing work! I'll buy you a coffee for sure.
Thanks for sharing! Suggestion: It would be great if the app would also include a way for the user to see the text chunks used to generate the answer, so people can check that the answer is accurate and not either hallucinated or taken from knowledge the AI has from elsewhere. Unfortunately, I've seen too many RAG apps where either answers are inaccurate or the LLM is responding based on other knowledge and not the documents it's supposed to use.
hey there Sharon! thanks for the suggestion! i will mention how to do this in a future video. it has changed a little since the introduction of LCEL, so it would be a very good topic to cover :)
Exceptional clarity, keep it up👍
thanks, will do!!
Awesome course Alejandro! By the way, I'm having issue with installing "Chroma" See....I'm getting "ERROR: Failed building wheel for chroma-hnswlib"
Thank you Alejandro for your clear and insightful tutorials. They are much appreciated.
I remade your full solution and it's working like a charm.
However, when I ask it a question that's outside of the context of the webpage (like: What's the capital of France?, for example), it still answers. How can I ground it to answer only from the context?
i tried to add the following to the system prompt of the get_conversational_rag_chain:
("system", "Answer the user question based on the below context. If the context does not contain the answer, do not make up an answer and respond with \"I am sorry I cannot answer this out-of-context question!\":
Context:{context}")
However, it still answers irrelevant questions.
What do you think?
Thank you for the great explanation. 🙏
I have two questions:
1. At some point the context window of the llm will be exceeded, how can I deal with that?
2. You used the llm with the context retrieval chain to get the query. Is there a way to get the query without using the llm? In this way the app would be a bit faster and some extra costs would be avoided.
Thanks in advance
Hey there, thanks for these questions. And sorry about the late reply, i was sick last week.
1. Yeah, as soon as your conversation becomes too long, you will exceed the context window. To fix this, you can use one of LangChain's memory classes. In short, they allow you to summarize the history of a conversation instead of sending the entire history in your prompt. Something like: "we are talking about .... and this and that was considered. Now answer the following question based on this context...". There are several ways to summarize a conversation (entity memory, buffer memory, etc) Here are the docs: python.langchain.com/docs/modules/memory/types/
2. Good point. You could make your own chain that will make its own similarity search. But the problem here is that if your conversation is too long, then your similarity search will not be as efficient as if you had summarized it with a query beforehand. I suppose you can test both and see which one works best for your app. Because you're right, we are basically making 2 LLM requests for every user query here.
Let me know if this helped!
@@alejandro_ao no worries at all! I hope you're feeling better now.
Yeah this answers my questions. Using langchain's memory classes would be better than just asking the user to start a new chat. I'll take a look at the documentation.
Thank you and have a great day 🙏
Looking forward for your new videos :)
@@JawadRaad-p7g thanks! have a good one!
Thank you so much, can we connect through X for more discussing, i have some bug w/ this problem.
of course! @_alejandroao
Question - If retrieval chain is already finding the most relevant documents chunks based on conversation history and user's input and passing it through the {context}, what is the need to integrate retrieval chain using "create_retrieval_chain(retriever_chain, stuff_documents_chain)"
Thank you very much, I have a chatbot project and your video helps me a lot. Best wishes
good luck with your project!
@@alejandro_ao Thank you
Amazing content as always. I have a request: Could you create a tutorial on creating a vector database with PDF files and using LangChain to query on it?
definitely! i’ve been meaning to update my previous version of that tutorial for a while now. expect that soon :)
Please continue this and do a part 2
what would you like to see in part 2? btw, you're hereby invited to join our Discord community: link.alejandro-ao.com/discord
I hope to see you there!
@@alejandro_aoHi. It would be appreciated if you could add one last part showing how you would use beautifulSoup to extend this app from a web page chat to a whole web site chat. Thanks.
Great tutorial! I'm a new subscriber and found it really helpful. I'm interested in using the Pinecone vector database for my projects. Could you please provide some guidance on how to get started with it? Any tips or resources would be greatly appreciated. Thank you!
Learnt a lot from this tutorial. Thank you Alejandro.
you're welcome!
i like you learning and implementing style can you guide how i improve my learning speed in this dynamic tech world
Great video and Good explanation.
can we implement this without having an openai API key for free???
Great video, can we pass multiple website URLs like 2 or 3
Man you're awesome!
Some day I'll be too
you're already awesome my friend, keep it up and keep improving every day! 💪
Great video Alejandro, can I ask which theme you are using for your terminal
hey there, thanks! sure. the theme is robbyrussell and i'm running it with ohmyzsh 😎
Good work 👏👏
Can you do one tutorial illustrating how you can implement the same in flask or fastapi
You are a rare gem. I really appreciate your knowledge sharing. Please help us release video that uses natural language to SQL and we can connect to WhatsApp. To make things more exciting that we can load image from WhatsApp to our database.
could you make a version where you don't delete stuff all the time as it wasn't always clear what was supposed to be deleted and what needed to be kept so i often missed when you did it and it made it harder to follow what you were doing.
Hi. Thanks for the very nice tutorial, how do you create those beautiful flow diagrams? Is there an app that you use?
that's miro.com/
but i have since switched to excalidraw.com/ (example here: ruclips.net/video/kBXYFaZ0EN0/видео.html )
both have a free tier
@@alejandro_ao Thanks a lot
Thank you, very helpful!!
Will be interesting a video like this with the implementation of Mixtral.
how are you reading into my mind?
thanks for sharing your knowledge, pdf files can be added?
absolutely, but you would have to use a PDFLoader instead of the WebBasedLoader!
Hi, thank a lot for your sharing.
And also I want to ask about error message APIConnectionError: Connection error while running Chroma.from_documents(""chunkc","embeddings"). Here I use AzureOpenAIEmbeddings..
how to solve this error?
thanks for helping.
Thank you for the great video. Could you do a video showing how to add in the streaming feature?
it's coming soon!
Please Mate can you upload videos about Agents , Function calling lansmith and langserve .....Also how to create user data personalized llm's
Great content buddy ! One question , how can we connect this chatbot to apps like WhatsApp ?
Hey! Great video!
i have one doubt at 21:50 why did you use if condition for chat_history to be in session_state ? why not use directly?
great question! remember that every time something happens in your application (you enter a value into a field, press a button, etc), streamlit runs the whole code again. let's call this a streamlit cycle.
st.session_state is used to keep the values across each cycle. we test the condition before assigning the value because otherwise we would be running `st.session_state.chat_history = []`on every cycle, effectively reinitializing it every time and defeating the purpose of having a session state altogether...
so we test if the variable is already defined in the session state on every cycle. if it is already there, it means it has some information so we should not reinitialize it.
try testing the app without testing for this condition to see how this changes. i hope it is clear!
@@alejandro_ao thank you so much for taking time and explaining it clearly!
Can I ask you one more question off topic?
How are the GPT’s from GPT builder different compared to these RAG applications?
Awesome video thanks, one question, once it is deployed on a server, does it also run the chunks,embeddings,vector stores part each time a user loads the page, or will only be called at the time it is starts to run on the server? Thanks again
hey there! yes it does! you would have to set up a persistent data store for it to be able to keep your data. you can do this in your own server or use a hosted service such as Qdrant Cloud or Pinecone
Great video! subscribed and liked.
Just one part that I could not understand.
the stuff_documents_chain tasks input and context.
How is the context being passed from retriever_chain to stuff_documents_chain?
Dose LangChain just defined that creste_retrieval_chain can pass context from the first argument to the second?
hey there! welcome to the club! yeah that's exactly what is happening. we are using a prebuild chain that does the `.invoke('{"context": [...]"})`by itself without us having to call the variable. if you look at the prebuilt chain's source code, you will see that it is calling the invoke method inside the runnable. i will make a future video creating chains ourselves so that this is easier to understand!
Thank you so much for your amazing videos , this one in particular is outstanding as far as RAG is concerned , you are a real master in gen AI , I do find your videos tremendously helpful , keep on making them cheers from Saudi Arabia 😍😍😍💯
i appreciate it! i'm glad that you're finding these useful :) i have many more coming
Hola, he seguido casi todos tus tutoriales, son magnificos y tus explicaciones muy claras y precisas, no hablo mucho ingles asi que entre mira la traduccion y mira el video, jaja, entretenido. he aprendido mucho, muchas gracias de verdad, solo una pregunta, solo funciona con chatgpt 4 o se le pude poner chatgtp 3.5?. saludos desde Mexico
Hola! Qué gusto saber que te hayan sido útiles estos videos :) De hecho, en el ejemplo del video, estamos usando el modelo gpt-3.5-turbo porque es el que Langchain usa por defecto. Si quieres usar gpt-4, tienes que indicar al inicializar tu LLM que quieres otro modelo. Así: llm = ChatOpenAI(model='gpt-4-0125-preview'). Sigue así!
Puedes unirte al servidor de discord para participar en la comunidad si quieres. Está en inglés, pero no todos allí hablan inglés como lengua materna 👉 link.alejandro-ao.com/discord
@@alejandro_ao Muchas gracias maestro, nos veremos por discord. que todo el bien que haces enseñando, se te multiplique
Awesome job man! Just a question, can this work with forum-like site with multiple internal pages?
Thats what Im very curious about. I made a pdf webscraper for mining documents to train a model, but i could only get it to pull the ones on the actual page the url directed to specifically. Wondering if theres a technique for automatically searching through all the pages associated with the original url. Im sure there is, I'm just a programming noob so im learning best I can.. ✌️
hey there! there is a way indeed, but it is a bit more complex. you would have to either:
1. have access to the website database. this way is simpler because you would just have to apply a RAG algorithm to a database.
2. scrape the website. this is more complex, as it requires using something like python's beautifulsoup to scrape the contents of the entire website. but beware because some websites don't allow bots (sometimes they can even try to get you in trouble). a no-code tool for scraping that is very good is octoparse, but know that this is on the edge of what is allowed and they have had several lawsuits in the past for making scraping so easy.
It looks interesting. 👌
But One question: Can I give any website link into it and ask for the best keys used for this website (like using it for digital marketing concepts)
i suppose you can! although i would think that this requires a different approach than using a LLM. you might need other NLP algorithms to deal with this. maybe some pipeline that strips your corpus of text, removes filler and useless words and gets the main words. it then could generate a bag of words for the keywords of your website. check it out, you can do all of this with python ;)
Hi, many thanks for the amazing tutorial video. May I ask what extensions you were using to make code suggestions?
hey there, thanks! i was using github copilot, it's just amazing how good it is
Many thanks!@@alejandro_ao
Hi Alejandro, thanks for your videos, your videos helps me to solve my first steps in use of LLM models.
Please is it possible, that you show some solution of your last Langchain videos but as GPU version, how to run ChatBot with own PDF on GPU?
Thanks alot.
hey Mathias, thanks mate, it means a lot! sure thing, i'm very glad that you share these video ideas that can be super useful to the community! i'll be working on a video about it!
Hi, your videos are excellent! You have saved me more times than I can count. I have a question, if instead of streamlit, we are using a flask server, how can I deal with the st.session_state? What about production, would it look the same as your script?
hey vanessa! thanks! that is exactly the way to go for a more professional app. in this video, i am using session_state because i am doing everything in the front end, which is good for POC and showcasing the app. but ideally, you would have your API in flask with all the logic there and the front end interacting with it.
in your API, you would have your controller that initializes your vector store from a persistent database. this API would expose an endpoint such as `get_response(user_query, chat_history)`. and your front end would make calls to this endpoint. so there would be no need to put your vectorstore in the front end (st.session_state) since it would be dealt with in flask.
the only thing that you would take track of in the front end would be things related to the session, like the chat history.
you can join the discord server too and maybe we can help you a bit more 👉 link.alejandro-ao.com/discord
Excellent tutorial, awesome. Many Thanks! If you are able to show how to work with multimodal searches on uploaded PDF's with text and images and how to use private LLMs like OLLAMA that would be great 💪.
i have been meaning to do videos on local LLMs indeed. have you tried mixtral?
@@alejandro_ao hi. No, there are so many models and the evolution of each is advancing is fast, it’s hard to keep up as a noob. Happy to see a video on mixtral, a particular focus on image querying would be amazing. 🙏
Great work. Could you please make a vídeo about chainlit usage also?
Salut, super tuto !! Beaucoup de concepts se sont éclaircis grace à ce tuto. Prévois tu d'en faire un sur lang graph? Qui a l'air un peu plus complexe.
salut ! merci ! en effet, je vais bientôt aborder LangGraph dans quelques tutos :) n'oublie pas de t'abonner pour recevoir les notifs :)
This is great, thanks so much for sharing this with us!
my pleasure!
This guy is a legend
nah you are
nice project. Plz dont mind, how much we should charge a client for this custom chatbot ??
hey there, as always it depends! i would suggest you start low, say 2k usd. then provide the best service you can. if your client accepted right away, next time you're selling the product, double the price. rinse and repeat. your technical skill will improve with each project and so will your client-relationship skills 💪
Great tutorial! ❤ Do you mind if you would tell what intellisense extension do you use in your vs code? Please? 😊
thanks! i'm using github copilot. it's great :)
heyy I had a doubt u said that after we do similarity based ranking with a vector db and get few chunks(context) to answer our query and then we'd pass these contexts with user query along with chat_history to an LLM,
But if we pass chat_history wouldn't that exceed the max token size of an LLM if conversation went too long ??
hey there. great question. yes, totally. if the conversation is too long, then you can exceed the context window of your LLM. however, keep in mind that modern LLMs, such as GPT-4, Claude and especially Gemini 1.5 have gigantic context windows, so this might not be too much of a concern.
also, consider that sending the entire conversation history is only one method for implementing memory in these systems. you can also send a summary of the conversation + the last 10 messages, for example. or produce a NER-based memory. i don't think there is an industry standard yet for implementing memory, though. so feel free to try out several methods.
How can i build this without showing the sidebar(url input), only the chat UI?
This is so cool, thank you so much. Can this be applied to database with so many views or tables so you can ask questions and it's intelligent to perform joins to bring the answer? It will be interesting to see if it's possible or create a video. Thank you so much
that's a great idea. i would build an agent do achieve that. i'll plan a video about that :)
What is the Microphone model you use in your videos ? And please share the camera details as well. thnx
hey there, the mic is a simple TnB SC 420 USB that I borrowed from a friend. But i'm buying an audio technica at2020 very soon. And the camera is just my iPhone 13 :) I do much more post prod recently than in earlier videos, mostly with the audio, though
Awesome thanks.
Great video! A suggestion for next is to combine retrieval from a site, combined with PDF documents.
Also, it might be useful to spider a whole site.
Thanks for the great work!
thanks! yeah totally, we could technically combine many of these apps together and make a supercharged local llm app.... that's giving me ideas for an open source repo
@@alejandro_ao About this, could you make a video to explain the diference between the PDF code and this?
yeah that would be very helpful@@joseiltonmota
Hey pretty cool of your video! Like it! BTW could you suggest which plugin you're using for the auto-code-completing feature? Looks very useful to me. Thx!
I figured it out by installing plugin "Continue". Thx anyway.
hey there! sure thing. i don't know about 'continue', but i am using gh copilot. how good is continue?
I compared it a bit by following your video and turned out most of the time I could get similar code prediction as gh copilot did. Only sometimes it focused more on correcting the word but not giving the correct package name within a module. I can still fix it by debugging. So overall, Continue is good and enough for me, and free!@@alejandro_ao
Thanks for the video, it's great stuff. Wonder if you can do these videos with Gemini LLM, the LLM and embeddings are free as far as I can tell. I somehow made it work by watching your video, thanks man.
hey there, that's a great idea. good that you made that work. i was going to use them for a video a few weeks ago, but then realized that i needed to set up a vpn before recording. these models were not available in the eu when i checked a few weeks ago (great)... so i got lazy and went for openai. i will set up this vpn thing (or move out of europe) :P
Thank you for this ♥️♥️♥️..do you know how can we host it?
no worries! check out this other video: ruclips.net/video/74c3KaAXPvk/видео.html
all the best!
What a great tutorial!!! Really enjoyed it!! New subscriber here!! Only one question...just for testing purposes I used the chat to ask if he could answer topics different from the context of the websites I was chatting with, and it did. It even wrote code for me. Is there a way to restrict the app to only answer about the content of the website we are chatting with and not other questions?
Thanks for the amazing video
hey there, welcome onboard! yeah, that's on of the main complications of creating rag applications. i have seen several ways of dealing with this. you might want to try some of these:
1. the main thing to do to control this issue is to add a restrictive instruction to your initial prompt. so you would have to add something like this: "Don’t justify your answers. Don’t give information not mentioned in the CONTEXT INFORMATION"
2. to go even further, you can fine-tune your model with some of these edge cases.
3. lastly, a more complex approach would be to add a "policing" ai to read the main ai's answers and decide wether to accept them or not. this is, of course, more expensive and complex.
@@alejandro_ao thanks for the quick answer
@alenjandro_ao can you suggest some open models that can be used instead of openai models for this particular used case?
hey there! go for mixtral, those guys are on fire. even mixtral-8x7B is just amazing and super lightweight!
How would you handle streaming with the response ? I The docs aren't exactly clear on how to achieve this when you aren't using LCEL and the new syntax.
i agree with that, i'm working on a video about precisely this! it is a little trickier to deal with this in streamlit, but it can be done. i would think this feature is more aimed at more sophisticated front-end frameworks such as react, though
Thank you...Thank you so much..Definitely learnt a lot. Got to create my first AI app!!!!
you are amazing! keep it up!
@@alejandro_ao Thanks!
Hi, great video! but after I change the website URL,it keeps answering based in the first URL. I closed the browser and opened again. Inserted a different URL, and asked What is the title of the articles and still giving the title of the first URL. Even after change the URL in the text input, the program does not change the URL, because keeps answering based on the first URL.
i guess you have to stop and start the app again to be able to give a differnt url!
Please make an video for langchain evaluators that how can we use them using huggingface it would be helpful for us
Hey, good video, thank you for your work. Have you tried visual recognition instead of using RAG over site's http? I saw some attempts to give an LLM an ability to navige though a site using Puppeteer and recognize pages' content from screenshots. To me the quality of the recognition appeared to be too low but this is a viry interesting way of doing the web scrapping and chatting with the collected data. How good do you find the quality of your approach?
that would be sick. this approach is pretty much for a single HTML file. but yeah, technically i guess it is possible to use a vision api like openai's gpt4-vision and plug it into an agent that would use puppeteer to navigate the site. it would be a one-fit-all scrapper. because usually, when you want to scrape a site, you have to check its structure and target what you want to scrape manually. let me know if you make the puppeteer thing, it sounds cool
@alejandro_ao Well, this is not my idea. I met it first in this video: ruclips.net/video/VeQR17k7fiU/видео.html
I coded my version of the same idea to grab a list of new horror movies from one torrent tracker. Just to keep track of what new horrors are released. I was not satisfied with the recognition results. There was some room for improvement but I decided to switch to a different project.
Later I met a continuation of this idea. It is possible to highlight links and buttons on a screenshot (also with Puppeteer) and let an LLM function to click on them. That sounds really interesting but this I haven't tried myself.
One tutorial with open source models like mistral would be great , what about chatting with json file ?
that's coming very soon!
Easy to follow, thank you
my pleasure!
Friend, could you teach me how to receive a data stream from OpenAI with langchain through document analysis? I swear I've put a lot of effort into figuring it out, but I'm struggling a bit with the language and the new documentation 😢
hey my friend, sure thing. but i'm not sure i can help more without more info. why don't you bring that up in our discord server? maybe we can help you out there: link.alejandro-ao.com/discord
How can i use streamlit stream feature in this example ?
How does the technique utilize website links to parse data, particularly in relation to the rules set by robots.txt files?
hey there, this actually only sends a GET request to the URL you requested. in other words, it only gets a single web page, not the full website
Great video, thanks for doing it for us.
i’m glad you appreciate it :)
great tutorial - thanks so much!
Thanks! Glad it was helpful!
love this guy
i'm sure he loves you too
Hey Alejandro. I tried to deploy Chatwithpdf app to stream lit but its not working. whenever I upload pdf and start processing its starts to download weights. I am using huggingface APIs instead of OpenAI. Does streamlit has any sort of limitation ? IF yes is there any way I can perform this task without downloading weights just like OpenAI APIs?
try running ollama model and check
Hi! The chat is showing the scraped text and and the message in prompt template
How to solve it?
(As we only want to show queries and the response only.)
I've even tried your other tutorial of multiple PDFs, same issue there,
Would appreciate your response
Great video!! Thanks!
Great video, thank you for the hard work.
thank you, i appreciate it! i will be putting more of these :)
Thanks for this app, you are awesome!
you are awesome!
love your content !
you are the best