Thanks for amazing instructions, what was the reason to use Falcon 7 B model and not the better one? how does it affect the resources? Also is there a way to finetune the model with proprietary data and create a chatbot on it? if you can share few resources on that as well
Great that you find the video helpful. There is no such reason of uaing Falcon 7B, its just that it was new and decided to create a video so some of you viewers might learn something new. In my opinion, instead of fine tuning your proprietary data, just loading the data, creating embeddings and passing it to LLM works better. Well, it depends upon your usecase.
As we can see it doesn't remember chat history, so it fails on follow-up questions. How to reslove this, maybe you can add memory from langchain. Can you please make a video on this or share the implementation that works on follow-up questions.
Hello I have created other videos where I have explained how to add chat history. Please refer to that and plug it in Chainlit. This way you learn something new to implement.
@datasciencebasics Thanks for the reply, I have tried to already use memory from langchain, but it seems it is not working somehow,there are various github issues also open with the same that RetrievalQAWithSourcesChain is not actually using memory.If you can show an example implementation it will be great.
Please help, I compiled to end but I am getting this error instead: Error 04:14:32 PM Error raised by inference API: Authorization header is correct, but the token seems invalid
@@datasciencebasics Error: "Could not initialize webview: Error: Could not register service worker: SecurityError: Failed to register a ServiceWorker: The provided scriptURL"
Right, I was just wondering what code would be different, given that one typically loads an HF model with the 'automodelforcausalLM' argument from the 'transformers' package, but I was assuming that wouldn't be translateable into the chainlit framework?@@datasciencebasics
How do you use VSCode with conda env? When working from VSCode, everything I do is run and installed on the host system. But I find Conda extremely easy for working with different CUDA LLM environments. To clarify, I want to use my existing Conda environments I already have.
Hi, GitHub codespace has already conda installed. If needed you can use GPU also, meaning that you can use CUDA there but probably some configuration might be needed. You can export your existing conda environments in GitHub codespace. But again, as its running on the cloud, there might be some additional configurations might be needed. I have already created a video about using conda in GitHub codespace, you can follow it. ruclips.net/video/4kLoVibcFNo/видео.html
I have explained to include history using Gradio in another video. Read the comment section, there is answer for the issue I had while explaining the gradio chat history. Here is the link -> ruclips.net/video/TeDgIDqQmzs/видео.html
Hello, No idea as there was no need for me do do these stuffs. Chainlit a framework so you may need to go deep into the framework to change styles OR you can ask in the chainlit github repo.
Thanks, but was hoping you were NOT going to show how to do on HuggingFace like the thousands of others are doing. WE WANT LOCAL! Why are not people doing this, I just do not understand why you would want to run something so slow like HuggingFace API. You can eat lunch by the time it answers a simple question?
I can understand your feelings. LOCAL and PRIVATE is the main factor people are using Huggingface API to access ooen source models. So far, I find Falcon more efficient and faster compared to others. I hope there will be more fast and efficient models in the future.
@@datasciencebasics Please just help us that have the computer to run the model without issue and who are not programers by trade create our own private ai that does not require the internet at all. That is all many of us want. I have a 3090 with 24GB of Vram to use and using oobabooga I can run the model, but I am not in love with the webui and the fact that if you click superbooga it crashes and I have to completely re-install it. But without superbooga you can not run QLoRa to train your model. So I am searching for an answer to this riddle. How can I run my own LLM locally and train it using QLoRa, oh and while I am dreaming, I would also like to communicate with it using speech because I have RA arthritis and it hurts to type like this all the time. So if you have the skills and could help that would be wonderful, and I would be willing to bet that it would get you a lot of views quickly right now. Sorry for the rant. Thanks for the reply, have a wonderful day, and good luck.
@@timothymaggenti717 You can you GPT4ALL api in python to load models locally and then use in Chainlit. If not, you can run any LLM locally using accelerate and transformers and pipe the output to Chainlit.
Thanks a lot to you, I couldn’t figure out the LangChain, but given how you tell it, I was able to find where I did it wrong
You are welcome. Glad that you found it useful !!
This was the best instructions thus far ... thank you
You are welcome. Thanks for the feedback !!
Thanks You for sharing and explaining stuff, i have download the 7B and searching how to build with a chatbot anyway thanks for all your work
Thank you so much for sharing your knowledge
You are Welcome !!
sir please make video on how to make an web-app using --> LMQL, Langchain, and Chainlit
Great video! thanks for the knowledge
Glad it was helpful!
Great video great help but can we do the Same by adding own document that is pdf and text files and answer from that along with metadata
Great that you find it helpful. Will look into it.
Thanks for amazing instructions, what was the reason to use Falcon 7 B model and not the better one? how does it affect the resources? Also is there a way to finetune the model with proprietary data and create a chatbot on it? if you can share few resources on that as well
Great that you find the video helpful. There is no such reason of uaing Falcon 7B, its just that it was new and decided to create a video so some of you viewers might learn something new. In my opinion, instead of fine tuning your proprietary data, just loading the data, creating embeddings and passing it to LLM works better. Well, it depends upon your usecase.
@@datasciencebasics to follow up on that, how do you pass in embeddings to llms?
awesome dude!
Thanks!
Great video, any idea on how we can make it into a functioning website instead of just local?
Well you can use different hosting strategy. There are many out there.
can you make video on this purpose@@datasciencebasics
hello great video, Can we use this code with some other model like Pygmalion13b without any changes
Hello, haven’t tested myself. You can give a try!
can we connect our own database with the chatbot ??
Yep, check my Langchain videos to use falcon model using Huggingface.
As we can see it doesn't remember chat history, so it fails on follow-up questions. How to reslove this, maybe you can add memory from langchain. Can you please make a video on this or share the implementation that works on follow-up questions.
Hello I have created other videos where I have explained how to add chat history. Please refer to that and plug it in Chainlit. This way you learn something new to implement.
@datasciencebasics Thanks for the reply, I have tried to already use memory from langchain, but it seems it is not working somehow,there are various github issues also open with the same that RetrievalQAWithSourcesChain is not actually using memory.If you can show an example implementation it will be great.
Please help, I compiled to end but I am getting this error instead:
Error
04:14:32 PM
Error raised by inference API: Authorization header is correct, but the token seems invalid
The error says token seems invalid so please the tokens
Thanks for the video. I tried in cloud but the codespace notebook does not open. It says the url violates the Content Security Policy.
hello, I have updated the python pckages used in the repo, now the issue must be fixed. Let me know if it is fixed or not.
@@datasciencebasics Error: "Could not initialize webview: Error: Could not register service worker: SecurityError: Failed to register a ServiceWorker: The provided scriptURL"
I installed Local and it worked fine for me.
Hey @datasciencebasics, how can we do the answer not appear as whole, but to streams?
You can reference to this documentation feom chainlit.
docs.chainlit.io/concepts/streaming/langchain
How would we go about loading falcon locally instead of using the API?
hey, you can download the model from huggingface and load it. But it might not work in commodity hardware unless you use the quantized version of it.
Right, I was just wondering what code would be different, given that one typically loads an HF model with the 'automodelforcausalLM' argument from the 'transformers' package, but I was assuming that wouldn't be translateable into the chainlit framework?@@datasciencebasics
How do you use VSCode with conda env?
When working from VSCode, everything I do is run and installed on the host system. But I find Conda extremely easy for working with different CUDA LLM environments.
To clarify, I want to use my existing Conda environments I already have.
i put a search key word for you but apparantly my comments was remover search for the how to in google you will find lot how to's
Hi, GitHub codespace has already conda installed. If needed you can use GPU also, meaning that you can use CUDA there but probably some configuration might be needed. You can export your existing conda environments in GitHub codespace. But again, as its running on the cloud, there might be some additional configurations might be needed. I have already created a video about using conda in GitHub codespace, you can follow it. ruclips.net/video/4kLoVibcFNo/видео.html
In the "langchain_falcon.ipynb" it is not able render the code block, is there any solution to that??
Hei, it is ipython notebook. You need to open in jupyter notebook or Google colab. It is working normally for me.
Awsome explaination. Please can you explain how can we implement history in this.
Also if possible can you make video on gradio.
I have explained to include history using Gradio in another video. Read the comment section, there is answer for the issue I had while explaining the gradio chat history. Here is the link -> ruclips.net/video/TeDgIDqQmzs/видео.html
How would you apply custom css? where would you put the stylesheet file
Hello, No idea as there was no need for me do do these stuffs. Chainlit a framework so you may need to go deep into the framework to change styles OR you can ask in the chainlit github repo.
bro, can u do one for uncensored Wizard Vicuna models? thanksagain
change repo id to the one you want
Thanks, but was hoping you were NOT going to show how to do on HuggingFace like the thousands of others are doing. WE WANT LOCAL! Why are not people doing this, I just do not understand why you would want to run something so slow like HuggingFace API. You can eat lunch by the time it answers a simple question?
I can understand your feelings. LOCAL and PRIVATE is the main factor people are using Huggingface API to access ooen source models. So far, I find Falcon more efficient and faster compared to others. I hope there will be more fast and efficient models in the future.
@@datasciencebasics Please just help us that have the computer to run the model without issue and who are not programers by trade create our own private ai that does not require the internet at all. That is all many of us want. I have a 3090 with 24GB of Vram to use and using oobabooga I can run the model, but I am not in love with the webui and the fact that if you click superbooga it crashes and I have to completely re-install it. But without superbooga you can not run QLoRa to train your model. So I am searching for an answer to this riddle. How can I run my own LLM locally and train it using QLoRa, oh and while I am dreaming, I would also like to communicate with it using speech because I have RA arthritis and it hurts to type like this all the time. So if you have the skills and could help that would be wonderful, and I would be willing to bet that it would get you a lot of views quickly right now. Sorry for the rant. Thanks for the reply, have a wonderful day, and good luck.
@@timothymaggenti717 You can you GPT4ALL api in python to load models locally and then use in Chainlit. If not, you can run any LLM locally using accelerate and transformers and pipe the output to Chainlit.
This code is deprecated.
@cl.langchain_factory is no more available.
Can you make a new video or update existing repository??
Hello, thanks for mentioning. I updated the code now.
@@datasciencebasics thanks. Can u provide your twitter profile
twitter.com/mesudarshan