Mistral-7B with LocalGPT: Chat with YOUR Documents
HTML-код
- Опубликовано: 2 окт 2024
- In this video, I will show you how to use the newly released Mistral-7B by Mistral AI as part of the LocalGPT. LocalGPT lets you chat with your own documents. We will also go over some of the new updates to the project.
If you like the repo, don't forget to give it a ⭐
💻 Pre-configured localGPT VM: bit.ly/localGPT (use Code: PromptEngineering for 50% off).
#localGPT #mistral #mistral-7B #langchain
CONNECT:
☕ Buy me a Coffee: ko-fi.com/prom...
|🔴 Support my work on Patreon: Patreon.com/PromptEngineering
🦾 Discord: / discord
📧 Business Contact: engineerprompt@gmail.com
💼Consulting: calendly.com/e...
💻 Pre-configured localGPT VM: bit.ly/localGPT (use Code: PromptEngineering for 50% off).
LINKS:
LocalGPT Github: github.com/Pro...
LocalGPT Playlist: tinyurl.com/37...
Embedding Models: • Understanding Embeddin...
Text Splitters: • LangChain: How to Prop...
Can you make a video ho how to use open source LLMs as chatbot on tabular data
would you mind to recommend any videos you found related to your question?
thank you for this valuable train. I want to ask you about the languages rather than English. What do you advice about write a LocalGPT in a non-english language?
Can you run this in langchane or flowise
Make a comparison of your project with "h2o gpt" project please
thanks it is a good video , is there a suggestion to make the response faster . i tested wit Nidia GeForce RTX 3050
cool is possible to use it in oobabooga text generation ui ?
I believe so, yes.
"I'm using this on mac"
Buddy just buy a computer, this is basically irrelevant to the world when you are using CPU.
Thanks for showing RAG with mistral. Why your advise to use gptq instead of gguf when u have a gpu?
from my understanding, gptq are optimized specifically for NVidia gpus. GGUF supports both cpu and gpu but I have seen gptq performs better on gpus (speed wise)
I tried GGUF format but it only utilizing my cpu not GPU.. why its happening? as I guess is it possible to add gpu layers on it . right?@@engineerprompt
😊
Hi - Thanks for uploading. Why do I get this error while running your model?
super().__init__(**kwargs)
File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for LLMChain
llm
none is not an allowed value (type=type_error.none.not_allowed)
have you managed to fix this - am getting the same error when running with Mistral. Any help/advise would be appreciated.
Hi - Yes I have. On windows, it worked when I changed the 'mps' to 'cpu'. On mac book it needed installing the required library of llama-cpp-python. Hope it helps, if not let me know and I can look into the error.
@@logicalm4th I'm struggling with the same issue as well. Did you find a solution?
so you just implemented lama along with a RAG approach to the prompts right?
nice video. how can we test the model with test data. how can we ensure that it is generating data correctly?
is 3070 enough to run model?
When I tested the code, it always returned Split into 0 chunks of text. Does anyone know what causes this
I'm still unclear about what we do with these models once they are fine tuned on our data. Which or Where do we put this file, to be used by the public in a chat application say on wordpress? Customers don't want to log into terminal obvioiusly, they go to a site, and have a chatbot prompt them, and they want that chat bot to reply to them personally. Is there software already out there that can accept a fine-tuned-LLM? can you suggest one that doesn't have a subscription? preferreabley for WP.
Oobanoogas text generation webui runs locally, just git clone, put models in the models folder, in parameters>characters tab, customize characters, etc etc.. I may have misspelled that, idk.
@@mikefreeman6399 but that is still a terminal on your PC, even if you add an API, it's still a Terminal looking thing. I'm specifically asking about an application like a Chat app, that sits on Wordpress site, that a customer who is looking to buy something, can ask a question on that site. Oobabbooga is just the terminal to the model. I hope I'm explaining myself. But to 2x clarify, if you go on any random site to buy something, say toothbrushes, and you need to ask a specific question about their toothbrush, on their site, you don't want to go to another page to chat with Oobaboga interface, you just want a small chatbox on the side with a "live person" or AI in this case right?
can make video how convert this modal to exe
😊😊 how much RAM is needes to run this model?
The quantized one will need about 4-6GB (4-bit).
Can you make how to train our personal documents. Pdf or text
the program is running with internet. instead can we run the local gpt without internet. please tell how to do that
I tried to run this on thinkpad x250 core i5, win11, 8gb ram... 😂 It ran like a dead turtle... So, please make a video about this but instead of running a model locally, let's use HF API, if possible. But keep the objective unchanged... Chat with multiple PDFs. It will be great for those who cannot afford a high spec system.
I agree. You might be able to run the highly quantized version, though. 2024 is going to be awesome for local models.
Wow, 2 bit quantization isn't too few possible values for the weights?
Yes but if you have 7B parameters, the network might still be able to preserve some of the learnings
Index out of range error,why this?
Have you thought of having a colab notebook ?
Thanks!! Awesome video. Is there a way to do it in google Colab?
Thank you so much for providing us with the updated code for mistral ! I have tested mistral vs. llama-2 chat, on long texts about philosophy, it seems in my case that llama-2 is doing better with understanding it atm. Thank you for developing this project !
still working on my project which is similar. my long text comprises four hundred thousand chinese characters.
Why is it called "GPT"? Does it use any API key to interact with GPT models? If yes, then why do you need other LLMs with it? If not, then what does it do that makes the other LLMs work like a charm? Like, just takes a document, and extract answers for unseen questions.
Sorry for my newbie question, exploring this topic for the first day.
GPT stand for Generative Pre-trained Transformers, not own by OpenAI
How to optimize the LLM model interaction timing?
Is a RTX 4070 good enough to use a gpu model?
If I ingest fileA and and then I want to create another gpt instance with different base knowledge, separate from the one earlier, should I just rerun the ingest with replaced files or I need to create separate conda environment?
Currently, you will need to delete the "DB" folder and run ingest again. In the constants.py file, you can set the folder name of the DB you want to create/use.
its work with Persian language? thanks
You can use something like Aya for persian
cohere.com/research/aya
@@engineerprompt thanks
Hi, Is internet is required to run the model?
Thanks so much!
thanks
how to get apikey on mistral
On their website
God Bless you