Thanks again for the great work! I have tested similar approach with the vision model. It is especially good for pdf's with lots of unstructured data like graphs, plots, pictures, text, etc... One limitation for this approach was when I created a chatbot and wanted to get the hyperlink within the documents I couldn't because the url of the hyperlink is not visible in the image, but it was not a problem when I used markdown with the standard text based RAG system. Questions: - how many pdf's can I upload? Is there any size limit? - Does the chatbot has a memory of the current conversation? If so, how are you handling it?
Running on a laptop with GPU I am getting the following error: - ERROR - models.indexer - Error during indexing: Input type (torch.cuda.FloatTensor) and weight type (CUDABFloat16Type) should be the same Any idea?
I have uploaded a pdf for indexing and once i click on upload and indexing button getting page response as can't reach this page... Can anyone suggest me where to check the issue
Pdf document format is specific right, so maybe posssible to compare results just using that formatted content data? It's closed, owned, controlled by Adobe correct? So why do this?
Any chance you can input the new Mistral Pixtral model in your software? -- It seems to be the best version of a local model for vision, and it's based on Nemo.
Thank you for your expertise! Could you recommend a stable and efficient large language model for coding that I can run on my machine without it becoming unresponsive?
Can us mere mortals has a 1 click installer plox. Some sort of bat file or something that checks for whatever is required and optional and let us choose. You could tell an AI to write it for you.
ERROR - models.indexer - Error during indexing: Unable to get page count. Any ideas?
same
Wooohoooo!!! This is so cool! I need more time, I definitely have to test it!!!!
Thanks again for the great work! I have tested similar approach with the vision model. It is especially good for pdf's with lots of unstructured data like graphs, plots, pictures, text, etc... One limitation for this approach was when I created a chatbot and wanted to get the hyperlink within the documents I couldn't because the url of the hyperlink is not visible in the image, but it was not a problem when I used markdown with the standard text based RAG system.
Questions:
- how many pdf's can I upload? Is there any size limit?
- Does the chatbot has a memory of the current conversation? If so, how are you handling it?
This is amazing ! Thanks will try it out
Indeed, this is an amazing project. I'll check out the code and give try. Thank you very much for sharing, there's a lot to learn from this one.
Running on a laptop with GPU I am getting the following error:
- ERROR - models.indexer - Error during indexing: Input type (torch.cuda.FloatTensor) and weight type (CUDABFloat16Type) should be the same
Any idea?
I am facing the same error, did you solved it?
@@SurajPrasad-bf9qn nope!
Cool! Is there a context window or any strict limit on the quantity of pages or images that can be uploaded?
WIll try it out
I have uploaded a pdf for indexing and once i click on upload and indexing button getting page response as can't reach this page... Can anyone suggest me where to check the issue
Very nice work
Can document metadata be included as well in the retrieval, such as document name or title, author, and publication year?
Yes, that can be added
Pdf document format is specific right, so maybe posssible to compare results just using that formatted content data?
It's closed, owned, controlled by Adobe correct?
So why do this?
Would love a video about the detailed architecture and code explanation. Thanks.
This is awesome. Very grateful. What is your local setup, GPU?
Any chance you can input the new Mistral Pixtral model in your software? -- It seems to be the best version of a local model for vision, and it's based on Nemo.
Yes, I think it can be added. Will have a look into it.
Thank you for your expertise! Could you recommend a stable and efficient large language model for coding that I can run on my machine without it becoming unresponsive?
Qwen2.5 VL 72b support?
What would be the complexity level to combining Verbi and Local GPT --Vision? Is this a realistic possibility?
VERY cool!
Great stuff though!! Nice work!
If poppler is missing under Windows, use: choco install poppler
I think google-generativeai is misspelled as google-generative-ai in the requirements.txt
Thanks for pointing it out, will fix that
I like the concept of this, but I don't like the original model selection. Can you add other open ai api's like 4o?
Yes, will update the list with more models
Can us mere mortals has a 1 click installer plox. Some sort of bat file or something that checks for whatever is required and optional and let us choose. You could tell an AI to write it for you.