Goodbye Text-Based RAG, Hello Vision AI: Introducing LocalGPT Vision!

Gen AI Project Using Llama3.1 | End to End Gen AI Project

Top 5 Gen AI Projects 2024 🔥 Hands-on AI Portfolio Projects for Resume

Election results LIVE: AP race calls and electoral map 2024

🚨 Vinicius Jr. Hat Trick 🚨 Real Madrid vs. Osasuna | LALIGA Highlights | ESPN FC

2025 GRAMMY Nominations: Watch Live Here

Can VISION Language Models Solve RAG? Introducing localGPT-Vision

Prompt Engineering

Просмотров 8 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 10 ноя 2024

Комментарии • 33

@bwljustus8077 Месяц назад ⁺³
ERROR - models.indexer - Error during indexing: Unable to get page count. Any ideas?
@mosbehbarhoumi9256 Месяц назад
same
@MeinDeutschkurs Месяц назад
Wooohoooo!!! This is so cool! I need more time, I definitely have to test it!!!!
@kai_s1985 Месяц назад ⁺²
Thanks again for the great work! I have tested similar approach with the vision model. It is especially good for pdf's with lots of unstructured data like graphs, plots, pictures, text, etc... One limitation for this approach was when I created a chatbot and wanted to get the hyperlink within the documents I couldn't because the url of the hyperlink is not visible in the image, but it was not a problem when I used markdown with the standard text based RAG system.
Questions:
- how many pdf's can I upload? Is there any size limit?
- Does the chatbot has a memory of the current conversation? If so, how are you handling it?
@magmikefpv Месяц назад ⁺¹
This is amazing ! Thanks will try it out
@kenchang3456 Месяц назад
Indeed, this is an amazing project. I'll check out the code and give try. Thank you very much for sharing, there's a lot to learn from this one.
@eduardvendrell9136 18 дней назад ⁺¹
Running on a laptop with GPU I am getting the following error:
- ERROR - models.indexer - Error during indexing: Input type (torch.cuda.FloatTensor) and weight type (CUDABFloat16Type) should be the same
Any idea?
@SurajPrasad-bf9qn 16 дней назад
I am facing the same error, did you solved it?
@eduardvendrell9136 16 дней назад
@@SurajPrasad-bf9qn nope!
@Elingsanto Месяц назад
Cool! Is there a context window or any strict limit on the quantity of pages or images that can be uploaded?
WIll try it out
@chaitanyanerpagar6076 24 дня назад
I have uploaded a pdf for indexing and once i click on upload and indexing button getting page response as can't reach this page... Can anyone suggest me where to check the issue
@thenextension9160 Месяц назад
Very nice work
@unshadowlabs Месяц назад
Can document metadata be included as well in the retrieval, such as document name or title, author, and publication year?
@engineerprompt Месяц назад
Yes, that can be added
@nyliveechay-so3ps Месяц назад
Pdf document format is specific right, so maybe posssible to compare results just using that formatted content data?
It's closed, owned, controlled by Adobe correct?
So why do this?
@akashnagarkar7560 Месяц назад
Would love a video about the detailed architecture and code explanation. Thanks.
@ysy69 Месяц назад
This is awesome. Very grateful. What is your local setup, GPU?
@awesomedata8973 Месяц назад ⁺¹
Any chance you can input the new Mistral Pixtral model in your software? -- It seems to be the best version of a local model for vision, and it's based on Nemo.
@engineerprompt Месяц назад ⁺¹
Yes, I think it can be added. Will have a look into it.
@forcebrew Месяц назад
Thank you for your expertise! Could you recommend a stable and efficient large language model for coding that I can run on my machine without it becoming unresponsive?
@MagagnaJayzxui Месяц назад
Qwen2.5 VL 72b support?
@trevorbaylis7423 Месяц назад
What would be the complexity level to combining Verbi and Local GPT --Vision? Is this a realistic possibility?
@TeamDman Месяц назад
VERY cool!
@nyliveechay-so3ps Месяц назад
Great stuff though!! Nice work!
@bwljustus8077 Месяц назад
If poppler is missing under Windows, use: choco install poppler
@RyanSmith-rb1ch Месяц назад
I think google-generativeai is misspelled as google-generative-ai in the requirements.txt
@engineerprompt Месяц назад
Thanks for pointing it out, will fix that
@brianhopson2072 Месяц назад
I like the concept of this, but I don't like the original model selection. Can you add other open ai api's like 4o?
@engineerprompt Месяц назад
Yes, will update the list with more models
@dadadies Месяц назад
Can us mere mortals has a 1 click installer plox. Some sort of bat file or something that checks for whatever is required and optional and let us choose. You could tell an AI to write it for you.

Следующие

Автовоспроизведение

Goodbye Text-Based RAG, Hello Vision AI: Introducing LocalGPT Vision!

Goodbye Text-Based RAG, Hello Vision AI: Introducing LocalGPT Vision!

Gen AI Project Using Llama3.1 | End to End Gen AI Project

Gen AI Project Using Llama3.1 | End to End Gen AI Project

Top 5 Gen AI Projects 2024 🔥 Hands-on AI Portfolio Projects for Resume

Top 5 Gen AI Projects 2024 🔥 Hands-on AI Portfolio Projects for Resume

Election results LIVE: AP race calls and electoral map 2024

Election results LIVE: AP race calls and electoral map 2024

🚨 Vinicius Jr. Hat Trick 🚨 Real Madrid vs. Osasuna | LALIGA Highlights | ESPN FC

🚨 Vinicius Jr. Hat Trick 🚨 Real Madrid vs. Osasuna | LALIGA Highlights | ESPN FC

2025 GRAMMY Nominations: Watch Live Here

2025 GRAMMY Nominations: Watch Live Here

I 3D Printed a $1,175 Chair

I 3D Printed a $1,175 Chair

ColPali: Document Retrieval with Vision-Language Models only (with Manuel Faysse)

ColPali: Document Retrieval with Vision-Language Models only (with Manuel Faysse)

LightRAG: A More Efficient Solution than GraphRAG for RAG Systems?

LightRAG: A More Efficient Solution than GraphRAG for RAG Systems?

Ollama with Vision - Enabling Multimodal RAG

Ollama with Vision - Enabling Multimodal RAG

Is This the End of RAG? Anthropic's NEW Prompt Caching

Is This the End of RAG? Anthropic's NEW Prompt Caching

ColPali: Vision Language Models for Efficient Document Retrieval

ColPali: Vision Language Models for Efficient Document Retrieval

Why Are Open Source Alternatives So Bad?

Why Are Open Source Alternatives So Bad?

ColPali: Vision-Based RAG System For Complex Documents

ColPali: Vision-Based RAG System For Complex Documents

The Best RAG Technique Yet? Anthropic’s Contextual Retrieval Explained!

The Best RAG Technique Yet? Anthropic’s Contextual Retrieval Explained!

Not Just Talk: A Voice Assistant That can take Actions

Not Just Talk: A Voice Assistant That can take Actions

КАРИНА СДАЛА ТЕСТ НА 90% 🤓 НО ЕСТЬ НЮАНС...😱 #robloxshorts #roblox #brookhaven

КАРИНА СДАЛА ТЕСТ НА 90% 🤓 НО ЕСТЬ НЮАНС...😱 #robloxshorts #roblox #brookhaven

M&M’s Working Lego Vending Machine

M&M’s Working Lego Vending Machine

Увеличили моцареллу для @Lorenzo.bagnati

Увеличили моцареллу для @Lorenzo.bagnati

НОВЫЙ AMONG US в РЕАЛЬНОЙ ЖИЗНИ - Масленников, Егорик, Милана Хаметова, Супер Стас

НОВЫЙ AMONG US в РЕАЛЬНОЙ ЖИЗНИ - Масленников, Егорик, Милана Хаметова, Супер Стас

REAL MADRID 4 - 0 CA OSASUNA I RESUMEN LALIGA EA SPORTS

REAL MADRID 4 - 0 CA OSASUNA I RESUMEN LALIGA EA SPORTS

Мы всё ещё - МОЛОДАЯ СЕМЬЯ #зояяровицына #стендап #shorts

Мы всё ещё - МОЛОДАЯ СЕМЬЯ #зояяровицына #стендап #shorts

5 ДНЕЙ ИЗ ЖИЗНИ ЗВЕЗДЫ | ФАН-ВСТРЕЧА и многое другое...КАЗАХСТАН

5 ДНЕЙ ИЗ ЖИЗНИ ЗВЕЗДЫ | ФАН-ВСТРЕЧА и многое другое...КАЗАХСТАН

ВЕЧЕР В ХАТУ - МИША ЛИТВИН / АРУТ / ЛЕРЧЕК

ВЕЧЕР В ХАТУ - МИША ЛИТВИН / АРУТ / ЛЕРЧЕК