hey man, thank you so much!! so glad to have you on board and very happy to know that you have found my work useful :) let me know if i can help you with anything :)
@@alejandro_ao Come on bro, I'm just doing a cancer project and although it's not similar to the dataset you used, the example you gave in other videos was one of the things I also used as a guide. When it is well prepared I will show it to you so you can give an opinion.
Hey Alejandro, In your channel, you did more Projects using GenAI concepts but you forget to do a video with Multi model RAG, Kindly requesting do the project for Multi model RAG.
For this particular document, if you're using Windows, you'll need encoding='utf-8' as the last parameter in the "with open()" statement. I don't think Mac's have this issue.
Yeah, llamaparse is good for table in PDF, but it can fail often for some complex scanned PDF or complex tables, do you know other better options or the STOA solution for rag for tables in PDF? I will be very grateful!
Great video. What do you think, how would this perform with pdf-s containing scans or lots of images? Another struggle point in my use cases to recognise scanned text, like a fully scanned book and transform it to MD to use it with LLM-s.
Many thanks bro! Fantastic video as always! I was wondering if you could give me an advice: My work primarily involves using RAG on scientific papers (let's say hundreds!) , which often include figures that sometimes convey more information than the text itself. Can LlamaParser analyse the figures and add a description of them in the markdown file? (That will literally create llm professors via RAG!) If not, Is there a technique to incorporate these figures into the vector database along with the paper’s text? Essentially, for multi-modal vector embedding that includes both text and images, what’s the best approach to achieve this? I greatly appreciate your insight 🙏🙏🙏
hey sam, i'm glad this is useful! that's a great question. i am actually working on a comprehensive set of tutorials dealing precisely with with multimodal rag. essentially, you have to use a model with vision like GPT-4V to parse the tables and images if you want to do this. expect to see this in the channel soon!
Hi, could you convert complex PDF documents (with graphics and tables) into an easily readable text format, such as Markdown? The input file would be a PDF and the output file would be a text file (.txt).
hello there! llamaparse should do the trick for you. although for more complex pdfs (mainly those including images) maybe you will need to do this by hand using a LLM with vision (such as GPT-4o or Gemini 1.5). i will be making a video about that soon!
indeed, they use GenAI for parsing the documents, and it the exact way how they do that is part of their secret sauce. but they make it possible to run LlamaCloud (and LlamaParse) within your servers as part of their enterprise solution so you can be sure that the data never leaves your premises
If we need to chat with data in off table, we can use this api and output can be sent to a vector database (rag app) and then we can chat with that table?
In my PDF files, there are several questions organized in tables. I want to extract these questions group by group, considering the headers, etc. There will be 4-6 sets of questions. Should I use an LLM for this task? I believe Llama cannot handle this part. If so, which ChatGPT model would be the best fit for this use case? I previously implemented a project involving chat with multiple PDFs using a specific model of chatgpt. Is it still a good idea to use that model, or is there a better option available now? By the way, I will have some paragraphs in future pdf files where I am supposed to extract structured data as well. What is your recommendation in general?
hey man, so sorry i missed your comment. thank you so much for the tip! in my tests, this api works great for extracting tables in pretty much any pdf, no matter how complicated they are. you could probably use a local setup (i am planning a video on multimodel rag). but this api looks like the most useful approach to me, considering your case. what i would do is extract the questions from your documents using this llamaparse and then add it to my vector database. about the model, pretty much any model above gpt-4 should be perfect for this. let me know how this went!
@@alejandro_ao hi thank you for the reply. yes i was trying to do it locally since i have a lot of articles with tabular structure that i wanted to extract and use
hey there, it is possible to have LlamaParse run within your own system. but that is only possible for bigger companies with their enterprise solution. you can get in touch with them for that here: www.llamaindex.ai/contact i am not an official interface of LlamaIndex, but when I asked Jerry (the founder of LlamaIndex) this same question, he mentioned that LlamaParse (and LlamaCloud) uses its own GenAI for parsing these documents, and the process and model they built to make this possible is proprietary. so you cannot just install it on your computer. however, if your company requires that the data never leaves your premises, they can put LlamaCloud within your servers with the enterprise solution. if your concern is privacy, Jerry declared that they do not store any source data from their clients. sometimes they might store metadata of the documents to improve retrieval, but that's all. you can hear his response from this same question here: ruclips.net/video/imlQ1icxpBU/видео.html on minute 41:34.
@@alejandro_ao Right!! Thank you very much for your feedback.. I need to extract tables from laboratory PDF documents, this is extremely sensitive data. That's why I'm looking for an LLM or IA that does this locally. I have already managed to extract when the table is only on one page, but the table sometimes overflows into several pages and then it becomes more complex to apply logic to it. I will continue to look for a solution.. I always watch your videos, they help me a lot. Congratulations!!
@@vagnerbelfort687 This means a lot! Thanks!! That sounds like a pretty interesting task. Let me know if you find a way to do that. I am preparing a bunch of videos on advanced RAG so I might cover something like this soon!
Hey, sorry about that! That must have been a bug with my UI! Sure, you can send me an email to hello@alejandro-ao.com. I will be going back to consulting starting from next week.
Awesome contents as always, thanks Alejandro!
thanks chris!
thanks a lot! this is very helpful. if you figure out how to create RAG when you have charts and graphs in a document, please share with us also
thanks for the video, yes please make more videos on llama-index and llama-parse
coming up!!!
Bro, I follow your videos, what you do is really good, it helped me a lot with my work, so I joined your channel!
hey man, thank you so much!! so glad to have you on board and very happy to know that you have found my work useful :) let me know if i can help you with anything :)
@@alejandro_ao Come on bro, I'm just doing a cancer project and although it's not similar to the dataset you used, the example you gave in other videos was one of the things I also used as a guide. When it is well prepared I will show it to you so you can give an opinion.
Awesome . Thank you for your classes. I would to see something similar, but local
indeed, i realize there's a lot of demand for local implementations. i'm working on it, you should see it up soon!
Hey Alejandro, In your channel, you did more Projects using GenAI concepts but you forget to do a video with Multi model RAG, Kindly requesting do the project for Multi model RAG.
For this particular document, if you're using Windows, you'll need encoding='utf-8' as the last parameter in the "with open()" statement. I don't think Mac's have this issue.
I want to know how to handle the mathematical formulas contained in pdf
Yeah, llamaparse is good for table in PDF, but it can fail often for some complex scanned PDF or complex tables, do you know other better options or the STOA solution for rag for tables in PDF? I will be very grateful!
thank you
Great video. What do you think, how would this perform with pdf-s containing scans or lots of images?
Another struggle point in my use cases to recognise scanned text, like a fully scanned book and transform it to MD to use it with LLM-s.
Cool and informative vid thanks
no problem! it's mostly useful to create RAG applications
Many thanks bro! Fantastic video as always! I was wondering if you could give me an advice:
My work primarily involves using RAG on scientific papers (let's say hundreds!) , which often include figures that sometimes convey more information than the text itself. Can LlamaParser analyse the figures and add a description of them in the markdown file? (That will literally create llm professors via RAG!)
If not, Is there a technique to incorporate these figures into the vector database along with the paper’s text? Essentially, for multi-modal vector embedding that includes both text and images, what’s the best approach to achieve this?
I greatly appreciate your insight 🙏🙏🙏
hey sam, i'm glad this is useful! that's a great question. i am actually working on a comprehensive set of tutorials dealing precisely with with multimodal rag. essentially, you have to use a model with vision like GPT-4V to parse the tables and images if you want to do this. expect to see this in the channel soon!
@@alejandro_ao MANY MANY THANKS 🙏 🙏 🙏 that will help the academic research greatly!
Passionately looking forward to watching and learning from them!
awesome, love your content!!
thank you so much!
Would love to see something similar but using hugging face 🤗
Noted!
Great!! As always!! Thanks!!
@@alejandro_ao can we parse locally using llama parse , if org dsnt wants to send data to cloud. Can we use open source LLMs with Lama parse?
How do you add this in retrieveal pipeline? Which splitter?
Hi, could you convert complex PDF documents (with graphics and tables) into an easily readable text format, such as Markdown? The input file would be a PDF and the output file would be a text file (.txt).
hello there! llamaparse should do the trick for you. although for more complex pdfs (mainly those including images) maybe you will need to do this by hand using a LLM with vision (such as GPT-4o or Gemini 1.5). i will be making a video about that soon!
I believe, we don't know what is happening & inside llama parse
indeed, they use GenAI for parsing the documents, and it the exact way how they do that is part of their secret sauce. but they make it possible to run LlamaCloud (and LlamaParse) within your servers as part of their enterprise solution so you can be sure that the data never leaves your premises
@@alejandro_ao can implement this using NextJs React?
i get an error saying api key is required even after following all the steps mentioned
hey there, i feel like you haven't got your api key from llamacloud. you can create an account here and then your api key there: cloud.llamaindex.ai/
If we need to chat with data in off table, we can use this api and output can be sent to a vector database (rag app) and then we can chat with that table?
absolutely, that's the main use of this api. since the table is converted to markdown, it can be used in retrieval for rag apps :)
Thanks!
In my PDF files, there are several questions organized in tables. I want to extract these questions group by group, considering the headers, etc. There will be 4-6 sets of questions. Should I use an LLM for this task? I believe Llama cannot handle this part. If so, which ChatGPT model would be the best fit for this use case? I previously implemented a project involving chat with multiple PDFs using a specific model of chatgpt. Is it still a good idea to use that model, or is there a better option available now?
By the way, I will have some paragraphs in future pdf files where I am supposed to extract structured data as well.
What is your recommendation in general?
hey man, so sorry i missed your comment. thank you so much for the tip!
in my tests, this api works great for extracting tables in pretty much any pdf, no matter how complicated they are. you could probably use a local setup (i am planning a video on multimodel rag). but this api looks like the most useful approach to me, considering your case. what i would do is extract the questions from your documents using this llamaparse and then add it to my vector database.
about the model, pretty much any model above gpt-4 should be perfect for this.
let me know how this went!
Thank you for your valuable efforts. get APIKey It is difficult for me. Do you know any other solutions? Thank you for your reply
hey there, is that because your company requires that the data not leave the premisses?
is there any way to do it without llama parse with out api limitation ?
you can try to do this with your own model or a llm model with vision. are you looking to parse more than 1k pages per day?
@@alejandro_ao hi thank you for the reply. yes i was trying to do it locally since i have a lot of articles with tabular structure that i wanted to extract and use
@@anurajms i'm planning more videos about multi-model rag. i'm pretty sure that will help you!
@@alejandro_ao awesome thank you
Hi Alejandro! Can I do this with a local model on my server?
hey there, it is possible to have LlamaParse run within your own system. but that is only possible for bigger companies with their enterprise solution. you can get in touch with them for that here: www.llamaindex.ai/contact
i am not an official interface of LlamaIndex, but when I asked Jerry (the founder of LlamaIndex) this same question, he mentioned that LlamaParse (and LlamaCloud) uses its own GenAI for parsing these documents, and the process and model they built to make this possible is proprietary. so you cannot just install it on your computer.
however, if your company requires that the data never leaves your premises, they can put LlamaCloud within your servers with the enterprise solution.
if your concern is privacy, Jerry declared that they do not store any source data from their clients. sometimes they might store metadata of the documents to improve retrieval, but that's all. you can hear his response from this same question here: ruclips.net/video/imlQ1icxpBU/видео.html on minute 41:34.
@@alejandro_ao Right!!
Thank you very much for your feedback..
I need to extract tables from laboratory PDF documents, this is extremely sensitive data.
That's why I'm looking for an LLM or IA that does this locally.
I have already managed to extract when the table is only on one page,
but the table sometimes overflows into several pages and then it becomes more complex to apply logic to it.
I will continue to look for a solution..
I always watch your videos, they help me a lot.
Congratulations!!
@@vagnerbelfort687 This means a lot! Thanks!! That sounds like a pretty interesting task. Let me know if you find a way to do that. I am preparing a bunch of videos on advanced RAG so I might cover something like this soon!
Will this work with the pdfs containing images and fonts?
absolutely. it even works with powerpoint slides
@@alejandro_ao I just tried it gave me the markdown. But the images are missing 😞
Hi Alejandro,
How to reach you for a consultancy? Tried the consultancy link but it is not working. Can you kindly share an email address?
Cheers!
happy to hear that!
@@alejandro_ao Thanks. But I asked for an email to contact you :)
Hey, sorry about that! That must have been a bug with my UI! Sure, you can send me an email to hello@alejandro-ao.com. I will be going back to consulting starting from next week.