You don't need GraphRAG!!!

1littlecoder

Просмотров 3,4 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 27 июл 2024
I'm giving 3 reasons why you won't need GraphRAG at this point!
Timestamps:
00:00 Intro
00:06 No GraphRAG
00:49 GraphRAG Baseline Benchmark
02:52 Chunking Techniques Report
04:52 GraphRAG Cost Issue
06:51 Importance of Latency in RAG
08:01 The End
🔗 Links 🔗
Graph RAG
‪@engineerprompt‬ Video discussing cost - • Graph RAG: Improving R...
GraphRAG is RAG for Riches - x.com/iamvladyashin/status/18...
GraphRAG by MSFT - www.microsoft.com/en-us/resea...
❤️ If you want to support the channel ❤️
Support here:
Patreon - / 1littlecoder
Ko-Fi - ko-fi.com/1littlecoder
🧭 Follow me on 🧭
Twitter - / 1littlecoder
Linkedin - / amrrs
Наука

Комментарии • 32

@adandrea 18 дней назад ⁺¹⁸
I love your channel, but I also agree that you are missing the point here. Even with the most advanced chunking or retrieval or re-ranking or whatever other techniques, will never retrieve related topics that are rich for the problem being solved. For example, if you are asking about machine learning and you find a related cluster , but then you have another cluster that speaks about ethics and has a strong relationship with this cluster, and therefore you want to delve deep into that to give a better answer, you will never retrieve those chunks from a similarity search based on a question about machine learning. And that is when something like GraphRag, and probably an evolution of it, will allow agents to traverse the graph through other concepts that are meaningful to give a better answer. For me this is the first step in the right direction for more advanced agents
@1littlecoder 18 дней назад ⁺⁶
@@adandrea interesting point. Thanks for sharing. I'm wondering why Microsoft couldn't pick an example like this and showcase benchmarks
@adandrea 17 дней назад
Good question! @@1littlecoder
@themax2go 15 дней назад
@@1littlecoder because it's that new, the writer of the research paper did showcase something, but perhaps not this particular comcept mentioned by @adandrea - right now graphrag is v0.1 i believe, so i'd say it'll take some time for use cases to dev and the system to improve
@theresalwaysanotherway3996 18 дней назад ⁺⁶
interesting perspective. However you were very vague when it came to details on these "more advanced solutions" we should instead be using. My stack at the moment is simply:
1. "BAAI/bge-small-en-v1.5" for the bi-encoder
2. "BAAI/bge-reranker-v2-minicpm-layerwise" as the cross encoder to re-rank the top 50 from step 1
3. phi "3.1" mini Q5_K_M to generate a response
is there a more advanced technique from here other than larger models or Graph RAG?
@1littlecoder 18 дней назад
@@theresalwaysanotherway3996 thanks for sharing your approach. Curious if you use a vector db?
@potential900 18 дней назад
Sounds neat. You got a GH or HF repo for that or a notebook? Would love to check out, I'm a little out of the loop of the cutting edge atm.
@theresalwaysanotherway3996 17 дней назад
@@1littlecoder to store the vector embeddings I just save a .pt file with torch.save(embeddings), and on start up I load it up. Are there any advanced techniques I'm missing that could get better retrieval scores other than moving up to Graph RAG?
@JP-wv8wo 16 дней назад
@1littlecoder can you please a make a video of SOTA chunking, embeddings and vector store for production grade.
@TheRealHassan789 17 дней назад
Great insights!
@dlyog 18 дней назад ⁺²
With all due respect this one you got it wrong and its totally an opionated view. Let me know if you want a open discussion and I can come to your channel
@1littlecoder 18 дней назад ⁺¹
Why do you think so? I listed my reasons. Happy to hear yours!
@arunuday8814 17 дней назад
Thanks for this. It was v.useful as we are actually evaluating GraphRAG for one of our projects. You mention that there are different chunking techniques which can improve the output quality of RAG. Can you also pls do a video on that pls?
@freeideas 17 дней назад
I don't understand how graphrag costs more. I thought the way it worked was, one round-trip to LLM is used to write the graph query, then the query runs against the graphQL database, then the results from the query are used in a 2nd round-trip to the LLM. Seems a sure thing that this is slower, but I would expect both prompts to be quite a bit smaller than the typical vector-database solution. 1st prompt is just the user's question and maybe some info about the graph schema; 2nd prompt is very precise data that almost directly answers the question, instead of a large number of chunks that may or may not be relevant. Perhaps the large cost is incurred while INPUT of the source data happens? Thank you for the video, btw. This is def a topic that needs to be understood by people making a living in this field.
@attilavass6935 17 дней назад ⁺¹
Talking about fair comparison but forgetting to make distinction of costs of different part of RAG, like ingestion and retrieval.
Ingestion is much more costly for a graph solution, but there are use cases where it pays off, when docs don't change frequently or at all ( eg. Arxiv articles) and improving RAG accuracy is important, but latency is not that much...
@1littlecoder 17 дней назад ⁺¹
Haha My bad. Thanks for pointing it out!
@henkhbit5748 17 дней назад
Yes, graphrag can find extra solution that are missed with current rag techniques. but in general the end results are acceptable using current rag techniques for most use cases. Agree, it's slow, u need a powerful LLM thus cost a lot. I think it's a fair warning about the cons of graphrag. Thanks for the video.
@d.d.z. 18 дней назад ⁺¹
Thank you
@1littlecoder 18 дней назад
Youre welcome!
@eliasterzis2548 18 дней назад
Very well said! I agree so much with you! I think that the cost for both money and time is too much for this. We charged around $14 for indexing a document with GraphRAG. But it is not only the cost. It is also the fact that not all the documents are connected with each other nor in all domains the knowledge is distributed in many documents. So GraphRAG is not the solution for every RAG problem nor one size fits all. I recently read a relative article in medium that I can share with you.
Also I am interesting about what exactly technics do you have in your mind when you say about advanced RAG in your video. A video on this topic could be helpful!
@1littlecoder 18 дней назад ⁺¹
Thanks for sharing. RUclips might shadowban comments with links, could you please email it to me or send on twitter? my email is 1littlecoder at gmail dot com
@eliasterzis2548 17 дней назад
@@1littlecoder Certainly! I sent it through an email! Have a nice day!
@KevinKreger 18 дней назад
I'm using a knowledge graph and trying to cut down on queries. Very timely.
@aghanjack 14 дней назад
Graph RAG must be looked at, as a capability enabler, not just a technique.
LLM-based RAGs/apps have a weak information model - Graph RAG is "one step" in fixing that.
Of course, it is not too far fetched to think that LLM-like systems should be able to deal with graphs natively and this will happen, too.
@MattJonesYT 17 дней назад
It's basically alpha quality code that barely works and is loaded with problems. Once it matures a little it will be much more useful. The concept itself is on the right path though.
@jackbauer322 18 дней назад
Wrong... You absolutely need it for any serious retrieval of advanced concepts
@msokokokokokok 18 дней назад ⁺¹
Rag systesms has 2 big disadvantages - what objective function was used to create embeddings of document and curse of dimensionality (false positiuves)
@1littlecoder 18 дней назад
curious why did you mention false positives alongside curse of dimensionality?
@msokokokokokok 18 дней назад
@@1littlecoder say your query is orange , and your relevant document is along "fruit "dimension , but when you do nearest neighbor match, you will also pull documents from other irrelevant dimensions like "shape" and you will get football in your RAG which will pollute generation, .. you see KNN based retreival does not differentiate among dimensions, and with extreme high diemsion 2048, 4096, 8192, 16384. makes problem more prone to false psotives
@msokokokokokok 18 дней назад
@@1littlecoder It is noisy by design suffering from curse of dimensionality. Say a query is an orange. RAG will retrieve Apple (fruit dimension) and Ball (Shape dimension). Now Ball will add noise. noise ~ false positives
@harisjaved1379 18 дней назад ⁺¹
That’s ok, you can use the LLM to ask about your results.

Следующие

Автовоспроизведение

I wish every AI Engineer could watch this.