You don't need GraphRAG!!!
HTML-код
- Опубликовано: 27 июл 2024
- I'm giving 3 reasons why you won't need GraphRAG at this point!
Timestamps:
00:00 Intro
00:06 No GraphRAG
00:49 GraphRAG Baseline Benchmark
02:52 Chunking Techniques Report
04:52 GraphRAG Cost Issue
06:51 Importance of Latency in RAG
08:01 The End
🔗 Links 🔗
Graph RAG
@engineerprompt Video discussing cost - • Graph RAG: Improving R...
GraphRAG is RAG for Riches - x.com/iamvladyashin/status/18...
GraphRAG by MSFT - www.microsoft.com/en-us/resea...
❤️ If you want to support the channel ❤️
Support here:
Patreon - / 1littlecoder
Ko-Fi - ko-fi.com/1littlecoder
🧭 Follow me on 🧭
Twitter - / 1littlecoder
Linkedin - / amrrs - Наука
I love your channel, but I also agree that you are missing the point here. Even with the most advanced chunking or retrieval or re-ranking or whatever other techniques, will never retrieve related topics that are rich for the problem being solved. For example, if you are asking about machine learning and you find a related cluster , but then you have another cluster that speaks about ethics and has a strong relationship with this cluster, and therefore you want to delve deep into that to give a better answer, you will never retrieve those chunks from a similarity search based on a question about machine learning. And that is when something like GraphRag, and probably an evolution of it, will allow agents to traverse the graph through other concepts that are meaningful to give a better answer. For me this is the first step in the right direction for more advanced agents
@@adandrea interesting point. Thanks for sharing. I'm wondering why Microsoft couldn't pick an example like this and showcase benchmarks
Good question! @@1littlecoder
@@1littlecoder because it's that new, the writer of the research paper did showcase something, but perhaps not this particular comcept mentioned by @adandrea - right now graphrag is v0.1 i believe, so i'd say it'll take some time for use cases to dev and the system to improve
interesting perspective. However you were very vague when it came to details on these "more advanced solutions" we should instead be using. My stack at the moment is simply:
1. "BAAI/bge-small-en-v1.5" for the bi-encoder
2. "BAAI/bge-reranker-v2-minicpm-layerwise" as the cross encoder to re-rank the top 50 from step 1
3. phi "3.1" mini Q5_K_M to generate a response
is there a more advanced technique from here other than larger models or Graph RAG?
@@theresalwaysanotherway3996 thanks for sharing your approach. Curious if you use a vector db?
Sounds neat. You got a GH or HF repo for that or a notebook? Would love to check out, I'm a little out of the loop of the cutting edge atm.
@@1littlecoder to store the vector embeddings I just save a .pt file with torch.save(embeddings), and on start up I load it up. Are there any advanced techniques I'm missing that could get better retrieval scores other than moving up to Graph RAG?
@1littlecoder can you please a make a video of SOTA chunking, embeddings and vector store for production grade.
Great insights!
With all due respect this one you got it wrong and its totally an opionated view. Let me know if you want a open discussion and I can come to your channel
Why do you think so? I listed my reasons. Happy to hear yours!
Thanks for this. It was v.useful as we are actually evaluating GraphRAG for one of our projects. You mention that there are different chunking techniques which can improve the output quality of RAG. Can you also pls do a video on that pls?
I don't understand how graphrag costs more. I thought the way it worked was, one round-trip to LLM is used to write the graph query, then the query runs against the graphQL database, then the results from the query are used in a 2nd round-trip to the LLM. Seems a sure thing that this is slower, but I would expect both prompts to be quite a bit smaller than the typical vector-database solution. 1st prompt is just the user's question and maybe some info about the graph schema; 2nd prompt is very precise data that almost directly answers the question, instead of a large number of chunks that may or may not be relevant. Perhaps the large cost is incurred while INPUT of the source data happens? Thank you for the video, btw. This is def a topic that needs to be understood by people making a living in this field.
Talking about fair comparison but forgetting to make distinction of costs of different part of RAG, like ingestion and retrieval.
Ingestion is much more costly for a graph solution, but there are use cases where it pays off, when docs don't change frequently or at all ( eg. Arxiv articles) and improving RAG accuracy is important, but latency is not that much...
Haha My bad. Thanks for pointing it out!
Yes, graphrag can find extra solution that are missed with current rag techniques. but in general the end results are acceptable using current rag techniques for most use cases. Agree, it's slow, u need a powerful LLM thus cost a lot. I think it's a fair warning about the cons of graphrag. Thanks for the video.
Thank you
Youre welcome!
Very well said! I agree so much with you! I think that the cost for both money and time is too much for this. We charged around $14 for indexing a document with GraphRAG. But it is not only the cost. It is also the fact that not all the documents are connected with each other nor in all domains the knowledge is distributed in many documents. So GraphRAG is not the solution for every RAG problem nor one size fits all. I recently read a relative article in medium that I can share with you.
Also I am interesting about what exactly technics do you have in your mind when you say about advanced RAG in your video. A video on this topic could be helpful!
Thanks for sharing. RUclips might shadowban comments with links, could you please email it to me or send on twitter? my email is 1littlecoder at gmail dot com
@@1littlecoder Certainly! I sent it through an email! Have a nice day!
I'm using a knowledge graph and trying to cut down on queries. Very timely.
Graph RAG must be looked at, as a capability enabler, not just a technique.
LLM-based RAGs/apps have a weak information model - Graph RAG is "one step" in fixing that.
Of course, it is not too far fetched to think that LLM-like systems should be able to deal with graphs natively and this will happen, too.
It's basically alpha quality code that barely works and is loaded with problems. Once it matures a little it will be much more useful. The concept itself is on the right path though.
Wrong... You absolutely need it for any serious retrieval of advanced concepts
Rag systesms has 2 big disadvantages - what objective function was used to create embeddings of document and curse of dimensionality (false positiuves)
curious why did you mention false positives alongside curse of dimensionality?
@@1littlecoder say your query is orange , and your relevant document is along "fruit "dimension , but when you do nearest neighbor match, you will also pull documents from other irrelevant dimensions like "shape" and you will get football in your RAG which will pollute generation, .. you see KNN based retreival does not differentiate among dimensions, and with extreme high diemsion 2048, 4096, 8192, 16384. makes problem more prone to false psotives
@@1littlecoder It is noisy by design suffering from curse of dimensionality. Say a query is an orange. RAG will retrieve Apple (fruit dimension) and Ball (Shape dimension). Now Ball will add noise. noise ~ false positives
That’s ok, you can use the LLM to ask about your results.