Session 7: RAG Evaluation with RAGAS and How to Improve Retrieval
HTML-код
- Опубликовано: 3 дек 2023
- What you'll learn this session:
- How and why to evaluate RAG systems using best-practice open-source tooling
- RAG Assessment with RAGAS, including Context Precision, Context Recall, Answer Relevancy, and Faithfulness
- How to improve RAG system outputs using advanced retrieval
Speakers:
Dr. Greg Loughnane, Founder & CEO AI Makerspace.
/ greglough. .
Chris Alexiuk, CTO AI Makerspace.
/ csalexiuk
Apply for one of our AI Engineering Courses today!
www.aimakerspace.io/cohorts Наука
incredibly informative, not like clickbait or anything like other channels. real 37mins worth of knowledge. Thank you 🙌
great video, thanks a lot!
Great presentation guys, full of valuable knowledge 🎉
Great job guys. 👏
Thanks Mansoor!
thank you:)
Colab Notebook: colab.research.google.com/drive/1TZo2sgf1YFzI4_U-tGppg_ylHAR3MXF_?usp=sharing
Slides: canva.com/design/DAF13fk63Ps/oKNCJf_Oez21fkf0KRW9eA/edit?DAF13fk63Ps&
The slides link is not valid ?
@@someshfengade9623 it looks like the permissions were set to "anyone can edit" and someone went ahead and did that! We've restored the previous version and it should work now!
Can anyone tell me how ragas actually calculates these numbers. Like manually I get it, but what do the algorithms or functions look like? Like how does it measure faithfulness?
Hey Ravi great question! We go a bit deeper into this in our more recent event with the creators! ruclips.net/user/liveAnr1br0lLz8?si=UG6vRnSY9oVtAuAT
We'd recommend reading through the docs and digging into the source to go EVEN deeper! e.g., docs.ragas.io/en/stable/concepts/metrics/faithfulness.html
Thanks for the great video. When did context relevance get broken out into context precision and context recall? The RAGAs paper of 26 September 2023 still refers only to relevance and I'd find it useful to have a source to explain why it was broken into two components. Intuitively it makes sense though.
Hey @andybrown8438 we're planning another event soon on RAG eval, and are in contact with the RAGAS creators - we'll ask them!
This is really great explanation. I have one query, lets say I want to improve the performance by focusing on Faithfulness or Answer Relevance, so which RAG optimization techniques I should follow to increase Faithfulness or which techniques can improve Relevance or Precision etc.
The answer is, unfortunately, it depends! The whole system needs to work together (from data quality, to retrieval quality, to model performance, to prompting), and it needs to work for your use case. What is the best metric to use for your use case? That also depends. It all comes down to metrics-driven development: docs.ragas.io/en/stable/concepts/metrics_driven.html , but you need to decide which direction to drive!
There are some simple things to do after you set up RAG like reranking, but for any given use case the details really matter with regards to what steps you should take.
Hi Chris, Very informative video, Can you please tell how can I generate test set using Azure in RAGAs.
You'd want to use a LangChain apadter for Azure - so we can use that to create the test set.
Good video but one question: Why did you choose to create the testset step-by-step yourself and not use the provided TestSetGenerator from Ragas? Was is not available back then?
That's right! They had just rolled it out it when we had them on for this more recent event: ruclips.net/user/liveAnr1br0lLz8?si=_wIYqsL4vcVM5QDq
Hi chris
I have a use case for text-to-SQL with RAG using LangChain. Is there any example or guide to evaluate the SQL result? Is the metric the same as regular text RAG? Thanks in advance
The E2E metrics would likely be the same - and you could crearte a dataset that let you compare the intermediate results as well, the same as you saw here.
Chris I love your explanations and notebooks! But you shouldn't be singing while Greg is talking at 16:49
😆
Why did nobody laugh at Greg’s durag joke?
😆🤣
Dude you're over 30 years old. Take the cap off if you want to be taken seriously
Thanks for the tip @nirash! The h/t, that is. Cheers!
@@AI-Makerspace You're welcome bro. Carry that bald head with pride
@@nirash8018 ✊