RAGAS - Evaluate your LangChain RAG Pipelines
HTML-код
- Опубликовано: 30 июл 2024
- Creating good RAG Systems is hard. RAGAS can help you to change some parts of your System and perform automated performance evaluation to see if your RAG performance improved or not.
Code: github.com/Coding-Crashkurse/...
Timestamps:
0:00 Introduction
0:30 RAGAS
9:46 RAGAS with LangFuse
This is awesome! Great and clear video :)
So simple, helpful and clear! Very interesting.
Thanks for the video
Excellent timing ;-) Thanks for video
Another banger! :)
Огромное спасибо за видео!!
Bro is on fire this month!
You guys give me so many requests on topics 😀
@@codingcrashcourses8533 i was, i am and i will support you till the end. Ur videos helped my sooooooooooo much.
Great video , thank you
thank you for your comment :)
I switched to using RecursiveCharacterTextSplitter, but my context relevance is still low. Do you know why?
Nice one! Also a big fan of RAGAS, however there are still many bugs that come with RAGAS, especially when trying to evaluate with local llms
yes, it´s still far away from perfect, but good that frameworks like these are developed
Thank you for the video!
Yeah, It will be really intereseting to know how to perform RAGAS in CI/CD pipline. Can you record video for this one please? Will be really helpful
Maybe in a few weeks
Nice, Meister! Machste irgendwann das Thema Code RAG ggf. mit Knowledge-Graphen?
Currently no plans on working with knowledge graphs, since I don´t have experience with these. But maybe in the future :)
I was waiting for this thank you so much, is it possible to add how to evaluate accuracy using F1 scoring or other methods
Not out of the box, F1 scores can be easily caculated with pandas (to_pandas) like this: F1 = 2*precision*recall/(precision+recall)
@@codingcrashcourses8533 thanks
you could also calculate the RAGAS score which is the mean across all metrics
It's there an ai pipeline to auto optimize the rag quality? Seems like the obvious next step...
Great video 🙏👍
You probably would have to build something like that on your own, since there are so many ways how a pipeline could look like. You could also work on your prompt and so on.
@@codingcrashcourses8533 I'd always want to manually make changes I think are best, but I'd still like to see a full matrix of hyperperameters to remove alot of the guess work. Chunk size for example. More over I'd like to benchmark everything and add scoring functions. For example a score for fact checking - see Lucidate's last video.
And also IndyDevDan last video battle royal of models, I suggested to combine it with something like you do with rag params and what I suggest for full pipeline benchmark with ai suggested optimization