Great video and demo as always! You mentioned a RAGAS notebook that compares these techniques. Would be interested in seeing that too, if it’s convenient to add a link. (A demo/tutorial that deep dives into adding and utilizing metadata within the LlamaIndex environment would be super helpful.)
Hey guys, question for you: I want to take a (1) source PDF document, (2) comprehends its content (meaning), and then (3) search through a database of documents with similar meanings/context to (4) find and display excerpts describing similar situations and decisions. Example: Take the legal case of FTX Sam Bankman Fried Crypto Scam, understands the key details of this case, and then do searches through a database of other legal cases to find and present excerpts that describe similar situations and decisions made in those cases. Any suggestion on approach? (I just started working on this on Sunday at a Law Hackathon that Stanford held; having problems with approach)
This cannot be done in a single step but we can use multi-step agent approach. For example if you use plan and execute strategy 1) We need to extract the Key points from the Source PDF 2) We need to call the Legal cases Database API, (In my case Indian Kannon) 3) Analysis the results per document - Aggregation results 4) Filter the needed one and show the results
Don’t use rag just use a sql database and structure your data into the database. Just use a normal you know academic referencing annotation source attribution type database for this because in reality an LLM is doing K mean similarity searches around a pool of text if you ask it to give you a definitive list of bunch of ideas say all historical dates from your data it’s not gonna give you the definitive list. It’s gonna give you a statistical inference of what it thinks is the list that it can serve up within its token window count.. too bad if the lis you want is bigger than what it can server up.. it wont be the complete list.
Google Colab Notebook: colab.research.google.com/drive/1Yz4hs08pTLx3uX7A3mTY_x-ykulfxqYQ?usp=sharing
Event Slides: www.canva.com/design/DAGCBnEs6-0/xHNkBkdflgRNXL3oV0dMEg/view?DAGCBnEs6-0&
Great video and demo as always! You mentioned a RAGAS notebook that compares these techniques. Would be interested in seeing that too, if it’s convenient to add a link. (A demo/tutorial that deep dives into adding and utilizing metadata within the LlamaIndex environment would be super helpful.)
We'll be adding that link very soon! I'll make sure to ping here when we do!
Amazing content, exactly what I've been looking for to utilise in my pet project :D analysing some serious law documents...
Love to hear it @bparlan - keep building and shipping! When you're ready to share, we'd love to amplify!
On a scale from 1 to John Wick how much do you love your dog? 😆🤣🤣😆
Hey guys, question for you:
I want to take a (1) source PDF document, (2) comprehends its content (meaning), and then (3) search through a database of documents with similar meanings/context to (4) find and display excerpts describing similar situations and decisions.
Example:
Take the legal case of FTX Sam Bankman Fried Crypto Scam, understands the key details of this case, and then do searches through a database of other legal cases to find and present excerpts that describe similar situations and decisions made in those cases.
Any suggestion on approach?
(I just started working on this on Sunday at a Law Hackathon that Stanford held; having problems with approach)
This cannot be done in a single step but we can use multi-step agent approach. For example if you use plan and execute strategy
1) We need to extract the Key points from the Source PDF
2) We need to call the Legal cases Database API, (In my case Indian Kannon)
3) Analysis the results per document - Aggregation results
4) Filter the needed one and show the results
Don’t use rag just use a sql database and structure your data into the database. Just use a normal you know academic referencing annotation source attribution type database for this because in reality an LLM is doing K mean similarity searches around a pool of text if you ask it to give you a definitive list of bunch of ideas say all historical dates from your data it’s not gonna give you the definitive list. It’s gonna give you a statistical inference of what it thinks is the list that it can serve up within its token window count.. too bad if the lis you want is bigger than what it can server up.. it wont be the complete list.