Hey Jason, great takes! This whole thing almost sounds like trying to develop a timezones system from scratch haha. I'm guessing very soon RAG will be "solved" (sota foss) but we're still at the bleeding edge. I think most of the meta "non embeddings" questions you're asking could be answered by a small script over the db, if the llm knows the structure of you data and the right keys to search, it could write the extraction code and provide an answer with 2 steps. Also for the most common questions, the answers could be ragged and live updated
What do you think about Live-Updating RAG? Let's say some info changed in some client's watched doc, we would want to immediately prepare it for rag, remove the old outdated chunks, rechunk new data, and update the db/vdb. Ideally very quickly. I think it really sets apart fun-to-do projects and real world production systems. Would love it if you could cover this 💜 Thanks & All the best! ps - Your takes around benchmarking metrics are spot on. We need to develop better tooling for ourselves
I just found Instructor. Cannot thank you enough! This is the most elegant way to program LLM interactions I've seen!
Jason thank you for dropping all this wisdom on us! This short video format on separate topics is great.
Hey Jason, great takes!
This whole thing almost sounds like trying to develop a timezones system from scratch haha. I'm guessing very soon RAG will be "solved" (sota foss) but we're still at the bleeding edge.
I think most of the meta "non embeddings" questions you're asking could be answered by a small script over the db, if the llm knows the structure of you data and the right keys to search, it could write the extraction code and provide an answer with 2 steps. Also for the most common questions, the answers could be ragged and live updated
Is there any reference implementation on how to address routing/query classification? considering multi-turn, context aware conversations.
What do you think about Live-Updating RAG?
Let's say some info changed in some client's watched doc, we would want to immediately prepare it for rag, remove the old outdated chunks, rechunk new data, and update the db/vdb. Ideally very quickly.
I think it really sets apart fun-to-do projects and real world production systems.
Would love it if you could cover this 💜
Thanks & All the best!
ps - Your takes around benchmarking metrics are spot on. We need to develop better tooling for ourselves
What’s rag ?🙏
slay