what if the text you want to query, is so large that you need to chunk it. Will you have then multipe chunks in the database with the same metadata? for e.g if you ask your chatbot a question it constructs the query which filters only the relevant chunks, and you end up for example with 10 chunks of the same document wich are then used as context?
in contrast to "query expansion", it takes only the metadata from the original ("self") query, and adds nothing. And this can be implemented not only with LLM, in general :D
what if the text you want to query, is so large that you need to chunk it. Will you have then multipe chunks in the database with the same metadata?
for e.g if you ask your chatbot a question it constructs the query which filters only the relevant chunks, and you end up for example with 10 chunks of the same document wich are then used as context?
Why is it called self querying, and not LLM-based query generation?
in contrast to "query expansion", it takes only the metadata from the original ("self") query, and adds nothing. And this can be implemented not only with LLM, in general :D
😊😊Daq😊1😊😊😊êw😊😊😊😊😊😊😊😊😊w😊wqq¹ SD d😊5a😊ar😊😊😊😊😊😊⁵x😊t😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊😊∆@@ivanhelsing214
This will add an additional LLM call into any query. Which is not so good.
Thinking about production: using a cheaper LLM or a local SLM is not so bad.
It's actually good if you design it well. Sometimes you need additional information as LLMs can hallucinate regardless of prompt quality.
@@devSero you can provide a few-shot prompting (which, I think, LangChain already does under the hood)
@@ivanhelsing214 I'm aware but I don't use LangChain for production purposes.