This is the 10th video in our RAG From Scratch series, focused on different types of query routing (logical and semantic). Notebook: github.com/lan... Slides: docs.google.co...
Hi Lance, Thanks for sharing. One thing that would be helpful would be if you could discuss routing when the state needs to be remembered. What I mean is that you start in a particular location, based on routing logic, you end up in a new state. The next time the user interacts with the system, you pick up where you left off and you would then have different routing logic. You essentially are building a state machine of logic where each state has separate routing logic.
Hi Lance, I have to say that I am enjoying your series very much. Thank you for breaking these concepts down in such a way that makes it easy to digest.
Both approaches have some problems. - LLM-based is most accurate but adds latency, which may or may not be acceptable (depending on the complexity of existing chain). - Semantic is very fast (still slower than say tfidf or other simpler NLP methods, but ~5-10 times faster than LLM), but this misclassifies the route much more often than LLM-based approach. I ended up doing classification and resulting branches in parallel runnable, and then deciding which output to show in merge step. But this only works if cost / quota is not a problem and there are only few branches. I have some ideas how to address more complex chains, but still experimenting :)
Thank you for the videos Lance! I've always wondered how to manually enforce choice of chain use. Would you think of Routing as a manual more basic way of doing multi-agent (non-agenic) chains?
Is there a way to include memory? I mean if we want to mantain a previous conversation or change between different topics, is there a way to include memory to our LLM routing/Semantic Routing?
Is it possible to use more than one databases simultaneously based on user query. Kind of multiple ifs statements in python checks if query requires vector db and analytics db, only one or use both to answer.
Hi Lance,
Thanks for sharing. One thing that would be helpful would be if you could discuss routing when the state needs to be remembered.
What I mean is that you start in a particular location, based on routing logic, you end up in a new state. The next time the user interacts with the system, you pick up where you left off and you would then have different routing logic. You essentially are building a state machine of logic where each state has separate routing logic.
Hi Lance, I have to say that I am enjoying your series very much. Thank you for breaking these concepts down in such a way that makes it easy to digest.
Lance from Langchain 🙌🏾🙌🏾🙌🏾
Really good series! Thanks Lance!
This is a bit unrelated, but I really like the flow diagrams you always have in your videos, what tool do you use for them?
Excalidraw
thank you!!
Both approaches have some problems.
- LLM-based is most accurate but adds latency, which may or may not be acceptable (depending on the complexity of existing chain).
- Semantic is very fast (still slower than say tfidf or other simpler NLP methods, but ~5-10 times faster than LLM), but this misclassifies the route much more often than LLM-based approach.
I ended up doing classification and resulting branches in parallel runnable, and then deciding which output to show in merge step. But this only works if cost / quota is not a problem and there are only few branches.
I have some ideas how to address more complex chains, but still experimenting :)
Can you please share your experiments on git, i am intrested in this and can also contribute
Thank you for the videos Lance! I've always wondered how to manually enforce choice of chain use. Would you think of Routing as a manual more basic way of doing multi-agent (non-agenic) chains?
How can we deal with the situation where a user throw an unrelated question to this router chain??
So this is the trick that LLM has some 'reasoning' ability that can 'think' and routing to the right direction.... right?
Is there a way to include memory? I mean if we want to mantain a previous conversation or change between different topics, is there a way to include memory to our LLM routing/Semantic Routing?
Is there a way to use with_structured_output with Gemini pro?
A non-related question to the video itself - but which software do you use to create the diagrams?
Is it possible to use more than one databases simultaneously based on user query. Kind of multiple ifs statements in python checks if query requires vector db and analytics db, only one or use both to answer.
Why do we need to route queries?
In langchain 0.1.16 , I see in the code the method with_structured_output as Not Implemented - so how are you making it work?
Can we use AzureChatOpenAI instead of ChatOpenAI for function calling with LLM to create Structured_llm ??
yes, you can.
@@florinfilip6355 it worked!