Building an Adaptive RAG System with LangGraph: Embedding Dynamic Query Routing
This article highlights the concepts of Adaptive RAG (Retrieval-Augmented Generation). This LangGraph-based implementation brings us closer to self-reflective and self-improving systems that are also capable of adaptively choosing the best source of information based on a given query.
What is Adaptive RAG?
Adaptive RAG is an extension of the traditional system RAG (Retrieval-Augmented Generation), this addition brings in some significant improvements. This is because RAG enriches the output generated by a language model by retrieving pertinent information from a knowledge base. Adaptive RAG comes with the following enhancements:
- Query routing to choose between local vector store and web search
- Self-evaluation of retrieved documents and generated responses
- If the outcome from the first generation is not satisfactory, there can be several attempts at generations.
- Ability to seek additional information when needed
Components and Workflow
Our Adaptive RAG implementation is summed up as follows:
- Query Router: It is the part that determines whether to use a local vector store or instead do a web search in response to the query input.
- Retriever: It fetches relevant documents from the selected source, be it a vector store or web.
- Document Grader: Calculates the degree of relevance the retrieved documents possess.
- Generator: It feeds the retrieved information to a generator that produces a response.
- Generation Grader: Checks the output produced whether it is a hallucination or relevant.
The workflow is defined using a StateGraph
from LangGraph, allowing for complex, conditional paths through the system.
Code Breakdown
Let’s examine the key parts of our Adaptive RAG implementation:
Graph Definition (graph.py)
workflow = StateGraph(GraphState)
workflow.add_node(RETRIEVER, retriever_node)
workflow.add_node(GRADE_DOCS, grading_node)
workflow.add_node(GENERATE, generate_node)
workflow.add_node(WEBSEARCH, websearch_node)
workflow.set_conditional_entry_point(
query_router_conditional_edge, {VECTORSTORE: RETRIEVER, WEBSEARCH: WEBSEARCH}
)
workflow.add_edge(RETRIEVER, GRADE_DOCS)
workflow.add_conditional_edges(
GRADE_DOCS,
grade_conditional_node,
{
WEBSEARCH: WEBSEARCH,
GENERATE: GENERATE,
},
)
workflow.add_edge(WEBSEARCH, GENERATE)
workflow.add_edge(GENERATE, END)
workflow.add_conditional_edges(
GENERATE,
grade_generation_grounded_in_documents_and_question,
{NOT_SUPPORTED: GENERATE, NOT_USEFUL: WEBSEARCH, USEFUL: END},
)
This graph structure allows for adaptive routing and multiple paths through the system, a key aspect of Adaptive RAG.
Query Routing (router.py)
def query_router_conditional_edge(state: GraphState) -> str:
question = state["question"]
query_router_result = query_router_prompt_chain.invoke(input={"question": question})
destination = query_router_result.destination
print("Query router destination is {}".format(destination))
return destination
This function demonstrates the adaptive nature of the system by routing queries to either the local vector store or web search based on the content of the question.
State Management (state.py)
class GraphState(TypedDict):
question: str
generation: str
websearch: bool
documents: List[str]
class QueryRouter(BaseModel):
destination: Literal["VECTORSTORE", "WEBSEARCH"] = Field(
description="Given a question choose to route it to vectorstore or websearch"
)
These classes define the state of the graph and the structure for the query router’s decision, enabling the adaptive behavior of the system.
Key Benefits of Adaptive RAG
- Improved accuracy: Since the outputs of Adaptive RAG are evaluated to pick out the most informative source, more hallucinations and responses not relevant to questions are minimized.
- Flexibility: The system can be adapted to different kinds of queries, being based on local knowledge if applicable but falling back to web search when that is necessary.
- Iterative Improvement: It can improve the response from different generating attempts.
- Transparency: self-evaluation brings more transparency to the system’s decision-making process.
Test Scenarios
Let’s go through the following test scenarios to test our Adaptive RAG system capabilities.
- Local Knowledge Query Input: “What are the most common Java interview questions?” Expected Behavior: It should forward this to VECTORSTORE because that forms part of the local knowledge of the system.
- Web Search Query: Input: “What are recent developments in quantum computing?” Expected Behavior: This should be routed to WEBSEARCH, as the system’s local knowledge probably does not have it.
- Ambiguous Query: Input: How do I make risotto? Expected Behavior: It should go either way. If you have recipes for cooking in Vectorstore, it should go there. Otherwise, they should route to WEBSEARCH.
- Iterative Improvement: Input: “Define polymorphism in object-oriented programming.” Expected Behavior: The system might fetch data from the vector store at first. If the generation is rated as NOT_SUPPORTED or NOT_USEFUL, it should try to regenerate or do a web search for more information.
To run these tests, you can use the provided main block in graph.py:
if __name__ == "__main__":
res = graph.invoke(input={"question": "how to calculate XIRR in mutual funds?"})
print(res)
Invoked::::query_router_conditional_edge
Query router destination is WEBSEARCH
Invoked::::websearch_node
Invoked::::generate_node
Invoked::::grade_generation_grounded_in_documents_and_question
hallucination_result.bool_score is yes
generation_answer_grader_result.bool_score is yes
{'question': 'how to calculate XIRR in mutual funds?', 'generation': 'To calculate XIRR in mutual funds, you need to have a record of all cash flows associated with the investment and enter SIP transactions and corresponding dates in an excel sheet. XIRR is a modification of IRR that factors in irregular periods and is used to calculate the annualized rate of return for investments with irregular cash flows. It is a vital performance measure for mutual funds as it considers both investment inflows and outflows over different time periods.', 'documents': [Document(metadata={'source': 'https://www.equiruswealth.com/blog/understanding-xirr-in-mutual-funds-calculation-and-significance'}, page_content='To calculate XIRR, one must have a record of all cash flows associated with the investment. ... Significance of XIRR in Mutual Fund Evaluation: XIRR serves as a vital performance measure for mutual funds due to its ability to consider both investment inflows and outflows over different time periods. Here are some key reasons why XIRR is ...'), Document(metadata={'source': 'https://cleartax.in/s/xirr-mutual-funds'}, page_content='The XIRR formula is the modification of IRR (Internal Rate of Return) and factors irregular periods. You will have to enter SIP transactions, and the corresponding dates from mutual fund statements in the excel sheet. You then apply the XIRR formula to calculate SIP returns. For example, you invest Rs 3,000 per month in a mutual fund scheme ...'), Document(metadata={'source': 'https://www.forbes.com/advisor/in/investing/xirr-in-mutual-fund/'}, page_content='What is XIRR in Mutual Fund. XIRR, or extended internal rate of return, is a financial metric used to calculate the annualized rate of return for investments with irregular cash flows. Unlike ...')]}
Replace the question with each of the test scenarios to observe the system’s behavior.
Conclusion
The Adaptive RAG is an important quantum leap in the search for more robust, accurate, and versatile AI systems. Its integration of query routing, self-reflection, and self-improvement mechanisms brings us closer to real AI that can critically evaluate its own outputs and adapt its approach when necessary, besides choosing the most relevant information sources.
Such an implementation with LangGraph shows how excellent, flexible design is in creating the elegance of sophisticated AI workflows: developing and maturing this work further opens a whole set of exciting possibilities for developing more accurate but also more adaptive and trustworthy AI systems.
Adaptive RAG has kept moving forward in this promising step in the journey to more intelligent and more flexible AI. Now, instead of just thinking about how much information processing AI can do, we also have to think about how it could possibly question its own processes, given the many different types of queries and information needs.
Let’s Connect (Or Hire Me!)
The complete code for this Adaptive RAG system is available in the GitHub repository. If you’re utterly enthralled with this blog or thinking “This person should definitely be on my team,” let’s connect! I’d love to chat about Agents, LLM gossip, or even exciting job opportunities.
Find me on LinkedIn and let’s dive deeper into the world of AI and LLMs together.