Tutorial: Building an Agentic RAG with Fallback to Websearch

_{Last Updated:
June 19, 2025}

Level: Intermediate
Time to complete: 10 minutes
Components Used: ConditionalRouter, SerperDevWebSearch, ChatPromptBuilder, OpenAIChatGenerator
Prerequisites: You must have an OpenAI API Key and a Serper API Key for this tutorial
Goal: After completing this tutorial, you’ll have learned how to create an agentic RAG pipeline with conditional routing that can fallback to websearch if the answer is not present in your dataset.

Overview

When developing applications using retrieval augmented generation ( RAG), the retrieval step plays a critical role. It serves as the primary information source for large language models (LLMs) to generate responses. However, if your database lacks the necessary information, the retrieval step’s effectiveness is limited. In such scenarios, it may be practical incorporate agentic behavior and use the web as a fallback data source for your RAG application. By implementing a conditional routing mechanism in your system, you gain complete control over the data flow, enabling you to design a system that can leverage the web as its data source under some conditions.

In this tutorial, you will learn how to create an agentic RAG pipeline with conditional routing that directs the query to a web-based RAG route if the answer is not found in the initially given documents.

Development Environment

Prepare the Colab Environment

Install Haystack

Install Haystack with pip:

%%bash

pip install haystack-ai

Enter API Keys

Enter API keys required for this tutorial.

from getpass import getpass
import os

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass("Enter OpenAI API key:")
if "SERPERDEV_API_KEY" not in os.environ:
    os.environ["SERPERDEV_API_KEY"] = getpass("Enter Serper Api key: ")

Populate a Document Store

Create a Document about Munich, where the answer to your question will be initially searched and write it to InMemoryDocumentStore.

from haystack.dataclasses import Document
from haystack.document_stores.in_memory import InMemoryDocumentStore

document_store = InMemoryDocumentStore()

documents = [
    Document(
        content="""Munich, the vibrant capital of Bavaria in southern Germany, exudes a perfect blend of rich cultural
                                heritage and modern urban sophistication. Nestled along the banks of the Isar River, Munich is renowned
                                for its splendid architecture, including the iconic Neues Rathaus (New Town Hall) at Marienplatz and
                                the grandeur of Nymphenburg Palace. The city is a haven for art enthusiasts, with world-class museums like the
                                Alte Pinakothek housing masterpieces by renowned artists. Munich is also famous for its lively beer gardens, where
                                locals and tourists gather to enjoy the city's famed beers and traditional Bavarian cuisine. The city's annual
                                Oktoberfest celebration, the world's largest beer festival, attracts millions of visitors from around the globe.
                                Beyond its cultural and culinary delights, Munich offers picturesque parks like the English Garden, providing a
                                serene escape within the heart of the bustling metropolis. Visitors are charmed by Munich's warm hospitality,
                                making it a must-visit destination for travelers seeking a taste of both old-world charm and contemporary allure."""
    )
]

document_store.write_documents(documents)

Creating the Initial RAG Pipeline Components

First, you need to initalize components for a RAG pipeline. For that, define a prompt instructing the LLM to respond with the text "no_answer" if the provided documents do not offer enough context to answer the query. Next, initialize a ChatPromptBuilder with that prompt. ChatPromptBuilder accepts prompts in the form of ChatMessage. It’s crucial that the LLM replies with "no_answer" as you will use this keyword to indicate that the query should be directed to the fallback web search route.

As the LLM, you will use an OpenAIChatGenerator with the gpt-4o-mini model.

The provided prompt works effectively with the gpt-4o-mini model. If you prefer to use a different ChatGenerator, you may need to update the prompt to provide clear instructions to your model.

from haystack.components.builders import ChatPromptBuilder
from haystack.dataclasses import ChatMessage
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever

retriever = InMemoryBM25Retriever(document_store)

prompt_template = [
    ChatMessage.from_user(
        """
Answer the following query given the documents.
If the answer is not contained within the documents reply with 'no_answer'

Documents:
{% for document in documents %}
  {{document.content}}
{% endfor %}
Query: {{query}}
"""
    )
]

prompt_builder = ChatPromptBuilder(template=prompt_template, required_variables="*")
llm = OpenAIChatGenerator(model="gpt-4o-mini")

Initializing the Web-RAG Components

Initialize the necessary components for a web-based RAG application. Along with a ChatPromptBuilder and an OpenAIChatGenerator, you will need a SerperDevWebSearch to retrieve relevant documents for the query from the web.

If desired, you can use a different ChatGenerator for the web-based RAG branch of the pipeline.

from haystack.components.websearch.serper_dev import SerperDevWebSearch

prompt_for_websearch = [
    ChatMessage.from_user(
        """
Answer the following query given the documents retrieved from the web.
Your answer should indicate that your answer was generated from websearch.

Documents:
{% for document in documents %}
  {{document.content}}
{% endfor %}

Query: {{query}}
"""
    )
]

websearch = SerperDevWebSearch()
prompt_builder_for_websearch = ChatPromptBuilder(template=prompt_for_websearch, required_variables="*")
llm_for_websearch = OpenAIChatGenerator(model="gpt-4o-mini")

Creating the ConditionalRouter

ConditionalRouter is the key component that enables agentic behavior and handles data routing on specific conditions. You need to define a condition, an output, an output_name and an output_type for each route. Each route that the ConditionalRouter creates acts as the output of this component and can be connected to other components in the same pipeline.

In this case, you need to define two routes:

If the LLM replies with the "no_answer" keyword, the pipeline should perform web search. It means that you will put the original query in the output value to pass to the next component (in this case the next component will be the SerperDevWebSearch) and the output name will be go_to_websearch.
Otherwise, the given documents are enough for an answer and pipeline execution ends here. Return the LLM reply in the output named answer.

from haystack.components.routers import ConditionalRouter

routes = [
    {
        "condition": "{{'no_answer' in replies[0].text}}",
        "output": "{{query}}",
        "output_name": "go_to_websearch",
        "output_type": str,
    },
    {
        "condition": "{{'no_answer' not in replies[0].text}}",
        "output": "{{replies[0].text}}",
        "output_name": "answer",
        "output_type": str,
    },
]

router = ConditionalRouter(routes)

Building the Agentic RAG Pipeline

Add all components to your pipeline and connect them. go_to_websearch output of the router should be connected to the websearch to retrieve documents from the web and also to prompt_builder_for_websearch to use in the prompt.

from haystack import Pipeline

agentic_rag_pipe = Pipeline()
agentic_rag_pipe.add_component("retriever", retriever)
agentic_rag_pipe.add_component("prompt_builder", prompt_builder)
agentic_rag_pipe.add_component("llm", llm)
agentic_rag_pipe.add_component("router", router)
agentic_rag_pipe.add_component("websearch", websearch)
agentic_rag_pipe.add_component("prompt_builder_for_websearch", prompt_builder_for_websearch)
agentic_rag_pipe.add_component("llm_for_websearch", llm_for_websearch)

agentic_rag_pipe.connect("retriever", "prompt_builder.documents")
agentic_rag_pipe.connect("prompt_builder.prompt", "llm.messages")
agentic_rag_pipe.connect("llm.replies", "router.replies")
agentic_rag_pipe.connect("router.go_to_websearch", "websearch.query")
agentic_rag_pipe.connect("router.go_to_websearch", "prompt_builder_for_websearch.query")
agentic_rag_pipe.connect("websearch.documents", "prompt_builder_for_websearch.documents")
agentic_rag_pipe.connect("prompt_builder_for_websearch", "llm_for_websearch")

Visualize the Pipeline

To understand how you formed this pipeline with conditional routing, use show() method of the pipeline.

# agentic_rag_pipe.show()

Running the Pipeline!

In the run(), pass the query to retriever, prompt_builder, and the router.

query = "Where is Munich?"

result = agentic_rag_pipe.run(
    {"retriever": {"query": query}, "prompt_builder": {"query": query}, "router": {"query": query}}
)

# Print the `answer` coming from the ConditionalRouter
print(result["router"]["answer"])

✅ The answer to this query can be found in the defined document.

Now, try a different query that doesn’t have an answer in the given document and test if the web search works as expected:

query = "How many people live in Munich?"

result = agentic_rag_pipe.run(
    {"retriever": {"query": query}, "prompt_builder": {"query": query}, "router": {"query": query}}
)

# Print the `replies` generated using the web searched Documents
print(result["llm_for_websearch"]["replies"][0].text)

If you check the whole result, you will see that websearch component also provides links to Documents retrieved from the web:

result

What’s next

🎉 Congratulations! You’ve built an agentic RAG pipeline with conditional routing! You can now customize the condition for your specific use case and create a custom Haystack pipeline to meet your needs.

If you liked this tutorial, there’s more to learn about Haystack:

To stay up to date on the latest Haystack developments, you can sign up for our newsletter or join Haystack discord community.

Thanks for reading!

Evaluating RAG Pipelines

Simplifying Pipeline Inputs with Multiplexer