Function Calling and Multimodal QA with Gemini
Last Updated: August 8, 2025
by Tuana Celik: Twitter, LinkedIn, Tilde Thurium: Twitter, LinkedIn and Silvano Cerza: LinkedIn
This is a notebook showing how you can use Gemini + Vertex AI with Haystack.
To use Gemini models on the Gemini Developer API with Haystack, check out our documentation.
Gemini is Google’s newest model. You can read more about its capabilities here.
Install dependencies
As a prerequisite, you need to have a Google Cloud Project set up that has access to Vertex AI and Gemini.
Useful resources:
Following that, you’ll only need to authenticate yourself in this Colab.
First thing first we need to install our dependencies including Google Gen AI integration:
! pip install haystack-ai google-genai-haystack trafilatura
Let’s login using Application Default Credentials (ADCs). For more info see the official documentation.
from google.colab import auth
auth.authenticate_user()
Remember to set the project_id
variable to a valid project ID that you have enough authorization to use for Gemini.
We’re going to use this one throughout the example!
To find your project ID you can find it in the
GCP resource manager or locally by running gcloud projects list
in your terminal. For more info on the gcloud CLI see the
official documentation.
project_id = input("Enter your project ID:")
Use gemini-2.5-flash
Answer Questions
Now that we setup everything we can create an instance of our
GoogleGenAIChatGenerator
. This component supports both Gemini and Vertex AI. For this demo, we will set api="vertex"
, and pass our project_id as vertex_ai_project.
from haystack_integrations.components.generators.google_genai import GoogleGenAIChatGenerator
gemini = GoogleGenAIChatGenerator(model="gemini-2.5-flash", api="vertex", vertex_ai_project=project_id, vertex_ai_location="europe-west1")
Let’s start by asking something simple.
This component expects a list of ChatMessage
as input to the run()
method. You can pass text or function calls through the messages.
from haystack.dataclasses import ChatMessage
messages = [ChatMessage.from_user("What is the most interesting thing you know?")]
result = gemini.run(messages = messages)
for answer in result["replies"]:
print(answer.text)
The most interesting thing I know, and one of the most profound mysteries in all of science, is that **about 95% of the universe is made of something we cannot see or directly detect: dark energy and dark matter.**
Imagine if 95% of the world around you was completely invisible and unknown, yet it fundamentally shaped everything you *could* see. That's our current situation with the cosmos.
* **Dark Matter** makes up about 27% of the universe. We know it exists because of its gravitational effects – it holds galaxies together, prevents clusters from flying apart, and influenced the large-scale structure of the early universe. But it doesn't absorb, reflect, or emit light, making it "dark." We don't know what particles it's made of.
* **Dark Energy** makes up about 68% of the universe. It's an even bigger enigma. We infer its existence because it's responsible for the accelerated expansion of the universe. It's essentially pushing the cosmos apart, overcoming the attractive force of gravity. Its nature is one of the biggest unsolved problems in physics.
This means that all the stars, planets, galaxies, gas, and dust – everything we can observe with telescopes – makes up only about 5% of the universe's total mass-energy content. The vast majority of reality is utterly mysterious, and understanding it is one of the greatest scientific quests of our time. It dictates the fate of the cosmos itself.
Answer Questions about Images
Let’s try something a bit different! gemini-2.5-flash
can also work with images, let’s see if we can have it answer questions about some robots 👇
We’re going to download some images for this example. 🤖
from haystack.dataclasses import ImageContent
urls = [
"https://upload.wikimedia.org/wikipedia/en/5/5c/C-3PO_droid.png",
"https://platform.theverge.com/wp-content/uploads/sites/2/chorus/assets/4658579/terminator_endoskeleton_1020.jpg",
"https://upload.wikimedia.org/wikipedia/en/3/39/R2-D2_Droid.png",
]
images = [ImageContent.from_url(url) for url in urls]
messages = [ChatMessage.from_user(content_parts=["What can you tell me about these robots? Be short and graceful.", *images])]
result = gemini.run(messages = messages)
for answer in result["replies"]:
print(answer.text)
These are iconic robots from popular culture:
1. **C-3PO:** A refined protocol droid, fluent in countless languages, known for his golden appearance and nervous demeanor.
2. **T-800 Endoskeleton:** A formidable, relentless combat machine, skeletal and chilling, from a dystopian future.
3. **R2-D2:** A courageous and resourceful astromech, full of personality, who communicates in beeps and whistles.
Function Calling with gemini-2.5-flash
With gemini-2.5-flash
, we can also use function calling!
So let’s see how we can do that 👇
Let’s see if we can build a system that can run a get_current_weather
function, based on a question asked in natural language.
First we create our function definition and tool (learn more about Tools in the docs).
For demonstration purposes, we’re simply creating a get_current_weather
function that returns an object which will always tell us it’s ‘Sunny, and 21.8 degrees’… If it’s Celsius, that’s a good day! ☀️
from haystack.components.tools import ToolInvoker
from haystack.tools import tool
from typing import Annotated
@tool
def get_current_weather(
location: Annotated[str, "The city for which to get the weather, e.g. 'San Francisco'"] = "Munich",
unit: Annotated[str, "The unit for the temperature, e.g. 'celsius'"] = "celsius",
):
return {"weather": "sunny", "temperature": 21.8, "unit": unit}
user_message = [ChatMessage.from_user("What is the temperature in celsius in Berlin?")]
replies = gemini.run(messages=user_message, tools=[get_current_weather])["replies"]
print(replies)
[ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text=''), ToolCall(tool_name='get_current_weather', arguments={'unit': 'celsius', 'location': 'Berlin'}, id=None)], _name=None, _meta={'model': 'gemini-2.5-flash', 'finish_reason': 'stop', 'usage': {'prompt_tokens': 53, 'completion_tokens': 10, 'total_tokens': 126}})]
Look at that! We go a message with some interesting information now. We can use that information to call a real function locally.
Let’s do exactly that and pass the result back to Gemini.
tool_invoker = ToolInvoker(tools=[get_current_weather])
tool_messages = tool_invoker.run(messages=replies)["tool_messages"]
print(tool_messages)
messages = user_message + replies + tool_messages
res = gemini.run(messages = messages)
print(res["replies"][0].text)
[ChatMessage(_role=<ChatRole.TOOL: 'tool'>, _content=[ToolCallResult(result="{'weather': 'sunny', 'temperature': 21.8, 'unit': 'celsius'}", origin=ToolCall(tool_name='get_current_weather', arguments={'unit': 'celsius', 'location': 'Berlin'}, id=None), error=False)], _name=None, _meta={})]
The temperature in Berlin is 21.8°C and it's sunny.
Seems like the weather is nice and sunny, remember to put on your sunglasses. 😎
Build a full Retrieval-Augmented Generation Pipeline with gemini-2.5-flash
As a final exercise, let’s add the GoogleGenAIChatGenerator
to a full RAG pipeline. In the example below, we are building a RAG pipeline that does question answering on the web, using gemini-2.5-flash
from haystack.components.fetchers.link_content import LinkContentFetcher
from haystack.components.converters import HTMLToDocument
from haystack.components.preprocessors import DocumentSplitter
from haystack.components.rankers import SentenceTransformersSimilarityRanker
from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder
from haystack import Pipeline
fetcher = LinkContentFetcher()
converter = HTMLToDocument()
document_splitter = DocumentSplitter(split_by="word", split_length=50)
similarity_ranker = SentenceTransformersSimilarityRanker(top_k=3)
gemini = GoogleGenAIChatGenerator(model="gemini-2.5-flash", api="vertex", vertex_ai_project=project_id, vertex_ai_location="europe-west1")
prompt_template = [ChatMessage.from_user("""
According to these documents:
{% for doc in documents %}
{{ doc.content }}
{% endfor %}
Answer the given question: {{question}}
Answer:
""")]
prompt_builder = ChatPromptBuilder(template=prompt_template)
pipeline = Pipeline()
pipeline.add_component("fetcher", fetcher)
pipeline.add_component("converter", converter)
pipeline.add_component("splitter", document_splitter)
pipeline.add_component("ranker", similarity_ranker)
pipeline.add_component("prompt_builder", prompt_builder)
pipeline.add_component("gemini", gemini)
pipeline.connect("fetcher.streams", "converter.sources")
pipeline.connect("converter.documents", "splitter.documents")
pipeline.connect("splitter.documents", "ranker.documents")
pipeline.connect("ranker.documents", "prompt_builder.documents")
pipeline.connect("prompt_builder.prompt", "gemini")
Let’s try asking Gemini to tell us about Haystack and how to use it.
question = "What do graphs have to do with Haystack?"
result = pipeline.run({"prompt_builder": {"question": question},
"ranker": {"query": question},
"fetcher": {"urls": ["https://haystack.deepset.ai/blog/introducing-haystack-2-beta-and-advent"]}})
for message in result["gemini"]["replies"]:
print(message.text)
In Haystack, pipelines are structured as graphs. Specifically, Haystack 1.x pipelines were based on Directed Acyclic Graphs (DAGs). In Haystack 2.0, the "A" (acyclic) is being removed from DAG, meaning pipelines can now branch out, join, and cycle back to other components, allowing for more complex graph structures that can retry or loop.
Now you’ve seen some of what Gemini can do, as well as how to integrate it with Haystack. If you want to learn more, check out the Haystack docs or tutorials