๐Ÿ“š Learn how to turn Haystack pipelines into production-ready REST APIs or MCP tools
Maintained by deepset

Integration: Google Gen AI

Use Google's Gemini models with Haystack via the new Google Gen AI SDK

Authors
deepset
Gary Badwal

Table of Contents

Overview

Google Gen AI provides access to Google’s Gemini models through the new Google Gen AI SDK. This integration enables the usage of Google’s latest generative models via the updated API interface.

Haystack supports the latest Gemini models like gemini-2.0-flash and text-embedding-004 for tasks such as chat completion, function calling, streaming responses and embedding generation.

Installation

Install the Google Gen AI integration:

pip install google-genai-haystack

Usage

Once installed, you will have access to the Haystack Chat components:

  • GoogleGenAIChatGenerator: Use this component with Gemini models, such as ‘gemini-2.0-flash’ for chat completion and function calling.
  • GoogleGenAIDocumentEmbedder: Use this component with Google GenAI models, such as ‘text-embedding-004’ for generating embeddings for documents.
  • GoogleGenAITextEmbedder: Use this component with Google GenAI models, such as ‘text-embedding-004’ for generating embeddings for text.

To use Google Gemini models, you need an API key. You can either pass it as an init argument or set a GOOGLE_API_KEY or GEMINI_API_KEY environment variable. If neither is set, you won’t be able to use the generator.

To get an API key visit Google AI Studio.

Chat Generation with gemini-2.0-flash

To use Gemini model for chat generation, set the GOOGLE_API_KEY or GEMINI_API_KEY environment variable and then initialize a GoogleGenAIChatGenerator with "gemini-2.0-flash":

import os
from haystack.dataclasses.chat_message import ChatMessage
from haystack_integrations.components.generators.google_genai import GoogleGenAIChatGenerator

os.environ["GOOGLE_API_KEY"] = "YOUR-GOOGLE-API-KEY"

# Initialize the chat generator
chat_generator = GoogleGenAIChatGenerator(model="gemini-2.0-flash")

# Generate a response
messages = [ChatMessage.from_user("Tell me about the future of AI")]
response = chat_generator.run(messages=messages)
print(response["replies"][0].text)

Output:

The future of AI is incredibly exciting and multifaceted, with developments spanning multiple domains...

Streaming Chat Generation

For real-time streaming responses, you can use the streaming callback functionality:

import os
from haystack.dataclasses.chat_message import ChatMessage
from haystack.dataclasses import StreamingChunk
from haystack_integrations.components.generators.google_genai import GoogleGenAIChatGenerator

os.environ["GOOGLE_API_KEY"] = "YOUR-GOOGLE-API-KEY"

def streaming_callback(chunk: StreamingChunk):
    print(chunk.content, end='', flush=True)

# Initialize with streaming callback
chat_generator = GoogleGenAIChatGenerator(
    model="gemini-2.0-flash",
    streaming_callback=streaming_callback
)

# Generate a streaming response
messages = [ChatMessage.from_user("Write a short story about robots")]
response = chat_generator.run(messages=messages)
# Text will stream in real-time via the callback

Function calling

When chatting with Gemini models, you can also use function calls for tool integration:

import os
from haystack.dataclasses.chat_message import ChatMessage
from haystack.tools import Tool

os.environ["GOOGLE_API_KEY"] = "YOUR-GOOGLE-API-KEY"

# Define a simple weather function
def get_weather(city: str) -> str:
    return f"The weather in {city} is sunny and 25ยฐC"

# Create a tool from the function
weather_tool = Tool(
    name="get_weather",
    description="Get weather information for a city",
    parameters={
        "type": "object",
        "properties": {
            "city": {"type": "string", "description": "The city name"}
        },
        "required": ["city"]
    },
    function=get_weather
)

# Initialize chat generator with tools
chat_generator = GoogleGenAIChatGenerator(
    model="gemini-2.0-flash",
    tools=[weather_tool]
)

# Generate response with tool usage
messages = [ChatMessage.from_user("What's the weather like in Paris?")]
response = chat_generator.run(messages=messages)

# The model will call the weather function and provide a natural response
print(response["replies"][0].text)

Output:

The weather in Paris is sunny and 25ยฐC.

Document Embedding

To use Google model for embedding generation, set the GOOGLE_API_KEY or GEMINI_API_KEY environment variable and then initialize a GoogleGenAIDocumentEmbedder:

import os
from haystack import Document
from haystack_integrations.components.embedders.google_genai import GoogleGenAIDocumentEmbedder

os.environ["GOOGLE_API_KEY"] = "YOUR-GOOGLE-API-KEY"

# Initialize the embedder
embedder = GoogleGenAIDocumentEmbedder()

# Generate a response
doc = Document(content="some text")
docs_w_embeddings = embedder.run(documents=[doc])["documents"]

Text Embedding

To use Google model for embedding generation, set the GOOGLE_API_KEY or GEMINI_API_KEY environment variable and then initialize a GoogleGenAITextEmbedder:

import os
from haystack_integrations.components.embedders.google_genai import GoogleGenAITextEmbedder

os.environ["GOOGLE_API_KEY"] = "YOUR-GOOGLE-API-KEY"

text_to_embed = "I love pizza!"

# Initialize the text embedder
text_embedder = GoogleGenAITextEmbedder()

# Generate a response
print(text_embedder.run(text_to_embed))

Output:

{'embedding': [-0.052871075, -0.035282962, ...., -0.04802792], 
'meta': {'model': 'text-embedding-004'}}