βοΈ Highlights
π§ Agent Breakpoints
This release introduces Agent Breakpoints, a powerful new feature that enhances debugging and observability when working with Haystack Agents. You can pause execution mid-run by inserting breakpoints in the Agent or its tools to inspect internal state and resume execution seamlessly. This brings fine-grained control to agent development and significantly improves traceability during complex interactions.
from haystack.dataclasses.breakpoints import AgentBreakpoint, Breakpoint
from haystack.dataclasses import ChatMessage
chat_generator_breakpoint = Breakpoint(
component_name="chat_generator",
visit_count=0,
snapshot_file_path="debug_snapshots"
)
agent_breakpoint = AgentBreakpoint(break_point=chat_generator_breakpoint, agent_name='calculator_agent')
response = agent.run(
messages=[ChatMessage.from_user("What is 7 * (4 + 2)?")],
break_point=agent_breakpoint
)
πΌοΈ Multimodal Pipelines and Agents
You can now blend text and image capabilities across generation, indexing, and retrieval in Haystack.
-
New
ImageContent
Dataclass: A dedicated structure to store image data along withbase64_image
,mime_type
,detail
, andmetadata
. -
Image-Aware Chat Generators: Image inputs are now supported in
OpenAIChatGenerator
from haystack.dataclasses import ImageContent, ChatMessage
from haystack.components.generators.chat import OpenAIChatGenerator
image_url = "https://cdn.britannica.com/79/191679-050-C7114D2B/Adult-capybara.jpg"
image_content = ImageContent.from_url(image_url)
message = ChatMessage.from_user(
content_parts=["Describe the image in short.", image_content]
)
llm = OpenAIChatGenerator(model="gpt-4o-mini")
print(llm.run([message])["replies"][0].text)
-
Powerful Multimodal Components:
PDFToImageContent
,ImageFileToImageContent
,DocumentToImageContent
: Convert PDFs, image files, and Documents intoImageContent
objects.LLMDocumentContentExtractor
: Extract text from images using a vision-enabled LLM.SentenceTransformersDocumentImageEmbedder
: Generate embeddings from image-based documents using models like CLIP.DocumentLengthRouter
: Route documents based on textual content lengthβideal for distinguishing scanned PDFs from text-based ones.DocumentTypeRouter
: Route documents automatically based on MIME type metadata.
-
Prompt Building with Image Support: The
ChatPromptBuilder
now supports templates with embedded images, enabling dynamic multimodal prompt creation.
With these additions, you can now build multimodal agents and RAG pipelines that reason over both text and visual content, unlocking richer interactions and retrieval capabilities.
π Learn more about multimodality in our Introduction to Multimodal Text Generation.
π New Features
-
Add
to_dict
andfrom_dict
toByteStream
so it is consistent with our other dataclasses in having serialization and deserialization methods. -
Add
to_dict
andfrom_dict
to classesStreamingChunk
,ToolCallResult
,ToolCall
,ComponentInfo
, andToolCallDelta
to make it consistent with our other dataclasses in having serialization and deserialization methods. -
Added the
tool_invoker_kwargs
param toAgent
so additional kwargs can be passed to theToolInvoker
likemax_workers
andenable_streaming_callback_passthrough
. -
ChatPromptBuilder
now supports special string templates in addition to a list ofChatMessage
objects. This new format is more flexible and allows structured parts like images to be included in the templatizedChatMessage
.from haystack.components.builders import ChatPromptBuilder from haystack.dataclasses.chat_message import ImageContent template = """ {% message role="user" %} Hello! I am {{user_name}}. What's the difference between the following images? {% for image in images %} {{ image | templatize_part }} {% endfor %} {% endmessage %} """ images=[ ImageContent.from_file_path("apple-fruit.jpg"), ImageContent.from_file_path("apple-logo.jpg") ] builder = ChatPromptBuilder(template=template) builder.run(user_name="John", images=images)
-
Added convenience class methods to the
ImageContent
dataclass to createImageContent
objects from file paths and URLs. -
Added multiple converters to help convert image data between different formats:
-
DocumentToImageContent
: Converts documents sourced from PDF and image files intoImageContents
. -
ImageFileToImageContent
: Converts image files toImageContent
objects. -
ImageFileToDocument
: Converts image file references into emptyDocument
objects with associated metadata. -
PDFToImageContent
: Converts PDF files toImageContent
objects. -
Chat Messages with the user role can now include images using the new
ImageContent
dataclass. We’ve added image support toOpenAIChatGenerator
, and plan to support more model providers over time. -
Raise a warning when a pipeline can no longer proceed because all remaining components are blocked from running and no expected pipeline outputs have been produced. This scenario can occur legitimately. For example, in pipelines with mutually exclusive branches where some components are intentionally blocked. To help avoid false positives, the check ensures that none of the expected outputs (as defined by
Pipeline().outputs()
) have been generated during the current run. -
Added
source_id_meta_field
andsplit_id_meta_field
toSentenceWindowRetriever
for customizable metadata field names. Addedraise_on_missing_meta_fields
to control whether a ValueError is raised if any of the documents at runtime are missing the required meta fields (set to True by default). If False, then the documents missing the meta field will be skipped when retrieving their windows, but the original document will still be included in the results. -
Add a
ComponentInfo
dataclass to thehaystack.dataclasses
module. This dataclass is used to store information about the component. We pass it toStreamingChunk
so we can tell from which component a stream is coming from. -
Pass the
component_info
to theStreamingChunk
in theOpenAIChatGenerator
,AzureOpenAIChatGenerator
,HuggingFaceAPIChatGenerator
andHuggingFaceLocalChatGenerator
. -
Added the
enable_streaming_callback_passthrough
to theToolInvoker
init, run and run_async methods. If set to True the ToolInvoker will try and pass thestreaming_callback
function to a tool’s invoke method only if the tool’s invoke method hasstreaming_callback
in its signature. -
Added dedicated
finish_reason
field toStreamingChunk
class to improve type safety and enable sophisticated streaming UI logic. The field uses aFinishReason
type alias with standard values: “stop”, “length”, “tool_calls”, “content_filter”, plus Haystack-specific value “tool_call_results” (used by ToolInvoker to indicate tool execution completion). -
Updated
ToolInvoker
component to use the newfinish_reason
field when streaming tool results. The component now setsfinish_reason="tool_call_results"
in the final streaming chunk to indicate that tool execution has completed, while maintaining backward compatibility by also setting the value inmeta["finish_reason"]
. -
Added new
HuggingFaceTEIRanker
component to enable reranking with Text Embeddings Inference (TEI) API. This component supports both self-hosted Text Embeddings Inference services and Hugging Face Inference Endpoints. -
Added a raise_on_failure boolean parameter to OpenAIDocumentEmbedder and AzureOpenAIDocumentEmbedder. If set to True then the component will raise an exception when there is an error with the API request. It is set to False by default to so the previous behavior of logging an exception and continuing is still the default.
-
ToolInvoker
now executestool_calls
in parallel for both sync and async mode. -
Add
AsyncHFTokenStreamingHandler
for async streaming support inHuggingFaceLocalChatGenerator
-
We introduced the
LLMMessagesRouter
component, which routes Chat Messages to different connections, using a generative Language Model to perform classification. This component can be used with general-purpose LLMs and with specialized LLMs for moderation like Llama Guard.Usage example:
from haystack.components.generators.chat import HuggingFaceAPIChatGenerator
from haystack.components.routers.llm_messages_router import LLMMessagesRouter
from haystack.dataclasses import ChatMessage
# initialize a Chat Generator with a generative model for moderation
chat_generator = HuggingFaceAPIChatGenerator(
api_type="serverless_inference_api",
api_params={"model": "meta-llama/Llama-Guard-4-12B", "provider": "groq"},
)
router = LLMMessagesRouter(
chat_generator=chat_generator,
output_names=["unsafe", "safe"],
output_patterns=["unsafe", "safe"]
)
print(router.run([ChatMessage.from_user("How to rob a bank?")]))
-
For
HuggingFaceAPIGenerator
andHuggingFaceAPIChatGenerator
all additional key, value pairs passed inapi_params
are now passed to the initializations of the underlying Inference Clients. This allows passing of additional parameters to the clients liketimeout
,headers
,provider
, etc. This means we now can easily specify a different inference provider by passing theprovider
key inapi_params
. -
Updated
StreamingChunk
to add the fieldstool_calls
,tool_call_result
,index
, andstart
to make it easier to format the stream in a streaming callback. -
Added new dataclass
ToolCallDelta
for theStreamingChunk.tool_calls
field to reflect that the arguments can be a string delta. -
Updated
print_streaming_chunk
and_convert_streaming_chunks_to_chat_message
utility methods to use these new fields. This especially improves the formatting when usingprint_streaming_chunk
with Agent. -
Updated
OpenAIGenerator
,OpenAIChatGenerator
,HuggingFaceAPIGenerator
,HuggingFaceAPIChatGenerator
,HuggingFaceLocalGenerator
andHuggingFaceLocalChatGenerator
to follow the new dataclasses. -
Updated
ToolInvoker
to follow theStreamingChunk
dataclass.
β¬οΈ Upgrade Notes
-
HuggingFaceAPIGenerator
might no longer work with the Hugging Face Inference API. As of July 2025, the Hugging Face Inference API no longer offers generative models that support thetext_generation
endpoint. Generative models are now only available through providers that support thechat_completion
endpoint. As a result, theHuggingFaceAPIGenerator
component might not work with the Hugging Face Inference API. It still works with Hugging Face Inference Endpoints and self-hosted TGI instances. To use generative models via Hugging Face Inference API, please use theHuggingFaceAPIChatGenerator
component, which supports thechat_completion
endpoint. -
All parameters of the
Pipeline.draw()
andPipeline.show()
methods must now be specified as keyword arguments. Example:
pipeline.draw(
path="output.png",
server_url="https://custom-server.com",
params=None,
timeout=30,
super_component_expansion=False
)
-
The deprecated
async_executor
parameter has been removed from theToolInvoker
class. Please use themax_workers
parameter instead and aThreadPoolExecutor
with these workers will be created automatically for parallel tool invocations. -
The deprecated
State
class has been removed from thehaystack.dataclasses
module. TheState
class is now part of thehaystack.components.agents
module. -
Remove the
deserialize_value_with_schema_legacy
function from thebase_serialization
module. This function was used to deserializeState
objects created with Haystack 2.14.0 or older. Support for the old serialization format is removed in Haystack 2.16.0.
β‘οΈ Enhancement Notes
-
Add
guess_mime_type
parameter toBytestream.from_file_path()
-
Add the init parameter
skip_empty_documents
to theDocumentSplitter
component. The default value is True. Setting it to False can be useful when downstream components in the Pipeline (likeLLMDocumentContentExtractor
) can extract text from non-textual documents. -
Test that our type validation and connection validation works with builtin python types introduced in 3.9. We found that these types were already supported, we just now add explicit tests for them.
-
We relaxed the requirement that in
ToolCallDelta
(introduced in Haystack 2.15) which required the parameters arguments or name to be populated to be able to create aToolCallDelta
dataclass. We remove this requirement to be more in line with OpenAI’s SDK and since this was causing errors for some hosted versions of open source models following OpenAI’s SDK specification. -
Added
return_embedding
parameter insideInMemoryDocumentStore::init
method. -
Updated methods
bm25_retrieval
, andfilter_documents
to useself.return_embedding
to determine whether embeddings are returned. -
Updated tests (test_in_memory & test_in_memory_embedding_retriever) to reflect the changes in the
InMemoryDocumentStore
. -
Added a new
deserialize_component_inplace
function to handle generic component deserialization that works with any component type. -
Made doc-parser a core dependency since ComponentTool that uses it is one of the core Tool components.
-
Make the
PipelineBase().validate_input
method public so users can use it with the confidence that it won’t receive breaking changes without warning. This method is useful for checking that all required connections in a pipeline have a connection and is automatically called in the run method of Pipeline. It is being exposed as public for users who would like to call this method before runtime to validate the pipeline. -
For component run Datadog tracing, set the span resource name to the component name instead of the operation name.
-
Added a
trust_remote_code
parameter to theSentenceTransformersSimilarityRanker
component. When set to True, this enables execution of custom models and scripts hosted on the Hugging Face Hub. -
Add a new parameter
require_tool_call_ids
toChatMessage.to_openai_dict_format
. The default isTrue
, for compatibility with OpenAI’s Chat API: if theid
field is missing in a Tool Call, an error is raised. UsingFalse
is useful for shallow OpenAI-compatible APIs, where theid
field is not required. -
Haystack’s core modules are now [ “type complete”]( https://typing.python.org/en/latest/guides/libraries.html#how-much-of-my-library-needs-types), meaning that all function parameters and return types are explicitly annotated. This increases the usefulness of the newly added
py.typed
marker and sidesteps differences in type inference between the various type checker implementations. -
Refactors the
HuggingFaceAPIChatGenerator
to use the util method_convert_streaming_chunks_to_chat_message
. This is to help with being consistent for how we convertStreamingChunks
into a finalChatMessage
. -
We also add
ComponentInfo
to theStreamingChunks
made inHuggingFaceGenerator
, andHugginFaceLocalGenerator
so we can tell from which component a stream is coming from. -
If only system messages are provided as input a warning will be logged to the user indicating that this likely not intended and that they should probably also provide user messages.
π Bug Fixes
-
Fix
_convert_streaming_chunks_to_chat_message
which is used to convert HaystackStreamingChunks
into a HaystackChatMessage
. This fixes the scenario where oneStreamingChunk
contains twoToolCallDeltas
inStreamingChunk.tool_calls
. With this fix this correctly saves bothToolCallDeltas
whereas before they were overwriting each other. This only occurs with some LLM providers like Mistral (and not OpenAI) due to how the provider returns tool calls. -
Fixed a bug in the
print_streaming_chunk
utility function that prevented tool call name from being printed. -
Fixed the
to_dict
andfrom_dict
ofToolInvoker
to properly serialize thestreaming_callback
init parameter. -
Fix bug where if
raise_on_failure=False
and an error occurs mid-batch that the following embeddings would be paired with the wrong documents. -
Fix
component_invoker
used byComponentTool
to work when a dataclass likeChatMessage
is directly passed tocomponent_tool.invoke(...)
. Previously this would either cause an error or silently skip your input. -
Fixed a bug in the
LLMMetadataExtractor
that occurred when processingDocument
objects withNone
or empty string content. The component now gracefully handles these cases by marking such documents as failed and providing an appropriate error message in their metadata, without attempting an LLM call. -
RecursiveDocumentSplitter now generates a unique
Document.id
for every chunk. The meta fields (split_id
,parent_id
, etc.) are populated [ before]()Document
creation, so the hash used forid
generation is always unique. -
In
ConditionalRouter
fixed theto_dict
andfrom_dict
methods to properly handle the case whenoutput_type
is a List of types or a List of strings. This occurs when a user specifies a route inConditionalRouter
to have multiple outputs. -
Fix serialization of
GeneratedAnswer
whenChatMessage
objects are nested inmeta
. -
Fix the serialization of
ComponentTool
andTool
when specifyingoutputs_to_string
. Previously an error occurred on deserialization right after serializing ifoutputs_to_string
is not None. -
When calling
set_output_types
we now also check that the decorator@component.output_types
is not present on therun_async
method of a Component. Previously we only checked that theComponent.run
method did not possess the decorator. -
Fix type comparison in schema validation by replacing
is not
with!=
when checking the typeList[ChatMessage]
. This prevents false mismatches due to Python’sis
operator comparing object identity instead of equality. -
Re-export symbols in
__init__.py
files. This ensures that short imports likefrom haystack.components.builders import ChatPromptBuilder
work equivalently tofrom haystack.components.builders.chat_prompt_builder import ChatPromptBuilder
, without causing errors or warnings in mypy/Pylance. -
The
SuperComponent
class can now correctly serialize and deserialize aSuperComponent
based on an async pipeline. Previously, theSuperComponent
class always assumed the underlying pipeline was synchronous.
β οΈ Deprecation Notes
async_executor
parameter inToolInvoker
is deprecated in favor ofmax_workers
parameter and will be removed in Haystack 2.16.0. You can usemax_workers
parameter to control the number of threads used for parallel tool calling.
π Big thank you to everyone who contributed to this release!
@Amnah199 @RafaelJohn9 @anakin87 @bilgeyucel @davidsbatista @julian-risch @kanenorman @kr1shnasomani @mathislucka @mpangrazzi @sjr @srishti-git1110