βοΈ Highlights
Enhancements for Complex Agentic Systems
We’ve improved agent workflows with better message handling and streaming support.
Agent component now returns a last_message
output for quick access to the final message, and can use a streaming_callback
to emit tool results in real time. You can use the updated print_streaming_chunk
or write your own callback function to enable ToolCall details during streaming.
from haystack.components.websearch import SerperDevWebSearch
from haystack.components.agents import Agent
from haystack.components.generators.utils import print_streaming_chunk
from haystack.tools import tool, ComponentTool
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
web_search = ComponentTool(name="web_search", component=SerperDevWebSearch(top_k=5))
wiki_search = ComponentTool(name="wiki_search", component=SerperDevWebSearch(top_k=5, allowed_domains=["https://www.wikipedia.org/"]))
research_agent = Agent(
chat_generator=OpenAIChatGenerator(model="gpt-4o-mini"),
system_prompt="""
You are a research agent that can find information on web or specifically on wikipedia.
Use wiki_search tool if you need facts and use web_search tool for latest news on topics.
Use one tool at a time, use the other tool if the retrieved information is not enough.
Summarize the retrieved information before returning response to the user.
""",
tools=[web_search, wiki_search],
streaming_callback=print_streaming_chunk
)
result = research_agent.run(messages=[ChatMessage.from_user("Can you tell me about Florence Nightingale's life?")])
Enabling streaming with print_streaming_chunk
function looks like this:
[TOOL CALL]
Tool: wiki_search
Arguments: {"query":"Florence Nightingale"}
[TOOL RESULT]
{'documents': [{'title': 'List of schools in Nottinghamshire', 'link': 'https://www.wikipedia.org/wiki/List_of_schools_in_Nottinghamshire', 'position': 1, 'id': 'a6d0fe00f1e0cd06324f80fb926ba647878fb7bee8182de59a932500aeb54a5b', 'content': 'The Florence Nightingale Academy, Eastwood; The Flying High Academy, Mansfield; Forest Glade Primary School, Sutton-in-Ashfield; Forest Town Primary School ...', 'blob': None, 'score': None, 'embedding': None, 'sparse_embedding': None}], 'links': ['https://www.wikipedia.org/wiki/List_of_schools_in_Nottinghamshire']}
...
Print the last_message
print("Final Answer:", result["last_message"].text)
>>> Final Answer: Florence Nightingale (1820-1910) was a pioneering figure in nursing and is often hailed as the founder of modern nursing. She was born...
Additionally,
AnswerBuilder stores all generated messages in all_messages
meta field of GeneratedAnswer and supports a new last_message_only
mode for lightweight flows where only the final message needs to be processed.
Visualizing Pipelines with SuperComponents
We extended pipeline.draw()
and pipeline.show()
, which save pipeline diagrams to images files or display them in Jupyter notebooks. You can now pass super_component_expansion=True
to expand any SuperComponents and draw more detailed visualizations.
Here is an example with a pipeline containing
MultiFileConverter and
DocumentPreprocssor SuperComponents. After installing the dependencies that the MultiFileConverter
needs for all supported file formats via pip install haystack-ai pypdf markdown-it-py mdit_plain trafilatura python-pptx python-docx jq openpyxl tabulate pandas
, you can run:
from pathlib import Path
from haystack import Pipeline
from haystack.components.converters import MultiFileConverter
from haystack.components.preprocessors import DocumentPreprocessor
from haystack.components.writers import DocumentWriter
from haystack.document_stores.in_memory import InMemoryDocumentStore
document_store = InMemoryDocumentStore()
pipeline = Pipeline()
pipeline.add_component("converter", MultiFileConverter())
pipeline.add_component("preprocessor", DocumentPreprocessor())
pipeline.add_component("writer", DocumentWriter(document_store = document_store))
pipeline.connect("converter", "preprocessor")
pipeline.connect("preprocessor", "writer")
# expanded pipeline that shows all components
path = Path("expanded_pipeline.png")
pipeline.draw(path=path, super_component_expansion=True)
# original pipeline
path = Path("original_pipeline.png")
pipeline.draw(path=path)
SentenceTransformersSimilarityRanker with PyTorch, ONNX, and OpenVINO
We added a new
SentenceTransformersSimilarityRanker component that uses the Sentence Transformers library to rank documents based on their semantic similarity to the query. This component replaces the legacy TransformersSimilarityRanker
component, which may be deprecated in a future release, with removal following a deprecation period. The SentenceTransformersSimilarityRanker
also allows choosing different inference backends: PyTorch, ONNX, and OpenVINO. For example, after installing sentence-transformers>=4.1.0
, you can run:
from haystack.components.rankers import SentenceTransformersSimilarityRanker
from haystack.utils.device import ComponentDevice
onnx_ranker = SentenceTransformersSimilarityRanker(
model="sentence-transformers/all-MiniLM-L6-v2",
token=None,
device=ComponentDevice.from_str("cpu"),
backend="onnx",
)
onnx_ranker.warm_up()
docs = [Document(content="Berlin"), Document(content="Sarajevo")]
output = onnx_ranker.run(query="City in Germany", documents=docs)
ranked_docs = output["documents"]
β¬οΈ Upgrade Notes
- We’ve added a
py.typed
file to Haystack to enable type information to be used by downstream projects, in line with PEP 561. This means Haystack’s type hints will now be visible to type checkers in projects that depend on it. Haystack is primarily type checked using mypy (not pyright) and, despite our efforts, some type information can be incomplete or unreliable. If you use static type checking in your own project, you may notice some changes: previously, Haystack’s types were effectively treated asAny
, but now actual type information will be available and enforced. We’ll continue improving typing with the next release. - The deprecated
deserialize_tools_inplace
utility function has been removed. Usedeserialize_tools_or_toolset_inplace
instead, importing it as follows:from haystack.tools import deserialize_tools_or_toolset_inplace
.
π New Features
-
Added
run_async
method toToolInvoker
class to allow asynchronous tool invocations. -
Agent can now stream tool result with
run_async
method as well. -
Introduced
serialize_value
anddeserialize_value
utility methods for consistent value (de)serialization across modules. -
Moved the
State
class to theagents.state
module and added serialization and deserialization capabilities. -
Add support for multiple outputs in ConditionalRouter
-
Implement JSON-safe serialization for OpenAI usage data by converting token counts and details (like CompletionTokensDetails and PromptTokensDetails) into plain dictionaries.
-
Added a new
SentenceTransformersSimilarityRanker
component that uses the Sentence Transformers library to rank documents based on their semantic similarity to the query. This component is a replacement for the legacyTransformersSimilarityRanker
component, which may be deprecated in a future release, with removal following after a deprecation period. TheSentenceTransformersSimilarityRanker
also allows choosing different inference backends: PyTorch, ONNX, and OpenVINO. To use theSentenceTransformersSimilarityRanker
, you need to installsentence-transformers>=4.1.0
. -
Add a
streaming_callback
parameter toToolInvoker
to enable streaming of tool results. Note that tool_result is emitted only after the tool execution completes and is not streamed incrementally. -
Update
print_streaming_chunk
to print ToolCall information if it is present in the chunk’s metadata. -
Update
Agent
to forward thestreaming_callback
toToolInvoker
to emit tool results during tool invocation. -
Enhance SuperComponent’s type compatibility check to return the detected common type between two input types.
β‘οΈ Enhancement Notes
-
When using HuggingFaceAPIChatGenerator with streaming, the returned ChatMessage now contains the number of prompt tokens and completion tokens in its meta data. Internally, the HuggingFaceAPIChatGenerator requests an additional streaming chunk that contains usage data. It then processes the usage streaming chunk to add usage meta data to the returned ChatMessage.
-
We now have a Protocol for TextEmbedder. The protocol makes it easier to create custom components or SuperComponents that expect any TextEmbedder as init parameter.
-
We added a
Component
signature validation method that details the mismatches between therun
andrun_async
method signatures. This allows a user to debug custom components easily. -
Enhanced the AnswerBuilder component with two agent-friendly features:
- All generated messages are now stored in the
meta
field of the GeneratedAnswer objects under anall_messages
key, improving traceability and debugging capabilities. - Added a new
last_message_only
parameter that, when set toTrue
, processes only the last message in the replies while still preserving the complete conversation history in metadata. This is particularly useful for agent workflows where only the final response needs to be processed.
- All generated messages are now stored in the
-
A variety of improvements have been made so an Agent component can be directly used in ComponentTool enabling straightforward building of Multi-Agent systems. These improvements include:
- Adding a
last_message
field to the Agent’s output which returns the last generated ChatMessage. - Improving the
_default_output_handler
in theToolInvoker
to try and first serialize the outputs in the tool result before converting it into a string. This is especially relevant for getting a better representation when stringifying dataclasses like ChatMessage.
- Adding a
-
Added type hints to the
component
decorator. This improves support for Pyright/Pylance, enabling IDEs like VSCode to show docstrings for components. -
Updated pipeline execution logic to use a new utility method
_deepcopy_with_exceptions
, which attempts to deep copy an object and safely falls back to the original object if copying fails. Additionally_deepcopy_with_exceptions
skips deep-copying ofComponent
,Tool
, andToolset
instances when used as runtime parameters. This prevents errors and unintended behavior caused by trying to deepcopy objects that contain non-copyable attributes (e.g. Jinja2 templates, clients). Previously, standarddeepcopy
was used on inputs and outputs which occasionally lead to errors since certain Python objects cannot be deepcopied. -
Refactored JSON Schema generation for ComponentTool parameters using Pydanticβs model_json_schema, enabling expanded type support (e.g. Union, Enum, Dict, etc.). We also convert dataclasses to Pydantic models before calling model_json_schema to preserve docstring descriptions of the parameters in the schema. This means dataclasses like ChatMessage, Document, etc. now have correctly defined JSON schemas.
-
The
draw()
andshow()
methods fromPipeline
now have an extra boolean parameter,super_component_expansion
, which, when set toTrue
and if the pipeline contains SuperComponents, the visualisation diagram will show the internal structure of super-components as if they were components part of the pipeline instead of a “black-box” with the name of the SuperComponent. -
Improve the type annotations for
@component
and theComponent
protocol. The type checker can now ensure that a@component
class provides a compatiblerun()
method, whose required return type has been changed fromDict[str, Any]
(invariant) to theMapping[str, Any]
to allowTypedDict
to be used for output types. -
- Updates StreamingChunk construction in ToolInvoker to also stream a chunk with a finish reason. This is useful when using the print_streaming_chunk utility method
- Update the print_streaming_chunk to have better formatting of messages especially when using it with Agent.
- Also updated to work with the current version of the AWS Bedrock integration by working with the dict representation of ChoiceDeltaToolCall
-
ComponentTool now preserves and combines docstrings from underlying pipeline components when wrapping a SuperComponent. When a SuperComponent is used with ComponentTool, two key improvements are made:
- Parameter descriptions are now extracted from the original components in the wrapped pipeline. When a single input is mapped to multiple components, the parameter descriptions are combined from all mapped components, providing comprehensive information about how the parameter is used throughout the pipeline.
- The overall component description is now generated from the descriptions of all underlying components instead of using the generic SuperComponent description. This helps LLMs understand what the component actually does rather than just seeing “Runs the wrapped pipeline with the provided inputs.”
These changes make SuperComponents much more useful with LLM function calling as the LLM will get detailed information about both the component’s purpose and its parameters.
-
Adds
local_files_only
parameter toSentenceTransformersDocumentEmbedder
andSentenceTransformersTextEmbedder
to allow loading models in offline mode. -
The
DocumentRecallEvaluator
was updated. Now, when inMULTI_HIT
mode, the division is over the unique ground truth documents instead of the total number of ground truth documents. We also added checks for emptiness. If there are no retrieved documents or all of them have an empty string as content, we return 0.0 and log a warning. Likewise, if there are no ground truth documents or all of them have an empty string as content, we return 0.0 and log a warning.
β οΈ Deprecation Notes
- Deprecated the
State
class in thedataclasses
module. Users are encouraged to transition to the new version ofState
now located in theagents.state
module. A deprecation warning has been added to guide this migration.
π Security Notes
- Made QUOTE_SPANS_RE regex in SentenceSplitter ReDoS-safe. This prevents potential backtracking on malicious inputs.
π Bug Fixes
- Fixed a potential ReDoS issue in QUOTE_SPANS_RE regex used inside the SentenceSplitter component.
- Add the init parameters timeout and max_retries to the to_dict methods of OpenAITextEmbedder and OpenAIDocumentEmbedder. This ensures that these values are properly serialized when using the to_dict method of these components.
- use coerce_tag_value in LoggingTracer to serialize tag values
- Update the
__deepcopy__
of ComponentTool to gracefully handle NotImplementedError when trying to deepcopy attributes. - Fix an issue where OpenAIChatGenerator and OpenAIGenerator were not properly handling wrapped streaming responses from tools like Weave.
- A bug in the
RecursiveDocumentSplitter
was fixed for the case where asplit_text
is longer than thesplit_length
and recursive chunking is triggered. - Make internal tool conversion in the HuggingFaceAPICompatibleChatGenerator compatible with
huggingface_hub>=0.31.0
. In the huggingface_hub library,arguments
attribute ofChatCompletionInputFunctionDefinition
has been renamed toparameters
. Our implementation is compatible with both the legacy version and the new one. - The
HuggingFaceAPIChatGenerator
now checks the type of thearguments
variable in the tool calls returned by the Hugging Face API. Ifarguments
is a JSON string, it is parsed into a dictionary. Previously, thearguments
type was not checked, which sometimes led to failures later in the tool workflow. - Move deserialize_tools_inplace back to original import path of from haystack.tools.tool import deserialize_tools_inplace.
- To properly preserve the context when AsyncPipeline with components that only have sync run methods we copy the context using contextvars.copy_context() and run the component using
ctx.run(...)
so we can preserve context like the active tracing span. This now means if your component 1) only has a sync run method and 2) it logs something to the tracer then this trace will be properly nested within the parent context. - Fixed a bug in the
LLMMetadataExtractor
that occurred when processingDocument
objects withNone
or empty string content. The component now gracefully handles these cases by marking such documents as failed and providing an appropriate error message in their metadata, without attempting an LLM call. - Fix component_invoker used by ComponentTool to work when a dataclass like ChatMessage is directly passed to
component_tool.invoke(...)
. Previously this would either cause an error or silently skip your input.
π Big thank you to everyone who contributed to this release!
- @Amnah199 @anakin87 @davidsbatista @denisw @dfokina @jantrienes @mdrazak2001 @medsriha @mpangrazzi @sjrl @vblagoje @wsargent @YassinNouh21
Special thanks and congratulations to our first time contributors!
- @wsargent made their first contribution in https://github.com/deepset-ai/haystack/pull/9273
- @YassinNouh21 made their first contribution in https://github.com/deepset-ai/haystack/pull/9303
- @jantrienes made their first contribution in https://github.com/deepset-ai/haystack/pull/9400