โญ๏ธ Highlights
๐ Passing State to Tools and Components
Tools and components can now access the live agent State directly - no extra wiring needed. Just add a state: State parameter to your tool or component’s run method and ToolInvoker automatically injects the current state at runtime. The State parameter is hidden from the LLM-facing schema so the model is never asked to supply it.
For function-based tools created with @tool:
from haystack.components.agents import State
from haystack.tools import tool
@tool
def my_tool(query: str, state: State) -> str:
"""Search using context from agent state."""
history = state.get("history")
...
For component-based tools created with ComponentTool:
from haystack import component
from haystack.components.agents import State
from haystack.tools import ComponentTool
@component
class MyComponent:
@component.output_types(result=str)
def run(self, query: str, state: State) -> dict:
history = state.get("history")
...
tool = ComponentTool(component=MyComponent())
This is an alternative to the existing inputs_from_state and outputs_to_state options, which map individual state keys declaratively. Injecting the full State object is more flexible when a tool needs to read from or write to multiple keys.
โ ๏ธ Upgrade Notes
-
As part of the migration from
requeststohttpx,request_with_retryandasync_request_with_retry(inhaystack.utils.requests_utils) no longer raiserequests.exceptions.RequestExceptionon failure. They now raisehttpx.HTTPErrorinstead. This also affectsHuggingFaceTEIRanker, which relies on these utilities. Update any code catchingrequests.exceptions.RequestExceptionto catchhttpx.HTTPError. -
The
LLMcomponent now requiresuser_promptto be provided at initialization and it must contain at least one Jinja2 template variable (e.g.{{ variable_name }}).required_variablesnow defaults to"*"(all variables inuser_promptare required), and passing an empty list raises aValueError.Before:
llm = LLM(chat_generator=OpenAIChatGenerator(), system_prompt="You are helpful.")After:
llm = LLM( chat_generator=OpenAIChatGenerator(), system_prompt="You are helpful.", user_prompt='{% message role="user" %}{{ query }}{% endmessage %}', ) -
Agent.run()andAgent.run_async()now requiremessagesas an explicit argument (no longer optional). If you were relying on the defaultNonevalue, pass an empty list instead:agent.run(messages=[], ...)
โก๏ธ Enhancement Notes
- Clarified in the Markdown-producing converter documentation that
DocumentCleanerwith its default settings can flatten Markdown output. Updated the example pipelines forPaddleOCRVLDocumentConverter,MistralOCRDocumentConverter,AzureDocumentIntelligenceConverter, andMarkItDownConverterto avoid routing Markdown content through the default cleaner configuration. - Made
_create_agent_snapshotrobust towards serialization errors. If serializing agent component inputs fails, a warning is logged and an empty dictionary is used as a fallback, preventing the serialization error from masking the real pipeline runtime error. - Standardized HTTP request handling in Haystack by adopting
httpxfor both synchronous and asynchronous requests, replacingrequests. Error reporting for failed requests has also been improved: exceptions now include additional details alongside the reason field. - Added
run_asyncmethod toLLMMetadataExtractor.ChatGeneratorrequests now run concurrently using the existingmax_workersinit parameter. MarkdownHeaderSplitternow accepts aheader_split_levelsparameter (list of integers 1โ6, default all levels) to control which header depths create split boundaries. For example,header_split_levels=[1, 2]splits only on#and##headers, merging content under deeper headers into the preceding chunk.MarkdownHeaderSplitternow ignores#lines that appear inside fenced code blocks (triple-backtick or triple-tilde), preventing Python comments and other hash-prefixed lines in code from being misidentified as Markdown headers.- Expanded the
PaddleOCRVLDocumentConverterdocumentation with more detailed guidance on advanced parameters, common usage scenarios, and a more realistic configuration example for layout-heavy documents.
๐ Bug Fixes
-
Fix
ToolInvoker._merge_tool_outputssilently appendingNoneto list-typed state when a tool’soutputs_to_statesource key is absent from the tool result. This is a common scenario withPipelineToolwrapping a pipeline that has conditional branches where not all outputs are always produced even if defined inoutputs_to_state. The mapping is now skipped entirely when the source key is not present in the result dict. -
Fixed a bug in
MarkdownHeaderSplitterwhere a child header lost its direct parent header in the metadata when the parent header had its own content chunk before the first child header.Previously, the following document:
# header 1 intro text ## header 1.1 text 1would produce
parent_headers: []forheader 1.1instead of the expected['header 1']. All parent header metadata is now correctly preserved in split chunks. -
Reverted the change that made
Agentmessagesoptional, as it caused issues with pipeline execution. As a consequence, theLLMcomponent now defaults to an empty messages list unless provided at runtime.
๐ Big thank you to everyone who contributed to this release!
@Aftabbs, @Amanbig, @anakin87, @bilgeyucel, @bogdankostic, @davidsbatista, @dina-deifallah, @jimmyzhuu, @julian-risch, @kacperlukawski, @maxdswain, @MechaCritter, @ritikraj2425, @sarahkiener, @sjrl, @soheinze, @srini047, @tholor
