Integration: Prior Labs
Tabular data science via MCP - predict missing values, classify, and run regression on tabular datasets using Prior Labs' foundation model.
Table of Contents
Overview
Prior Labs is the team behind TabPFN, a foundation model for tabular data. They expose TabPFN as a cloud service via the Model Context Protocol (MCP), enabling Haystack agents to run tabular machine learning without writing any ML code.
The MCP server exposes five tools (as of March 2026):
| Tool | Description |
|---|---|
upload_dataset |
Upload a CSV file (train or test) and receive a dataset_id for use in subsequent calls |
fit_and_predict_from_dataset |
Train a TabPFN model on an uploaded training file and predict on an uploaded test file |
predict_from_dataset |
Run prediction with a previously trained model on an uploaded test file |
fit_and_predict_inline |
Train and predict on small arrays already present in the conversation |
predict_inline |
Predict with a previously trained model on inline arrays |
Both classification and regression tasks are supported. For small datasets shared directly in the conversation the agent uses the inline tools, while for larger file-based datasets it calls upload_dataset first.
This integration does not require a separate Prior Labs Python package. Instead, it uses the existing mcp-haystack package to connect to the Prior Labs MCP server and discover its tools automatically.
Prerequisites
- Sign up for a Prior Labs account at priorlabs.ai and obtain an API token.
- Set the required environment variable:
export PRIOR_LABS_MCP_TOKEN="your-prior-labs-token"
Installation
pip install mcp-haystack
Usage
Connecting to Prior Labs
Use StreamableHttpServerInfo to point to the Prior Labs MCP server and MCPToolset to connect and fetch all available tools:
from haystack_integrations.tools.mcp import MCPToolset, StreamableHttpServerInfo
from haystack.utils import Secret
server_info = StreamableHttpServerInfo(
url="https://api.priorlabs.ai/mcp/server",
token=Secret.from_env_var("PRIOR_LABS_MCP_TOKEN"),
)
toolset = MCPToolset(server_info=server_info, eager_connect=True)
eager_connect=True causes the toolset to immediately connect and fetch the tool definitions from the server.
Predicting Tabular Data with an Agent
The following example sets up a Haystack Agent that uses Prior Labs tools to predict missing values in a CSV-style dataset:
from haystack_integrations.tools.mcp import MCPToolset, StreamableHttpServerInfo
from haystack_integrations.components.generators.anthropic import AnthropicChatGenerator
from haystack.utils import Secret
from haystack.components.agents import Agent
from haystack.dataclasses import ChatMessage
server_info = StreamableHttpServerInfo(
url="https://api.priorlabs.ai/mcp/server",
token=Secret.from_env_var("PRIOR_LABS_MCP_TOKEN"),
)
toolset = MCPToolset(server_info=server_info, eager_connect=True)
llm = AnthropicChatGenerator(model="claude-opus-4-6", tools=toolset)
agent = Agent(
chat_generator=llm,
tools=toolset,
system_prompt=(
"You are a data scientist working on tabular data. "
"Use provided tools to solve all the given tasks. "
"Always use the tools when you can, and only answer directly if you are sure "
"you have the correct answer. If you are not sure, use the tools to find the answer. "
"Always think step by step."
),
max_agent_steps=10,
)
input_table = """age,income,purchased
22,32000,0
25,45000,0
28,51000,0
31,58000,1
34,62000,1
37,75000,1
40,88000,1
43,95000,1
26,48000,
33,61000,
29,39000,
41,82000,
"""
output = agent.run(messages=[
ChatMessage.from_user(
"I have a dataset with columns 'age', 'income', and 'purchased' "
"(0 = did not purchase, 1 = purchased). Some rows are missing the 'purchased' value. "
"Can you predict the missing values based on the known examples?"
),
ChatMessage.from_user(input_table),
])
print(output["last_message"].text)
Because the dataset is small and provided inline, the agent calls fit_and_predict_inline with the known rows as training data and the incomplete rows as test data, using task_type="classification". The result is the predicted purchased values (0 or 1) for each incomplete row.
License
The mcp-haystack package is distributed under the
Apache-2.0 license. Use of the Prior Labs API is subject to
Prior Labs’ terms of service.
