Invoking APIs with OpenAPITool


Many APIs available on the Web provide an OpenAPI specification that describes their structure and syntax.

OpenAPITool is an experimental Haystack component that allows you to call an API using payloads generated from human instructions.

Here’s a brief overview of how it works:

  • At initialization, it loads the OpenAPI specification from a URL or a file.
  • At runtime:
    • Converts human instructions into a suitable API payload using a Chat Language Model (LLM).
    • Invokes the API.
    • Returns the API response, wrapped in a Chat Message.

Let’s see this component in action…

Setup

! pip install haystack-ai jsonref

In this notebook, we will be using some APIs that require an API key. Let’s set them as environment variables.

import os

os.environ["OPENAI_API_KEY"]="..."

# free API key: https://www.firecrawl.dev/
os.environ["FIRECRAWL_API_KEY"]="..."

# free API key: https://serper.dev/
os.environ["SERPERDEV_API_KEY"]="..."

Call an API without credentials

In the first example, we use Open-Meteo, a Free Weather API that does not require authentication.

We use OPENAI as LLM provider. Other supported providers are ANTHROPIC and COHERE.

from haystack.dataclasses import ChatMessage
from haystack_experimental.components.tools.openapi import OpenAPITool, LLMProvider
from haystack.utils import Secret

tool = OpenAPITool(generator_api=LLMProvider.OPENAI,
                   spec="https://raw.githubusercontent.com/open-meteo/open-meteo/main/openapi.yml")

tool.run(messages=[ChatMessage.from_user("Weather in San Francisco, US")])

Incorporate OpenAPITool in a Pipeline

Next, let’s create a simple Pipeline where the service response is translated into a human-understandable format using the Language Model.

We use a ChatPromptBuilder to create a list of Chat Messages for the LM.

from haystack import Pipeline
from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator

messages = [ChatMessage.from_user("{{user_message}}"), ChatMessage.from_user("{{service_response}}")]
builder = ChatPromptBuilder(template=messages)

pipe = Pipeline()
pipe.add_component("meteo", tool)
pipe.add_component("builder", builder)
pipe.add_component("llm", OpenAIChatGenerator(generation_kwargs={"max_tokens": 1024}))

pipe.connect("meteo", "builder.service_response")
pipe.connect("builder", "llm.messages")
result = pipe.run(data={"meteo": {"messages": [ChatMessage.from_user("weather in San Francisco, US")]},
                        "builder": {"user_message": [ChatMessage.from_user("Explain the weather in San Francisco in a human understandable way")]}})
print(result["llm"]["replies"][0].content)

Use an API with credentials in a Pipeline

In this example, we use Firecrawl: a project that scrape Web pages (and Web sites) and convert them into clean text. Firecrawl has an API that requires an API key.

In the following Pipeline, we use Firecrawl to scrape a news article, which is then summarized using a Language Model.

messages = [ChatMessage.from_user("{{user_message}}"), ChatMessage.from_user("{{service_response}}")]
builder = ChatPromptBuilder(template=messages)


pipe = Pipeline()
pipe.add_component("firecrawl", OpenAPITool(generator_api=LLMProvider.OPENAI,
                                            spec="https://raw.githubusercontent.com/mendableai/firecrawl/main/apps/api/openapi.json",
                                            credentials=Secret.from_env_var("FIRECRAWL_API_KEY")))
pipe.add_component("builder", builder)
pipe.add_component("llm", OpenAIChatGenerator(generation_kwargs={"max_tokens": 1024}))

pipe.connect("firecrawl", "builder.service_response")
pipe.connect("builder", "llm.messages")
user_prompt = "Given the article below, list the most important facts in a bulleted list. Do not include repetitions. Max 5 points."

result = pipe.run(data={"firecrawl": {"messages": [ChatMessage.from_user("Scrape https://lite.cnn.com/2024/07/18/style/rome-ancient-papal-palace/index.html")]},
                        "builder": {"user_message": [ChatMessage.from_user(user_prompt)]}})
print(result["llm"]["replies"][0].content)

Create a Pipeline with multiple OpenAPITool components

In this example, we show a Pipeline where multiple alternative APIs can be invoked depending on the user query. In particular, a Google Search (via Serper.dev) can be performed or a single page can be scraped using Firecrawl.

⚠️ The approach shown is just one way to achieve this using conditional routing. We are currently experimenting with tool support in Haystack, and there may be simpler ways to achieve the same result in the future.

import json

decision_prompt_template = """
You are a virtual assistant, equipped with the following tools:

- `{"tool_name": "search_web", "tool_description": "Access to Google search, use this tool whenever information on recents events is needed"}`
- `{"tool_name": "scrape_page", "tool_description": "Use this tool to scrape and crawl web pages"}`

Select the most appropriate tool to resolve the user's query. Respond in JSON format, specifying the user request and the chosen tool for the response.
If you can't match user query to an above listed tools, respond with `none`.


######
Here are some examples:

```json
{
  "query": "Why did Elon Musk recently sue OpenAI?",
  "response": "search_web"
}
{
  "query": "What is on the front-page of hackernews today?",
  "response": "scrape_page"
}
{
  "query": "Tell me about Berlin",
  "response": "none"
}

Choose the best tool (or none) for each user request, considering the current context of the conversation specified above.

{“query”: {{query}}, “response”: } """

def get_tool_name(replies): try: tool_name = json.loads(replies)[“response”] return tool_name except: return “error”

routes = [ { “condition”: “{{replies[0] | get_tool_name == ‘search_web’}}”, “output”: “{{query}}”, “output_name”: “search_web”, “output_type”: str, }, { “condition”: “{{replies[0] | get_tool_name == ‘scrape_page’}}”, “output”: “{{query}}”, “output_name”: “scrape_page”, “output_type”: str, }, { “condition”: “{{replies[0] | get_tool_name == ’none’}}”, “output”: “{{replies[0]}}”, “output_name”: “no_tools”, “output_type”: str, }, { “condition”: “{{replies[0] | get_tool_name == ’error’}}”, “output”: “{{replies[0]}}”, “output_name”: “error”, “output_type”: str, }, ]



```python
from haystack.components.builders import PromptBuilder, ChatPromptBuilder
from haystack.components.routers import ConditionalRouter
from haystack.components.generators import OpenAIGenerator


messages = [ChatMessage.from_user("{{query}}")]

search_web_chat_builder = ChatPromptBuilder(template=messages)
scrape_page_chat_builder = ChatPromptBuilder(template=messages)

search_web_tool = OpenAPITool(generator_api=LLMProvider.OPENAI,
                   spec="https://bit.ly/serper_dev_spec_yaml",
                   credentials=Secret.from_env_var("SERPERDEV_API_KEY"))

scrape_page_tool = OpenAPITool(generator_api=LLMProvider.OPENAI,
                   spec="https://raw.githubusercontent.com/mendableai/firecrawl/main/apps/api/openapi.json",
                   credentials=Secret.from_env_var("FIRECRAWL_API_KEY"))

pipe = Pipeline()
pipe.add_component("prompt_builder", PromptBuilder(template=decision_prompt_template))
pipe.add_component("llm", OpenAIGenerator())
pipe.add_component("router", ConditionalRouter(routes, custom_filters={"get_tool_name": get_tool_name}))
pipe.add_component("search_web_chat_builder", search_web_chat_builder)
pipe.add_component("scrape_page_chat_builder", scrape_page_chat_builder)
pipe.add_component("search_web_tool", search_web_tool)
pipe.add_component("scrape_page_tool", scrape_page_tool)

pipe.connect("prompt_builder", "llm")
pipe.connect("llm.replies", "router.replies")
pipe.connect("router.search_web", "search_web_chat_builder")
pipe.connect("router.scrape_page", "scrape_page_chat_builder")
pipe.connect("search_web_chat_builder", "search_web_tool")
pipe.connect("scrape_page_chat_builder", "scrape_page_tool")
query = "Who won the UEFA European Football Championship?"

pipe.run({"prompt_builder": {"query": query}, "router": {"query": query}})
query = "What is on the front-page of BBC today?"

pipe.run({"prompt_builder": {"query": query}, "router": {"query": query}})