Tutorial: Serializing LLM Pipelines
Last Updated: December 19, 2024
- Level: Beginner
- Time to complete: 10 minutes
- Components Used:
HuggingFaceLocalChatGenerator
,ChatPromptBuilder
- Prerequisites: None
- Goal: After completing this tutorial, you’ll understand how to serialize and deserialize between YAML and Python code.
This tutorial uses Haystack 2.0. To learn more, read the Haystack 2.0 announcement or visit the Haystack 2.0 Documentation.
Overview
๐ Useful Documentation: Serialization
Serialization means converting a pipeline to a format that you can save on your disk and load later. It’s especially useful because a serialized pipeline can be saved on disk or a database, get sent over a network and more.
Although it’s possible to serialize into other formats too, Haystack supports YAML out of the box to make it easy for humans to make changes without the need to go back and forth with Python code. In this tutorial, we will create a very simple pipeline in Python code, serialize it into YAML, make changes to it, and deserialize it back into a Haystack Pipeline
.
Preparing the Colab Environment
Installing Haystack
Install Haystack 2.0 with pip
:
%%bash
pip install haystack-ai
Enabling Telemetry
Knowing you’re using this tutorial helps us decide where to invest our efforts to build a better product but you can always opt out by commenting the following line. See Telemetry for more details.
from haystack.telemetry import tutorial_running
tutorial_running(29)
Creating a Simple Pipeline
First, let’s create a very simple pipeline that expects a topic
from the user, and generates a summary about the topic with google/flan-t5-large
. Feel free to modify the pipeline as you wish. Note that in this pipeline we are using a local model that we’re getting from Hugging Face. We’re using a relatively small, open-source LLM.
from haystack import Pipeline
from haystack.components.builders import ChatPromptBuilder
from haystack.dataclasses import ChatMessage
from haystack.components.generators.chat import HuggingFaceLocalChatGenerator
template = [
ChatMessage.from_user(
"""
Please create a summary about the following topic:
{{ topic }}
"""
)
]
builder = ChatPromptBuilder(template=template)
llm = HuggingFaceLocalChatGenerator(model="Qwen/Qwen2.5-1.5B-Instruct", generation_kwargs={"max_new_tokens": 150})
pipeline = Pipeline()
pipeline.add_component(name="builder", instance=builder)
pipeline.add_component(name="llm", instance=llm)
pipeline.connect("builder.prompt", "llm.messages")
topic = "Climate change"
result = pipeline.run(data={"builder": {"topic": topic}})
print(result["llm"]["replies"][0].text)
Serialize the Pipeline to YAML
Out of the box, Haystack supports YAML. Use dumps()
to convert the pipeline to YAML:
yaml_pipeline = pipeline.dumps()
print(yaml_pipeline)
You should get a pipeline YAML that looks like the following:
components:
builder:
init_parameters:
required_variables: null
template:
- content: '
Please create a summary about the following topic:
{{ topic }}
'
meta: {}
name: null
role: user
variables: null
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
llm:
init_parameters:
init_parameters:
generation_kwargs:
max_new_tokens: 150
stop_sequences: []
huggingface_pipeline_kwargs:
device: cpu
model: Qwen/Qwen2.5-1.5B-Instruct
task: text-generation
streaming_callback: null
token:
env_vars:
- HF_API_TOKEN
- HF_TOKEN
strict: false
type: env_var
type: haystack.components.generators.chat.hugging_face_local.HuggingFaceLocalChatGenerator
connections:
- receiver: llm.messages
sender: builder.prompt
max_runs_per_component: 100
metadata: {}
Editing a Pipeline in YAML
Let’s see how we can make changes to serialized pipelines. For example, below, let’s modify the ChatPromptBuilder
’s template to translate provided sentence
to French:
yaml_pipeline = """
components:
builder:
init_parameters:
template:
- content: 'Please translate the following to French: \n{{ sentence }}\n'
meta: {}
name: null
role: user
variables: null
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
llm:
init_parameters:
generation_kwargs:
max_new_tokens: 150
stop_sequences: []
huggingface_pipeline_kwargs:
device: cpu
model: Qwen/Qwen2.5-1.5B-Instruct
task: text-generation
streaming_callback: null
chat_template : "{% for message in messages %}{% if message['role'] == 'user' %}{{ ' ' }}{% endif %}{{ message['content'] }}{% if not loop.last %}{{ ' ' }}{% endif %}{% endfor %}{{ eos_token }}"
token:
env_vars:
- HF_API_TOKEN
- HF_TOKEN
strict: false
type: env_var
type: haystack.components.generators.chat.hugging_face_local.HuggingFaceLocalChatGenerator
connections:
- receiver: llm.messages
sender: builder.prompt
max_runs_per_component: 100
metadata: {}
"""
Deseriazling a YAML Pipeline back to Python
You can deserialize a pipeline by calling loads()
. Below, we’re deserializing our edited yaml_pipeline
:
from haystack import Pipeline
new_pipeline = Pipeline.loads(yaml_pipeline)
Now we can run the new pipeline we defined in YAML. We had changed it so that the ChatPromptBuilder
expects a sentence
and translates the sentence to French:
new_pipeline.run(data={"builder": {"sentence": "I love capybaras"}})
What’s next
๐ Congratulations! You’ve serialzed a pipeline into YAML, edited it and ran it again!
If you liked this tutorial, you may also enjoy:
To stay up to date on the latest Haystack developments, you can sign up for our newsletter. Thanks for reading!