Maintained by deepset

Integration: Cerebras

Use LLMs served by Cerebras API

Authors
deepset

Table of Contents

Overview

Cerebras is the go-to platform for fast and effortless AI training and inference.

Usage

Cerebras API is OpenAI compatible, making it easy to use in Haystack via OpenAI Generators.

Using Generator

Here’s an example of using llama3.1-8b served via Cerebras to perform question answering on a web page. You need to set the environment variable CEREBRAS_API_KEY and choose a compatible model.

from haystack import Pipeline
from haystack.utils import Secret
from haystack.components.fetchers import LinkContentFetcher
from haystack.components.converters import HTMLToDocument
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator

fetcher = LinkContentFetcher()
converter = HTMLToDocument()
prompt_template = """
According to the contents of this website:
{% for document in documents %}
  {{document.content}}
{% endfor %}
Answer the given question: {{query}}
Answer:
"""
prompt_builder = PromptBuilder(template=prompt_template)
llm = OpenAIGenerator(
    api_key=Secret.from_env_var("CEREBRAS_API_KEY"),
    api_base_url="https://api.cerebras.ai/v1",
    model="llama3.1-8b"
)
pipeline = Pipeline()
pipeline.add_component("fetcher", fetcher)
pipeline.add_component("converter", converter)
pipeline.add_component("prompt", prompt_builder)
pipeline.add_component("llm", llm)

pipeline.connect("fetcher.streams", "converter.sources")
pipeline.connect("converter.documents", "prompt.documents")
pipeline.connect("prompt.prompt", "llm.prompt")

result = pipeline.run({"fetcher": {"urls": ["https://cerebras.ai/inference"]},
              "prompt": {"query": "Why should I use Cerebras for serving LLMs?"}})

print(result["llm"]["replies"][0])

Using ChatGenerator

See an example of engaging in a multi-turn conversation with llama3.1-8b. You need to set the environment variable CEREBRAS_API_KEY and choose a compatible model.

from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack.utils import Secret

generator = OpenAIChatGenerator(
    api_key=Secret.from_env_var("CEREBRAS_API_KEY"),
    api_base_url="https://api.cerebras.ai/v1",
    model="llama3.1-8b",
    generation_kwargs = {"max_tokens": 512}
)

messages = []

while True:
    msg = input("Enter your message or Q to exit\n🧑 ")
    if msg=="Q":
        break
    messages.append(ChatMessage.from_user(msg))
    response = generator.run(messages=messages)
    assistant_resp = response['replies'][0]
    print("🤖 "+assistant_resp.content)
    messages.append(assistant_resp)