DocumentationAPI ReferenceπŸ““ TutorialsπŸ§‘β€πŸ³ Cookbook🀝 IntegrationsπŸ’œ Discord

RemoteWhisperTranscriber

Use RemoteWhisperTranscriber to transcribe audio files using OpenAI's Whisper model.

NameRemoteWhisperTranscriber
Folder Path/audio/
Most common Position in a PipelineAs the first component in an indexing pipeline
Mandatory Input variablesβ€œsources”: List of paths or binary streams that you want to transcribe
Output variablesβ€œdocuments”: List of Documents

Overview

RemoteWhisperTranscriberΒ needs an OpenAI key to work. It uses an OPENAI_API_KEYΒ environment variable by default. Otherwise, you can pass an API key at initialization with api_key:

audio = RemoteWhisperTranscriber(api_key=Secret.from_token("<your-api-key>"))

Additionally, the component requires the following parameters to work:

  • model specifies the Whisper model.
  • api_base_url specifies the OpenAI base URL and defaults to "<https://api.openai.com/v1>".

See other optional parameters in our API documentation.

See the Whisper API documentation and the official Whisper GitHub repo for the supported audio formats and languages.

Usage

On its own

Here’s an example of how to use RemoteWhisperTranscriber on its own:

from haystack.components.audio import RemoteWhisperTranscriber

whisper = RemoteWhisperTranscriber(api_key=Secret.from_token("<your-api-key>"), model="tiny")
transcription = whisper.run(sources=["path/to/audio/file"])

In a Pipeline

This example shows an indexing Pipeline that takes audio files, transcribes them, and then stores the text as documents in a document store. β€œ.” needs to be a directory that contains only audio files.

from pathlib import Path
from haystack import Pipeline
from haystack.components.audio import RemoteWhisperTranscriber
from haystack.components.preprocessors import DocumentSplitter, DocumentCleaner
from haystack.components.writers import DocumentWriter
from haystack.document_stores.in_memory import InMemoryDocumentStore

document_store = InMemoryDocumentStore()
p = Pipeline()
p.add_component(instance=RemoteWhisperTranscriber(api_key=Secret.from_token("<your-api-key>"), name="transcriber")
p.add_component(instance=DocumentCleaner(), name="cleaner")
p.add_component(
    instance=DocumentSplitter(split_by="sentence", split_length=10), name="splitter"
)
p.add_component(instance=DocumentWriter(document_store=document_store), name="writer")

p.connect("transcriber.documents", "cleaner.documents")
p.connect("cleaner.documents", "splitter.documents")
p.connect("splitter.documents", "writer.documents")

p.run({"transcriber": {"sources": list(Path(".").iterdir())}})

Related Links

See the parameters details in our API reference: