๐Ÿ†• Haystack 2.30 is here! Pass a plain string to any ChatGenerator
Maintained by deepset

Integration: Whisper

Transcribe audio files with OpenAI's Whisper, locally or via the OpenAI API

Authors
deepset

Table of Contents

Overview

The whisper-haystack integration provides two components that transcribe audio files into Haystack documents using OpenAI’s Whisper model:

Both components are typically used as the first step of an indexing pipeline. They were previously part of Haystack core and now live in the whisper-haystack integration package, maintained in haystack-core-integrations.

Installation

pip install whisper-haystack

This is all you need for RemoteWhisperTranscriber, which uses the OpenAI Whisper API (set the OPENAI_API_KEY environment variable).

To use LocalWhisperTranscriber, also install the optional openai-whisper dependency and make sure ffmpeg is available on your system:

pip install -U openai-whisper

Usage

RemoteWhisperTranscriber

RemoteWhisperTranscriber transcribes audio with the OpenAI Whisper API. Set your OPENAI_API_KEY and pass the audio sources to transcribe:

import os
from haystack_integrations.components.audio.whisper import RemoteWhisperTranscriber

os.environ["OPENAI_API_KEY"] = "your-api-key"

transcriber = RemoteWhisperTranscriber()
result = transcriber.run(sources=["path/to/audio/file.mp3"])

print(result["documents"][0].content)

LocalWhisperTranscriber

LocalWhisperTranscriber runs the Whisper model on your machine. Choose a model size (for example tiny, base, or small) and transcribe your audio files:

from haystack_integrations.components.audio.whisper import LocalWhisperTranscriber

transcriber = LocalWhisperTranscriber(model="small")
transcriber.warm_up()
result = transcriber.run(sources=["path/to/audio/file.mp3"])

print(result["documents"][0].content)

In a pipeline

The pipeline below fetches an audio file from a URL with LinkContentFetcher and transcribes it with LocalWhisperTranscriber:

from haystack import Pipeline
from haystack.components.fetchers import LinkContentFetcher
from haystack_integrations.components.audio.whisper import LocalWhisperTranscriber

pipe = Pipeline()
pipe.add_component("fetcher", LinkContentFetcher())
pipe.add_component("transcriber", LocalWhisperTranscriber(model="tiny"))
pipe.connect("fetcher", "transcriber")

result = pipe.run(
    data={
        "fetcher": {
            "urls": [
                "https://github.com/deepset-ai/haystack/raw/refs/heads/main/test/test_files/audio/MLK_Something_happening.mp3"
            ]
        }
    }
)
print(result["transcriber"]["documents"][0].content)

Alternatively, the pipeline below indexes audio files from a local folder using LocalWhisperTranscriber, DocumentCleaner, DocumentSplitter, and DocumentWriter:

from pathlib import Path
from haystack import Pipeline
from haystack_integrations.components.audio.whisper import LocalWhisperTranscriber
from haystack.components.preprocessors import DocumentSplitter, DocumentCleaner
from haystack.components.writers import DocumentWriter
from haystack.document_stores.in_memory import InMemoryDocumentStore

document_store = InMemoryDocumentStore()
pipeline = Pipeline()
pipeline.add_component(instance=LocalWhisperTranscriber(model="small"), name="transcriber")
pipeline.add_component(instance=DocumentCleaner(), name="cleaner")
pipeline.add_component(instance=DocumentSplitter(), name="splitter")
pipeline.add_component(instance=DocumentWriter(document_store=document_store), name="writer")

pipeline.connect("transcriber.documents", "cleaner.documents")
pipeline.connect("cleaner.documents", "splitter.documents")
pipeline.connect("splitter.documents", "writer.documents")

pipeline.run({"transcriber": {"audio_files": list(Path("path/to/audio/folder").iterdir())}})

License

whisper-haystack is distributed under the terms of the Apache-2.0 license.