๐Ÿ†• Haystack 2.30 is here! Pass a plain string to any ChatGenerator
Maintained by deepset

Integration: FunASR

Transcribe audio files to Documents using FunASR โ€” an open-source, self-hosted speech recognition toolkit supporting 50+ languages.

Authors
deepset

PyPI - Version PyPI - Python Version test


Table of Contents

Overview

FunASR is an open-source speech recognition toolkit from Alibaba DAMO Academy that runs entirely locally โ€” no API key required. It supports 50+ languages, speaker diarization, and timestamp extraction.

This integration provides:

  • FunASRTranscriber: Transcribes audio files to Haystack Document objects. Accepts file paths, Path objects, and ByteStream inputs.

Installation

pip install funasr-haystack

Usage

FunASRTranscriber

FunASRTranscriber transcribes audio files to Haystack Document objects using FunASR models. Models are downloaded from ModelScope on first use and cached in ~/.cache/modelscope.

Basic Example

from haystack_integrations.components.audio.funasr import FunASRTranscriber

transcriber = FunASRTranscriber()

result = transcriber.run(sources=["speech.wav", "interview.mp3"])
for doc in result["documents"]:
    print(doc.content)

In a Pipeline

from haystack import Pipeline
from haystack.components.fetchers import LinkContentFetcher
from haystack_integrations.components.audio.funasr import FunASRTranscriber

pipe = Pipeline()
pipe.add_component("fetcher", LinkContentFetcher())
pipe.add_component("transcriber", FunASRTranscriber())

pipe.connect("fetcher", "transcriber")

result = pipe.run(
    data={
        "fetcher": {
            "urls": ["https://example.com/speech.wav"],
        },
    }
)
print(result["transcriber"]["documents"][0].content)

Speaker Diarization

from haystack.utils import ComponentDevice
from haystack_integrations.components.audio.funasr import FunASRTranscriber

transcriber = FunASRTranscriber(
    model="paraformer-zh",
    vad_model="fsmn-vad",
    punc_model="ct-punc",
    spk_model="cam++",
    device=ComponentDevice.from_str("cuda"),
)

result = transcriber.run(sources=["meeting.wav"])
doc = result["documents"][0]
print(doc.content)
print("Speakers:", doc.meta.get("speakers"))

License

funasr-haystack is distributed under the terms of the Apache-2.0 license.