๐Ÿ†• Haystack 2.18: Pipeline Error Recovery + Structured Output Generation
Maintained by deepset

Integration: Weaviate

Use a Weaviate database with Haystack

Authors
deepset

Table of Contents

Overview

PyPI - Version PyPI - Python Version test


Installation

Use pip to install Weaviate:

pip install weaviate-haystack

Usage

Once installed, initialize your Weaviate database to use it with Haystack.

In this example, we use the temporary embedded version for simplicity. To use a self-hosted Docker container or Weaviate Cloud Service, take a look at the docs.

from haystack_integrations.document_stores.weaviate import WeaviateDocumentStore
from weaviate.embedded import EmbeddedOptions

document_store = WeaviateDocumentStore(embedded_options=EmbeddedOptions())
# document_store = WeaviateDocumentStore(url="http://localhost:8080")

Writing Documents to WeaviateDocumentStore

To write documents to WeaviateDocumentStore, create an indexing pipeline.

from haystack.components.file_converters import TextFileToDocument
from haystack.components.writers import DocumentWriter

indexing = Pipeline()
indexing.add_component("converter", TextFileToDocument())
indexing.add_component("writer", DocumentWriter(document_store))
indexing.connect("converter", "writer")
indexing.run({"converter": {"paths": file_paths}})

Retrieval

The integration supports different retrieval types through different retriever components:

  • WeaviateBM25Retriever: A keyword-based retriever that fetches documents matching a query from the Document Store.
  • WeaviateEmbeddingRetriever: Compares the query and document embeddings and fetches the documents most relevant to the query.
  • WeaviateHybridRetriever: A retriever that uses hybrid search to find similar documents based on the embeddings of the query.

License

weaviate-haystack is distributed under the terms of the Apache-2.0 license.