๐ŸŽ„ Let's code and celebrate this holiday season with Advent of Haystack
Maintained by deepset

Integration: Qdrant

Use the Qdrant vector database with Haystack

Authors
deepset
Qdrant

An integration of Qdrant vector database with Haystack by deepset.

The library finally allows using Qdrant as a document store, and provides an in-place replacement for any other vector embeddings store. Thus, you should expect any kind of application to be working smoothly just by changing the provider to QdrantDocumentStore.

Installation

qdrant-haystack might be installed as any other Python library, using pip or poetry:

pip install qdrant-haystack
poetry add qdrant-haystack

Installation (1.x)

Latest versions of qdrant-haystack are compatible only with Haystack 2.x. If you’re using Haystack 1.x you need to specify the version explicitly.

pip install "qdrant-haystack<2.0.0"
poetry add "qdrant-haystack<2.0.0"

Usage

Once installed, you can already start using QdrantDocumentStore as any other store that supports embeddings.

from haystack_integrations.document_stores.qdrant import QdrantDocumentStore
# Note: for Haystack 1.x use the following import instead:
# from qdrant_haystack import QdrantDocumentStore


document_store = QdrantDocumentStore(
    url="localhost",
    index="Document",
    embedding_dim=512,
    recreate_index=True,
    hnsw_config={"m": 16, "ef_construct": 64}  # Optional
)

The list of parameters accepted by QdrantDocumentStore is complementary to those used in the official Python Qdrant client.

Using local in-memory / disk-persisted mode

Qdrant Python client, from version 1.1.1, supports local in-memory/disk-persisted mode. That’s a good choice for any test scenarios and quick experiments in which you do not plan to store lots of vectors. In such a case spinning a Docker container might be even not required.

The local mode was also implemented in qdrant-haystack integration.

In-memory storage

In case you want to have a transient storage, for example in case of automated tests launched during your CI/CD pipeline, using Qdrant Local mode with in-memory storage might be a preferred option. It might be simply enabled by passing :memory: as first parameter, while creating an instance of QdrantDocumentStore.

from haystack_integrations.document_stores.qdrant import QdrantDocumentStore
# Note: for Haystack 1.x use the following import instead:
# from qdrant_haystack import QdrantDocumentStore


document_store = QdrantDocumentStore(
    ":memory:",
    index="Document",
    embedding_dim=512,
    recreate_index=True,
    hnsw_config={"m": 16, "ef_construct": 64}  # Optional
)

On disk storage

However, if you prefer to keep the vectors between different runs of your application, it might be better to use on disk storage and pass the path that should be used to persist the data.

from haystack_integrations.document_stores.qdrant import QdrantDocumentStore
# Note: for Haystack 1.x use the following import instead:
# from qdrant_haystack import QdrantDocumentStore


document_store = QdrantDocumentStore(
    path="/home/qdrant/storage_local",
    index="Document",
    embedding_dim=512,
    recreate_index=True,
    hnsw_config={"m": 16, "ef_construct": 64}  # Optional
)

Connecting to Qdrant Cloud cluster

If you prefer not to manage your own Qdrant instance, Qdrant Cloud might be a better option.

from haystack_integrations.document_stores.qdrant import QdrantDocumentStore
from haystack.utils import Secret
# Note: for Haystack 1.x use the following import instead:
# from qdrant_haystack import QdrantDocumentStore


document_store = QdrantDocumentStore(
    url="https://YOUR-CLUSTER-URL.aws.cloud.qdrant.io",
    index="Document",
    api_key=Secret.from_env_var("QDRANT_API_KEY"),
    embedding_dim=512,
    recreate_index=True,
)

There is no difference in terms of functionality between local instances and cloud clusters.