DocumentationAPI ReferenceπŸ““ TutorialsπŸ§‘β€πŸ³ Cookbook🀝 IntegrationsπŸ’œ Discord

Docker

Learn how to deploy your Haystack Pipelines through Docker starting from the basic Docker container to a complex application.

Running Haystack in Docker

The most basic form of Haystack deployment happens through Docker containers. Becoming familiar with running and customizing Haystack Docker images is useful as they form the basis for more advanced deployment.

Haystack releases are officially distributed through the deepset/haystack Docker image. Haystack images come in different flavors depending on the specific components they ship and the Haystack version.

πŸ“˜

At the moment, the only flavor available for Haystack 2.0 is base, which ships exactly what you would get by installing Haystack locally with pip install haystack-ai.

You can pull a specific Haystack flavor using Docker tags: for example, to pull the image containing Haystack 2.0.0-beta7, you can run the command:

docker pull deepset/haystack:base-v2.0.0-beta.7

Although the base flavor is meant to be customized, it can also be used to quickly run Haystack scripts locally without the need to set up a Python environment and its dependencies. For example, this is how you would print Haystack’s version running a Docker container:

docker run -it --rm deepset/haystack:base-v2.0.0-beta.7 python -c"from haystack.version import __version__; print(__version__)"

Customizing the Haystack Docker Image

Chances are your application will be more complex than a simple script, and you’re going to need to install additional dependencies inside the Docker image along with Haystack.

For example, you might want to run a simple indexing Pipeline using Chroma as your Document Store using a Docker container. The base image only contains a basic install of Haystack, but you need to install the Chroma integration (chroma-haystack) package additionally. The best approach would be to create a custom Docker image shipping the extra dependency.

Assuming you have a main.py script in your current folder, the Dockerfile would look like this:

FROM deepset/haystack:base-v2.0.0-beta.7

pip install chroma-haystack

COPY ./main.py /usr/src/myapp/main.py

ENTRYPOINT ["python", "/usr/src/myapp/main.py"]

Then you can create your custom Haystack image with:

docker build . -t my-haystack-image

Complex Application with Docker Compose

One can go pretty far with a Haystack application running in Docker: with an internet connection available, the container can reach external services providing vector databases, inference endpoints, and observability features.

Still, you might want to orchestrate additional services for your Haystack container locally, for example, to reduce costs or increase performance. When your application runtime depends on more than one Docker container, Docker Compose is a great tool to keep everything together.

As an example, let’s say your application wraps two Pipelines: one to index Documents into a Qdrant instance and the other to query those Documents at a later time. This setup would require two Docker containers: one to run the Pipelines (for example, using Hayhooks) and a second to run a Qdrant instance.

The Haystack bit of this application would run on a custom Docker image in order to fulfill the dependency on the QdrantDocumentStore, and the Dockerfile would look like this:

FROM deepset/haystack:base-v2.0.0-beta.7

EXPOSE 1416

RUN pip install qdrant-haystack hayhooks sentence-transformers

CMD ["hayhooks", "run", "--pipelines-dir", "/pipelines", "--host", "0.0.0.0"]

We wouldn’t need to customize Qdrant, so their official Docker image would work perfectly. The docker-compose.yml file would then look like this:

services:
  qdrant:
    image: qdrant/qdrant:latest
    ports:
      - 6333:6333
      - 6334:6334
    expose:
      - 6333
      - 6334
    configs:
      - source: qdrant_config
        target: /qdrant/config/production.yaml
    volumes:
      - ./qdrant_data:/qdrant_data

  hayhooks:
    image: deepset/hayhooks:main
    ports:
      - "1416:1416"
    volumes:
      - ./pipelines:/pipelines

configs:
  qdrant_config:
    content: |
      log_level: INFO

For a functional example of a Docker Compose deployment, check out the β€œQdrant Indexing” demo from GitHub.