Integration: Azure AI Search
Use Azure AI Search with Haystack
Table of Contents
Overview
AzureAIDocumentStore
supports an integration of
Azure AI Search which is an enterprise-ready search and retrieval system with
Haystack by
deepset.
This integration allows using search indexes in Azure AI Search as a document store to build RAG-based applications on Azure, with native LLM integrations. To retrieve data from the document store, the integration supports three types of retrieval techniques:
- Embedding Retrieval: For vector-based searches.
- BM25 Retrieval: Keyword retrieval utilizing the BM25 algorithm.
- Hybrid Retrieval: A combination of vector and BM25 retrieval methods.
Installation
Install the Azure AI Search integration:
pip install "azure-ai-search-haystack"
Usage
To use the AzureAISearchDocumentStore
, you need to have an active
Azure subscription with a deployed Azure AI Search service. You need to provide a search service endpoint as an AZURE_AI_SEARCH_ENDPOINT
and an API key as AZURE_AI_SEARCH_API_KEY
for authentication. If the API key is not provided, the DefaultAzureCredential
will attempt to authenticate you through the browser.
from haystack_integrations.document_stores.azure_ai_search import AzureAISearchDocumentStore
from haystack import Document
document_store = AzureAISearchDocumentStore(
metadata_fields={"version": float, "label": str},
index_name="document-store-example",
)
documents = [
Document(
content="This is an introduction to using Python for data analysis.",
meta={"version": 1.0, "label": "chapter_one"},
),
Document(
content="Learn how to use Python libraries for machine learning.",
meta={"version": 1.5, "label": "chapter_two"},
),
Document(
content="Advanced Python techniques for data visualization.",
meta={"version": 2.0, "label": "chapter_three"},
),
]
document_store.write_documents(documents)
filters = {
"operator": "AND",
"conditions": [
{"field": "meta.version", "operator": ">", "value": 1.2},
{"field": "meta.label", "operator": "in", "value": ["chapter_one", "chapter_three"]},
],
}
results = document_store.filter_documents(filters)
print(results)
You can supply all supported parameters as index_creation_kwargs
for SearchIndex
during the initialization of the AzureAISearchDocumentStore
to customize index creation. Additionally, the AzureAISearchDocumentStore
supports semantic ranking, which can be enabled by including the SemanticSearch
configuration in index_creation_kwargs during initialization and utilizing it through one of the retrievers. For further details, refer to the
Azure AI tutorial on this feature.