DocumentationAPI ReferenceπŸ““ TutorialsπŸ§‘β€πŸ³ Cookbook🀝 IntegrationsπŸ’œ Discord

GradientTextEmbedder

This component computes embeddings for text (such as a query) using models deployed through the Gradient AI platform.

NameGradientTextEmbedder
TypeText Embedder
Pathhttps://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/gradient
Position in a PipelineBefore an embedding Retriever in a Query/RAG Pipeline
Inputsβ€œtext”: a string
Outputs"embedding": a list of float numbers representing the embedding of the text

GradientTextEmbedder allows you to compute the embedding of a string using embedding models deployed on the Gradient AI platform. This component should be used to embed a simple string (such as a query) into a vector.

For embedding lists of Documents, use one ofΒ the Document Embedders, which enrich the Document with the computed embedding, also known as vector.

Check out the Gradient documentation for the full list of available embedding models on Gradient. Currently, the component allows you to use the bge-large model.

πŸ“˜

For an example showcasing this component, check out this article and the related Colab notebook.

Parameters Overview

GradientTextEmbedder needs an access_token and workspace_id. You can provide these in one of the following ways:

For the access_token and workspace_id, do one of the following:

  • Provide the access_token and workspace_id init parameter.
  • Set GRADIENT_ACCESS_TOKEN and GRADIENT_WORKSPACE_ID environment variables.

As more models become available, you can change the model in the component by setting the model parameter at initialization.

Usage

You need to install gradient-haystack package to use the GradientTextEmbedder:

pip install gradient-haystack

On its own

Here is how you can use the component on its own:

import os
from gradient_haystack.embedders.gradient_text_embedder import GradientTextEmbedder

os.environ["GRADIENT_ACCESS_TOKEN"]="YOUR_GRADIENT_ACCESS_TOKEN"
os.environ["GRADIENT_WORKSPACE_ID"]="GRADIENT_WORKSPACE_ID"

text_embedder = GradientDocumentEmbedder()
text_embedder.warm_up()
text_embedder.run(text="Pizza is made with dough and cheese")

In a pipeline

Text embedders are commonly used to embed queries before an embedding retriever in Query/RAG Pipelines. Here is an example of this component being used in a RAG Pipeline, which is doing question answering based on Documents in an InMemoryDocumentStore:

import os
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
from haystack.components.builders import PromptBuilder
from haystack.components.generators import HuggingFaceTGIGenerator
from gradient_haystack.embedders.gradient_text_embedder import GradientTextEmbedder

document_store = InMemoryDocumentStore()

prompt = """ Answer the query, based on the
content in the documents.

Documents:
{% for doc in documents %}
  {{doc.content}}
{% endfor %}

Query: {{query}}
"""

os.environ["GRADIENT_ACCESS_TOKEN"]="YOUR_GRADIENT_ACCESS_TOKEN"
os.environ["GRADIENT_WORKSPACE_ID"]="GRADIENT_WORKSPACE_ID"

text_embedder = GradientDocumentEmbedder()
retriever = InMemoryEmbeddingRetriever(document_store=document_store)
prompt_builder = PromptBuilder(template=prompt)
generator = HuggingFaceTGIGenerator(model="mistralai/Mistral-7B-v0.1", 
																		token="YOUR_HUGGINGFACE_TOKEN")
generator.warm_up()

rag_pipeline = Pipeline()

rag_pipeline.add_component(instance=text_embedder, name="text_embedder")
rag_pipeline.add_component(instance=retriever, name="retriever")
rag_pipeline.add_component(instance=prompt_builder, name="prompt_builder")
rag_pipeline.add_component(instance=generator, name="generator")

rag_pipeline.connect("text_embedder", "retriever")
rag_pipeline.connect("retriever.documents", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder", "generator")

question = "What are the steps for creating a custom component?"
result = rag_pipeline.run(data={"text_embedder":{"text": question},
                                "prompt_builder":{"query": question}})

Related Links

Check out the API reference in the GitHub repo or in our docs: