Pipelines
Module: pipeline
Class: Pipeline
class Pipeline()
Pipeline brings together building blocks to build a complex search pipeline with Haystack & user-defined components.
Under-the-hood, a pipeline is represented as a directed acyclic graph of component nodes. It enables custom query flows with options to branch queries(eg, extractive qa vs keyword match query), merge candidate documents for a Reader from multiple Retrievers, or re-ranking of candidate documents.
add_node
| add_node(component, name: str, inputs: List[str])
Add a new node to the pipeline.
Arguments:
component
: The object to be called when the data is passed to the node. It can be a Haystack component (like Retriever, Reader, or Generator) or a user-defined object that implements a run() method to process incoming data from predecessor node.name
: The name for the node. It must not contain any dots.inputs
: A list of inputs to the node. If the predecessor node has a single outgoing edge, just the name of node is sufficient. For instance, a 'ElasticsearchRetriever' node would always output a single edge with a list of documents. It can be represented as ["ElasticsearchRetriever"].
In cases when the predecessor node has multiple outputs, e.g., a "QueryClassifier", the output must be specified explicitly as "QueryClassifier.output_2".
get_node
| get_node(name: str)
Get a node from the Pipeline.
Arguments:
name
: The name of the node.
set_node
| set_node(name: str, component)
Set the component for a node in the Pipeline.
Arguments:
name
: The name of the node.component
: The component object to be set at the node.
draw
| draw(path: Path = Path("pipeline.png"))
Create a Graphviz visualization of the pipeline.
Arguments:
path
: the path to save the image.
Class: BaseStandardPipeline
class BaseStandardPipeline()
add_node
| add_node(component, name: str, inputs: List[str])
Add a new node to the pipeline.
Arguments:
component
: The object to be called when the data is passed to the node. It can be a Haystack component (like Retriever, Reader, or Generator) or a user-defined object that implements a run() method to process incoming data from predecessor node.name
: The name for the node. It must not contain any dots.inputs
: A list of inputs to the node. If the predecessor node has a single outgoing edge, just the name of node is sufficient. For instance, a 'ElasticsearchRetriever' node would always output a single edge with a list of documents. It can be represented as ["ElasticsearchRetriever"].
In cases when the predecessor node has multiple outputs, e.g., a "QueryClassifier", the output must be specified explicitly as "QueryClassifier.output_2".
get_node
| get_node(name: str)
Get a node from the Pipeline.
Arguments:
name
: The name of the node.
set_node
| set_node(name: str, component)
Set the component for a node in the Pipeline.
Arguments:
name
: The name of the node.component
: The component object to be set at the node.
draw
| draw(path: Path = Path("pipeline.png"))
Create a Graphviz visualization of the pipeline.
Arguments:
path
: the path to save the image.
Class: ExtractiveQAPipeline
class ExtractiveQAPipeline(BaseStandardPipeline)
__init__
| __init__(reader: BaseReader, retriever: BaseRetriever)
Initialize a Pipeline for Extractive Question Answering.
Arguments:
reader
: Reader instanceretriever
: Retriever instance
Class: DocumentSearchPipeline
class DocumentSearchPipeline(BaseStandardPipeline)
__init__
| __init__(retriever: BaseRetriever)
Initialize a Pipeline for semantic document search.
Arguments:
retriever
: Retriever instance
Class: GenerativeQAPipeline
class GenerativeQAPipeline(BaseStandardPipeline)
__init__
| __init__(generator: BaseGenerator, retriever: BaseRetriever)
Initialize a Pipeline for Generative Question Answering.
Arguments:
generator
: Generator instanceretriever
: Retriever instance
Class: SearchSummarizationPipeline
class SearchSummarizationPipeline(BaseStandardPipeline)
__init__
| __init__(summarizer: BaseSummarizer, retriever: BaseRetriever)
Initialize a Pipeline that retrieves documents for a query and then summarizes those documents.
Arguments:
summarizer
: Summarizer instanceretriever
: Retriever instance
run
| run(query: str, filters: Optional[Dict] = None, top_k_retriever: int = 10, generate_single_summary: bool = False, return_in_answer_format=False)
Arguments:
query
: Your search queryfilters
:top_k_retriever
: Number of top docs the retriever should pass to the summarizer. The higher this value, the slower your pipeline.generate_single_summary
: Whether to generate single summary from all retrieved docs (True) or one per doc (False).return_in_answer_format
: Whether the results should be returned as documents (False) or in the answer format used in other QA pipelines (True). With the latter, you can use this pipeline as a "drop-in replacement" for other QA pipelines.
Class: FAQPipeline
class FAQPipeline(BaseStandardPipeline)
__init__
| __init__(retriever: BaseRetriever)
Initialize a Pipeline for finding similar FAQs using semantic document search.
Arguments:
retriever
: Retriever instance
Class: JoinDocuments
class JoinDocuments()
A node to join documents outputted by multiple retriever nodes.
The node allows multiple join modes:
- concatenate: combine the documents from multiple nodes. Any duplicate documents are discarded.
- merge: merge scores of documents from multiple nodes. Optionally, each input score can be given a different
weight
& atop_k
limit can be set. This mode can also be used for "reranking" retrieved documents.
__init__
| __init__(join_mode: str = "concatenate", weights: Optional[List[float]] = None, top_k_join: Optional[int] = None)
Arguments:
join_mode
:concatenate
to combine documents from multiple retrievers ormerge
to aggregate scores of individual documents.weights
: A node-wise list(length of list must be equal to the number of input nodes) of weights for adjusting document scores when using themerge
joinmode. By default, equal weight is given to each retriever score. This param is not compatible with theconcatenate
joinmode.top_k_join
: Limit documents to top_k based on the resulting scores of the join.