Module pipeline
Pipeline Objects
class Pipeline()
Pipeline brings together building blocks to build a complex search pipeline with Haystack & user-defined components.
Under-the-hood, a pipeline is represented as a directed acyclic graph of component nodes. It enables custom query flows with options to branch queries(eg, extractive qa vs keyword match query), merge candidate documents for a Reader from multiple Retrievers, or re-ranking of candidate documents.
add_node
| add_node(component, name: str, inputs: List[str])
Add a new node to the pipeline.
Arguments:
component
: The object to be called when the data is passed to the node. It can be a Haystack component (like Retriever, Reader, or Generator) or a user-defined object that implements a run() method to process incoming data from predecessor node.name
: The name for the node. It must not contain any dots.inputs
: A list of inputs to the node. If the predecessor node has a single outgoing edge, just the name of node is sufficient. For instance, a 'ElasticsearchRetriever' node would always output a single edge with a list of documents. It can be represented as ["ElasticsearchRetriever"].
In cases when the predecessor node has multiple outputs, e.g., a "QueryClassifier", the output must be specified explicitly as "QueryClassifier.output_2".
get_node
| get_node(name: str)
Get a node from the Pipeline.
Arguments:
name
: The name of the node.
set_node
| set_node(name: str, component)
Set the component for a node in the Pipeline.
Arguments:
name
: The name of the node.component
: The component object to be set at the node.
draw
| draw(path: Path = Path("pipeline.png"))
Create a Graphviz visualization of the pipeline.
Arguments:
path
: the path to save the image.
BaseStandardPipeline Objects
class BaseStandardPipeline()
add_node
| add_node(component, name: str, inputs: List[str])
Add a new node to the pipeline.
Arguments:
component
: The object to be called when the data is passed to the node. It can be a Haystack component (like Retriever, Reader, or Generator) or a user-defined object that implements a run() method to process incoming data from predecessor node.name
: The name for the node. It must not contain any dots.inputs
: A list of inputs to the node. If the predecessor node has a single outgoing edge, just the name of node is sufficient. For instance, a 'ElasticsearchRetriever' node would always output a single edge with a list of documents. It can be represented as ["ElasticsearchRetriever"].
In cases when the predecessor node has multiple outputs, e.g., a "QueryClassifier", the output must be specified explicitly as "QueryClassifier.output_2".
get_node
| get_node(name: str)
Get a node from the Pipeline.
Arguments:
name
: The name of the node.
set_node
| set_node(name: str, component)
Set the component for a node in the Pipeline.
Arguments:
name
: The name of the node.component
: The component object to be set at the node.
draw
| draw(path: Path = Path("pipeline.png"))
Create a Graphviz visualization of the pipeline.
Arguments:
path
: the path to save the image.
ExtractiveQAPipeline Objects
class ExtractiveQAPipeline(BaseStandardPipeline)
__init__
| __init__(reader: BaseReader, retriever: BaseRetriever)
Initialize a Pipeline for Extractive Question Answering.
Arguments:
reader
: Reader instanceretriever
: Retriever instance
DocumentSearchPipeline Objects
class DocumentSearchPipeline(BaseStandardPipeline)
__init__
| __init__(retriever: BaseRetriever)
Initialize a Pipeline for semantic document search.
Arguments:
retriever
: Retriever instance
GenerativeQAPipeline Objects
class GenerativeQAPipeline(BaseStandardPipeline)
__init__
| __init__(generator: BaseGenerator, retriever: BaseRetriever)
Initialize a Pipeline for Generative Question Answering.
Arguments:
generator
: Generator instanceretriever
: Retriever instance
SearchSummarizationPipeline Objects
class SearchSummarizationPipeline(BaseStandardPipeline)
__init__
| __init__(summarizer: BaseSummarizer, retriever: BaseRetriever)
Initialize a Pipeline that retrieves documents for a query and then summarizes those documents.
Arguments:
summarizer
: Summarizer instanceretriever
: Retriever instance
run
| run(query: str, filters: Optional[Dict] = None, top_k_retriever: int = 10, generate_single_summary: bool = False, return_in_answer_format=False)
Arguments:
query
: Your search queryfilters
:top_k_retriever
: Number of top docs the retriever should pass to the summarizer. The higher this value, the slower your pipeline.generate_single_summary
: Whether to generate single summary from all retrieved docs (True) or one per doc (False).return_in_answer_format
: Whether the results should be returned as documents (False) or in the answer format used in other QA pipelines (True). With the latter, you can use this pipeline as a "drop-in replacement" for other QA pipelines.
FAQPipeline Objects
class FAQPipeline(BaseStandardPipeline)
__init__
| __init__(retriever: BaseRetriever)
Initialize a Pipeline for finding similar FAQs using semantic document search.
Arguments:
retriever
: Retriever instance
JoinDocuments Objects
class JoinDocuments()
A node to join documents outputted by multiple retriever nodes.
The node allows multiple join modes:
- concatenate: combine the documents from multiple nodes. Any duplicate documents are discarded.
- merge: merge scores of documents from multiple nodes. Optionally, each input score can be given a different
weight
& atop_k
limit can be set. This mode can also be used for "reranking" retrieved documents.
__init__
| __init__(join_mode: str = "concatenate", weights: Optional[List[float]] = None, top_k_join: Optional[int] = None)
Arguments:
join_mode
:concatenate
to combine documents from multiple retrievers ormerge
to aggregate scores of individual documents.weights
: A node-wise list(length of list must be equal to the number of input nodes) of weights for adjusting document scores when using themerge
join_mode. By default, equal weight is given to each retriever score. This param is not compatible with theconcatenate
join_mode.top_k_join
: Limit documents to top_k based on the resulting scores of the join.