Haystack docs home page


While extractive QA highlights the span of text that answers a query, generative QA can return a novel text answer that it has composed.

The best current approaches, such as Retriever-Augmented Generation and LFQA, can draw upon both the knowledge it gained during language model pretraining (parametric memory) as well as passages provided to it with a retriever (non-parametric memory).

With the advent of Transformer based retrieval methods such as Dense Passage Retrieval, retriever and generator can be trained concurrently from the one loss signal.

Tutorial: Checkout our tutorial notebooks for a guide on how to build your own generative QA system with RAG (here) or with LFQA (here).


  • More appropriately phrased answers
  • Able to synthesize information from different texts
  • Can draw on latent knowledge stored in language model


  • Not easy to track what piece of information the generator is basing its response off of


Initialize a Generator as follows:

from haystack.generator.transformers import RAGenerator
generator = RAGenerator(

Running a Generator in a pipeline:

from haystack.pipeline import GenerativeQAPipeline
pipeline = GenerativeQAPipeline(generator=generator, retriever=dpr_retriever)
result = pipelines.run(query='What are the best party games for adults?', top_k_retriever=20)

Running a stand-alone Generator:

result = generator.predict(
query='What are the best party games for adults?',
documents=[doc1, doc2, doc3...],