Haystack docs home page

Answer Generator

The Answer Generator reads a set of documents and generates an answer to a question, word by word. While extractive QA highlights the span of text that answers a query, generative QA can return a novel text answer that it has composed. The best current approaches can draw upon both the knowledge the Answer Generator gained during language model pretraining (parametric memory) and the passages provided to it with a Retriever (non-parametric memory).

Haystack offers generative systems such as RAG and LFQA, that you can run on your own hardware. Since Answer Generators are often trained concurrently with a Retriever, you should use models that were trained together for best performance. For example, the RAG model expects a Dense Passage Retrieval.

With Haystack, you can use an API to perform answer generation. GPT-3 from OpenAI is a large scale generative language model that can be used in this question answering setting. It is infeasible to run a model of this size on local hardware, but you can query an instance of GPT-3 using an API. The OpenAIAnswerGenerator class facilitates this, and like any other Node in Haystack, it can be used in isolation or as part of a Pipeline. Note that to use GPT-3, you need to sign up for an OpenAI account and obtain a valid API key.

Position in a PipelineAfter the Retriever, you can use it as a substitute to the Reader.
ClassesRAGenerator, Seq2SeqGenerator, OpenAIAnswerGenerator


  • More appropriately phrased answers.
  • Able to synthesize information from different texts.
  • Can draw on latent knowledge stored in language model.


  • Not easy to track what piece of information the Answer Generator is basing its response off of.

Tutorial: Build your own generative QA system with RAG and LFQA.

Answer Generator Classes

  • RAGenerator: Retrieval-Augmented Generator based on Hugging Face's transformers model. Its main advantages are a manageable model size and the fact that the answer generation depends on retrieved documents. This means that the model can easily adjust to domain documents even after the training is finished.
  • Seq2SeqGenerator: A generic sequence-to-sequence generator based on Hugging Face's transformers. You can use it with any Hugging Face language model that extends GenerationMixin. See also How to Generate Text.
  • OpenAIAnswerGenerator: A class that calls the GPT-3 model hosted by OpenAI. It performs queries by making API calls but otherwise functions like any other Haystack Node.


To initialize a locally hosted Answer Generator, run:

from haystack.nodes import RAGenerator
generator = RAGenerator(

To initialize the OpenAIAnswerGenerator, run:

from haystack.nodes import OpenAIAnswerGenerator
generator = OpenAIAnswerGenerator(api_key=MY_API_KEY)

To use an Answer Generator in a pipeline, run:

from haystack.pipelines import GenerativeQAPipeline
pipeline = GenerativeQAPipeline(generator=generator, retriever=retriever)
result = pipelines.run(query='What are the best party games for adults?', params={"Retriever": {"top_k": 5}})

To run a stand-alone Answer Generator, run:

result = generator.predict(
query='What are the best party games for adults?',
documents=[doc1, doc2, doc3...],