
Summarize Hacker News Posts with Haystack & OPEA
Build a RAG pipeline to fetch live Hacker News posts and summarize them with a local LLM endpoint
June 10, 2025Welcome to this step-by-step tutorial where we’ll build a simple Retrieval-Augmented Generation (RAG) pipeline using Haystack and
OPEA. We’ll fetch the newest Hacker News posts, feed them to a lightweight LLM endpoint (OPEAGenerator
), and generate concise one-sentence summaries (based on this
notebook). Let’s dive in! ๐
1. Introduction & Motivation
In modern GenAI applications, having a flexible, performant, and scalable platform is essential. OPEA (Open Platform for Enterprise AI) is an open, model-agnostic framework for building and operating composable GenAI solutions. It provides:
- A library of microservices (LLMs, data stores, prompt engines) and higher-order megaservices for end-to-end workflows
- HTTP-based inference with multi-model support (open- and closed-source)
- Advanced features like batching, streaming, auto-scaling, routing via gateways, and unified observability
In this demo, we’ll use an OPEA LLM endpoint in a Haystack pipeline, giving you:
- Instant HTTP access to any hosted model.
- Seamless switching from small prototypes to production-grade RAG solutions.
In this tutorial, we’ll build a simple RAG pipeline that fetches the newest Hacker News posts, sends them to a local OPEA endpoint running a Qwen/Qwen2.5-7B-Instruct
demo model, and produces concise one-sentence summaries. Of course, you can replace our example model with any other OPEA-served model, making this pattern both lightweight for prototyping and powerful for real-world deployments. Let’s get started! ๐
2. Prerequisites
Make sure you have:
- Python 3.9+
- Install dependencies:
pip install haystack-ai haystack-opea newspaper3k lxml[html_clean]
- A running OPEA endpoint at http://localhost:9000/v1 (or your own)
NOTE: As a reference, here is a Docker Compose recipe to get you started. OPEA LLM service can be configured to use a variety of model serving backends like TGI, vLLM, ollama, OVMS… and offers validated runtime settings for good performance on various hardware’s including Intel Gaudi. In this example, it creates an OPEA LLM service with a TGI backend. See the documentation for LLM Generation. The code is based on OPEA LLM example and OPEA TGI example.
To run, call
LLM_MODEL_ID=Qwen/Qwen2.5-7B-Instruct docker compose up
.
3. Building the News Fetcher
We’ll create a custom Haystack component, HackernewsNewestFetcher
, that:
- Calls the Hacker News API to get the latest story IDs
- Filters for posts containing URLs
- Downloads & parses each article with
newspaper3k
. - Wraps results in Haystack
Document
objects
from typing import List
from haystack import component, Document
from newspaper import Article
import requests
@component
class HackernewsNewestFetcher():
@component.output_types(documents=List[Document])
def run(self, last_k: int):
# Fetch the IDs of the newest stories
newest_list = requests.get(
url='https://hacker-news.firebaseio.com/v0/newstories.json?print=pretty'
)
url_list = []
# Keep only the first `last_k` IDs with URLs
for id in newest_list.json()[0:last_k]:
article = requests.get(
url=f"https://hacker-news.firebaseio.com/v0/item/{id}.json?print=pretty"
)
if 'url' in article.json():
url_list.append(article.json()['url'])
docs = []
# Download and parse each article
for url in url_list:
try:
article = Article(url)
article.download()
article.parse()
docs.append(
Document(
content=article.text,
meta={'title': article.title, 'url': url}
)
)
except Exception:
print(f"Couldn't download {url}, skipped")
return {"documents": docs}
4. Integrating the LLM (OPEAGenerator)
We use the OPEAGenerator
to call our LLM over HTTP. Here, we point to a local endpoint serving the Qwen/Qwen2.5-7B-Instruct
model:
from haystack_opea import OPEAGenerator
llm = OPEAGenerator(
"http://localhost:9000/v1", # Your OPEA endpoint
"Qwen/Qwen2.5-7B-Instruct", # Model name
model_arguments={"max_tokens": 2000} # Generation settings
)
5. Crafting the Prompt
Using PromptBuilder
, we define a Jinja-style template that:
- Lists each article’s title, content, and URL.
- Asks the model for a one-sentence summary plus the URL.
from haystack.components.builders import PromptBuilder
prompt_template = """
You will be provided a few of the latest posts in HackerNews, followed by their URL.
For each post, provide a one sentence summary, followed by the original post URL.
Posts:
{% for doc in documents %}
{{doc.meta['title']}}:
{{doc.content}}
URL: {{doc.meta['url']}}
{% endfor %}
"""
prompt_builder = PromptBuilder(template=prompt_template)
6. Building the Pipeline
We wire up the components in a Pipeline
:
from haystack import Pipeline
# Instantiate components
fetcher = HackernewsNewestFetcher()
# Build and connect
pipe = Pipeline()
pipe.add_component("hackernews_fetcher", fetcher)
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)
# Define data flow
pipe.connect("hackernews_fetcher.documents", "prompt_builder.documents")
pipe.connect("prompt_builder.prompt", "llm.prompt")
# Visualize the pipeline
pipe.show() # Interactive usage
pipe.draw(pipeline.png) # Plotting to file
7. Running the Pipeline
Fetch and summarize the top 2 newest Hacker News posts:
result = pipe.run(data={"hackernews_fetcher": {"last_k": 2}})
print(result['llm']['replies'][0])
8. Results
A course on using Large Language Models (LLMs) to understand and structure search queries without relying on external services is being offered, demonstrating how LLMs can significantly improve and automate search capabilities.
[URL: https://softwaredoug.com/blog/2025/04/08/llm-query-understand]
The European Commission's new ProtectEU security strategy proposes enhanced tools for law enforcement, including methods to access encrypted data, raising significant concerns about potential violations of privacy and civil liberties.
[URL: https://www.cloudwards.net/news/protecteu-security-strategy-raises-encryption-concerns/]
Beautiful, concise summaries in seconds! โจ
9. Conclusion
In this tutorial, we built a full RAG pipeline:
- Custom news fetcher for Hacker News.
- Lightweight LLM integration via
OPEAGenerator
. - Jinja-templated prompt for structured input.
- Haystack pipeline orchestration.
Feel free to extend this setup with more advanced retrieval, caching, or different LLM backends. Happy coding! ๐ ๏ธ๐ฅ