Advent of Haystack

Explore Haystack with Weaviate, AssemblyAI, NVIDIA, Arize AI, and MongoDB through 10 challenges! 🎉

💙 Thank you for your interest in the Advent of Haystack 2024.

While submissions are now closed, solutions are available on each challenge page throughout January.

Day 7: Judging Toys, Tracing Joy 🧑‍⚖️

Santa collapsed into his chair with a huff, settling heavily next to Mrs. Claus.

🤶: “What’s wrong?”
🎅: “There’s just too many toys to check and not enough time! Christmas is almost here!”
🤶: “Well, can’t you just check some of them?”
🎅: “I wish it were that easy! But my elves make so many different toys, and we have to ensure every kid gets the right one!”

Elf Jane overheard the conversation from the next room. As a regular attendee at the North Pole Hackathon, she had learned a lot about evaluation recently and thought she might have a solution. “What if I build an LLM Judge to help?” she thought. “I can use Arize Phoenix to log everything—like why this toy was the perfect match or why it wasn’t!”

For this challenge, you will help Elf Jane by:

Using a Haystack pipeline to find the best toy for each child in the Big Elf Database of Christmas Wishlists (BEDCW)
Evaluating all toy matches using an LLM-as-a-Judge
Monitoring the system with the open-source tracing and evaluation tool, Arize Phoenix.

🎯 Requirements:

An Open API Key if you’d like to use OpenAIChatGenerator but you can choose any other LLM that is supported with Haystack Generators

💡 Some Hints

Take a look at this example notebook: Tracing and Evaluating a Haystack Application with Phoenix

Find more examples in Arize Phoenix Docs

🩵 Here’s the Starter Colab