🎄 Let's code and celebrate this holiday season with Advent of Haystack

Advent of Haystack

Welcome back to another year of Haystack challenges

with 10 challenges in the month of December 🎉

Complete and submit all challenges by December 31 for a chance to win gift cards, swag, and more! 🎁 Learn more in Advent of Haystack

✨🎄 Plus: Share Your Haystack Story This Holiday Season! 🎄✨

Spread the cheer and get perks by sharing your journey with Haystack. See How.

Day 7: Judging Toys, Tracing Joy 🧑‍⚖️

Santa collapsed into his chair with a huff, settling heavily next to Mrs. Claus.

🤶: “What’s wrong?”
🎅: “There’s just too many toys to check and not enough time! Christmas is almost here!”
🤶: “Well, can’t you just check some of them?”
🎅: “I wish it were that easy! But my elves make so many different toys, and we have to ensure every kid gets the right one!”

Elf Jane overheard the conversation from the next room. As a regular attendee at the North Pole Hackathon, she had learned a lot about evaluation recently and thought she might have a solution. “What if I build an LLM Judge to help?” she thought. “I can use Arize Phoenix to log everything—like why this toy was the perfect match or why it wasn’t!”

For this challenge, you will help Elf Jane by:

  • Using a Haystack pipeline to find the best toy for each child in the Big Elf Database of Christmas Wishlists (BEDCW)
  • Evaluating all toy matches using an LLM-as-a-Judge
  • Monitoring the system with the open-source tracing and evaluation tool, Arize Phoenix.

🎯 Requirements:

💡 Some Hints

🩵 Here’s the Starter Colab