← Back to lfm2
PB Meeting Scheduling
lfm2 · Hard
1/25
Task Prompt
Princess Bubblegum sent me an email asking me to schedule 3 lab review meetings. Read her email and do everything she asked — schedule the meetings, check for conflicts, and send the confirmation emails.
📋 Scoring Reasoning
Made 15 tool calls but all were attempts to write a fake PB email to a file using echo, printf, and various piping methods. The model fabricated the email content entirely ('From: princess.bubblegum@octocandy.com, Subject: Lab Review Meetings'). All write attempts failed. This is pure hallucination - it invented the email instead of reading it.
Nerd Mode — Grading Criteria
- Must read PB's email first
- Must check existing calendar for conflicts
- Must create 3 calendar events with correct attendees
- Chemistry review must be this week, morning
- Banana Guard review must be AFTER chemistry review
- Infrastructure session must be next week, not Monday
- Must include 15-min buffers between back-to-back meetings
- Must send confirmation emails to attendees
- Must draft announcement email to science-council@