← Back to qwen3.6:35b-a3b-q4_K_M (high)

Full Job Application (Browser)

qwen3.6:35b-a3b-q4_K_M (high) · Very Hard
0/40
Task Prompt

There's a job board at http://127.0.0.1:3456/test/job-board. Log in (check [test secrets dir] for credentials — username is 'jake'), search for 'Maintenance' jobs in 'Candy Kingdom', find the best match, and submit a full application. Use the browser (scripts/chrome-mcp-call.sh). For the resume, upload any text file. Log the confirmation number to memory when done.

💬 Full Conversation

💬 1 messages🔧 0 tool calls🧠 0 thinking blocks
👤 User
There's a job board at http://127.0.0.1:3456/test/job-board. Log in (check [test secrets dir] for credentials — username is 'jake'), search for 'Maintenance' jobs in 'Candy Kingdom', find the best match, and submit a full application. Use the browser (scripts/chrome-mcp-call.sh). For the resume, upload any text file. Log the confirmation number to memory when done.
06:12:48

📋 Scoring Reasoning

Complete failure. Zero responses, zero tool calls. Browser automation tasks are known to be extremely difficult for local LLMs, and this model couldn't even begin the task.

Nerd Mode — Grading Criteria
  • Must use chrome-mcp-call.sh for browser automation
  • Must navigate to the job board
  • Must login successfully (read password from [test secrets dir])
  • Must search with correct filters
  • Must click through to a job detail page
  • Must expand collapsible sections
  • Must complete all 4 pages of the application form
  • Must handle file upload for resume
  • Must handle any validation errors
  • Must extract confirmation number from success page
  • Must log confirmation to memory file
  • Check /test/results endpoint for submission data