6/9/2026, 3:26:45 AM · 4 tests · run 6f9c98b6-f141-497f-93ea-fb113d118235
Plain streaming generation with a soft length constraint (~100 words).
{ "wordCount": 100, "outputTokens": 140 }
Offers a calculator tool and checks the model calls it for an arithmetic question.
{ "toolCalled": true, "toolName": "multiply", "args": { "a": 23, "b": 17 }, "correct": true }
Requests a strict JSON object and checks it parses with the required keys.
{ "parsed": { "name": "Anthropic", "founded": 2021, "headquarters": "San Francisco, California" }, "validJson": true, "missingKeys": [], "bareJson": false }
Summarizes an ~11k-token essay to ~500 words (long input, reasoning minimized).
{ "wordCount": 545, "inputTokens": 10602, "outputTokens": 772, "latencyMs": 11839 }