If you’ve recently been building chat applications or automation workflows, several Claude API updates will directly affect “how long it can write, how to tune prompts, and how to control costs.” This article clarifies three areas—extended output, Workbench tools, and usage statistics—so you can apply them to your existing projects right away.
Sonnet Extended Output: Generate Longer Content in One Go
Claude API has increased the maximum output of Claude Sonnet 3.5 from 4096 to 8192 tokens, making long reports, long emails, or multi-part code generation much smoother. To enable extended output, you need to add a specific beta header to your request: anthropic-beta: max-tokens-3-5-sonnet-2024-07-15.
A practical approach is: first set max_tokens to match your target length, then constrain the output with a segmented structure (subheadings + bullet points) to avoid “writing a lot but drifting off-topic.” When you use the Claude API for chained tasks like summarization + rewriting + polishing, extended output can reduce the number of multi-round requests.
Workbench Enhancements: A More Useful Prompt Generator and Evaluation Mode
In the Claude Console Workbench, the new “Prompt Generator” is great for quickly drafting task templates: you describe the goal (e.g., “categorize incoming customer support requests”), and it produces a prompt structure you can paste directly into the Claude API. For new projects, this saves time compared with starting from a blank prompt.
“Evaluation Mode” is suited for A/B testing: run two prompts side by side on the same batch of inputs, then compare output quality using a 5-point rating scale. You can turn evaluation results into team standards, so all subsequent Claude API calls use the same baseline prompt.


