If you’ve recently been using the Claude API for summarization, coding, or generating long-form text, the most noticeable change is that it “can output more,” and the developer console is also more usable. This article breaks down several new Claude API features: how to enable long outputs, how to use the Workbench for prompt evaluation, and how to understand costs in the dashboard.
Claude API long output: Sonnet 3.5 increased from 4096 to 8192
The Claude API has increased the maximum output token limit for Claude Sonnet 3.5 to 8192, but it must be explicitly enabled. When calling the Claude API, add anthropic-beta to the request headers to enable the longer output window—useful for generating more complete reports, long code files, or multi-part summaries in one go.
The exact format is straightforward: add anthropic-beta: max-tokens-3-5-sonnet-2024-07-15 to the request headers. If you run into “output truncated” in the Claude API, first check whether you forgot this toggle and whether your max_tokens is set high enough.
A smoother Workbench: prompt generator and evaluation mode
In the Claude Console Workbench, the Claude API debugging experience has been strengthened with two key tools. The first is the “Prompt Generator”: you simply describe the task goal (for example, “classify incoming customer support requests”), and it produces a well-structured prompt draft that you can copy directly into the Claude API.
The second is “Evaluation Mode”: run two or more prompts side-by-side on the same batch of inputs, compare the outputs together, and even rate performance on a 5-point scale. For Claude API use cases that require stable output (support routing, information extraction, compliance rewrites), this step can significantly reduce guesswork in prompt tuning.


