If you often use Claude for development, scripting, or generating long-form text, this Workbench update will save you more back-and-forth. The key changes focus on long-output capability, prompt assistance, side-by-side evaluation, and clearer usage and cost tracking. Below, I’ll break down Claude’s new features by real usage scenarios and explain them clearly.
Claude Sonnet 3.5 Long Output: Increased from 4096 to 8192
In the API, Claude Sonnet 3.5 doubles the maximum output token limit from 4096 to 8192, so long code and long reports are no longer frequently cut off. To enable extended output, you need to include the specified beta request header in your request. For generation tasks that need a “single-pass final draft,” this change is the most immediately impactful.
Add the following when calling: anthropic-beta: max-tokens-3-5-sonnet-2024-07-15, then set max_tokens as needed. It’s recommended to also state structural requirements clearly (such as sections, lists, and return format); otherwise, even with longer output, Claude’s response may become more loosely organized.
Prompt Generator: Turn Requirement Descriptions into Usable Prompts
The Workbench now includes a prompt generator. You only need to describe the task in natural language (for example, “classify and handle inbound customer support requests”), and Claude will produce a more complete prompt draft. Its value isn’t in “fancier writing,” but in filling in easy-to-miss pieces like roles, input/output constraints, and boundary conditions.
For day-to-day internal tools or PoCs, you can first have Claude produce a runnable prompt, then fine-tune fields and examples based on business rules. This is faster than writing a prompt from scratch and makes it easier to turn into a team template over time.
Evaluation Mode: Side-by-Side Comparison of Multiple Prompt Outputs
If you wanted to compare two prompt variants for the same task before, you had to copy and paste back and forth. Now, Evaluation Mode in the Workbench can display the outputs from two or more prompts side by side, and record ratings of Claude’s results on a 5-point scale.


