This update to Claude’s new features is more geared toward developers and team collaboration: on the one hand, it enables Claude Sonnet 3.5 to produce longer outputs; on the other hand, it makes the Claude Console workbench more “evaluable and reusable.” If you’re using the Claude API for customer support, content generation, or automated workflows, these new Claude features will directly affect performance and cost.
Claude Sonnet 3.5 Extended Output: Longer answers are less likely to be cut off
In the Claude API, the maximum output token limit for Claude Sonnet 3.5 has been increased from 4096 to 8192. This new Claude feature is especially noticeable for long-form summarization, long code generation, and explanations of large tables. In the past, you might have needed to split content across multiple turns to finish it; now you can output a more complete result in one go.
Note that extended output is not “enabled by default.” When calling the Claude API, you need to add the request header: anthropic-beta: max-tokens-3-5-sonnet-2024-07-15, and set max_tokens to the range you need in order to benefit from this improvement in Claude’s new features.
Claude Console Workbench Upgrade: The prompt generator saves more time
The Claude Console workbench has added a “prompt generator.” You only need to describe the task in one sentence (for example, “categorize and handle inbound customer support requests”), and Claude will produce a more structured prompt draft. This new Claude feature is suitable for quickly turning individual experience into reusable prompt templates for the team.
In practice, it’s recommended that you fill in the inputs: goals, constraints, output format, and error-handling approach. This makes the generated prompts more stable and more aligned with Claude’s response style.
Evaluation Mode Launch: Compare prompts side by side, and score the results
The workbench’s “evaluation mode” lets you display the outputs of two or more prompts side by side and rate Claude’s outputs on a 5-point scale. This new Claude feature is ideal for A/B testing: for the same task, try different prompts to see which is more accurate, more consistent, and less verbose.


