If you typically use Claude for code generation, long-form summarization, or batch document analysis, several recent Claude API updates will noticeably change your workflow: the core model has been upgraded to Claude Sonnet 3.5, the API output limit now supports expansion, and the console now includes more intuitive usage and cost tracking. Below, I’ll break it down item by item in terms of “changes you can use immediately.”
Claude Sonnet 3.5: A stronger default choice in the same tier
Claude Sonnet 3.5 is positioned by the official team as the flagship model in the mid-range speed and cost bracket, yet it outperforms the previous generation’s higher-end models across multiple evaluations. For developers, this means that in many scenarios you don’t need to force a more expensive model just to get better results—simply switching the default model in the Claude API to Sonnet 3.5 can deliver more reliable code quality and text reasoning capabilities.
In practical terms, it’s better suited for requests that need to be fast, accurate, and able to handle relatively complex instructions—such as fixing bugs, writing unit tests, explaining stack traces, or consolidating multiple materials into an actionable checklist.
Claude API Expanded Output: Max output increases from 4096 to 8192
One of the most useful changes this time is that Claude Sonnet 3.5 in the Claude API supports “expanded output,” doubling the maximum output token limit from 4096 to 8192. Long code completions, long report generation, and summaries with tables are much less likely to get “cut off halfway through.”
Enabling it is straightforward: include the specified beta request header in your call (the anthropic-beta field provided in the docs). It’s also recommended that you ask in the prompt to “output by sections, list the table of contents first, then write the main body,” and then pair that with a higher max tokens setting for more stable long-form output.


