Claude New Feature Roundup: Extended Output, Usage Dashboard, and File API Highlights

Recently, Claude’s updates have leaned more toward being “developer-ready”: not only is the model stronger, but it has also filled key gaps around long outputs, cost visibility, and context continuity for long-running tasks. Below, organized by the three capabilities you’re most likely to use right away, is a quick breakdown of what’s been upgraded and how to use it.

Extended output: long articles and long code are no longer stuck at 4096

In the Claude API, Claude Sonnet 3.5’s maximum output has been increased from 4096 to 8192 tokens, making it suitable for generating more complete technical proposals, test cases, API documentation, or long blocks of code in one pass. To enable it, add the specified beta request header to your request (the example in the official docs is anthropic-beta: max-tokens-3-5-sonnet-2024-07-15).

In practice, the recommendation is: reserve extended output for content that truly must be generated in a single pass, and keep chunking content that can be split into sections—so you avoid pointless long outputs that drive up cost and waiting time. For models like Claude that excel at structured writing, the most direct benefit of longer outputs is fewer back-and-forth follow-up turns.

Usage and cost dashboard: finally track Claude costs by key

The Claude developer console now includes new “Usage” and “Cost” dashboards, allowing you to track usage by dollar amount, token count, and API key. For teams, this is more useful than looking only at the total bill: you can quickly identify which product line or which key is “quietly burning money.”

At the same time, the official documentation has added more complete release notes, making future changes across the Claude API, console, and app easier to follow and reducing the production risk of “something changed but we didn’t notice.”

File API and prompt caching: smoother for long tasks and agents

In updates related to the Claude 4 series, the API has introduced a File API, allowing Claude to read and write “memory files” during long-running tasks, so key progress, constraints, and intermediate artifacts can be persisted. This is especially helpful for code refactoring, migrations, and long-chain analysis: the task doesn’t need to restate the full context from scratch each time.

Another more directly cost-saving improvement is the prompt caching upgrade: the cache TTL has been extended from 5 minutes to 1 hour. The official note mentions it can significantly reduce cost and latency in scenarios with long prompts and repeated context. Put simply: cache the unchanging system prompt, project background, and long-document context so Claude doesn’t have to recompute them on repeated calls.

How to use it more reliably: three practical habits

First, don’t blindly crank long output to the maximum: Claude works well with “outline first, then expand,” saving extended output for the final combined draft. Second, use a separate API key for each business line and pair it with the usage dashboard for routine checks—otherwise, when anomalies happen, they’re hard to trace. Third, when you need continuity across long tasks, prioritize using the File API to store key state, then use prompt caching to lock in project background; Claude’s consistency will be more stable.

Extended output: long articles and long code are no longer stuck at 4096

Usage and cost dashboard: finally track Claude costs by key

File API and prompt caching: smoother for long tasks and agents

How to use it more reliably: three practical habits

Search articles

ChatGPT Pro Subscription | 30% Off | Credited in 1 Minute | Renewal Supported

Spotify Premium 3-Month Subscription | $10 Top-Up | For Your Own Account | Ad-Free Offline Listening

Popular Articles

Some of the best ChatGPT prompts—methods that can truly boost efficiency by 10x

Claude Code Installation Keeps Failing? A Step-by-Step Guide to Fix the Setup in 3 Steps

ChatGPT, Claude, Gemini, and Midjourney output fail-safe troubleshooting checklist and KISS prompt tips

An efficient ChatGPT + Claude + Gemini + Midjourney workflow to solve inconsistent outputs and rewrite meltdowns

ChatGPT and Claude always miss the point: three questioning techniques to make AI instantly understand your needs