Introduction to Claude’s new features: extended thinking, the Files API, and prompt caching upgrades

This round of Claude updates doesn’t focus on “chatting better,” but instead pushes Claude a major step toward executable, sustainable workflows: the model adds an extended thinking mode; the API fills in the Files API and prompt caching; and the developer console makes cost tracking easier. Below, I’ll break down Claude’s new features by use case and explain them clearly.

Model-level update: extended thinking makes reasoning more reliable

Claude’s next-generation models offer two modes: instant replies, and extended thinking for deeper reasoning. For complex problems, long-chain decisions, or tasks that require repeated validation, Claude is more inclined to carry the reasoning through fully rather than just give a conclusion that “seems right.”

If you often use Claude for code design, debugging, or solution reviews, the value of extended thinking will be more evident: it’s better at clearly laying out assumptions, boundary conditions, and verifiable steps, reducing the back-and-forth needed to fill in missing information.

Agent capability upgrade: better endurance for long-running tasks

Claude’s agent capabilities have been enhanced, making it more friendly for “keep-running” task scenarios: it can maintain a to-do list within long-running workflows and, when granted local file access permissions, preserve continuity by writing to memory files. The officially disclosed capability limit supports independent operation on the order of several hours.

At the same time, Claude supports parallel handling of multiple tool calls, making tasks that require “look up information + edit files + run steps” more fluid. For teams, upgrades like these are more practical than incremental improvements in single-answer quality.

New Claude API features: Files API, prompt caching, and longer outputs

On the Claude API side, the Files API is now available. Developers can have Claude read and write files for context handoff in long-running tasks and for “persisting memory to disk.” This is even more critical for agent applications, batch reviews, and continuous report generation.

Prompt caching has also seen significant upgrades: the cache TTL has been extended, making it suitable for long prompts and business scenarios that repeatedly call the same context. The official direction is to reduce cost and latency. Another practical change is that Claude Sonnet 3.5 supports extended output, increasing the maximum output tokens from 4096 to 8192 (enabled by using the specified beta request header per the documentation).

A more usable console: usage & cost dashboards and release notes

The Claude developer console has added usage and cost dashboards, allowing tracking by USD amount, token count, and API key. For teams with multiple projects and multiple keys, this quickly helps pinpoint “who is burning tokens.”

In addition, Claude’s release notes are more complete, making updates across the API, console, and app easier to track. It’s recommended that before upgrading a Claude model or enabling a new beta capability, you first cross-check the release notes and run through a regression checklist—this can help you avoid a lot of pitfalls.

Model-level update: extended thinking makes reasoning more reliable

Agent capability upgrade: better endurance for long-running tasks

New Claude API features: Files API, prompt caching, and longer outputs

A more usable console: usage & cost dashboards and release notes

Search articles

ChatGPT Pro Subscription | 30% Off | Credited in 1 Minute | Renewal Supported

Spotify Premium 3-Month Subscription | $10 Top-Up | For Your Own Account | Ad-Free Offline Listening

Popular Articles

Some of the best ChatGPT prompts—methods that can truly boost efficiency by 10x

Claude Code Installation Keeps Failing? A Step-by-Step Guide to Fix the Setup in 3 Steps

ChatGPT, Claude, Gemini, and Midjourney output fail-safe troubleshooting checklist and KISS prompt tips

An efficient ChatGPT + Claude + Gemini + Midjourney workflow to solve inconsistent outputs and rewrite meltdowns

ChatGPT and Claude always miss the point: three questioning techniques to make AI instantly understand your needs