Claude API New Feature Breakdown: Prompt Caching, Citations, and a Tool-Calling Switch

This time we’ll mainly talk about several practical new capabilities in the Claude API: prompt caching, citations and search-result content blocks, and finer-grained control over tool calling. They’re not flashy, but they can noticeably affect cost, latency, and controllability. Below, we’ll quickly break them down from the perspective of “how you can use them.”

Prompt caching: store repeated system prompts in advance

If your Claude API use case includes a large amount of repeated system prompts (for example, unified customer-service scripting rules, fixed extraction formats, or long business context), prompt caching is a great fit. According to the official documentation, reusing cached prompts can reduce latency by up to about 80% and costs by up to about 90%, which is especially friendly for batch tasks.

In practice, it’s recommended to split out the “long-term unchanged parts” into a cacheable segment, and put the “user input that changes each time” separately in subsequent messages. This way, the Claude API can keep outputs consistent without charging you repeatedly for the same long prompt every time.

Citations and search-result content blocks: making RAG easier to do right

The Claude API already provides citation capabilities, used to attribute sources for key information in an answer. For knowledge-base Q&A or retrieval-augmented generation, citations can reduce the awkwardness of responses that “sound right but have no evidence,” and they also make it easier for you to display sources in the frontend for users to verify.

In addition, search-result content blocks have been promoted to an official capability, making them better suited for handing external retrieval results to the model in a “citable structure.” You can have the Claude API include citation markers when summarizing, and then decide on the application side whether to enforce a rule like “no citations, no conclusions.”

More controllable tool use: tool_choice supports none

In the Claude API Messages endpoint, tool_choice now supports none, which explicitly forbids the model from calling any tools. This switch is very practical for risk control: when you detect that an input contains sensitive instructions, or the conversation is in a low-trust stage, you can first use the Claude API for text analysis only, without letting it trigger external actions.

At the same time, the documentation also notes: when tool_use and tool_result blocks are included, you no longer have to provide tools. This is more convenient for scenarios like “replaying historical tool-call logs and reproducing them for audit purposes.”

Computer use tool: turning automation from “suggestions” into “executable steps”

The Claude API also provides an officially defined computer use tool. Combined with the model, it can turn “how to operate a computer” into structured step outputs. It’s more like standardizing UI automation workflows: you can take over execution and validation for each step, reducing the risk of the model clicking around randomly.

A practical rollout suggestion is to start by piloting low-risk workflows, such as read-only queries in internal systems, report downloads, and form pre-filling. Once you’ve stabilized failure rollbacks, screenshot-based validation, and permission isolation, you can gradually expand to more complex workflows.

Prompt caching: store repeated system prompts in advance

Citations and search-result content blocks: making RAG easier to do right

More controllable tool use: tool_choice supports none

Computer use tool: turning automation from “suggestions” into “executable steps”

Search articles

ChatGPT Pro Subscription | 30% Off | Credited in 1 Minute | Renewal Supported

Spotify Premium 3-Month Subscription | $10 Top-Up | For Your Own Account | Ad-Free Offline Listening

Popular Articles

Some of the best ChatGPT prompts—methods that can truly boost efficiency by 10x

Claude Code Installation Keeps Failing? A Step-by-Step Guide to Fix the Setup in 3 Steps

ChatGPT, Claude, Gemini, and Midjourney output fail-safe troubleshooting checklist and KISS prompt tips

An efficient ChatGPT + Claude + Gemini + Midjourney workflow to solve inconsistent outputs and rewrite meltdowns

ChatGPT and Claude always miss the point: three questioning techniques to make AI instantly understand your needs