The easiest way for Claude Opus 4.6 to “quietly get expensive” isn’t how many times you ask, but how much content you dump in at once—and how long-winded you let it be in the output. The cost-saving techniques below aren’t based on any mysticism; the core is to reduce invalid context and repeated computation, so Claude Opus 4.6 spends its usage on the key steps.
Set the goal before you start: split one long conversation into three parts
In Claude Opus 4.6, the longer the input and the longer the output, the higher the usage is usually. It’s recommended to break your need into three parts—“clarification → plan → execution”—and have each part do only one thing, avoiding asking it in the same turn to analyze, write a finished deliverable, and handle formatting. You’ll clearly feel that you can get the same result with fewer back-and-forth rounds.
In practice, start by setting boundaries in a single sentence, such as: “Give me only 3 options, each no more than 5 lines.” Claude Opus 4.6 is very sensitive to instruction clarity—the clearer you are, the less likely it is to go off-topic and rewrite. What you save is usage.
Compress context: replace “chat history” with a “reusable summary”
Many people are reluctant to start a new chat, and the conversation gets longer and longer; Claude Opus 4.6 then has to keep computing with a large chunk of history every time. A more cost-efficient approach is: once you reach a phased conclusion, ask Claude Opus 4.6 to output a “project summary + constraints + confirmed conclusions,” then start a new chat and paste only that summary.
It’s best to keep the summary as a bullet list rather than repeating the original text; you can also ask it to “keep only information that affects what comes next.” This way you don’t lose context, and you delete irrelevant chatter from the bill.
Streamline attachments and citations: feed only the “necessary pages,” don’t dump in the whole book
In Claude Opus 4.6, uploading long documents or long screenshots most commonly forces the model to “read everything,” and usage jumps immediately. A more reliable approach is to locate the scope yourself first: upload only the relevant pages or chapters, or copy the key paragraphs into plain text and send that. Text is often more controllable than an entire PDF, and it’s easier to specify which parts you want it to process.


