In response to widespread user complaints about complex prompts rapidly exhausting quotas, Google has announced emergency adjustments to Gemini’s compute-based usage limit system. At last week’s I/O 2026, the company switched Gemini to a new mechanism that meters usage based on prompt complexity, model used, tool calls, and chat length. However, some users found that complex tasks like processing large files or videos could instantly drain their daily quota. Google Gemini head Josh Woodward confirmed in a latest statement that the company will set a cap on the consumption of individual Gemini 3.1 Pro requests, allowing users to get more usable output from the Pro model.
Specific adjustments include: Gemini 3.1 Flash-Lite prompts are now completely free and do not count toward user quotas. Additionally, failed requests do not consume usage—"system errors are on us." For heavy tasks like Deep Research, Google will provide more detailed usage breakdown reports and real-time notifications to help users plan their usage wisely. Currently, the gemini.google.com/usage dashboard only shows an overview; in the future, it will display the compute cost of each prompt, so users can clearly understand why complex queries burn through their allowance faster.

