At last week's Google I/O 2026, Google announced a shift from fixed message limits to a new quota system based on compute resource consumption for the Gemini app. The goal was simple: make lightweight text prompts cost less, while complex video or coding tasks consume more. But soon after launch, many users complained that their quotas burned too quickly—especially when uploading large files or running heavy tasks like "Deep Research." In response, Google moved quickly, with Gemini lead Josh Woodward confirming they are adjusting the system to improve the user experience.
Woodward said Google has now imposed a cap on the quota consumed per single prompt, ensuring users can get more out of the Pro model. Additionally, prompts using the Flash-Lite model remain free and do not count toward quotas; failed requests also do not deduct quota. For users who reported abnormal quota usage from the video generation feature "Omni," Google has fixed the related bug. To increase transparency, Google plans to roll out more detailed usage breakdowns and real-time notifications to help users understand where their quota is going. Currently, the dashboard at gemini.google.com/usage only shows a high-level overview. In the future, Google will also allow users to purchase pay-as-you-go top-up AI credits.

