Anthropic has reduced the time-to-live (TTL) for the Claude Code prompt cache from one hour to five minutes for many requests, prompting complaints from users who say their allocated quotas are draining more quickly as a result.
Prompt caching allows AI models to store and reuse context from earlier in a session, reducing the number of tokens processed and the associated cost or quota consumption.
Anthropic said in an announcement that the change "should not increase costs," but a forum bug report filed by user Sean Swanson indicated the one-hour cache had previously been a feature of Claude Code, and multiple commenters said the shorter TTL appears to be affecting how quickly their quotas deplete.
Related reading
- Anthropic launches Claude Managed Agents as enterprise revenue hits $30 billion run rate
- Anthropic secures multi-gigawatt Google and Broadcom chip deal as annual revenue run rate hits $30bn
- Anthropic cuts off Claude Code subscribers from third-party AI agents, starting with OpenClaw
Participants in the thread raised broader concerns about transparency, arguing that opaque backend changes can alter usage patterns and effective value without advance notice to users.
Some commenters suggested Anthropic may need to revisit its pricing or usage rationing if capacity constraints persist following the adjustm
The recap
- Anthropic changed Claude Code prompt cache TTL for many requests.
- TTL was shortened from one hour to five minutes.
- Users report faster quota depletion after the change last month.