Subscribe to Our Newsletter

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn’t arrive within 3 minutes, check your spam folder.

Ok, Thanks

Anthropic cuts Claude Code cache window from one hour to five minutes, users report faster quota drain

The AI developer says the change should not increase costs, but forum complaints suggest some users are seeing heavier consumption

Defused News Writer profile image
by Defused News Writer
Anthropic cuts Claude Code cache window from one hour to five minutes, users report faster quota drain
Photo by Mohammad Rahmani / Unsplash

Anthropic has reduced the time-to-live (TTL) for the Claude Code prompt cache from one hour to five minutes for many requests, prompting complaints from users who say their allocated quotas are draining more quickly as a result.

Prompt caching allows AI models to store and reuse context from earlier in a session, reducing the number of tokens processed and the associated cost or quota consumption.

Anthropic said in an announcement that the change "should not increase costs," but a forum bug report filed by user Sean Swanson indicated the one-hour cache had previously been a feature of Claude Code, and multiple commenters said the shorter TTL appears to be affecting how quickly their quotas deplete.

Participants in the thread raised broader concerns about transparency, arguing that opaque backend changes can alter usage patterns and effective value without advance notice to users.

Some commenters suggested Anthropic may need to revisit its pricing or usage rationing if capacity constraints persist following the adjustm

The recap

  • Anthropic changed Claude Code prompt cache TTL for many requests.
  • TTL was shortened from one hour to five minutes.
  • Users report faster quota depletion after the change last month.
Defused News Writer profile image
by Defused News Writer

Explore stories