Subscribe to Our Newsletter

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn’t arrive within 3 minutes, check your spam folder.

Ok, Thanks

OpenAI builds hybrid billing engine to keep Codex and Sora running at peak demand

The system blends rate limits with purchasable credits so users are not cut off when traffic spikes

Defused News Writer profile image
by Defused News Writer
OpenAI builds hybrid billing engine to keep Codex and Sora running at peak demand
Photo by SumUp / Unsplash

OpenAI has built a real-time access engine that combines rate limits with purchasable credits, allowing users of its Codex coding agent and Sora video generator to keep working when demand exceeds baseline capacity.

The company said existing approaches forced a choice between hard rate limits, which cut users off at a fixed threshold, and pure usage-based billing, which charges from the first request.

Its hybrid system instead enforces rate limits first and transitions seamlessly to credit-based access within the same request when a user's allocation runs out, with the decision made in real time.

Under the hood, the engine tracks per-user, per-feature usage, maintains rate-limit windows and live credit balances, and settles charges through a streaming asynchronous processor.

Every request follows a single evaluation path that consumes rate limits synchronously, checks credit balances when needed and returns one definitive outcome before debiting credits asynchronously.

OpenAI said the billing architecture ties together three datasets: product usage events, monetisation events and balance updates, enabling independent audit and reconciliation.

Transactions carry stable idempotency keys, unique identifiers that prevent the same charge from being applied twice, to guard against double debits.

Balance updates are applied asynchronously but in near real time, with automatic refunds issued if an update overshoots a user's balance.

The company said balance decrements and their corresponding records are written in a single atomic database transaction and serialised per account to prevent race conditions, a class of bug in which simultaneous operations produce incorrect results.

Codex and Sora are the first products to use the system, but OpenAI said the infrastructure is designed to extend to additional services over time.

The Recap

  • OpenAI built real-time access engine for Codex and Sora.
  • System tracks usage, rate-limit windows, and credit balances.
  • Billing uses events, monetisation, and balance updates for audit.
Defused News Writer profile image
by Defused News Writer

Read More