Nvidia says cost per token is the only metric that matters for AI infrastructure

The chipmaker argues buyers who focus on raw compute costs are measuring the wrong thing entirely

by Defused News Writer

Updated April 15, 2026

Nvidia is pushing enterprises to judge artificial intelligence infrastructure on cost per token, the all-in expense of producing each unit of AI output, rather than on raw compute metrics such as FLOPS per dollar or GPU hourly rates.

The argument positions modern data centres as what Nvidia calls "AI token factories," where inference, the process of running models to generate responses, has overtaken storage as the dominant workload.

Nvidia's case rests on a distinction between the numerator and denominator of the cost equation: buyers typically focus on GPU hourly rates, but the real leverage lies in how many tokens those GPUs actually deliver.

The company identifies a range of technical factors that determine delivered token output, including interconnect architecture for mixture-of-experts models, FP4 precision support, speculative decoding, KV-aware routing and disaggregated serving, as well as the capacity to handle agentic AI workloads that demand ultralow latency and long sequence lengths.

Nvidia uses its own Blackwell generation of chips to illustrate the argument, contrasting it with the prior Hopper generation.

On surface metrics, Blackwell can appear roughly twice as expensive on a compute cost basis, and FLOPS per dollar suggests only a 2x improvement, but Nvidia and the SemiAnalysis InferenceX v2 benchmark indicate Blackwell delivers more than 50 times greater token output per watt, translating to nearly 35 times lower cost per million tokens.

Nvidia attributes the gain to hardware and software codesign, combined with continued optimisation across inference stacks including vLLM, SGLang, Nvidia TensorRT-LLM and Nvidia Dynamo.

Cloud and infrastructure partners including CoreWeave, Nebius, Nscale and Together AI have already deployed Blackwell infrastructure at scale, according to the company.

The framing is an implicit challenge to rival chip vendors and cloud providers that compete on headline compute pricing, redirecting the commercial conversation toward a metric where Nvidia's latest hardware shows the most favourable numbers.

"Cost per token determines whether enterprises can profitably scale AI," Nvidia said.

The recap

Industry shifting evaluation to cost per token metric.
Blackwell delivers nearly 35x lower cost per million tokens.
Cloud partners deploy NVIDIA Blackwell infrastructure at scale.

by Defused News Writer

Updated April 15, 2026

by Defused News Writer

April 16, 2026

Markets Cryptocurrency Anthropic Big Tech

How retail investors found a way to buy shares in private AI giant Anthropic. And why some now can't sell

by Defused News Writer

April 16, 2026

AI News Quantum Computing Tech Tech Giants

Nvidia quantum AI launch triggers buying frenzy across Asian and US tech stocks

by Defused News Writer

April 15, 2026

Nvidia releases free AI tools designed to make quantum computers reliable enough to be useful

Google tells Fortune 500 bosses that training workers to use AI is a choice, not an inevitability, and the companies that move first will win

Meta raises prices on its Quest virtual reality headsets, blaming a global shortage of memory chips driven by AI demand

Google will let its AI image generator use your personal photos to create pictures of you and your family

Two startups are racing to build quantum computers out of light, but they disagree on almost everything about how to do it

Nvidia releases free AI tools designed to make quantum computers reliable enough to be useful

Google tells Fortune 500 bosses that training workers to use AI is a choice, not an inevitability, and the companies that move first will win

Meta raises prices on its Quest virtual reality headsets, blaming a global shortage of memory chips driven by AI demand

Google will let its AI image generator use your personal photos to create pictures of you and your family

Two startups are racing to build quantum computers out of light, but they disagree on almost everything about how to do it

Intel launches a new family of budget laptop chips designed to bring AI features to cheaper computers

Taiwan Semiconductor signals stronger AI chip demand

Adam Back urges gradual approach to protecting Bitcoin from quantum computers, a week after being named as its likely creator

Perplexity launches an AI assistant that takes over your Mac and works while you sleep

Anthropic takes 158,000 square feet in London as US artificial intelligence rivals race for British talent

Anthropic releases Claude Opus 4.7, its most capable publicly available AI model, but token costs give developers pause

How retail investors found a way to buy shares in private AI giant Anthropic. And why some now can't sell

Nvidia quantum AI launch triggers buying frenzy across Asian and US tech stocks

Tech

Artificial Intelligence

Business

Entertainment & Sport

Nvidia says cost per token is the only metric that matters for AI infrastructure

Related reading

The recap

Nvidia releases free AI tools designed to make quantum computers reliable enough to be useful

Google tells Fortune 500 bosses that training workers to use AI is a choice, not an inevitability, and the companies that move first will win

Meta raises prices on its Quest virtual reality headsets, blaming a global shortage of memory chips driven by AI demand

Google will let its AI image generator use your personal photos to create pictures of you and your family

Two startups are racing to build quantum computers out of light, but they disagree on almost everything about how to do it

Explore stories

Nvidia releases free AI tools designed to make quantum computers reliable enough to be useful

Google tells Fortune 500 bosses that training workers to use AI is a choice, not an inevitability, and the companies that move first will win

Meta raises prices on its Quest virtual reality headsets, blaming a global shortage of memory chips driven by AI demand

Google will let its AI image generator use your personal photos to create pictures of you and your family

Two startups are racing to build quantum computers out of light, but they disagree on almost everything about how to do it

Intel launches a new family of budget laptop chips designed to bring AI features to cheaper computers

Taiwan Semiconductor signals stronger AI chip demand

Adam Back urges gradual approach to protecting Bitcoin from quantum computers, a week after being named as its likely creator

Perplexity launches an AI assistant that takes over your Mac and works while you sleep

Anthropic takes 158,000 square feet in London as US artificial intelligence rivals race for British talent

Anthropic releases Claude Opus 4.7, its most capable publicly available AI model, but token costs give developers pause

Bitcoin miner HIVE Digital is raising $75m to fund a shift into artificial intelligence infrastructure

Attack on OpenAI boss Sam Altman reflects a growing backlash against the artificial intelligence industry

How retail investors found a way to buy shares in private AI giant Anthropic. And why some now can't sell

Nvidia quantum AI launch triggers buying frenzy across Asian and US tech stocks