Microsoft brings high-speed AI model engine Fireworks AI to its Azure cloud platform

The partnership gives developers faster, cheaper access to open AI models with enterprise security controls built in

by Defused News Writer

Updated March 11, 2026

Microsoft brings high-speed AI model engine Fireworks AI to its Azure cloud platform — Photo by Rubaitul Azad / Unsplash

Microsoft has added Fireworks AI, a high-performance artificial intelligence inference engine, to its Azure cloud platform in public preview, giving developers a faster and more flexible way to run open AI models at scale.

Inference, in this context, refers to the process of running a trained AI model to generate responses, the step that happens every time a user sends a message or an application requests an output.

The partnership pairs Fireworks AI's speed-focused engine with Microsoft Foundry, Azure's platform for building, deploying and managing AI models and agents, combining raw processing performance with the security, compliance and governance controls that large organisations require.

The move responds to a growing preference among businesses for open models, AI systems whose underlying weights and architecture are publicly available, which offer more control over cost, customisation and vendor independence than proprietary alternatives tied to a single provider.

Fireworks AI says its engine already processes more than 13 trillion tokens daily, handling around 180,000 requests per second and generating more than 1,000 tokens per second on large models, with tokens being the small chunks of text that AI models read and produce.

Models available through the integration include DeepSeek V3.2, OpenAI's gpt-oss-120b, Kimi K2.5 and MiniMax M2.5.

Developers can upload their own customised or compressed model weights through a bring-your-own-weights option, and choose between paying per token used or buying provisioned throughput units (PTUs), a fixed-capacity arrangement that delivers more predictable performance for high-volume applications.

Yina Arenas, Microsoft's corporate vice president for AI platform, said the integration offers unified governance, observability and agent-ready tooling for production use.

Fireworks AI models are available now in the Foundry model catalogue, with serverless and PTU deployment options.

The recap

Microsoft Foundry begins public preview of Fireworks AI inference.
Fireworks engine processes over 13T tokens daily at internet scale.
Developers can deploy serverless or provisioned throughput units.

by Defused News Writer

Updated March 11, 2026

by Defused News Writer

March 11, 2026

AI News markets

Quince valued at $10.1bn as the 'MtC' retail disruptor raises $500m in latest funding round

by Defused News Writer

March 11, 2026

AI News

Myriad Venture Partners powers up industry insights with top tier execs joining advisory board

by Defused News Writer

March 11, 2026

AI News

UK start-up FLock.io pilots sovereign AI in Malaysia

by Defused News Writer

March 11, 2026

Fintech markets

Ripple gets Aussie regs foothold as it agrees to acquire BC Payments

by Defused News Writer

March 11, 2026

Subscribe to Our Newsletter

Microsoft brings high-speed AI model engine Fireworks AI to its Azure cloud platform

The recap

OpenAI redesigns AI agent defences against manipulation attacks that mimic human social engineering

Perplexity outlines product modes and teases Comet, a new browser built around AI search

Mastercard recruits Binance, Ripple and PayPal for crypto program

OpenAI gives its developers API a built-in computers to run complex, multi-step AI tasks

Apple TV brings back Friday Night Baseball for fifth season with 25-week MLB doubleheader schedule

Explore stories

OpenAI redesigns AI agent defences against manipulation attacks that mimic human social engineering

Perplexity outlines product modes and teases Comet, a new browser built around AI search

Mastercard recruits Binance, Ripple and PayPal for crypto program

OpenAI gives its developers API a built-in computers to run complex, multi-step AI tasks

Apple TV brings back Friday Night Baseball for fifth season with 25-week MLB doubleheader schedule

Canva launches tool to convert static AI images into editable layered designs

Hyperscale Data targets Q4 profitability as bankrupt subsidiary Ballista returns to fold

Yahoo launches MyScout personalized AI homepage

NVIDIA Nemotron 3 Super boosts agentic AI throughput 5x

Perplexity switches on content filtering by default for API users and pledges zero data retention

HEA-World is launching very specific agents, made for each business

Quince valued at $10.1bn as the 'MtC' retail disruptor raises $500m in latest funding round

Myriad Venture Partners powers up industry insights with top tier execs joining advisory board

UK start-up FLock.io pilots sovereign AI in Malaysia

Ripple gets Aussie regs foothold as it agrees to acquire BC Payments

Explore topics

Tech

Artificial Intelligence

Business

Entertainment & Sport

Top tags

Microsoft brings high-speed AI model engine Fireworks AI to its Azure cloud platform

Related reading

The recap

OpenAI redesigns AI agent defences against manipulation attacks that mimic human social engineering

Perplexity outlines product modes and teases Comet, a new browser built around AI search

Mastercard recruits Binance, Ripple and PayPal for crypto program

OpenAI gives its developers API a built-in computers to run complex, multi-step AI tasks

Apple TV brings back Friday Night Baseball for fifth season with 25-week MLB doubleheader schedule

Explore stories

OpenAI redesigns AI agent defences against manipulation attacks that mimic human social engineering

Perplexity outlines product modes and teases Comet, a new browser built around AI search

Mastercard recruits Binance, Ripple and PayPal for crypto program

OpenAI gives its developers API a built-in computers to run complex, multi-step AI tasks

Apple TV brings back Friday Night Baseball for fifth season with 25-week MLB doubleheader schedule

Canva launches tool to convert static AI images into editable layered designs

Hyperscale Data targets Q4 profitability as bankrupt subsidiary Ballista returns to fold

Yahoo launches MyScout personalized AI homepage

NVIDIA Nemotron 3 Super boosts agentic AI throughput 5x

Perplexity switches on content filtering by default for API users and pledges zero data retention

HEA-World is launching very specific agents, made for each business

Quince valued at $10.1bn as the 'MtC' retail disruptor raises $500m in latest funding round

Myriad Venture Partners powers up industry insights with top tier execs joining advisory board

UK start-up FLock.io pilots sovereign AI in Malaysia

Ripple gets Aussie regs foothold as it agrees to acquire BC Payments