Subscribe to Our Newsletter

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn’t arrive within 3 minutes, check your spam folder.

Ok, Thanks

NVIDIA unveils AI inference memory platform powered by BlueField-4

The world's largest company has introduced a new storage platform designed to improve performance and efficiency for large-scale AI inference workloads.

Defused News Writer profile image
by Defused News Writer
NVIDIA unveils AI inference memory platform powered by BlueField-4

NVIDIA said it has launched an Inference Context Memory Storage Platform powered by its BlueField-4 data processing unit, unveiling the technology at CES.

The company said artificial intelligence models generate large volumes of context data in the form of key-value caches, which are critical for accuracy, continuity and user experience. It added that storing this data long-term on graphics processors can create bottlenecks for real-time inference, particularly in multi-agent systems.

According to NVIDIA, the new platform extends effective GPU memory capacity and enables high-speed sharing of context data across rack-scale clusters. The company said this can increase tokens processed per second by up to five times while delivering up to five times greater power efficiency compared with conventional storage approaches.

The platform uses NVIDIA’s DOCA framework and integrates with the NVIDIA NIXL library and NVIDIA Dynamo software to accelerate key-value cache sharing, reduce time to first token and improve responsiveness across multiple interactions, the company said. It added that hardware-accelerated cache placement managed by BlueField-4 reduces data movement, removes metadata overhead and provides secure, isolated access for GPU nodes.

Related reading

NVIDIA said storage suppliers including AIC, Cloudian, DDN, Dell Technologies, Hewlett-Packard Enterprise, Hitachi Vantara, IBM, Nutanix, Pure Storage, Supermicro, VAST Data and WEKA are among the first to build next-generation AI storage platforms using BlueField-4.

The company said BlueField-4 is expected to be available in the second half of 2026.

The Recap

  • BlueField-4 powers a new Inference Context Memory Storage Platform.
  • Platform can boost tokens per seconds by up to 5x.
  • BlueField-4 will be available in the second half of 2026.
Defused News Writer profile image
by Defused News Writer

Read More