Gemini audio models gain native live‑voice updates

Google rolled out an updated Gemini 2.5 Flash Native Audio model across Google AI Studio and Vertex AI, and has begun deploying it in Gemini Live and Search Live, the company said. The update brings native audio to Search Live for the first time and introduces a beta live speech translation experience in the Google Translate app, the company added.

The company said the model was improved in three areas: more reliable function calling, stronger instruction following, and better multi‑turn conversation quality. On ComplexFuncBench Audio, an evaluation capturing multi‑step function calling, the model scored 71.5%. The company reported a 90% adherence rate to developer instructions, up from 84%, and said the model retrieves context from previous turns more effectively.

Customers testing the model include Shopify, United Wholesale Mortgage and Newo.ai. “Users often forget they’re talking to AI within a minute of using Sidekick, and in some cases have thanked the bot after a long chat…New Live API AI capabilities offered through Gemini [2.5 Flash Native Audio] empower our merchants to win,” David Wurtz, VP of Product, Shopify said.

Related reading

Gemini’s live speech translation supports continuous listening and two‑way conversation, the company said, preserving intonation, pacing and pitch. It can translate over 70 languages and 2,000 language pairs, handle multilingual input, auto‑detect the spoken language, and filter ambient noise. The company said the Translate app beta is rolling out to Android devices in the US, Mexico and India, with iOS and more regions coming soon, and that it plans to expand the feature to more Google products including the Gemini API.

The company said Gemini 2.5 Flash Native Audio is generally available on Vertex AI and available as a preview in the Gemini API, and that Gemini 2.5 Flash and 2.5 Pro text‑to‑speech models are available via the Gemini API in Google AI Studio.

The Recap

Google updated Gemini 2.5 Flash Native Audio for live voice agents.
Model scored 71.5% on ComplexFuncBench Audio benchmark eval.
Live speech translation beta rolling out in Google Translate app.

Subscribe to Our Newsletter

Gemini audio models gain native live‑voice updates

The Recap

Most token launches fail in the months after listing, not on the day itself, Kraken research finds

Most workers are using AI wrong, Google and Stanford study finds

Nvidia brings 90 frames-per-second VR streaming to GeForce NOW as cloud gaming pushes into headsets

Kraken received nearly 8,000 law enforcement data requests in 2025 as regulatory scrutiny of crypto intensifies

Google expands open shopping protocol to let AI agents browse, compare and check out like human customers

Explore topics

Tech

Artificial Intelligence

Business

Entertainment & Sport

Top tags