Default interruption handling is not enough for production call centres, ElevenLabs warns

ElevenLabs has published a technical assessment warning that its native barge-in handling, the mechanism that allows a caller to interrupt an AI agent mid-sentence, is insufficient for production call centre deployments without additional engineering work.

And the report goes on to say that broken turn-taking in voice AI systems costs US businesses an estimated $62 billion per year.

Barge-in, or interruption detection, is the capability that determines whether a voice agent stops talking and begins processing a new response when a human caller speaks over it.

The problem is harder than it appears because reliable interruption handling is not a single component but a chain of dependent systems, each of which must perform within tight latency constraints for the overall behaviour to feel natural.

The chain runs from voice activity detection, which identifies whether a human is speaking, through streaming speech-to-text transcription, which must produce stable interim results quickly enough to act on, to text-to-speech cancellation, which must stop audio playback at the right moment without creating an audible glitch.

ElevenLabs acknowledges that its platform handles straightforward interruptions automatically, but does not expose the low-level controls that engineering teams need to tune behaviour for difficult conditions: variable voice activity detection thresholds, overlapping-speech handling and conditional logic that can distinguish a genuine interruption from background noise or affirmative sounds such as "mm-hmm" that should not trigger a stop.

The latency budget is unforgiving.

Natural human conversation typically has gaps between turns of under 300 milliseconds, meaning a voice AI system has roughly a third of a second to detect an interruption, stop generating audio and begin processing the new input before the interaction starts to feel broken.

Under telephony conditions, with added network latency, compressed audio and the acoustic variability of real call centre environments, including background noise and diverse accents, that budget becomes even tighter.

ElevenLabs recommends that engineering teams test interruption handling against production-realistic conditions, including noisy audio, injected affirmations and concurrent-load simulation, before deploying at scale, and points teams towards Deepgram's Voice Agent API as an integrated runtime with lower-level controls for turn-taking behaviour.

The recap

ElevenLabs' native interruption handling suits scripted, clean-audio flows.
Poor customer service costs U.S. businesses $62 billion per year.
The announcement recommends realistic tests and a $200 trial.

Subscribe to Our Newsletter

Default interruption handling is not enough for production call centres, ElevenLabs warns

The recap

Microsoft warns AI agents risk becoming "double agents" as it unveils security controls at RSAC

Payward backs White House national AI framework

YouTube launches Top Sports Podcast Lineup

Nvidia shifts to end-to-end accelerated computing systems

Most token launches fail in the months after listing, not on the day itself, Kraken research finds

Explore topics

Tech

Artificial Intelligence

Business

Entertainment & Sport

Top tags

Default interruption handling is not enough for production call centres, ElevenLabs warns

Related reading

The recap

Microsoft warns AI agents risk becoming "double agents" as it unveils security controls at RSAC

Payward backs White House national AI framework

YouTube launches Top Sports Podcast Lineup

Nvidia shifts to end-to-end accelerated computing systems

Most token launches fail in the months after listing, not on the day itself, Kraken research finds