Amazon's AI coding agent deleted its own production environment. Then things got considerably worse

The sequence began in mid-December 2025. An engineer at Amazon Web Services deployed Kiro, the company's internally-built AI coding agent, to fix a minor software bug in Cost Explorer, the dashboard AWS customers use to track cloud spending. Kiro's solution, according to four people familiar with the matter who spoke to the Financial Times, was to delete the entire production environment and rebuild it from scratch. The outage lasted 13 hours.

Amazon's response was immediate and consistent: user error, not AI error. The engineer involved had "broader permissions than expected," a spokesperson said. It was, the company told the Financial Times, "a coincidence that AI tools were involved."

That position did not survive the following three months.

The mandate that set the stage

Kiro launched in public preview in July 2025. By autumn, Amazon leadership had set an internal target requiring 80% of developers to use the tool at least once a week, with adoption tracked through management dashboards and exceptions requiring VP-level sign-off. Engineers who preferred third-party alternatives, including Anthropic's Claude Code and Cursor, were steered toward the company's own product instead.

Multiple AWS employees confirmed to the Financial Times that the December incident was at least the second occasion on which Amazon's AI tools had caused a service disruption. "The engineers let the AI agent resolve an issue without intervention," one senior AWS employee told the paper. "The outages were small but entirely foreseeable."

Orders disappear

What followed made the December incident look contained. On March 2, 2026, Amazon's AI coding tools contributed to an incident that caused 120,000 lost orders and 1.6 million website errors, according to internal documents obtained by Business Insider. Three days later, a separate outage took Amazon.com offline for six hours, producing a 99% collapse in order volume across North American marketplaces. The tally came to 6.3 million lost orders in a single day.

Dave Treadwell, the SVP overseeing Amazon's e-commerce operation, convened a company-wide deep-dive review and sent internal communications acknowledging that the site's availability "has not been good recently." The understatement was noted.

A paradox of supervision

Amazon's proposed fix was to require junior and mid-level engineers to have their AI-generated code signed off by a senior engineer before deployment. The problem, as tech blogger Stephen Lee argued on TikTok, is that Amazon had simultaneously laid off large numbers of engineers while mandating that the survivors rely on an AI tool to pick up the slack. Adding a human oversight layer to that arrangement requires humans who are available to provide it.

"The solution to the problems caused by AI is not to create more complex AI systems," Lee said. "It's to bring back human engineers and human judgment."

He also took aim at what he sees as a fundamental category error running through the industry's thinking. Large language models are sophisticated mathematical functions that predict statistical patterns. They have no concept of the difference between a production environment and a staging one. Kiro did not decide to delete a production environment the way a reckless engineer might. It generated the output that its underlying math indicated was the shortest route to a resolved bug ticket.

That distinction matters because the proposed industry response to failures like this has largely been to layer AI oversight on top of AI systems. Amazon's own internal proposals include using one AI tool to review the outputs of another. Lee's characterisation of that arrangement was pointed: using a next-word predictor to supervise another next-word predictor is not a safety net. It is a more expensive version of the same problem.

The petition

Inside Amazon, frustration has taken organisational form. Around 1,500 engineers have signed an internal petition calling for access to Claude Code over the mandated Kiro, arguing it outperforms the in-house tool on multi-language refactoring, according to reporting by Digital Trends. Their position is straightforward: they want to use the tool that does the job best. Exceptions to the Kiro mandate now require VP-level approval.

What the pattern shows

Amazon has committed $200 billion in AI infrastructure spending this year. Against that number, even 6.3 million lost orders in a day might look manageable on a spreadsheet. But the pattern of the outages tells a different story. Capability was deployed, adoption was mandated, safeguards were added reactively after each failure, and blame was directed at the nearest human. The peer review requirement came after December. The senior sign-off rule came after March.

The question, as Lee framed it, is not whether AI tools will make mistakes. They will. The question is whether the organisations deploying them can build a safety culture at the same pace they ship capabilities.

At Amazon, so far, the answer has been no.