A team of researchers has released AgentRx, an open-source framework designed to identify the precise moment an AI agent makes an unrecoverable mistake, tackling one of the more frustrating problems in applied AI development.
AI agents are software systems that plan and carry out multi-step tasks autonomously, often using external tools and operating alongside other agents.
When they fail, working out why is genuinely hard.
Agent runs produce long, unpredictable sequences of actions, and in systems where multiple agents interact, the true cause of a failure is frequently buried several steps before anything obviously goes wrong, making it difficult to reproduce or fix.
AgentRx addresses this by treating an agent's execution history the way an engineer would treat a system log, applying structured validation rather than reading through raw output and guessing.
The framework first converts records from different agent systems into a common format, then generates a set of executable rules drawn from the tools the agent used and any relevant operating policies.
Those rules are then checked against the agent's actions to produce an auditable log of violations.
A large language model (LLM), an AI model capable of reasoning over text, then analyses that log using a nine-category failure taxonomy to identify the specific step where the agent went beyond recovery and classify the underlying cause.
Failure categories range from plan adherence failures, where an agent deviates from its intended approach, to outright system failures.
The authors described the core idea plainly: "AgentRx treats agent execution like a system trace that needs validation."
To test the framework, the team created the AgentRx Benchmark, a dataset of 115 manually annotated failed agent trajectories drawn from three established testing environments.
In experiments, AgentRx improved failure localisation by 23.6% and root-cause attribution by 22.9% compared with approaches that rely on prompting a language model directly without structured constraint checking.
Related reading
- Microsoft publishes email security performance data in push for transparency
- Deepgram brings speech recognition to Together AI's voice platform
- Facebook Marketplace adds Meta AI selling tools
Both the framework and the benchmark dataset are being released as open source, allowing other researchers and developers to use AgentRx on their own agent systems and contribute additional constraint definitions over time.
The release comes as agent-based AI systems move from research settings into production deployments, where the ability to diagnose and fix failures quickly has real operational consequences.
The recap
- AgentRx open-source framework for diagnosing AI agent failures released
- Benchmark contains 115 manually annotated failed trajectories across domains
- The project is open-sourcing code, dataset and documentation