OpenAI is launching a public Safety Bug Bounty program designed to surface AI abuse risks that its existing security processes were not built to catch.
The program sits alongside OpenAI's Security Bug Bounty but targets a different category of problem: issues that cause tangible harm without meeting the technical definition of a security vulnerability. In practice, that means the grey zone where AI systems can be manipulated, redirected or exploited at scale.
What researchers can report
The scope is broad. Agentic risks are explicitly in scope, including third-party prompt injection and data exfiltration, where attacker-controlled text can reliably hijack an agent's behaviour. To qualify, a reported issue must be reproducible at least 50% of the time.
Also in scope: agentic products performing disallowed actions at scale, model outputs that return OpenAI proprietary information, and vulnerabilities that expose other proprietary data. Account and platform integrity issues are covered too, including bypassing anti-automation controls, manipulating account trust signals and evading bans or restrictions.
Jailbreaks are out of scope. Issues that allow access beyond authorised permissions belong in the Security Bug Bounty rather than this one. OpenAI says submissions will be triaged by both teams and rerouted depending on scope.
Why this program exists
The conventional security model was built for software with defined inputs and outputs. AI systems do not behave that way. An agent that can browse the web, write code and take actions on a user's behalf creates attack surfaces that have no equivalent in traditional vulnerability research.
Prompt injection is the clearest example. An attacker embeds instructions in content the agent reads, and the agent follows them. No system is compromised in the conventional sense. The model simply does what it was told, by someone other than the user.
OpenAI's decision to pay for these reports acknowledges that finding and fixing them requires external eyes. The company also runs private campaigns targeting specific harm categories, with biorisk content in ChatGPT Agent and GPT-5 cited as current examples.
The broader signal
A public bug bounty is also a statement of intent. It tells researchers that OpenAI regards safety abuse as a legitimate and compensable area of work, not a policy problem to be managed separately from engineering.
Related reading
- OpenAI publishes open-source teen safety tools for developers building AI apps
- OpenAI strengthens Sora video safety and consent controls
- OpenAI builds real-time monitor to catch misaligned behaviour in its own AI coding agents
"Our goal is to ensure our systems remain safe and secure against misuse or abuse that could lead to tangible harm," the company said in its announcement.
Researchers, ethical hackers and safety teams can apply through the Safety Bug Bounty program to submit findings and work with OpenAI on remediation.
The recap
- OpenAI launches public Safety Bug Bounty program for AI abuse.
- Reports must be reproducible at least 50% of the time.
- Researchers can apply through the Safety Bug Bounty program.