OpenAI's Patch the Planet Squeezes Weeks of Fuzzing Into a Day

Trail of Bits engineers built a full fuzzing lab covering dozens of entry points, variant builds, and novel test seeds in less than a day using GPT-5.5-Cyber. They estimate building the same setup manually would take at least several weeks.

That's the story behind Patch the Planet, a Daybreak initiative from OpenAI announced today. It's not another vulnerability scanner. It's a structured program that pairs OpenAI's most cyber-capable models with expert human reviewers from Trail of Bits to not only find flaws but write and test patches. The goal: reduce the burden on overworked open-source maintainers, not add to it.

How the Program Works

Each engagement starts with a conversation between the maintainer and security engineers. They decide where effort is most useful: vulnerability validation, patch development, CI/CD improvements, or long-term security engineering. Researchers then investigate, validate, develop patches, and coordinate disclosure through the project's established channels.

Initial participants: cURL, NATS Server, pyca/cryptography, Sigstore, aiohttp, the Go project, freenginx, Python, and python.org. These are the networking, cryptography, software supply chain, and language infrastructure projects that downstream services depend on. More projects join in future rounds.

Researchers use Codex Security and GPT-5.5-Cyber for analysis, patch development, testing, and documentation. Participating projects get ChatGPT Pro, conditional Codex Security access, and API credits for release workflows. Trail of Bits built AI-assisted workflows for deduplication, triage, and patching.

Early Results: Hundreds of Issues, Dozens of Patches

Trail of Bits dedicated full-time engineers to work with AI across 19 open-source projects. They already identified hundreds of security issues and merged dozens of patches. Many more are under coordinated disclosure.

The initial sprint produced reusable security infrastructure: fuzzing harnesses, historical-CVE analysis pipelines, differential-testing systems, threat models, expanded test suites, and workflows for false-positive filtering and severity correction.

One standout: a fuzzing lab built in under a day. Engineers used repeated Codex /goal runs with GPT-5.5-Cyber, setting objectives and refining prompts. The system used coverage feedback to expand into new surfaces, target edge cases, and filter weak candidates. The team says the model made useful choices about where to expand coverage with limited guidance.

Another reusable pipeline ingests historical CVEs, extracts vulnerability patterns, searches target codebases for related flaws, and sends candidates through specialized judging agents. It deduplicates, filters false positives, and routes the strongest evidence for manual confirmation. Trail of Bits found the models especially effective at variant analysis, uncovering many additional issues.

Differential testing - comparing multiple implementations of the same protocol under identical inputs - normally requires custom shim code that takes weeks or months to write. Codex generated and iterated on that glue code in days, allowing engineers to fuzz implementations against each other and investigate behavioral differences. The workflow produced a high-signal set of candidates for expert review.

Patch the Planet isn't about replacing security engineers. It's about compressing the tedious parts - building harnesses, searching for variants, writing shims - so human experts can focus on the bugs that actually matter. If the model can do in a day what takes a human weeks, that's a shift in how we think about open-source security maintenance.

Source: Patch the Planet: a Daybreak initiative to support open source maintainers
Domain: openai.com

OpenAI's Patch the Planet Squeezes Weeks of Fuzzing Into a Day

How the Program Works

Early Results: Hundreds of Issues, Dozens of Patches

More in Artificial Intelligence