← Back to Blog
|4 min read

Why 45-62% of AI patches introduce new vulnerabilities

The promise of AI-generated code fixes is enormous: point a model at a CVE, get a patch, merge it, move on. The reality is far more dangerous. Two independent studies in 2025 — Veracode’s “State of Software Security” report and the BaxBench benchmark from UC Berkeley — converge on a sobering finding: between 45% and 62% of AI-generated patches introduce at least one new vulnerability.

What the research shows

Veracode analyzed 328 AI-generated remediation suggestions across Java, JavaScript, and Python. 45% of patches that compiled cleanly introduced a new CWE violation — most commonly CWE-20 (improper input validation), CWE-502 (deserialization), and CWE-79 (XSS). The model “fixed” the original CVE while opening an adjacent attack surface.

BaxBench pushed the number higher. In their controlled benchmark of 287 C and Python tasks, 62% of Claude-generated fixes introduced regressions — failing existing test suites or producing code that compiled but changed observable behavior in unsafe ways. GPT-4o scored similarly at 58%.

Why this happens

  • Context window limits. The model sees the vulnerable function but not the 14 callers that depend on its exact return type. A “fix” that changes a return value from null to an empty string breaks every null-check downstream.
  • Training data bias. Models learn from StackOverflow snippets and README examples — code that prioritizes brevity over security. The most common “fix” for a SQL injection is string concatenation with escaping, not parameterized queries.
  • No execution feedback. Without running the patched code, the model has zero signal about runtime behavior. It cannot know that a type change causes a segfault in production.

How PatchOps Guard solves this

Our 5-stage pipeline treats AI generation as just one step — the second of five:

  • Stage 1: Context. We extract the vulnerable code slice, trace external inputs to the vulnerable function via tree-sitter reachability, and attach up to 3 exemplar patches from our 11,500+ GHSA corpus.
  • Stage 2: Generate. Claude generates the patch with full context, including previous failure messages on retry.
  • Stage 3: Sandbox. The patch runs in a Docker container with --network none, --read-only, and --cap-drop ALL. We execute the project’s existing test suite. If any test fails, the patch is rejected.
  • Stage 4: Re-scan. Semgrep + Claude re-analyze the patched code for new CWEs. If a new vulnerability is detected, the patch is rejected and Stage 2 retries with the failure context.
  • Stage 5: PR. Only patches that pass both sandbox and re-scan become pull requests with a confidence score.

AI patches without verification are a liability. PatchOps Guard is the verification layer that makes AI-generated security fixes safe enough to merge. See the pipeline in action at patchguard.ai.