The Plausible Vulnerability

The code compiled. The tests passed. The security audit found 40% of AI-generated functions had vulnerabilities the human reviewers missed.

The Incident

Multiple independent security research studies in 2023 converged on a consistent finding: AI-generated code contains security vulnerabilities at a rate of 30–40% for security-sensitive functions, and human reviewers catch these vulnerabilities at a significantly lower rate than they catch equivalent vulnerabilities in human-written code. The reason is consistent across studies: AI-generated code looks authoritative, idiomatic, and correct. Reviewers unconsciously extend more trust to well-structured code — and AI-generated code is almost always well-structured. The failure modes cluster into five categories: missing input validation, insecure defaults (cleartext, weak crypto), improper error handling that leaks information, missing authentication checks in internal APIs, and injection vulnerabilities in dynamically constructed queries. For Android specifically, the most common patterns are: raw SharedPreferences for sensitive data, missing certificate pinning, cleartext HTTP as fallback, and Room queries built with string concatenation.

Evidence from the Scene

Security reviewers approved AI-generated code at higher rates than equivalent human code with the same vulnerability
The vulnerabilities followed consistent patterns across different AI tools and different reviewers
AI-generated code passed automated SAST tools at similar rates to human code for the same bugs
The vulnerability discovery rate dropped when reviewers were told the code was AI-generated (adversarial framing)
Most vulnerabilities were in error handling paths, not main flows — the paths that are hardest to test
Mobile-specific vulnerabilities were underrepresented in AI training data — making them more likely to be generated incorrectly

The Suspects

2 of these are the real root causes. The others are plausible-sounding distractors.

Reviewers extend unconscious trust to well-structured code regardless of how it was generated

AI models are trained predominantly on public code with high rates of security vulnerabilities

SAST tools are not calibrated for AI-generated code vulnerability patterns

AI coding tools have a design flaw that introduces security bugs intentionally

The Verdict

Real Root Causes

Reviewers extend unconscious trust to well-structured code regardless of how it was generated
AI-generated code is consistently well-formatted, well-named, and idiomatic — exactly the signals human reviewers use as proxies for correctness. This creates an unconscious trust bias that reduces adversarial scrutiny.
AI models are trained predominantly on public code with high rates of security vulnerabilities
Public code repositories contain significant rates of security vulnerabilities. AI models trained on this data reproduce these patterns at rates similar to the training data distribution. Security-conscious private codebases are underrepresented in training data.

Plausible But Wrong

SAST tools are not calibrated for AI-generated code vulnerability patterns
SAST tools scan for known patterns regardless of code origin. The issue isn't SAST calibration — it's that the specific vulnerability patterns AI generates are often not caught by standard SAST rules, same as human-written equivalents.
AI coding tools have a design flaw that introduces security bugs intentionally
There is no evidence of intentional vulnerability injection. The vulnerabilities are a consequence of training data distribution and the fundamental challenge of generating code that's correct under adversarial input — a problem AI tools are not optimized for.

Summary

AI-generated code is optimized for plausibility, not for security. The combination of well-structured output (which reduces reviewer scrutiny) and training data that reflects the security quality of public repositories (which is imperfect) produces a consistent pattern: code that looks correct and has security vulnerabilities in the paths that are hardest to test.

The Real Decision That Caused This

“The real decision failure — across every organization that shipped vulnerable AI-generated code — was treating AI code review the same as human code review. AI code requires adversarial review: assume it's plausible-but-wrong in security-sensitive paths, prove it correct rather than looking for obvious bugs.”

Lesson Hint

Chapter 8 (Security & Privacy) covers the specific vulnerability categories to review for. Chapter 13 (Engineering with AI Agents) covers the adversarial review heuristics for AI-generated code.

Want to test yourself before reading the verdict?

Open Interactive Case in Autopsy Lab