Anonymous Fintech (Reconstructed)
The Ghost Codebase
The feature shipped. The engineers moved on. Six months later, nobody could explain why it was built the way it was — until it failed.
The Incident
A fintech company used AI coding agents aggressively during a product sprint to build a payment reconciliation feature. The feature shipped on time. The engineers who built it were reassigned to other teams. Six months later, a data consistency bug appeared in production: in certain high-load conditions, payment records were being reconciled against stale ledger snapshots, producing incorrect account balances for a small percentage of users. The investigating team opened the codebase and found: no Architecture Decision Records explaining the snapshot strategy, no comments explaining the non-obvious concurrency model, AI-generated database queries that used a subtle read isolation level the original engineers hadn't fully understood (they accepted the AI's suggestion without verifying its semantics), and characterization tests that had never been written. The investigation took 3 weeks. The fix took 2 days. The 3 weeks were spent understanding code that nobody had ever fully understood — including the team that shipped it.
Evidence from the Scene
- The feature was built entirely during a single sprint using AI coding agents
- No Architecture Decision Records were written for a system handling financial data
- The investigating engineers could not find any record of why a specific read isolation level was chosen
- Unit tests covered happy paths only — the race condition only appeared under concurrent load
- The original engineers had been reassigned before the bug appeared
- The AI agent's database query suggestion was accepted without verifying the isolation semantics
The Suspects
2 of these are the real root causes. The others are plausible-sounding distractors.
No Architecture Decision Records for a financial data system — rationale for key choices was lost
An AI-generated database isolation level was accepted without verifying its semantics
The engineers leaving the team caused the knowledge loss
The AI coding agent generated incorrect code
The Verdict
Real Root Causes
No Architecture Decision Records for a financial data system — rationale for key choices was lost
When AI agents generate key architectural choices (isolation levels, retry semantics, transaction boundaries), those choices need to be explicitly documented and reviewed. Without ADRs, the 'why' is lost when the engineers who accepted the AI's suggestion move on.
An AI-generated database isolation level was accepted without verifying its semantics
The AI suggested a specific read isolation level that was plausible but had subtle semantics the engineers hadn't fully understood. In financial systems, isolation levels are load-bearing correctness decisions — they must be understood, not accepted.
Plausible But Wrong
The engineers leaving the team caused the knowledge loss
Engineer attrition is inevitable. The failure is that the knowledge was never externalized — it lived in the engineers' heads (and the AI agent's context window) rather than in documentation. The attrition revealed the failure, it didn't cause it.
The AI coding agent generated incorrect code
The code was not incorrect per se — it was correct for a specific set of assumptions about concurrency and isolation that were never made explicit. The bug was a gap between the implicit assumptions in the AI's suggestion and the actual production environment.
Summary
AI-generated code creates a new form of the legacy code problem: code that works but whose rationale is unknown. When the rationale is a load-bearing correctness decision (isolation levels, retry semantics, idempotency guarantees), unknown rationale is a ticking incident.
The Real Decision That Caused This
“The real failure was treating AI-generated code the same as code whose author you can ask questions. When AI agents make non-obvious choices, those choices must be explicitly verified, understood, and documented before shipping — especially in high-stakes domains. The definition of 'done' must include 'a human understands and can defend every non-obvious decision in the code.'”
Lesson Hint
Chapter 5 (Data Persistence) covers database isolation and transaction semantics. Chapter 9 (Quality & Reliability) covers testing strategies for concurrent systems. Chapter 13 (Engineering with AI Agents) covers decision records and the ghost codebase pattern.
Want to test yourself before reading the verdict?
Open Interactive Case in Autopsy Lab