criticalstaff2024

Anonymous Fintech (Reconstructed)

The Ghost Codebase

The feature shipped. The engineers moved on. Six months later, nobody could explain why it was built the way it was — until it failed.

The Incident

A fintech company used AI coding agents aggressively during a product sprint to build a payment reconciliation feature. The feature shipped on time. The engineers who built it were reassigned to other teams. Six months later, a data consistency bug appeared in production: in certain high-load conditions, payment records were being reconciled against stale ledger snapshots, producing incorrect account balances for a small percentage of users. The investigating team opened the codebase and found: no Architecture Decision Records explaining the snapshot strategy, no comments explaining the non-obvious concurrency model, AI-generated database queries that used a subtle read isolation level the original engineers hadn't fully understood (they accepted the AI's suggestion without verifying its semantics), and characterization tests that had never been written. The investigation took 3 weeks. The fix took 2 days. The 3 weeks were spent understanding code that nobody had ever fully understood — including the team that shipped it.

Evidence from the Scene

  • The feature was built entirely during a single sprint using AI coding agents
  • No Architecture Decision Records were written for a system handling financial data
  • The investigating engineers could not find any record of why a specific read isolation level was chosen
  • Unit tests covered happy paths only — the race condition only appeared under concurrent load
  • The original engineers had been reassigned before the bug appeared
  • The AI agent's database query suggestion was accepted without verifying the isolation semantics

The Suspects

2 of these are the real root causes. The others are plausible-sounding distractors.

No Architecture Decision Records for a financial data system — rationale for key choices was lost

An AI-generated database isolation level was accepted without verifying its semantics

The engineers leaving the team caused the knowledge loss

The AI coding agent generated incorrect code

The Verdict

Real Root Causes

  • No Architecture Decision Records for a financial data system — rationale for key choices was lost

    When AI agents generate key architectural choices (isolation levels, retry semantics, transaction boundaries), those choices need to be explicitly documented and reviewed. Without ADRs, the 'why' is lost when the engineers who accepted the AI's suggestion move on.

  • An AI-generated database isolation level was accepted without verifying its semantics

    The AI suggested a specific read isolation level that was plausible but had subtle semantics the engineers hadn't fully understood. In financial systems, isolation levels are load-bearing correctness decisions — they must be understood, not accepted.

Plausible But Wrong

  • The engineers leaving the team caused the knowledge loss

    Engineer attrition is inevitable. The failure is that the knowledge was never externalized — it lived in the engineers' heads (and the AI agent's context window) rather than in documentation. The attrition revealed the failure, it didn't cause it.

  • The AI coding agent generated incorrect code

    The code was not incorrect per se — it was correct for a specific set of assumptions about concurrency and isolation that were never made explicit. The bug was a gap between the implicit assumptions in the AI's suggestion and the actual production environment.

Summary

AI-generated code creates a new form of the legacy code problem: code that works but whose rationale is unknown. When the rationale is a load-bearing correctness decision (isolation levels, retry semantics, idempotency guarantees), unknown rationale is a ticking incident.

The Real Decision That Caused This

The real failure was treating AI-generated code the same as code whose author you can ask questions. When AI agents make non-obvious choices, those choices must be explicitly verified, understood, and documented before shipping — especially in high-stakes domains. The definition of 'done' must include 'a human understands and can defend every non-obvious decision in the code.'

Lesson Hint

Chapter 5 (Data Persistence) covers database isolation and transaction semantics. Chapter 9 (Quality & Reliability) covers testing strategies for concurrent systems. Chapter 13 (Engineering with AI Agents) covers decision records and the ghost codebase pattern.

Want to test yourself before reading the verdict?

Open Interactive Case in Autopsy Lab