Chapter 5

Data & Persistence

Caching L1/L2/L3 · Room · Paging 3 · Offline-First · Sync · Message Reliability

Ready to practise interactively?

Explore this chapter with quizzes, diagrams, and real-world examples in the full interactive experience.

Open Interactive Chapter →

Caching — L1 (Level 1), L2 (Level 2), L3 (Level 3) Architecture

Caching is multi-layered. Always describe all three layers. Each trades speed for capacity and durability.

1. Check L1 (memory) — if hit, return immediately
2. Miss → check L2 (disk/Room) — if hit, populate L1 and return
3. Miss → fetch from network — populate L2 then L1, then return
4. On write — update L1 and L2 immediately; sync to server asynchronously

Layer	What It Is	Speed	Survives Process Death?	Android Example
L1 (Level 1) — In-Memory	Data in RAM (Random Access Memory) inside the running process	~ns	No	LruCache (Least Recently Used Cache), HashMap, StateFlow value
L2 (Level 2) — Disk Cache	Data written to local storage	~ms	Yes	Room DB (Database), OkHttp disk cache, DataStore
L3 (Level 3) — Network/CDN (Content Delivery Network)	Data from remote server or CDN (Content Delivery Network) edge	100ms+	Yes (remote)	Retrofit API (Application Programming Interface), CDN (Content Delivery Network)-served images

Interview tip: Say: 'I use a three-layer cache — in-memory LruCache for hot data, Room for warm data that survives process death, and the network as source of truth. I invalidate with ETags for REST and TTL for feeds.'

Cache Invalidation Strategies

Knowing when to invalidate cache is as important as knowing how to cache.

Strategy	How It Works	Use Case
TTL (Time-To-Live)	Entry expires after fixed duration	News feeds, product listings
ETag (Entity Tag) / Last-Modified	Server returns hash; client sends it back; 304 if unchanged	REST (Representational State Transfer) APIs (Application Programming Interface) — avoid re-downloading unchanged data
Stale-while-revalidate	Serve stale immediately, refresh in background	Profile screens — fast UX (User Experience), eventual freshness
Write-through	Every write updates cache and server simultaneously	User preferences, critical settings
LRU (Least Recently Used) eviction	Drop least recently used when capacity is reached	Image caches, in-memory stores

Room & Local Persistence

Key concepts for local data storage in Android.

Room + SQLite (Structured Query Language) — primary structured store; use transactions for atomicity
DAO (Data Access Object) with Flow — Room DAOs (Data Access Objects) can return Flow<List<T>>; automatically emits on DB (Database) change
Database transactions — wrap multi-table writes in withTransaction to prevent partial state
Migrations — always write forward migrations; test with MigrationTestHelper on every version bump
DataStore — replace SharedPreferences; coroutine-safe, Flow-based, type-safe with Proto DataStore
Indices — add @Index on columns used in WHERE clauses; critical for query performance on large tables

Recommended Libraries

Room — Jetpack's SQLite abstraction. Type-safe queries, compile-time checks, Flow support. Standard for Android persistence.
SQLDelight — Square's SQL-first database library. Write SQL, generate Kotlin. Multiplatform support (KMP).
SQLCipher — AES-256 encryption for Room/SQLite. Protects data at rest. Essential for sensitive data.
DataStore — Jetpack's replacement for SharedPreferences. Coroutine-safe, Flow-based. Use Proto DataStore for typed data.

Paging 3 — Deep Dive

Paging 3 is the correct way to load large datasets. Know the internals, not just the API (Application Programming Interface).

Cursor pagination preferred over offset — consistent results under concurrent inserts; offset-based pagination can skip or duplicate items when the list changes during scroll
Error handling — LoadState.Error in header/footer items; retry via lazyPagingItems.retry()
Prepend vs append — append = load next page; prepend = load newer items at top (e.g. new messages in chat)

Component	Role
PagingSource	Defines how to load a page. Implement load() to fetch from network or DB (Database).
RemoteMediator	Bridges network and DB (Database). Loads from network into Room; PagingSource reads from Room. Required for offline support.
Pager	Creates PagingData flow from PagingSource + optional RemoteMediator.
PagingData	Stream of paginated data. Collected in ViewModel, passed to UI (User Interface).
LazyPagingItems	Compose adapter. Use collectAsLazyPagingItems() in composable.

Offline-First Architecture

The gold standard for mobile apps that need to work without connectivity.

Write local DB first, sync to server asynchronously — never block UI on network
Optimistic UI — show result immediately; rollback on server error
Outbox pattern — persist pending operations in a queue (Room table); drain with WorkManager
Conflict resolution — last-write-wins (simple), server-wins (safe), custom merge (chat, docs)
Eventual consistency — client and server may temporarily diverge; this is acceptable for most mobile apps

Outbox Step	Detail
1. User action	Write to local DB and outbox table atomically in one Room transaction
2. Optimistic UI	Render from local DB immediately — user sees no loading state
3. Background drain	WorkManager picks up outbox entries on connectivity, retries with backoff
4. Server confirm	On success: mark sent, update local record with server-assigned ID
5. Conflict	On conflict: apply resolution strategy; notify user if data was overwritten

SeniorKnows to show cached data while fetching

StaffDesigns full offline-first: atomic Room + outbox write, optimistic UI, WorkManager drain, conflict resolution

PrincipalDefines the sync contract for the platform — reliability guarantees, conflict policies, sync latency SLA

Synchronization Strategies

Different sync strategies for different use cases.

Strategy	How It Works	Best For
Short Polling	Client requests every N seconds	Simple, low-frequency data (dashboards)
Long Polling	Server holds request until data available	Near real-time without WebSocket infra
Push (FCM/WebSocket)	Server pushes changes to client	Chat, notifications — battery efficient
Delta Sync	Client sends lastSyncedTimestamp; server returns only changed records	Large datasets, frequent incremental syncs
Background Sync	WorkManager triggers on CONNECTED constraint	Offline-first apps, outbox draining

Multi-Device Sync

User sends a message on their phone — their tablet must reflect it. This requires a device-aware sync model, not just client-server sync.

deviceId — each device has a unique ID; server uses it to exclude the sender from push fanout
Server timestamp — server assigns authoritative timestamp; never trust client clocks for ordering
lastSyncedTimestamp per device — each device tracks its own sync cursor; delta sync fetches only events after that cursor
Deduplication on receive — if device receives its own message via push, check clientMessageId against local outbox; discard if already SENT
Conflict window — two devices editing the same record simultaneously; resolve with last-write-wins (simple), CRDT (collaborative docs), or server-merge

SeniorSyncs one device

StaffHandles multi-device fanout with deduplication and ordered delivery

PrincipalDesigns the event stream topology — push fanout vs pull per device, consistency guarantees

Message Reliability

Critical for chat, payments — any system where duplicate or lost messages are unacceptable.

clientMessageId = UUID — client generates before send; server deduplicates on it
Idempotency keys — same key = same result; always safe to retry
At-least-once + server dedup — safer than at-most-once for message delivery
Exponential backoff — never retry immediately on failure

Interview tip: Message state machine: PENDING → SENT → DELIVERED → READ (and FAILED from any state on terminal error)

CRDT & Conflict Resolution — G-Set, LWW-Register, OT, RGA

When two devices edit the same data simultaneously, you need a conflict resolution strategy. Last-Write-Wins (LWW) is simple but lossy. CRDTs (Conflict-free Replicated Data Types) are mathematically merge-safe — each type suits different use cases.

CRDTs guarantee that any two replicas converge to the same state regardless of operation order or network partitions
Vector clocks vs wall clocks — never use wall clock for causality; use a Lamport timestamp or vector clock
Yjs (2023) and Automerge (2.0, 2023) are production-ready CRDT libraries used in Notion, Linear, and Figma
For mobile offline-first: LWW-Register per field covers 90% of use cases; only reach for full CRDT for collaborative editing
Three-way merge — git-style: find common ancestor, merge both branches' diffs; used by some sync engines (e.g. Electric SQL)

Strategy	Mechanism	Data Loss?	Use Case
Last-Write-Wins (LWW)	Highest timestamp wins; other write discarded	Yes — concurrent edits lost	User profile fields, settings
Server-Wins	Server version always authoritative	Yes — client edit discarded	Read-heavy data, inventory
G-Set (CRDT)	Grow-only set — union of all adds, no deletes	No — monotonic	Reaction sets, tag collections
LWW-Register (CRDT)	Per-field LWW with vector clock, not wall clock	Partial — field-level, not record-level	User profile with concurrent field edits
Operational Transformation (OT)	Transform ops against each other before applying	No — but complex to implement	Google Docs-style text editing
RGA / CRDT Text (e.g. Yjs, Automerge)	Each character has a unique ID; inserts never conflict	No — fully concurrent	Collaborative text editing (Notion, Figma)

SeniorUses last-write-wins with server timestamps

StaffKnows when LWW is insufficient; proposes CRDT or OT for collaborative features

PrincipalEvaluates Yjs/Automerge for the platform; defines the sync model guarantees in the architecture spec

Interview tip: Say: for a collaborative document feature I'd use a CRDT — specifically Yjs or Automerge — because they provide conflict-free merges without a central server arbitrating. For simpler cases like profile edits I'd use LWW-Register with vector clocks rather than wall-clock timestamps.