Chapter 5
Data & Persistence
Caching L1/L2/L3 · Room · Paging 3 · Offline-First · Sync · Message Reliability
Ready to practise interactively?
Explore this chapter with quizzes, diagrams, and real-world examples in the full interactive experience.
Caching — L1 (Level 1), L2 (Level 2), L3 (Level 3) Architecture
Caching is multi-layered. Always describe all three layers. Each trades speed for capacity and durability.
- 1. Check L1 (memory) — if hit, return immediately
- 2. Miss → check L2 (disk/Room) — if hit, populate L1 and return
- 3. Miss → fetch from network — populate L2 then L1, then return
- 4. On write — update L1 and L2 immediately; sync to server asynchronously
| Layer | What It Is | Speed | Survives Process Death? | Android Example |
|---|---|---|---|---|
| L1 (Level 1) — In-Memory | Data in RAM (Random Access Memory) inside the running process | ~ns | No | LruCache (Least Recently Used Cache), HashMap, StateFlow value |
| L2 (Level 2) — Disk Cache | Data written to local storage | ~ms | Yes | Room DB (Database), OkHttp disk cache, DataStore |
| L3 (Level 3) — Network/CDN (Content Delivery Network) | Data from remote server or CDN (Content Delivery Network) edge | 100ms+ | Yes (remote) | Retrofit API (Application Programming Interface), CDN (Content Delivery Network)-served images |
Interview tip: Say: 'I use a three-layer cache — in-memory LruCache for hot data, Room for warm data that survives process death, and the network as source of truth. I invalidate with ETags for REST and TTL for feeds.'
Cache Invalidation Strategies
Knowing when to invalidate cache is as important as knowing how to cache.
| Strategy | How It Works | Use Case |
|---|---|---|
| TTL (Time-To-Live) | Entry expires after fixed duration | News feeds, product listings |
| ETag (Entity Tag) / Last-Modified | Server returns hash; client sends it back; 304 if unchanged | REST (Representational State Transfer) APIs (Application Programming Interface) — avoid re-downloading unchanged data |
| Stale-while-revalidate | Serve stale immediately, refresh in background | Profile screens — fast UX (User Experience), eventual freshness |
| Write-through | Every write updates cache and server simultaneously | User preferences, critical settings |
| LRU (Least Recently Used) eviction | Drop least recently used when capacity is reached | Image caches, in-memory stores |
Room & Local Persistence
Key concepts for local data storage in Android.
- Room + SQLite (Structured Query Language) — primary structured store; use transactions for atomicity
- DAO (Data Access Object) with Flow — Room DAOs (Data Access Objects) can return Flow<List<T>>; automatically emits on DB (Database) change
- Database transactions — wrap multi-table writes in withTransaction to prevent partial state
- Migrations — always write forward migrations; test with MigrationTestHelper on every version bump
- DataStore — replace SharedPreferences; coroutine-safe, Flow-based, type-safe with Proto DataStore
- Indices — add @Index on columns used in WHERE clauses; critical for query performance on large tables
Recommended Libraries
- Room — Jetpack's SQLite abstraction. Type-safe queries, compile-time checks, Flow support. Standard for Android persistence.
- SQLDelight — Square's SQL-first database library. Write SQL, generate Kotlin. Multiplatform support (KMP).
- SQLCipher — AES-256 encryption for Room/SQLite. Protects data at rest. Essential for sensitive data.
- DataStore — Jetpack's replacement for SharedPreferences. Coroutine-safe, Flow-based. Use Proto DataStore for typed data.
Paging 3 — Deep Dive
Paging 3 is the correct way to load large datasets. Know the internals, not just the API (Application Programming Interface).
- Cursor pagination preferred over offset — consistent results under concurrent inserts; offset-based pagination can skip or duplicate items when the list changes during scroll
- Error handling — LoadState.Error in header/footer items; retry via lazyPagingItems.retry()
- Prepend vs append — append = load next page; prepend = load newer items at top (e.g. new messages in chat)
| Component | Role |
|---|---|
| PagingSource | Defines how to load a page. Implement load() to fetch from network or DB (Database). |
| RemoteMediator | Bridges network and DB (Database). Loads from network into Room; PagingSource reads from Room. Required for offline support. |
| Pager | Creates PagingData flow from PagingSource + optional RemoteMediator. |
| PagingData | Stream of paginated data. Collected in ViewModel, passed to UI (User Interface). |
| LazyPagingItems | Compose adapter. Use collectAsLazyPagingItems() in composable. |
Offline-First Architecture
The gold standard for mobile apps that need to work without connectivity.
- Write local DB first, sync to server asynchronously — never block UI on network
- Optimistic UI — show result immediately; rollback on server error
- Outbox pattern — persist pending operations in a queue (Room table); drain with WorkManager
- Conflict resolution — last-write-wins (simple), server-wins (safe), custom merge (chat, docs)
- Eventual consistency — client and server may temporarily diverge; this is acceptable for most mobile apps
| Outbox Step | Detail |
|---|---|
| 1. User action | Write to local DB and outbox table atomically in one Room transaction |
| 2. Optimistic UI | Render from local DB immediately — user sees no loading state |
| 3. Background drain | WorkManager picks up outbox entries on connectivity, retries with backoff |
| 4. Server confirm | On success: mark sent, update local record with server-assigned ID |
| 5. Conflict | On conflict: apply resolution strategy; notify user if data was overwritten |
Synchronization Strategies
Different sync strategies for different use cases.
| Strategy | How It Works | Best For |
|---|---|---|
| Short Polling | Client requests every N seconds | Simple, low-frequency data (dashboards) |
| Long Polling | Server holds request until data available | Near real-time without WebSocket infra |
| Push (FCM/WebSocket) | Server pushes changes to client | Chat, notifications — battery efficient |
| Delta Sync | Client sends lastSyncedTimestamp; server returns only changed records | Large datasets, frequent incremental syncs |
| Background Sync | WorkManager triggers on CONNECTED constraint | Offline-first apps, outbox draining |
Multi-Device Sync
User sends a message on their phone — their tablet must reflect it. This requires a device-aware sync model, not just client-server sync.
- deviceId — each device has a unique ID; server uses it to exclude the sender from push fanout
- Server timestamp — server assigns authoritative timestamp; never trust client clocks for ordering
- lastSyncedTimestamp per device — each device tracks its own sync cursor; delta sync fetches only events after that cursor
- Deduplication on receive — if device receives its own message via push, check clientMessageId against local outbox; discard if already SENT
- Conflict window — two devices editing the same record simultaneously; resolve with last-write-wins (simple), CRDT (collaborative docs), or server-merge
Message Reliability
Critical for chat, payments — any system where duplicate or lost messages are unacceptable.
- clientMessageId = UUID — client generates before send; server deduplicates on it
- Idempotency keys — same key = same result; always safe to retry
- At-least-once + server dedup — safer than at-most-once for message delivery
- Exponential backoff — never retry immediately on failure
Interview tip: Message state machine: PENDING → SENT → DELIVERED → READ (and FAILED from any state on terminal error)
CRDT & Conflict Resolution — G-Set, LWW-Register, OT, RGA
When two devices edit the same data simultaneously, you need a conflict resolution strategy. Last-Write-Wins (LWW) is simple but lossy. CRDTs (Conflict-free Replicated Data Types) are mathematically merge-safe — each type suits different use cases.
- CRDTs guarantee that any two replicas converge to the same state regardless of operation order or network partitions
- Vector clocks vs wall clocks — never use wall clock for causality; use a Lamport timestamp or vector clock
- Yjs (2023) and Automerge (2.0, 2023) are production-ready CRDT libraries used in Notion, Linear, and Figma
- For mobile offline-first: LWW-Register per field covers 90% of use cases; only reach for full CRDT for collaborative editing
- Three-way merge — git-style: find common ancestor, merge both branches' diffs; used by some sync engines (e.g. Electric SQL)
| Strategy | Mechanism | Data Loss? | Use Case |
|---|---|---|---|
| Last-Write-Wins (LWW) | Highest timestamp wins; other write discarded | Yes — concurrent edits lost | User profile fields, settings |
| Server-Wins | Server version always authoritative | Yes — client edit discarded | Read-heavy data, inventory |
| G-Set (CRDT) | Grow-only set — union of all adds, no deletes | No — monotonic | Reaction sets, tag collections |
| LWW-Register (CRDT) | Per-field LWW with vector clock, not wall clock | Partial — field-level, not record-level | User profile with concurrent field edits |
| Operational Transformation (OT) | Transform ops against each other before applying | No — but complex to implement | Google Docs-style text editing |
| RGA / CRDT Text (e.g. Yjs, Automerge) | Each character has a unique ID; inserts never conflict | No — fully concurrent | Collaborative text editing (Notion, Figma) |
Interview tip: Say: for a collaborative document feature I'd use a CRDT — specifically Yjs or Automerge — because they provide conflict-free merges without a central server arbitrating. For simpler cases like profile edits I'd use LWW-Register with vector clocks rather than wall-clock timestamps.