Chapter 1
Foundations
Requirements · Scale · Latency · Consistency · Tradeoffs
Ready to practise interactively?
Explore this chapter with quizzes, diagrams, and real-world examples in the full interactive experience.
Requirements Framing
Always open here. Jumping to architecture without this is the #1 failure signal at Staff level.
- Functional requirements — what the system does (send message, stream video, process payment)
- Non-functional requirements — how well it does it (latency, availability, consistency, security)
- DAU (Daily Active Users) / MAU (Monthly Active Users) estimation — drives architecture decisions (50M DAU = very different from 500K DAU)
- Latency targets — define p50 (50th percentile) and p99 (99th percentile) explicitly before designing
- Consistency model — strong consistency (banking) vs eventual consistency (social feeds)
- Reliability / SLA (Service Level Agreement) — 99.9% uptime = 8.7 hrs downtime/year; 99.99% = 52 mins/year
- Security & privacy — PII (Personally Identifiable Information) handling, compliance (GDPR (General Data Protection Regulation), HIPAA (Health Insurance Portability and Accountability Act)), data residency
Interview tip: Before I design anything, let me clarify requirements. I'll separate functional from non-functional, then nail down scale so my architecture choices are proportionate.
Latency — p50 (50th percentile), p95 (95th percentile), p99 (99th percentile), p999 (99.9th percentile)
Percentile latency measures how fast requests complete for a given % of users. Think of 100 users making the same request — p99 is how long the 99th-slowest user waited.
| Metric | Meaning | Who cares |
|---|---|---|
| p50 | Median — half of requests are faster, half are slower. Typical user experience. | Product metrics, dashboards |
| p95 | 95% of requests complete within this time. Catches most slow outliers. | API (Application Programming Interface) SLA agreements |
| p99 | 99% complete within this time. 1 in 100 users is slower. Reveals tail latency. | Infrastructure, Staff interviews |
| p999 | 99.9% complete within this time. 1 in 1000. Used for payments, trading, safety systems. | Financial, real-time systems |
Interview tip: At 50M DAU, p99 latency = 500,000 users/day having a bad experience. Optimizing only p50 can mask severe tail latency caused by GC (Garbage Collection) pauses, DB (Database) lock contention, or cold cache misses.
Consistency Models
Understanding consistency models is crucial for making the right architectural decisions.
| Model | Guarantee | Use Case |
|---|---|---|
| Strong consistency | Every read sees the most recent write. Slower. | Banking, payments, inventory |
| Eventual consistency | All nodes converge to same value eventually. Faster. | Social feeds, likes, read receipts |
| Causal consistency | Operations that are causally related are seen in order. | Chat ordering, collaborative editing |
| Read-your-writes | A user always sees their own writes immediately. | Profile updates, settings |
Scale Estimation — DAU (Daily Active Users) to RPS (Requests Per Second)
Every Staff interview expects you to derive RPS (Requests Per Second) from DAU (Daily Active Users) before drawing any architecture. Use this formula once and move on — do not spend more than 2 minutes on it.
- Peak multiplier — assume 3x average for peak traffic; 10x for viral events
- Read:write ratio — social feed ~100:1; chat ~1:1; payments ~10:1 read-heavy
- Mobile-specific — factor in retry storms on reconnect; 10% of requests may be retries
Example: Social Feed App
━━━━━━━━━━━━━━━━━━━━━━━━━━━
50M DAU (Daily Active Users)
x 3 sessions/day
x 20 requests/session
= 3,000,000,000 requests/day
/ 86,400 seconds/day
≈ 35,000 RPS (Requests Per Second) (avg) → ~100,000 RPS peak (3x)
━━━━━━━━━━━━━━━━━━━━━━━━━━━
Storage: 50M users x 500 bytes/event x 100 events/day
= 2.5 TB (Terabyte)/day → plan for tiered cold storageInterview tip: State your assumptions out loud: 50M DAU (Daily Active Users), 3 sessions/day, 20 req/session. Round to 35k RPS (Requests Per Second). Say: this means I need horizontal scaling and a CDN (Content Delivery Network) layer. Then move on — don't get stuck on the maths.
Canonical Mobile Architecture Diagram
Draw this in every App-design interview. Adjust layers to the question — don't draw what you don't need.
┌─────────────────────────────────────────────┐
│ UI (User Interface) Layer (Compose / View)│
│ Observes StateFlow. Emits user events only.│
└──────────────────────┬──────────────────────┘
│ uiState / events
┌──────────────────────▼──────────────────────┐
│ ViewModel (viewModelScope) │
│ Holds UiState. Delegates to UseCases. │
└──────────────────────┬──────────────────────┘
│
┌──────────────────────▼──────────────────────┐
│ Domain Layer (UseCases) │
│ Pure Kotlin. Zero Android imports. │
└──────────────────────┬──────────────────────┘
│
┌──────────────────────▼──────────────────────┐
│ Repository (decides: local or network?) │
│ Cache-aside: L1 (Level 1) → L2 (Level 2) → Network
└───────────┬─────────────────────┬───────────┘
│ │
┌───────────▼───────────┐ ┌───────▼─────��─────┐
│ Room DB (L2 cache) │ │ Retrofit / OkHttp│
│ Outbox table │ │ (REST / SSE / WS)│
└───────────────────────┘ └───────────────────┘Interview tip: Dependency rule: Arrows point inward only: UI (User Interface) → ViewModel → UseCase → Repository → Data. No layer imports the layer above it. Domain has zero Android imports.
Trade-off Vocabulary
Staff engineers articulate trade-offs with precision. Interviewers specifically listen for this vocabulary — answering 'it depends' without naming the axes and the deciding constraint is the most common Senior ceiling in design interviews.
- Name both sides explicitly — say 'the trade-off is X vs. Y' before stating your choice
- State the constraint that tips the balance — 'given our 50M DAU and battery sensitivity, I choose Y'
- Acknowledge the cost — 'the downside is Z, which I'd mitigate by...'
- Never say 'it depends' without immediately naming what it depends on and how each case resolves
| Trade-off | Axis A | Axis B | Classic Android Example |
|---|---|---|---|
| Latency vs. Throughput | How fast one request completes | How many requests complete per second | p99 cache hit (latency priority) vs. batch DB write (throughput priority) |
| Consistency vs. Availability | Every read sees the most recent write | System stays up during a network partition | Payment status (strong required) vs. read receipts (eventual OK) |
| Battery vs. Freshness | Fewer wakelocks and background work | Data is more up-to-date | WorkManager periodic sync vs. FCM push-on-change |
| Memory vs. Speed | Smaller in-memory footprint | Faster access without disk I/O | LruCache bounded size vs. unbounded in-memory map |
| Complexity vs. Performance | Simpler code, easier to maintain | More performant under load | Room + Paging vs. raw SQLite cursor with manual windowing |
| Offline-first vs. Simplicity | Works without a network connection | Less sync logic to write and maintain | CRDT merge vs. last-write-wins |
| Mobile: Real-time vs. Battery | Aggressive transport for live UX | Background-friendly transport | WebSocket always-on vs. FCM + HTTP-on-demand |
Interview tip: The Staff-level template: 'The trade-off here is [A] vs [B]. Given [constraint], I'd choose [X], accepting [cost], which I'd mitigate by [approach].' Use this structure every time you make an architectural decision. Senior engineers make a choice. Staff engineers name the trade-off, the constraint, the cost, and the mitigation.