Chapter 4
Networking & Real-Time
REST (Representational State Transfer) · GraphQL · WebSocket · SSE (Server-Sent Events) · Polling · Streaming · Reconnection
Ready to practise interactively?
Explore this chapter with quizzes, diagrams, and real-world examples in the full interactive experience.
Protocol Selection
Protocol selection is the most common networking question. Always justify your choice.
| Protocol | Best For | Tradeoff |
|---|---|---|
| REST (Representational State Transfer) | CRUD (Create, Read, Update, Delete) operations, simple request-response | Simple, cacheable — but no streaming |
| GraphQL | Complex queries, avoid over/under-fetching | Flexible — but hard to cache, complex tooling |
| WebSocket | Bidirectional real-time (chat, live collab) | Low latency — but high connection overhead at scale |
| SSE (Server-Sent Events) | Server-to-client streaming (LLM (Large Language Model) tokens, live feeds) | Simple, auto-reconnect — but one-directional only |
| gRPC | Internal microservices, typed binary contracts | Fast, typed — but needs HTTP (HyperText Transfer Protocol)/2, not browser-native |
| HTTP (HyperText Transfer Protocol) Chunked | LLM (Large Language Model) token streaming, progressive responses | Simple — but limited multiplexing |
Short Polling vs Long Polling vs SSE (Server-Sent Events) vs WebSocket
Always walk through all four before committing. Picking WebSocket without acknowledging SSE (Server-Sent Events) as a simpler alternative for read-only streaming signals shallow thinking.
| Strategy | How It Works | Pros | Cons |
|---|---|---|---|
| Short Polling | Client requests every N seconds. Server responds immediately whether or not data exists. | Simple to implement | Wastes bandwidth. High server load. Stale data between polls. |
| Long Polling | Client requests. Server holds connection open until data arrives or timeout. Client re-requests immediately after. | Near real-time. Fewer wasted requests. | Ties up server connections. Doesn't scale under high concurrency. |
| SSE (Server-Sent Events) | Persistent HTTP (HyperText Transfer Protocol) connection. Server pushes events as they occur. Browser/client auto-reconnects. | Real-time. Efficient. Built-in reconnect. | Server-to-client only. Not suitable for chat. |
| WebSocket | HTTP (HyperText Transfer Protocol) upgrade to persistent bidirectional socket. Both sides send freely. | Real-time, bidirectional. | Higher infra complexity. Connection overhead at scale. |
Recommended Libraries
- okhttp-eventsource — SSE client library built on OkHttp. Production-grade, used by LaunchDarkly. Best for SSE on Android.
- Ktor SSE Plugin — Kotlin-first SSE client/server with coroutines. Part of Ktor framework. Good for KMP (Kotlin Multiplatform) projects.
- Scarlet — Retrofit-inspired WebSocket client by Tinder. Type-safe, reactive, lifecycle-aware. Best for bidirectional real-time.
- OkHttp WebSocket — Built-in WebSocket support in OkHttp. Simple API, no extra dependencies. Use for basic WebSocket needs.
- Ktor WebSocket Plugin — Coroutine-based WebSocket client for Kotlin. Good for Kotlin Multiplatform projects.
Reconnection Strategy
Handling network disconnections gracefully is critical for mobile apps.
| Strategy | When to Use |
|---|---|
| Immediate retry (once) | Transient network blip — retry once immediately |
| Exponential backoff | Server overload or sustained outage — 1s, 2s, 4s, 8s... |
| Backoff + jitter | Many clients reconnecting simultaneously — add random delay to avoid thundering herd |
| Show manual reconnect UI | Extended outage — stop retrying automatically, give user control |
Interview tip: Thundering herd — when a server restarts, thousands of clients retry simultaneously. Jitter (random delay within backoff window) distributes the reconnect load.
OkHttp Essentials
Key OkHttp concepts every mobile engineer should know.
- Interceptor chain — add auth headers, log requests, compress payloads in order
- Authenticator — intercepts HTTP 401, refreshes token transparently, replays original request
- Connection pooling — OkHttp reuses TCP connections; configure pool size for concurrent requests
- Timeouts — set connect, read, and write timeouts explicitly; never rely on defaults in production
- Certificate pinning — add via CertificatePinner; always prepare a backup pin and rotation plan
Recommended Libraries
- OkHttp — Industry-standard HTTP client for Android/JVM. Connection pooling, GZIP, caching, HTTP/2. Foundation for Retrofit.
- Retrofit — Type-safe REST client by Square. Annotation-based API definition. Uses OkHttp under the hood.
- Ktor Client — Kotlin-first HTTP client with coroutines. Multiplatform support (Android, iOS, JVM). Modern alternative to Retrofit.
Protobuf vs JSON
Serialization format choice matters at scale. Staff engineers know when JSON (JavaScript Object Notation) is the wrong answer.
| Aspect | JSON (JavaScript Object Notation) | Protobuf |
|---|---|---|
| Payload size | Verbose — field names in every response | 3-10x smaller — binary encoding |
| Parse speed | Slower — string parsing | Faster — binary maps directly to struct |
| Schema | Optional, loosely typed | Required .proto schema, strongly typed |
| Debuggability | Human readable | Binary — needs tooling to inspect |
| Versioning | Manual / convention-based | Built-in field number evolution — backward compat |
| Best for | Public APIs (Application Programming Interface), browser clients, debugging | Internal services, high-frequency mobile data |
Recommended Libraries
- kotlinx.serialization — Kotlin's official serialization library. Type-safe, multiplatform, annotation-based. Recommended for modern Kotlin projects.
- Moshi — Modern JSON library by Square. Kotlin-first, smaller than Gson, supports code generation via KSP.
- Gson — Google's JSON library. Mature, reflection-based. Legacy choice — prefer Moshi or kotlinx.serialization for new projects.
- Wire — Square's Protobuf library for Kotlin/Java. Smaller runtime than Google's protobuf-java. Good for mobile.
- protobuf-kotlin — Official Google Protobuf for Kotlin. Full-featured, larger runtime. Use for complex proto schemas.
API (Application Programming Interface) Design for Mobile
Staff engineers design and critique APIs (Application Programming Interface), not just consume them. Know what makes a mobile-friendly API.
- Cursor pagination over offset — offset pagination skips/duplicates items on concurrent inserts. Return nextCursor token in response; client passes it as param in next request.
- Structured error envelope — return '{code, message, retryable, details}' always. retryable=true tells client it is safe to retry without user action.
- Field masks / partial responses — let client specify which fields to return (?fields=id,name,avatar). Reduces payload on slow connections.
- Batch endpoints — fetch multiple resources by ID list in one request. Critical for app startup and feed rendering performance.
- API (Application Programming Interface) versioning — URL (Uniform Resource Locator) versioning (/v1/, /v2/) is simplest and most debuggable. Never break existing clients silently.
- Idempotency keys — POST endpoints that create resources must accept a client-generated key so retries are safe.
Client-Side Rate Limiting & Request Queuing
Preventing the client from overloading its own server — and handling fast user input gracefully.
- OkHttp dispatcher limits — set dispatcher.maxRequests (default 64) and dispatcher.maxRequestsPerHost (default 5) to prevent connection saturation
- Debounce user input — Flow debounce(300ms) on search; flatMapLatest cancels the in-flight request when new input arrives
- Request deduplication — track in-flight request keys; if the same request is already running, return the existing Deferred
- Cancel on navigation — viewModelScope cancellation handles this automatically for coroutine-based requests; no manual cleanup needed
- Exponential backoff — never retry immediately; back off to avoid worsening server pressure during an outage
SSE with callbackFlow — The Canonical Kotlin Pattern
Server-Sent Events (SSE) is the standard for server-to-client streaming (AI token streams, live scores, notifications). In Kotlin, wrap the callback-based OkHttp EventSource in callbackFlow to get a first-class coroutines Flow.
- callbackFlow bridges callback APIs to Flow — use trySend (non-blocking) inside callbacks, never send
- awaitClose is mandatory — it keeps the flow alive until the collector cancels and runs cleanup
- retryWhen on the collector side handles reconnects with backoff — do not retry inside callbackFlow
- SSE auto-reconnects via Last-Event-ID header — server uses it to resume from last delivered event
- For AI token streaming (OpenAI, Gemini), SSE is the correct transport — WebSocket is overkill for unidirectional push
// SSE stream as a cold Flow using callbackFlow
// Requires: com.launchdarkly:okhttp-eventsource
fun sseFlow(url: String, client: OkHttpClient): Flow<String> = callbackFlow {
val handler = object : EventHandler {
override fun onOpen() {}
override fun onClosed() { channel.close() }
override fun onMessage(event: String, messageEvent: MessageEvent) {
// trySend is non-blocking — safe to call from the EventSource thread
trySend(messageEvent.data)
}
override fun onError(t: Throwable) { close(t) }
override fun onComment(comment: String) {}
}
val eventSource = EventSource.Builder(handler, URI.create(url))
.client(client)
.build()
eventSource.start()
// awaitClose runs when the collector cancels or the scope ends
awaitClose { eventSource.close() }
}
// Caller — automatic retry with exponential backoff
viewModelScope.launch {
sseFlow(url, client)
.retryWhen { cause, attempt ->
if (attempt < 5 && cause is IOException) {
delay(2.0.pow(attempt.toInt()).toLong() * 1000)
true
} else false
}
.collect { token -> _uiState.update { it.copy(text = it.text + token) } }
}Interview tip: If asked how to stream AI tokens to an Android client, say: SSE transport, callbackFlow on the client, retryWhen with exponential backoff, and Last-Event-ID for resume. That answer covers protocol, Kotlin pattern, reliability, and resumability.
Distributed Tracing & Request Correlation
In a microservices backend, a single mobile request fans out across 5–10 services. Without tracing, debugging a latency spike is guesswork. W3C traceparent (standardised in 2019) is the modern way to propagate trace context.
- traceparent format — 00-{traceId(32hex)}-{parentSpanId(16hex)}-{flags(2hex)}; traceId is stable across all hops
- X-Request-ID — simpler client-generated UUID; useful for log correlation even without a full tracing stack
- tracestate header — vendor-specific baggage alongside traceparent; use for feature flags or A/B experiment IDs
- Sampling — trace every request in dev; 1% in prod with 100% for errors (head-based vs tail-based sampling)
- Android OpenTelemetry SDK (alpha, 2024) — official CNCF SDK for Android distributed tracing
// OkHttp interceptor — inject W3C traceparent header
class TracingInterceptor(private val tracer: Tracer) : Interceptor {
override fun intercept(chain: Interceptor.Chain): Response {
val span = tracer.buildSpan(chain.request().url.pathSegments.last())
.withTag("http.method", chain.request().method)
.start()
// W3C traceparent: version-traceId-spanId-flags
// e.g. "00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01"
val request = chain.request().newBuilder()
.header("traceparent", span.toW3CTraceParent())
.header("X-Request-ID", UUID.randomUUID().toString())
.build()
return try {
val response = chain.proceed(request)
span.setTag("http.status_code", response.code).finish()
response
} catch (e: IOException) {
span.setTag("error", true).log(e.message ?: "io_error").finish()
throw e
}
}
}Interview tip: Mention traceparent propagation when discussing observability. It shows you think end-to-end: mobile emits the trace root, every backend service inherits it, and Jaeger/Tempo stitches the waterfall automatically.
Rate Limiting Algorithms — Token Bucket, Leaky Bucket, Sliding Window
Rate limiting protects both server (from abuse) and client (from ban). Know all three algorithms — interviewers ask you to compare them at Staff level.
- Token bucket — standard for API clients; capacity = max burst, refill rate = sustained limit
- HTTP 429 Too Many Requests — always honour Retry-After header from server; do not ignore it
- Leaky bucket guarantees constant output rate — correct for payment flows where bursting is unsafe
- Sliding window avoids the double-burst problem at fixed window boundaries
- Client-side limiting protects your own app from getting banned; server-side protects backend resources
| Algorithm | Analogy | Burst? | Use Case |
|---|---|---|---|
| Token Bucket | Bucket fills at fixed rate; each request consumes a token | Yes — up to bucket capacity | API calls — allows bursting within limit |
| Leaky Bucket | Requests queue; processed at fixed rate like water dripping | No — smoothed output | Payment processing — strict rate, no bursting |
| Sliding Window Counter | Count requests in rolling time window | Partial — smoother than fixed window | HTTP 429 enforcement on server side |
| Fixed Window Counter | Count resets at boundary (e.g. every minute) | Yes — double at boundary | Simple quota tracking; boundary burst risk |
// Client-side Token Bucket rate limiter
class TokenBucketRateLimiter(
private val capacity: Int, // max burst size
private val refillRatePerSec: Double // tokens added per second
) {
private var tokens = capacity.toDouble()
private var lastRefillNanos = System.nanoTime()
@Synchronized
fun tryAcquire(): Boolean {
refill()
return if (tokens >= 1.0) { tokens -= 1.0; true } else false
}
private fun refill() {
val now = System.nanoTime()
val elapsed = (now - lastRefillNanos) / 1_000_000_000.0
tokens = minOf(capacity.toDouble(), tokens + elapsed * refillRatePerSec)
lastRefillNanos = now
}
}
// Usage in Retrofit interceptor
class RateLimitInterceptor(private val limiter: TokenBucketRateLimiter) : Interceptor {
override fun intercept(chain: Interceptor.Chain): Response {
if (!limiter.tryAcquire()) throw RateLimitException("Client rate limit exceeded")
return chain.proceed(chain.request())
}
}Interview tip: When asked about rate limiting, name the algorithm, explain burst behaviour, and say which you'd pick for the specific use case. Token bucket for search/autocomplete, leaky bucket for payment APIs.