Chapter 4

Networking & Real-Time

REST (Representational State Transfer) · GraphQL · WebSocket · SSE (Server-Sent Events) · Polling · Streaming · Reconnection

Ready to practise interactively?

Explore this chapter with quizzes, diagrams, and real-world examples in the full interactive experience.

Open Interactive Chapter →

Protocol Selection

Protocol selection is the most common networking question. Always justify your choice.

ProtocolBest ForTradeoff
REST (Representational State Transfer)CRUD (Create, Read, Update, Delete) operations, simple request-responseSimple, cacheable — but no streaming
GraphQLComplex queries, avoid over/under-fetchingFlexible — but hard to cache, complex tooling
WebSocketBidirectional real-time (chat, live collab)Low latency — but high connection overhead at scale
SSE (Server-Sent Events)Server-to-client streaming (LLM (Large Language Model) tokens, live feeds)Simple, auto-reconnect — but one-directional only
gRPCInternal microservices, typed binary contractsFast, typed — but needs HTTP (HyperText Transfer Protocol)/2, not browser-native
HTTP (HyperText Transfer Protocol) ChunkedLLM (Large Language Model) token streaming, progressive responsesSimple — but limited multiplexing

Short Polling vs Long Polling vs SSE (Server-Sent Events) vs WebSocket

Always walk through all four before committing. Picking WebSocket without acknowledging SSE (Server-Sent Events) as a simpler alternative for read-only streaming signals shallow thinking.

StrategyHow It WorksProsCons
Short PollingClient requests every N seconds. Server responds immediately whether or not data exists.Simple to implementWastes bandwidth. High server load. Stale data between polls.
Long PollingClient requests. Server holds connection open until data arrives or timeout. Client re-requests immediately after.Near real-time. Fewer wasted requests.Ties up server connections. Doesn't scale under high concurrency.
SSE (Server-Sent Events)Persistent HTTP (HyperText Transfer Protocol) connection. Server pushes events as they occur. Browser/client auto-reconnects.Real-time. Efficient. Built-in reconnect.Server-to-client only. Not suitable for chat.
WebSocketHTTP (HyperText Transfer Protocol) upgrade to persistent bidirectional socket. Both sides send freely.Real-time, bidirectional.Higher infra complexity. Connection overhead at scale.

Recommended Libraries

  • okhttp-eventsource SSE client library built on OkHttp. Production-grade, used by LaunchDarkly. Best for SSE on Android.
  • Ktor SSE Plugin Kotlin-first SSE client/server with coroutines. Part of Ktor framework. Good for KMP (Kotlin Multiplatform) projects.
  • Scarlet Retrofit-inspired WebSocket client by Tinder. Type-safe, reactive, lifecycle-aware. Best for bidirectional real-time.
  • OkHttp WebSocket Built-in WebSocket support in OkHttp. Simple API, no extra dependencies. Use for basic WebSocket needs.
  • Ktor WebSocket Plugin Coroutine-based WebSocket client for Kotlin. Good for Kotlin Multiplatform projects.

Reconnection Strategy

Handling network disconnections gracefully is critical for mobile apps.

StrategyWhen to Use
Immediate retry (once)Transient network blip — retry once immediately
Exponential backoffServer overload or sustained outage — 1s, 2s, 4s, 8s...
Backoff + jitterMany clients reconnecting simultaneously — add random delay to avoid thundering herd
Show manual reconnect UIExtended outage — stop retrying automatically, give user control

Interview tip: Thundering herd — when a server restarts, thousands of clients retry simultaneously. Jitter (random delay within backoff window) distributes the reconnect load.

OkHttp Essentials

Key OkHttp concepts every mobile engineer should know.

  • Interceptor chain — add auth headers, log requests, compress payloads in order
  • Authenticator — intercepts HTTP 401, refreshes token transparently, replays original request
  • Connection pooling — OkHttp reuses TCP connections; configure pool size for concurrent requests
  • Timeouts — set connect, read, and write timeouts explicitly; never rely on defaults in production
  • Certificate pinning — add via CertificatePinner; always prepare a backup pin and rotation plan

Recommended Libraries

  • OkHttp Industry-standard HTTP client for Android/JVM. Connection pooling, GZIP, caching, HTTP/2. Foundation for Retrofit.
  • Retrofit Type-safe REST client by Square. Annotation-based API definition. Uses OkHttp under the hood.
  • Ktor Client Kotlin-first HTTP client with coroutines. Multiplatform support (Android, iOS, JVM). Modern alternative to Retrofit.

Protobuf vs JSON

Serialization format choice matters at scale. Staff engineers know when JSON (JavaScript Object Notation) is the wrong answer.

AspectJSON (JavaScript Object Notation)Protobuf
Payload sizeVerbose — field names in every response3-10x smaller — binary encoding
Parse speedSlower — string parsingFaster — binary maps directly to struct
SchemaOptional, loosely typedRequired .proto schema, strongly typed
DebuggabilityHuman readableBinary — needs tooling to inspect
VersioningManual / convention-basedBuilt-in field number evolution — backward compat
Best forPublic APIs (Application Programming Interface), browser clients, debuggingInternal services, high-frequency mobile data

Recommended Libraries

  • kotlinx.serialization Kotlin's official serialization library. Type-safe, multiplatform, annotation-based. Recommended for modern Kotlin projects.
  • Moshi Modern JSON library by Square. Kotlin-first, smaller than Gson, supports code generation via KSP.
  • Gson Google's JSON library. Mature, reflection-based. Legacy choice — prefer Moshi or kotlinx.serialization for new projects.
  • Wire Square's Protobuf library for Kotlin/Java. Smaller runtime than Google's protobuf-java. Good for mobile.
  • protobuf-kotlin Official Google Protobuf for Kotlin. Full-featured, larger runtime. Use for complex proto schemas.
SeniorUses JSON by default
StaffProposes Protobuf for high-frequency endpoints after measuring payload overhead
PrincipalDefines serialization strategy across platform, manages schema evolution policy

API (Application Programming Interface) Design for Mobile

Staff engineers design and critique APIs (Application Programming Interface), not just consume them. Know what makes a mobile-friendly API.

  • Cursor pagination over offset — offset pagination skips/duplicates items on concurrent inserts. Return nextCursor token in response; client passes it as param in next request.
  • Structured error envelope — return '{code, message, retryable, details}' always. retryable=true tells client it is safe to retry without user action.
  • Field masks / partial responses — let client specify which fields to return (?fields=id,name,avatar). Reduces payload on slow connections.
  • Batch endpoints — fetch multiple resources by ID list in one request. Critical for app startup and feed rendering performance.
  • API (Application Programming Interface) versioning — URL (Uniform Resource Locator) versioning (/v1/, /v2/) is simplest and most debuggable. Never break existing clients silently.
  • Idempotency keys — POST endpoints that create resources must accept a client-generated key so retries are safe.
SeniorConsumes APIs correctly
StaffDefines the API contract with backend — pagination, error schema, versioning
PrincipalOwns the mobile API platform standard across the org

Client-Side Rate Limiting & Request Queuing

Preventing the client from overloading its own server — and handling fast user input gracefully.

  • OkHttp dispatcher limits — set dispatcher.maxRequests (default 64) and dispatcher.maxRequestsPerHost (default 5) to prevent connection saturation
  • Debounce user input — Flow debounce(300ms) on search; flatMapLatest cancels the in-flight request when new input arrives
  • Request deduplication — track in-flight request keys; if the same request is already running, return the existing Deferred
  • Cancel on navigation — viewModelScope cancellation handles this automatically for coroutine-based requests; no manual cleanup needed
  • Exponential backoff — never retry immediately; back off to avoid worsening server pressure during an outage

SSE with callbackFlow — The Canonical Kotlin Pattern

Server-Sent Events (SSE) is the standard for server-to-client streaming (AI token streams, live scores, notifications). In Kotlin, wrap the callback-based OkHttp EventSource in callbackFlow to get a first-class coroutines Flow.

  • callbackFlow bridges callback APIs to Flow — use trySend (non-blocking) inside callbacks, never send
  • awaitClose is mandatory — it keeps the flow alive until the collector cancels and runs cleanup
  • retryWhen on the collector side handles reconnects with backoff — do not retry inside callbackFlow
  • SSE auto-reconnects via Last-Event-ID header — server uses it to resume from last delivered event
  • For AI token streaming (OpenAI, Gemini), SSE is the correct transport — WebSocket is overkill for unidirectional push
// SSE stream as a cold Flow using callbackFlow
// Requires: com.launchdarkly:okhttp-eventsource
fun sseFlow(url: String, client: OkHttpClient): Flow<String> = callbackFlow {
    val handler = object : EventHandler {
        override fun onOpen() {}
        override fun onClosed() { channel.close() }
        override fun onMessage(event: String, messageEvent: MessageEvent) {
            // trySend is non-blocking — safe to call from the EventSource thread
            trySend(messageEvent.data)
        }
        override fun onError(t: Throwable) { close(t) }
        override fun onComment(comment: String) {}
    }
    val eventSource = EventSource.Builder(handler, URI.create(url))
        .client(client)
        .build()
    eventSource.start()

    // awaitClose runs when the collector cancels or the scope ends
    awaitClose { eventSource.close() }
}

// Caller — automatic retry with exponential backoff
viewModelScope.launch {
    sseFlow(url, client)
        .retryWhen { cause, attempt ->
            if (attempt < 5 && cause is IOException) {
                delay(2.0.pow(attempt.toInt()).toLong() * 1000)
                true
            } else false
        }
        .collect { token -> _uiState.update { it.copy(text = it.text + token) } }
}
SeniorUses callbackFlow correctly with awaitClose
StaffAdds retryWhen backoff, Last-Event-ID resume, and back-pressure awareness (buffer operator)
PrincipalDefines the event streaming platform — fanout topology, per-device cursor, at-least-once delivery guarantees

Interview tip: If asked how to stream AI tokens to an Android client, say: SSE transport, callbackFlow on the client, retryWhen with exponential backoff, and Last-Event-ID for resume. That answer covers protocol, Kotlin pattern, reliability, and resumability.

Distributed Tracing & Request Correlation

In a microservices backend, a single mobile request fans out across 5–10 services. Without tracing, debugging a latency spike is guesswork. W3C traceparent (standardised in 2019) is the modern way to propagate trace context.

  • traceparent format — 00-{traceId(32hex)}-{parentSpanId(16hex)}-{flags(2hex)}; traceId is stable across all hops
  • X-Request-ID — simpler client-generated UUID; useful for log correlation even without a full tracing stack
  • tracestate header — vendor-specific baggage alongside traceparent; use for feature flags or A/B experiment IDs
  • Sampling — trace every request in dev; 1% in prod with 100% for errors (head-based vs tail-based sampling)
  • Android OpenTelemetry SDK (alpha, 2024) — official CNCF SDK for Android distributed tracing
// OkHttp interceptor — inject W3C traceparent header
class TracingInterceptor(private val tracer: Tracer) : Interceptor {
    override fun intercept(chain: Interceptor.Chain): Response {
        val span = tracer.buildSpan(chain.request().url.pathSegments.last())
            .withTag("http.method", chain.request().method)
            .start()

        // W3C traceparent: version-traceId-spanId-flags
        // e.g. "00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01"
        val request = chain.request().newBuilder()
            .header("traceparent", span.toW3CTraceParent())
            .header("X-Request-ID", UUID.randomUUID().toString())
            .build()

        return try {
            val response = chain.proceed(request)
            span.setTag("http.status_code", response.code).finish()
            response
        } catch (e: IOException) {
            span.setTag("error", true).log(e.message ?: "io_error").finish()
            throw e
        }
    }
}
SeniorAdds X-Request-ID to requests for log correlation
StaffImplements W3C traceparent propagation via OkHttp interceptor; understands sampling strategies
PrincipalDesigns the mobile observability platform — trace ingestion pipeline, sampling policy, cost vs coverage tradeoffs

Interview tip: Mention traceparent propagation when discussing observability. It shows you think end-to-end: mobile emits the trace root, every backend service inherits it, and Jaeger/Tempo stitches the waterfall automatically.

Rate Limiting Algorithms — Token Bucket, Leaky Bucket, Sliding Window

Rate limiting protects both server (from abuse) and client (from ban). Know all three algorithms — interviewers ask you to compare them at Staff level.

  • Token bucket — standard for API clients; capacity = max burst, refill rate = sustained limit
  • HTTP 429 Too Many Requests — always honour Retry-After header from server; do not ignore it
  • Leaky bucket guarantees constant output rate — correct for payment flows where bursting is unsafe
  • Sliding window avoids the double-burst problem at fixed window boundaries
  • Client-side limiting protects your own app from getting banned; server-side protects backend resources
AlgorithmAnalogyBurst?Use Case
Token BucketBucket fills at fixed rate; each request consumes a tokenYes — up to bucket capacityAPI calls — allows bursting within limit
Leaky BucketRequests queue; processed at fixed rate like water drippingNo — smoothed outputPayment processing — strict rate, no bursting
Sliding Window CounterCount requests in rolling time windowPartial — smoother than fixed windowHTTP 429 enforcement on server side
Fixed Window CounterCount resets at boundary (e.g. every minute)Yes — double at boundarySimple quota tracking; boundary burst risk
// Client-side Token Bucket rate limiter
class TokenBucketRateLimiter(
    private val capacity: Int,          // max burst size
    private val refillRatePerSec: Double // tokens added per second
) {
    private var tokens = capacity.toDouble()
    private var lastRefillNanos = System.nanoTime()

    @Synchronized
    fun tryAcquire(): Boolean {
        refill()
        return if (tokens >= 1.0) { tokens -= 1.0; true } else false
    }

    private fun refill() {
        val now = System.nanoTime()
        val elapsed = (now - lastRefillNanos) / 1_000_000_000.0
        tokens = minOf(capacity.toDouble(), tokens + elapsed * refillRatePerSec)
        lastRefillNanos = now
    }
}

// Usage in Retrofit interceptor
class RateLimitInterceptor(private val limiter: TokenBucketRateLimiter) : Interceptor {
    override fun intercept(chain: Interceptor.Chain): Response {
        if (!limiter.tryAcquire()) throw RateLimitException("Client rate limit exceeded")
        return chain.proceed(chain.request())
    }
}
SeniorHandles HTTP 429 by reading Retry-After and backing off
StaffImplements client-side token bucket; chooses algorithm based on burst requirements
PrincipalDesigns org-wide rate limiting strategy — per-endpoint budgets, quota sharing across app features, SDK-level enforcement

Interview tip: When asked about rate limiting, name the algorithm, explain burst behaviour, and say which you'd pick for the specific use case. Token bucket for search/autocomplete, leaky bucket for payment APIs.

Test your knowledge

This chapter includes 7 quiz questions covering all core concepts. Open the interactive experience to test yourself.

Start Quiz →