Interview Prep Resource

Android System Design Interview Questions & Answers

Detailed answers to the most important Android system design and technical questions asked at Senior, Staff, and Principal interviews. Covering Coroutines, Compose, OkHttp, WorkManager, offline-first architecture, and more.

callbackFlowStateFlowSharedFlowWorkManagerOkHttpComposeOffline-firstRate limitingProcess deathBaseline Profiles

How to use this page

Each answer below reflects what a Staff or Senior engineer is expected to know. For level signal nuances (Senior vs Staff vs Principal), see the dedicated question on that topic. To practice interactively with real Android failure case studies, visit the Autopsy Lab.

Frequently Asked Questions

What is callbackFlow in Android and when should you use it?

callbackFlow is a Kotlin coroutines builder that lets you convert callback-based APIs into cold Flows. It creates a channel-backed Flow, making it safe to emit values from callbacks (including multiple threads) using trySend() or send(). You should use callbackFlow when wrapping APIs that deliver multiple values over time via callbacks — such as LocationManager.requestLocationUpdates(), sensor listeners, Bluetooth scan callbacks, or any event-driven SDK that registers a listener and fires it repeatedly. The key pattern is: register your listener in the callbackFlow block, emit values with trySend(), and use awaitClose { unregisterListener() } to clean up when the collector cancels. This ensures the listener is always unregistered even if the coroutine is cancelled, preventing leaks. Unlike callbackFlow, a simple suspendCancellableCoroutine is appropriate when you only need a single value (one-shot async operations).

What is the difference between StateFlow and SharedFlow?

StateFlow and SharedFlow are both hot flows in Kotlin coroutines, but they serve different roles. StateFlow is a state holder: it always has a current value, replays exactly one value to new collectors (the current state), conflates emissions (if you emit faster than collectors consume, intermediate values are dropped, keeping only the latest), and uses structural equality to avoid re-emitting the same value twice. It is ideal for UI state that a Composable or View needs to observe. SharedFlow is a general-purpose broadcast: it has no required initial value, supports configurable replay (replay = 0 for fire-and-forget events, replay = N for a ring buffer), and delivers all emissions without conflation by default. It is ideal for one-time events like navigation commands, Snackbar triggers, or analytics events that must not be dropped or deduplicated. A common pattern: expose UI state as StateFlow and one-time side effects as SharedFlow with replay = 0.

How do you design an offline-first Android app?

An offline-first app treats the local database as the single source of truth and syncs with the network in the background. The core pattern: (1) All reads go to the local database (Room), never directly to the network. (2) All writes go to the local database first, then a sync engine queues them for upload. (3) The network layer writes its responses back into the database, and the UI reacts to database changes automatically via Flow or LiveData. This means the app is always fast and functional regardless of connectivity. The architecture typically has: a Repository that coordinates between a local RoomDao and a remote RemoteDataSource; a SyncManager (WorkManager) that observes the outbox table and pushes pending changes; a NetworkConnectivityObserver that triggers sync when connectivity is restored; and conflict resolution logic for concurrent edits (last-write-wins, CRDTs, or server-wins depending on requirements). Key libraries: Room for local persistence, WorkManager for reliable background sync, and OkHttp with a caching interceptor for HTTP-level caching of GET requests.

What is the outbox pattern in Android development?

The outbox pattern is a reliable message delivery technique adapted for mobile. When the user performs a write action (send a message, submit a form, post a comment), the app immediately writes the data to a local 'outbox' table in Room with a status of PENDING. A background WorkManager job observes the outbox, picks up PENDING records, sends them to the server, and marks them as SENT on success or FAILED on terminal failure. This decouples the user action from the network call, making the app feel instant and ensuring writes are never lost due to connectivity drops. The outbox table typically contains: id, payload (serialized request), operation type, retry count, created timestamp, and status. WorkManager is the correct scheduler because it survives process death and respects battery constraints — you can attach a NetworkType.CONNECTED constraint so the sync job only fires when the device has connectivity.

When should you use WorkManager vs Foreground Service vs Coroutines?

These three tools address different use cases. Coroutines (lifecycleScope, viewModelScope) are appropriate for async work tied to a component's lifecycle — network calls, database operations, CPU computation that should stop if the user leaves. They are cancelled when the ViewModel is cleared. Foreground Service is appropriate when the user needs to be aware of long-running work that must continue while the app is in the background — playing music, tracking a run with GPS, making a phone call. Foreground Services show a persistent notification and run until explicitly stopped. WorkManager is appropriate for guaranteed, deferrable background work that must survive process death — syncing data, uploading logs, compressing images, sending analytics. WorkManager uses JobScheduler on API 21+, respects Doze, and can be constrained by network type, battery level, and charging state. The decision tree: work tied to UI lifecycle → Coroutines; user-aware continuous background work → Foreground Service; guaranteed deferrable work that survives process death → WorkManager.

What is certificate pinning and how do you implement it in OkHttp?

Certificate pinning is a security technique that makes your app reject TLS connections to servers whose certificate does not match a known, trusted pin — even if that certificate is signed by a valid CA. This prevents man-in-the-middle attacks using rogue CAs or compromised root certificates. In OkHttp, you implement it via CertificatePinner. You build a CertificatePinner with the hostname and the SHA-256 hash of the certificate's public key (prefixed with 'sha256/'). Add it to your OkHttpClient.Builder via .certificatePinner(pinner). You must pin multiple certificates (the current cert plus at least one backup pin) to avoid locking out your users during a certificate rotation. OkHttp also supports Network Security Configuration (network_security_config.xml) for declarative pinning, which is Android-native and slightly easier to rotate. Key considerations: test your pins in staging before production, implement a pin refresh mechanism via your app config system, and have a fallback strategy (disable pin via remote flag) in case of emergency rotation.

How do you handle SSE (Server-Sent Events) in Android?

Server-Sent Events (SSE) is a unidirectional server-to-client streaming protocol over HTTP. On Android, you handle SSE using OkHttp with a streaming response body. The approach: make an OkHttp request with Accept: text/event-stream, then read the response body as a buffered source line by line without closing the connection. Each SSE event is separated by a blank line, with data: prefix on each line. A clean implementation wraps this in a callbackFlow: open the OkHttp call in the flow block, parse lines in a while loop calling source.readUtf8Line(), emit parsed events via trySend(), and cancel the OkHttp call in awaitClose. Alternatively, OkHttp's EventSource API (via the okhttp-sse artifact) provides a higher-level EventSourceListener interface. For reconnection, implement exponential backoff in the Flow's retry operator. SSE is appropriate for real-time feeds (live scores, notifications, chat read receipts) where you need server push but not bidirectional communication — use WebSocket when you need to send data back to the server.

What is the difference between MVVM and MVI architecture?

MVVM (Model-View-ViewModel) and MVI (Model-View-Intent) are both UI architecture patterns that separate UI from business logic, but they handle state differently. In MVVM, the ViewModel exposes multiple observable streams (StateFlow fields for different pieces of state), and the View observes and combines them. State mutations can come from many places, making the data flow harder to trace in complex screens. In MVI, the ViewModel exposes a single immutable UiState object and a single event stream. The View dispatches Intent objects (user actions) to the ViewModel, which processes each intent by computing a new UiState and emitting it. The flow is strictly unidirectional: View → Intent → ViewModel → UiState → View. This makes state mutations fully traceable and testable. MVI excels on complex screens with many user interactions and many state combinations. MVVM is simpler and sufficient for straightforward CRUD screens. In Compose, both patterns work well because Composables can observe StateFlow directly.

What are Baseline Profiles and how do they improve cold start?

Baseline Profiles are a set of class and method rules (stored in a baseline-prof.txt file) that tell the Android Runtime (ART) to pre-compile critical code paths ahead of time (AOT), rather than interpreting or JIT-compiling them at runtime during first use. Without Baseline Profiles, code is interpreted or JIT-compiled on first execution — this is why the first launch of an app (or a fresh install) feels slower than subsequent launches. With Baseline Profiles, ART pre-compiles the profiled code paths during app installation or in the background, so they execute at native speed from the very first launch. Typically, Baseline Profiles improve cold start time by 20–40% and reduce jank during first interaction with a screen. You generate them with the Macrobenchmark library: write a BaselineProfileRule test that navigates your critical user journeys, run it on a device, and the generated profile is included in your APK. Jetpack Compose ships its own Baseline Profile, which is why adding Compose to a project does not slow cold start as much as it theoretically could.

How do you prevent memory leaks in Android?

Memory leaks in Android occur when a short-lived object is held by a long-lived object, preventing garbage collection. The most common sources and fixes: (1) Activity/Fragment leaks — never store an Activity or View reference in a static field, singleton, or ViewModel; use WeakReference or application context when you need context in a long-lived class. (2) Listeners not unregistered — always unregister listeners (LocationListener, BroadcastReceiver, callback interfaces) in the matching lifecycle method: onStop/onDestroy. Use lifecycle-aware components (LifecycleObserver) so cleanup is automatic. (3) Coroutines launched outside a scope — always use lifecycleScope or viewModelScope so coroutines are cancelled when the lifecycle owner is destroyed. (4) Anonymous inner classes — they hold an implicit reference to the outer class; use static inner classes or lambda expressions carefully. (5) Bitmaps — always recycle or let the image loading library manage lifecycle. Use LeakCanary in debug builds to detect leaks automatically.

What is the OkHttp Authenticator and when do you use it?

OkHttp's Authenticator interface is a callback invoked automatically when a server returns a 401 Unauthorized response. It gives you the opportunity to refresh credentials and retry the original request transparently, without the caller needing to handle auth failures manually. You implement it by overriding authenticate(route, response): inside, check if you've already attempted a refresh (by inspecting the response's prior responses to prevent infinite loops), refresh the token (synchronously, since Authenticator runs on OkHttp's thread pool), update your token storage, and return the original request rebuilt with the new Authorization header. Return null to give up and propagate the 401 to the caller. Use Authenticator for OAuth2 token refresh flows where you need silent re-authentication. For proactive token attachment (adding a Bearer token to every outgoing request), use an Interceptor instead — Interceptor fires before the request is sent, Authenticator fires reactively after a 401 is received.

How do you implement rate limiting on the Android client?

Client-side rate limiting prevents your app from flooding the server with requests and provides a better user experience by queuing or throttling calls. The primary approaches: (1) Debouncing — delay emitting a value until a quiet period has passed. Ideal for search-as-you-type (debounce 300ms with Flow's debounce operator). (2) Throttling (sample/throttleFirst) — emit only the first event in a time window, discarding subsequent ones. Ideal for button taps to prevent double-submission. (3) Token bucket algorithm — maintain a bucket of N tokens that refills at a fixed rate (R tokens/second). Each request consumes one token; if the bucket is empty, the request is queued or rejected. Implement with a semaphore or a scheduled refill coroutine. (4) OkHttp Interceptor — intercept outgoing requests, check an in-memory rate limit counter, and either proceed or delay the request. In Kotlin Flows, debounce() and sample() are the most idiomatic tools. For critical paths, combine client-side throttling with server-side rate limit headers (Retry-After) parsed in your OkHttp Interceptor.

What is Compose recomposition and how do you prevent unnecessary recompositions?

Recomposition is Compose's mechanism for updating the UI when state changes: Compose re-executes Composable functions whose inputs have changed. Unnecessary recompositions waste CPU and can cause jank. To prevent them: (1) Make Composables read only the state they need — passing a specific string from a state object causes recomposition only when that string changes, while passing the whole object causes recomposition on any field change. (2) Use stable types — Compose skips recomposing a Composable if all its parameters are stable and unchanged. A type is stable if it's immutable or if Compose can determine it hasn't changed. Annotate your data classes with @Stable or @Immutable to inform the compiler. (3) Avoid creating lambdas inside Composable bodies — inline lambdas are recreated on every recomposition; hoist them or wrap with remember { }. (4) Use derivedStateOf { } for derived state that should only trigger recomposition when the derived value changes, not when the source state changes. (5) Use LazyColumn/LazyRow with keys — key(item.id) tells Compose to reuse Composables for the same items rather than recomposing all of them on list changes.

What is the difference between Senior, Staff, and Principal Android engineer system design?

The level differences in system design interviews reflect scope, proactivity, and judgment. A Senior engineer is expected to design a well-scoped component correctly: they ask functional requirements, identify the main data flows, choose appropriate libraries (Room, WorkManager, Retrofit), and handle the happy path with some error handling. They may need prompting on non-functional requirements. A Staff engineer opens with both functional and non-functional requirements unprompted, derives a scale estimate proactively (DAU, event rate, storage growth), names the consistency model, anticipates failure modes (network partitions, process death, concurrent writes), and considers observability (logging, metrics, tracing). They design for change — their architecture accommodates the next feature without a rewrite. A Principal engineer frames requirements in terms of business constraints and team-wide trade-offs. They reason about organizational concerns (which team owns which boundary, how a decision affects other apps in the fleet), challenge assumptions in the prompt, and discuss multi-year architectural consequences of each choice. They volunteer tradeoffs proactively rather than defending one approach.

How do you design a real-time chat application on Android?

A real-time chat app on Android requires: (1) Transport layer — WebSocket for bidirectional, low-latency messaging (OkHttp WebSocket API or Scarlet library). Keep one persistent WebSocket connection per active session; use a Foreground Service to maintain it while the app is backgrounded. Reconnect with exponential backoff on disconnect. (2) Local persistence — Room with a messages table. Write all incoming messages to Room first; the UI observes a Flow<List<Message>> from Room rather than the WebSocket directly. This gives you offline reading and instant load. (3) Optimistic UI — when the user sends a message, insert it into Room immediately with status = SENDING, then send over WebSocket. Update to SENT on server ACK, FAILED on error. (4) Message ordering — assign each message a logical clock value (Lamport timestamp or server-assigned monotonic ID). On receipt, sort by logical clock, not device time. (5) Delivery & read receipts — send/receive receipt events over the same WebSocket, update message status in Room. (6) Push notifications — use FCM for waking the app when the WebSocket is disconnected. The FCM payload contains enough info to show a notification without needing a full sync.

What is the token bucket algorithm for rate limiting?

The token bucket algorithm is a traffic shaping mechanism that allows burst traffic up to a maximum while enforcing a long-term average rate. The model: a bucket holds up to N tokens (the burst capacity). Tokens are added at a fixed rate R (e.g., 5 tokens/second). Each outgoing request consumes 1 token. If the bucket is full (N tokens), new tokens are discarded. If the bucket is empty (0 tokens), incoming requests must either wait or be rejected. The key property is that the bucket allows short bursts (consuming stored tokens) while the refill rate enforces the sustained throughput ceiling. On Android, you can implement this with a coroutine-based approach: a Mutex-protected token count and a coroutine that adds tokens on a fixed schedule. Alternatively, Guava's RateLimiter implements a token bucket variant you can use directly. For OkHttp, wrap the logic in an Interceptor that calls acquire() before proceeding. Token bucket differs from the leaky bucket: token bucket allows bursts, leaky bucket enforces a constant output rate regardless of input timing.

How do you test Kotlin Flows?

Testing Kotlin Flows requires controlling coroutine execution and time. The primary tools: (1) kotlinx-coroutines-test library with runTest { } — a coroutine test scope that auto-advances virtual time and completes immediately. (2) Turbine library (by Cash App) — provides the awaitItem(), awaitComplete(), and awaitError() DSL for asserting Flow emissions without complex cancellation handling. With Turbine: flow.test { assertEquals(expectedItem, awaitItem()) }. (3) For StateFlow and SharedFlow: use stateIn(backgroundScope) inside runTest to keep the flow active for the duration of the test. (4) For testing time-dependent flows (debounce, delay, retry): use runTest's advanceTimeBy(millis) to skip virtual time without real delays. (5) For repository or ViewModel tests: use FakeRepository or mock datasources that return test Flows via MutableStateFlow or flowOf(). (6) Replace Dispatchers.IO/Main with TestDispatcher via Dispatchers.setMain(testDispatcher) in @Before and resetMain() in @After to control threading. The combination of runTest + Turbine covers the vast majority of Flow testing scenarios idiomatically.

What is stateIn() and why use WhileSubscribed(5000)?

stateIn() is a Flow operator that converts a cold Flow into a hot StateFlow, making it suitable for UI observation. It takes three parameters: scope (the CoroutineScope in which the upstream Flow runs), started (the sharing strategy), and initialValue. The started parameter controls when the upstream collection starts and stops. SharingStarted.WhileSubscribed(stopTimeoutMillis = 5000) is the recommended strategy for ViewModels: the upstream Flow starts collecting when the first subscriber attaches (screen becomes visible) and stops 5 seconds after the last subscriber detaches. The 5-second grace period is specifically sized to survive configuration changes — when the user rotates the device, the Fragment/Activity is destroyed and recreated, detaching and reattaching collectors within ~1–2 seconds. Without the grace period (WhileSubscribed(0)), the upstream would be cancelled and restarted on every rotation, losing in-flight data and potentially re-triggering expensive network calls. The 5-second window also covers brief app backgrounding (switching apps quickly). Using Lazily starts once and never stops; Eagerly starts immediately at ViewModel creation. WhileSubscribed(5000) is the lifecycle-aware middle ground.

How do you handle process death in Android?

Process death is the OS terminating your app's process while it is in the background to reclaim memory. Your app can be relaunched into any destination the user was previously at, so you must restore all necessary state. Strategies: (1) SavedStateHandle in ViewModel — persists a small amount of state (IDs, query strings, selected tab) across process death and configuration changes via BBundle. Inject it into your ViewModel via Hilt or by delegation. (2) Room database — all durable application data should already be in Room; on relaunch, reload from Room. (3) DataStore / SharedPreferences — for user preferences and small primitives. (4) onSaveInstanceState — for transient UI state that is not in a ViewModel (scroll position, selection state). (5) Avoid storing large objects in SavedStateHandle; it is backed by Bundle which has a 1MB IPC limit. (6) Test for process death using adb shell am kill <package> while the app is backgrounded, then foreground it via the Recents menu. The Android Studio app inspection tool can also trigger process death. Common mistake: reloading all data from the network on every relaunch — always check Room first and only fetch if the local data is stale.

What is distributed tracing with W3C traceparent in Android?

Distributed tracing lets you follow a single user request across the entire stack — from the Android app, through the API gateway, to multiple backend services. The W3C Trace Context specification defines a standard HTTP header, traceparent, with the format: version-traceId-parentId-flags (e.g., 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01). traceId is a 16-byte globally unique ID for the entire transaction. parentId identifies the current span. On Android, you generate a traceId at the start of an important user action (e.g., tapping Search), attach it to all outgoing OkHttp requests as the traceparent header, and send it in your analytics events. Your backend propagates the header through its internal calls. You can then query your observability platform (Jaeger, Zipkin, Honeycomb, Datadog) for the traceId and see the full waterfall — which service was slow, where errors occurred, and what the Android client experienced vs what the server saw. In OkHttp, add traceparent in an Interceptor. Libraries like OpenTelemetry for Android provide automatic instrumentation.

What is the difference between cold and hot Flows in Kotlin?

A cold Flow starts executing its producer block fresh for each collector. No work happens until collect() is called, and each collector gets its own independent sequence of values. flowOf(), flow { }, channelFlow, and Room DAO Flows are cold — they produce values on demand per collector. A hot Flow produces values regardless of whether anyone is collecting, and multiple collectors share the same upstream. StateFlow and SharedFlow are hot. The practical difference: if you collect a cold Flow twice, the network call or database query runs twice. If you collect a StateFlow twice, both collectors read from the same in-memory state, and the upstream runs only once. Converting a cold Flow to hot (using stateIn() or shareIn()) is important in ViewModels so that multiple Composables observing the same data don't each trigger independent network calls. Hot Flows are also appropriate for event buses, real-time sensor data, or WebSocket message streams where values exist independently of any observer.

How do you design a feed (like Twitter or Instagram) on Android?

Designing a social feed involves: (1) Pagination — use Jetpack Paging 3 with a PagingSource backed by a RemoteMediator. The RemoteMediator fetches from the network, writes to Room, and Room is the source of truth for the PagingSource. This provides offline support and seamless pagination. (2) RecyclerView/LazyColumn — with stable keys (item.id) for efficient diffing. Use DiffUtil or Paging 3's built-in diffing. (3) Image loading — Coil or Glide with automatic lifecycle management, downsampling, and LRU caching. Set a max heap size for the image cache (typically 20–30% of available memory). (4) Video autoplay (for Reels-style content) — use a single shared ExoPlayer instance that tracks the most visible video item. Release the player when the item leaves the screen. Limit concurrent decode pipelines to 1–2. (5) Real-time updates — use SSE or WebSocket to receive new post notifications; insert into Room and let Paging 3 invalidate the PagingSource. (6) Pull-to-refresh — invalidate the PagingSource and RemoteMediator will re-fetch. The critical insight: the database is always the UI's source of truth; the network writes to the database; the UI observes the database.

What is the difference between lateinit and lazy in Kotlin?

Both defer property initialization but serve different purposes. lateinit is for var properties of non-nullable reference types that you guarantee will be initialized before first use but cannot initialize at declaration time (e.g., View binding in onCreateView, or fields injected by a DI framework). It throws UninitializedPropertyAccessException if accessed before initialization; check with ::property.isInitialized. It cannot be used with primitive types, nullable types, or val. lazy is for val properties that are computed on first access and then cached. The lambda you provide runs exactly once, and all subsequent accesses return the cached value. By default, lazy is thread-safe (synchronized); you can pass LazyThreadSafetyMode.NONE for a single-threaded context if performance is critical. Use lateinit for DI-injected mutable fields and view references. Use lazy for expensive computations (parsing, heavy object construction) that you want to defer until actually needed.

How does OkHttp's connection pool and keep-alive work?

OkHttp maintains a ConnectionPool that reuses TCP connections across multiple HTTP requests to the same host, avoiding the latency cost of a full TCP handshake and TLS negotiation on every request. By default, the pool keeps up to 5 idle connections, each alive for 5 minutes. HTTP/1.1 uses Connection: keep-alive headers; HTTP/2 multiplexes multiple requests over a single connection natively. You configure the pool via ConnectionPool(maxIdleConnections, keepAliveDuration, timeUnit) and pass it to OkHttpClient.Builder. In practice: for an app making many concurrent requests to the same API (e.g., a paginated feed loading 10 items in parallel), connection reuse cuts latency by 50–200ms per request on mobile networks. For an app that only makes occasional requests, the default pool settings are appropriate. One OkHttpClient instance should be shared across your entire app — creating multiple instances means each has its own pool and cache, wasting resources and defeating the purpose.

What is Dependency Injection and why use Hilt over manual DI?

Dependency Injection (DI) is a design pattern where a class receives its dependencies from an external provider rather than creating them internally. This makes classes testable (you can inject fakes), replaceable (swap implementations without changing the dependent class), and reduces coupling. Hilt is Jetpack's DI framework built on top of Dagger. Compared to manual DI (creating and passing dependencies by hand): (1) Hilt generates the wiring at compile time — no reflection overhead at runtime, unlike some DI frameworks. (2) It is lifecycle-aware — @ActivityScoped, @ViewModelScoped, @Singleton scopes align with Android lifecycle, so Hilt knows when to create and destroy each object. (3) It integrates with ViewModels via @HiltViewModel and SavedStateHandle injection. (4) It is boilerplate-reducing — you annotate modules with @Module/@Provides or constructors with @Inject; Hilt handles the graph resolution. Compared to Koin (a service-locator-style framework): Hilt's compile-time validation catches mismatched bindings before runtime, whereas Koin's errors only appear at runtime when the object is first requested.

How do you measure and improve app startup time on Android?

Android startup time is measured in three phases: Cold start (no process, Activity fully drawn), Warm start (process exists, Activity recreated), and Hot start (process and Activity both in memory, brought to foreground). Tools: (1) Android Studio's App Startup profiler — trace method execution during startup. (2) reportFullyDrawn() API — call this when your critical content is visible; Android reports it in logcat and Perfetto. (3) Macrobenchmark library — write a StartupBenchmark that measures time-to-first-frame in CI. Improvement strategies: (1) Baseline Profiles — pre-compile critical code paths, typically saves 20–40% cold start. (2) Lazy initialization — use App Startup library (Jetpack) to defer non-critical initializations out of Application.onCreate(). (3) Splash screen — use the SplashScreen API (API 31+) to show a branded splash immediately while the app initializes. (4) Reduce main thread work — any disk I/O, network call, or CPU-intensive parsing in onCreate() blocks launch. Move to background threads. (5) Reduce class loading — fewer classes loaded = less JIT work on first launch. R8 and Baseline Profiles both help here.

Ready to practice with real failure case studies?

The Autopsy Lab contains real Android post-mortems from Snapchat, Twitter, Uber, WhatsApp, and more. Diagnose the root cause before reading the verdict.