System Design Flashcards

Question

What is database indexing and what are the trade-offs?

Answer 1

Index is a data structure (B-tree or Hash) that speeds up read queries. B-tree index: O(log n) lookup + good for range queries + sorted order + most common. Hash index: O(1) lookup + only equality queries + no range support. Trade-offs: indexes speed up reads but slow down writes (index must be updated on insert/update/delete) + indexes consume storage. Composite index: index on multiple columns (order matters: leftmost prefix rule). Covering index: all queried columns in index (no table lookup needed). Index on foreign keys for JOIN performance

Answer 2

ACID (traditional RDBMS): Atomicity (all or nothing - transaction either fully completes or fully rolls back) + Consistency (DB moves from valid state to valid state) + Isolation (concurrent transactions appear sequential) + Durability (committed data survives crashes). BASE (NoSQL): Basically Available (system remains available - may return stale data) + Soft State (state may change over time even without input) + Eventually Consistent (data will converge to consistent state eventually). ACID = financial systems. BASE = social media - analytics - recommendations

Answer 3

N+1 problem: fetching 1 parent record then N individual queries for each child instead of one JOIN. Example: fetch 100 orders then loop and query each order's items individually = 101 queries instead of 1. Solutions: JOIN (fetch together) + eager loading (ORM: @ManyToOne fetch=EAGER or JPA JOIN FETCH) + batch loading (Hibernate batch size) + DataLoader (GraphQL - batch requests). N+1 is a common performance killer in ORM-based apps. Always check generated SQL in development

Answer 4

Connection pooling maintains a pool of reusable database connections instead of opening a new connection per request. Opening DB connection is expensive (TCP handshake + auth + ~50ms). Pool: min connections (always open) + max connections (limit on DB) + timeout (wait for available connection). Java: HikariCP (fastest + default in Spring Boot) + c3p0 + DBCP. Config: maximumPoolSize (default 10 - set based on DB capacity). Too many connections: DB overwhelmed. Too few: request queuing. PostgreSQL max_connections typically 100-500

Answer 5

Normalization: eliminate redundancy + organize data into related tables + reduces update anomalies + requires JOINs for queries + good for write-heavy OLTP. Normal forms: 1NF (atomic values) + 2NF (no partial dependencies) + 3NF (no transitive dependencies). Denormalization: intentionally introduce redundancy to improve read performance + store pre-joined data + reduces JOINs + may cause update anomalies + requires synchronization. Use in: data warehouses + read-heavy systems + NoSQL documents + caching layer. Trade-off: query speed vs storage + consistency

Answer 6

ORM (Object-Relational Mapping) maps Java/Python objects to database tables. Examples: Hibernate (Java JPA) + SQLAlchemy (Python) + TypeORM (Node). Benefits: no raw SQL (type safety + refactoring) + database portability + rapid development + caching. Drawbacks: N+1 queries (if not careful) + complex queries are awkward + generated SQL can be inefficient + mapping overhead + leaky abstraction (still need to understand SQL). Best practice: use ORM for CRUD + write raw SQL/native queries for complex analytics or performance-critical paths

Answer 7

Time-Series DB is optimized for storing and querying data indexed by time (timestamps). Optimized for: high write throughput + range queries by time + aggregation (avg/min/max over intervals) + automatic data downsampling/retention policies. Examples: InfluxDB + TimescaleDB (PostgreSQL extension) + Prometheus (metrics) + Apache Druid. Use cases: metrics monitoring + IoT sensor data + financial tick data + log analytics + application telemetry. Key features: time-based partitioning + compression + rollup aggregations

Answer 8

Eventual consistency: after all updates stop - all replicas will converge to same value. Reads may temporarily return stale data. Design patterns: read-your-own-writes (route user's reads to the replica they just wrote to) + monotonic reads (user always reads from same replica - no backward time travel) + causally consistent reads (track dependencies). Use version vectors or timestamps to detect conflicts. Accept it: for non-critical data (follower counts + like counts + recommendations). Require strong consistency: for financial balances + inventory

Answer 9

WAL is a technique where changes are first written to an append-only log before being applied to the actual database. Benefits: crash recovery (replay WAL to recover uncommitted changes) + durability without fsync on every write + basis for replication (stream WAL to replicas - PostgreSQL logical replication + MySQL binlog + MongoDB oplog). LSM trees (used in Cassandra + RocksDB) use a similar approach: write to in-memory memtable + WAL then flush to disk SSTables

Answer 10

SQL=ACID + joins + normalized. NoSQL=BASE + horizontal scale + flexible. Key-Value (Redis) + Document (MongoDB) + Wide-Column (Cassandra-writes) + Graph (Neo4j). Sharding: Range (hot spots) + Hash (even) + Directory (flexible). Replication: Single-Leader (async followers) + Multi-Leader (conflicts) + Leaderless (quorum). Indexing: B-tree (range) + Hash (equality). N+1: use JOIN/eager loading. ACID: Atomicity+Consistency+Isolation+Durability. Connection pool: HikariCP. Denormalize for reads. WAL = crash recovery + replication

Answer 11

Cache-Aside (Lazy Loading): app checks cache -> miss -> load from DB -> store in cache -> return. Cache may be stale until TTL expires. Most common pattern. Write-Through: write to cache AND DB simultaneously -> no stale reads -> higher write latency. Write-Back (Write-Behind): write to cache only -> async write to DB -> risk of data loss. Read-Through: cache sits in front of DB -> cache fetches from DB on miss automatically. Refresh-Ahead: proactively refresh cache before TTL expires for hot keys

Answer 12

LRU (Least Recently Used): evict item not accessed longest - good for temporal locality - most common. LFU (Least Frequently Used): evict item accessed least often - better for skewed access patterns. FIFO (First In First Out): evict oldest item regardless of access - simple but often suboptimal. TTL (Time To Live): evict after fixed time - good for data that expires (sessions - tokens - rates). MRU (Most Recently Used): evict most recently used - rare use case (scan patterns). Redis default: LRU. Use LRU for general caching + TTL for time-sensitive data (combine both)

Answer 13

Redis is an in-memory data structure store used as cache + message broker + session store. Data structures: String (SET/GET + counters + rate limiting) + Hash (user profiles + objects) + List (queues + activity feeds - LPUSH/RPOP) + Set (unique members + social graph + SADD/SMEMBERS) + Sorted Set (leaderboards + priority queues - ZADD/ZRANGE with score) + HyperLogLog (cardinality estimation) + Streams (append-only log - message queue) + Pub/Sub (messaging). Persistence: RDB (snapshots) + AOF (append-only log)

Answer 14

Cache Stampede (Thundering Herd): many requests miss same key simultaneously -> all hit DB. Solution: mutex lock (one request loads - others wait) + probabilistic early expiration + pre-warming cache. Cache Penetration: query for non-existent key -> every request hits DB. Solution: cache null values with short TTL + Bloom filter (check if key exists before querying DB). Cache Avalanche: many keys expire simultaneously -> DB overwhelmed. Solution: randomize TTL values (TTL + random jitter) + circuit breaker + pre-warm cache on restart

Answer 15

Bloom Filter is a space-efficient probabilistic data structure that answers: is this element in the set? It can have false positives (says yes but element not in set) but NO false negatives (if it says no - element is definitely not in set). Space: uses multiple hash functions + bit array. Use cases: cache penetration prevention (check if key exists before DB query) + URL deduplication (HBase/Cassandra row key existence) + weak password detection + CDN cache routing. Cannot delete elements (use Counting Bloom Filter for deletion)

Answer 16

Distributed cache spans multiple nodes for scale and high availability. Approaches: Client-side sharding (client decides which cache node to use - consistent hashing) + Proxy-based (proxy routes to correct cache node - Twemproxy) + Cluster mode (Redis Cluster - auto-sharding + replication). Redis Cluster: 16384 hash slots distributed across nodes + each node handles a subset + automatic failover. Challenges: cache coherence (keeping all nodes in sync) + network partitions + rebalancing when nodes added/removed. Alternative: Memcached (simpler - multi-threaded - no persistence)

Answer 17

Write-Through: every write goes to cache AND database synchronously. Pros: cache always consistent with DB + no data loss on cache failure. Cons: write latency = DB latency (defeats purpose for write speed). Write-Back (Write-Behind): write to cache only -> acknowledge to client -> async flush to DB later. Pros: very fast writes (RAM speed). Cons: risk of data loss if cache crashes before flush + complexity. Use Write-Through for: data that must be durable (financial). Use Write-Back for: write-heavy workloads where some data loss is tolerable (game scores - analytics counts)

Answer 18

Hardest problem in CS (along with naming). Strategies: TTL expiration (simplest - accept stale data for TTL period) + Event-driven invalidation (when DB changes -> publish event -> cache subscriber deletes/updates key) + Write-through (cache always updated on DB write) + Versioned cache keys (cache key includes version: user:123:v2 - bump version on update - old key naturally expires) + Cache-aside with short TTL (accept brief staleness). Key principle: shorter TTL = fresher data + more DB load. Longer TTL = stale data risk + less DB load

Answer 19

Strategies: Cache-Aside (most common - lazy load) + Write-Through (always consistent) + Write-Back (fast writes - risk loss). Eviction: LRU (most common) + LFU (skewed access) + TTL (time-based). Redis structures: String + Hash + List + Set + Sorted Set (leaderboards) + Streams. Cache problems: Stampede (mutex/jitter) + Penetration (null cache + Bloom Filter) + Avalanche (randomize TTL). Bloom Filter: no false negatives + space-efficient. Distributed: Redis Cluster (16384 slots). Cache invalidation: hardest problem - use TTL + event-driven + versioned keys

Answer 20

Message Queue decouples producers (senders) from consumers (receivers). Producer puts message in queue without knowing who processes it. Consumer reads at its own pace. Benefits: loose coupling + fault tolerance (messages persist if consumer down) + load leveling (absorbs traffic spikes - queue as buffer) + async processing + retry failed messages. Types: Point-to-Point (one message - one consumer - SQS) + Pub-Sub (one message - many consumers - SNS/Kafka topics). Examples: SQS + RabbitMQ + ActiveMQ

Answer 21

Kafka is a distributed event streaming platform. Architecture: Topic (logical channel - divided into Partitions) + Partition (ordered immutable append-only log - messages have offset) + Producer (publishes to topic) + Consumer Group (each partition consumed by one member - enables parallel processing) + Broker (Kafka server - stores partitions) + Zookeeper/KRaft (cluster coordination). Messages retained for configurable period (days/weeks) - can be replayed. Throughput: millions of messages/second. Use for: event streaming + log aggregation + CDC + real-time analytics

Answer 22

Kafka: pull-based consumers + append-only persistent log + messages retained after consumption (replay) + ordered within partition + consumer tracks offset + extremely high throughput (millions/sec) + best for: event sourcing + CDC + log aggregation + stream processing + multiple consumers needing same messages. RabbitMQ: push-based + message deleted after ACK + complex routing (exchanges: direct/topic/fanout/headers) + lower per-message latency + best for: task queues + work distribution + RPC patterns + complex routing. Rule: if data needs to be replayed or consumed by multiple groups -> Kafka

Answer 23

At-most-once: message delivered 0 or 1 times. May be lost. Fire-and-forget. Use for: metrics + telemetry where loss is OK. At-least-once: message delivered 1 or more times. May be duplicated. Consumer must be idempotent. Most common guarantee. At-exactly-once: message delivered exactly once. Hardest to implement. Kafka achieves with: idempotent producer + transactional API. SQS FIFO queues offer exactly-once processing. In practice: design consumers to be idempotent (safe to process same message twice) and accept at-least-once delivery

Answer 24

Dead Letter Queue stores messages that failed processing after maximum retry attempts. Prevents bad messages from blocking the queue forever. Use cases: malformed messages + business logic errors + downstream service unavailable. Config: max retry count -> on failure -> move to DLQ. Operations team monitors DLQ + investigates + reprocesses or discards. AWS SQS: configure redrive policy (maxReceiveCount + DLQ ARN). Kafka: implement consumer-side DLQ logic + write failed messages to error topic. Essential for production reliability

Answer 25

Event Sourcing stores state as a sequence of events. Append-only event log is the source of truth. Current state derived by replaying events. Kafka acts as the event store in many implementations. Benefits: complete audit trail + replay history + multiple read models from same events + temporal queries. Challenges: event schema evolution + eventual consistency + steep learning curve. CQRS + Event Sourcing: write side appends events -> read side builds projections from events. Event Store: specialized DB (EventStoreDB) or Kafka with long retention

Answer 26

Problem: save to DB and publish event are two separate operations - one can fail -> inconsistency. Solution: write event to OUTBOX table in SAME DB transaction as business data -> separate relay reads outbox -> publishes to Kafka/SQS -> marks as published. Guarantees: event published if and only if DB write committed. At-least-once delivery (relay may retry). Consumer must be idempotent. Implementation: Debezium CDC reads DB transaction log -> publishes directly to Kafka (no outbox table needed - cleaner)

Answer 27

Saga manages distributed transactions across services using events/commands. Choreography (event-driven): Order Service publishes OrderCreated event -> Inventory Service consumes + reserves + publishes InventoryReserved -> Payment Service consumes + charges + publishes PaymentProcessed -> Order Service confirms. Compensation: PaymentFailed event -> Inventory Service listens -> releases reservation. Kafka provides: durable event log + consumer groups + replay for recovery. Each step idempotent + has compensating transaction

Answer 28

Backpressure is when a consumer cannot keep up with producer rate. Queue grows indefinitely -> memory/disk exhaustion. Handling strategies: block producer (reactive streams backpressure - producer waits) + drop messages (circuit breaker - lose data) + scale consumers horizontally (add more consumer instances) + use priority queues (process important messages first) + batch processing (process in larger chunks). Kafka: consumer lag metric shows how far behind consumers are. Alert when lag grows beyond threshold

Answer 29

Queue: decouples + buffers + async. Point-to-point (SQS - one consumer). Pub-Sub (SNS/Kafka - many consumers). Kafka: persistent log + replay + consumer groups (one partition per member) + ordered within partition + millions/sec throughput. RabbitMQ: complex routing + push-based + deleted after ACK. Delivery guarantees: at-most-once (loss OK) + at-least-once (idempotent consumer needed) + exactly-once (hardest). DLQ = failed messages after max retries. Outbox = reliable event publishing with DB transaction. Backpressure: consumer can't keep up -> scale consumers + alert on lag

Answer 30

Vertical scaling (Scale Up): add more resources (CPU/RAM/SSD) to ONE server. Simple - no code changes - but has hardware limits + single point of failure + cost grows exponentially. Horizontal scaling (Scale Out): add MORE servers to a pool behind a load balancer. No theoretical limit + commodity hardware + fault tolerant + requires: stateless application (no local state) + distributed data + load balancer. Cloud makes horizontal scaling easy. Microservices and stateless design enable horizontal scaling. Almost all large-scale systems scale horizontally

Answer 31

Stateless: server holds NO client session data between requests. Each request is self-contained. Benefits: any server can handle any request -> easy horizontal scaling -> no sticky sessions needed. How: store state externally (JWT tokens in request + session in Redis + DB). Stateful: server remembers client state between requests. Problem: request must go to same server (sticky sessions) -> hard to scale -> server failure loses state. Design for statelessness: extract state to shared storage (Redis) + pass state in request (JWT) + store state in client (browser cookies)

Answer 32

Consistent hashing distributes keys across nodes so adding/removing a node only remaps ~1/n of keys (naive hash % n remaps nearly all). Algorithm: hash ring (0 to 2^32). Nodes placed at positions on ring. Key mapped to nearest clockwise node. Add node: only keys between new and predecessor reassigned. Remove node: only that node's keys go to successor. Virtual nodes (vnodes): each physical node has many virtual positions -> better load distribution + easier rebalancing. Used in: Redis Cluster + Cassandra + distributed caches + CDN routing

Answer 33

Opening a new DB connection is expensive (TCP handshake + auth + ~25-50ms). At 10K req/s: opening new connection per request = system overwhelmed. Connection pool maintains N reusable connections. Request: borrow connection from pool -> use -> return. Config: minIdle + maxPoolSize + connectionTimeout + idleTimeout. HikariCP (Spring Boot default): max pool size typically 10-20 per app server instance. At scale: PgBouncer (PostgreSQL connection pooler) sits between app and DB - handles thousands of app connections with fewer actual DB connections

Answer 34

Write buffering: batch writes (collect and insert many rows at once - 10x+ throughput). Async writes: write to queue first -> process async -> acknowledge immediately. LSM trees: Cassandra/RocksDB write to in-memory memtable -> flush to SSTables -> no random I/O. CQRS: separate write-optimized store from read-optimized store. Sharding: distribute writes across multiple DB instances. Event sourcing: append-only log (sequential writes = fastest). Database tuning: disable fsync (risky) + batch commits + bulk inserts

Answer 35

CQRS (Command Query Responsibility Segregation) separates the write model (Commands: create/update/delete) from the read model (Queries: fetch). Write side: normalized - ACID - optimized for integrity. Read side: denormalized - eventually consistent - optimized for query patterns (pre-joined views). Sync: events from write side update read side asynchronously. Benefits: scale reads and writes independently + optimize each for their use case + different storage per side. Use for: complex domains with different read/write patterns + high read:write ratio systems

Answer 36

Partitioning splits data across multiple nodes/tables. Horizontal partitioning (Sharding): rows distributed across nodes (user_id 1-1M on node1). Vertical partitioning: different columns on different nodes (user profile on node1 - user activity on node2). Functional partitioning: different services/domains on different nodes (payments DB vs catalog DB). Partition key selection: choose high cardinality key + even distribution + aligns with access patterns. Hotspot problem: celebrity user with millions of followers causes hot partition -> add random suffix to key + route to multiple partitions

Answer 37

Cell-based architecture divides the system into isolated cells (shards of the entire stack - not just DB). Each cell serves a subset of users and is completely independent: has its own servers + cache + database. One cell failure affects only its users (typically 1-10% of total). Used by: Amazon + Slack + Stripe. Benefits: blast radius containment + independent scaling + easier testing + geographic deployment. Router maps users to cells. Cells are identical deployments. Contrasts with: global architecture where all users share same infrastructure

Answer 38

Rate Limiting controls request frequency per user/API key/IP. Algorithms: Fixed Window Counter (simple: count per time window - boundary burst problem) + Sliding Window Log (accurate: log timestamps - memory intensive) + Sliding Window Counter (combines fixed + sliding - good balance) + Token Bucket (bucket fills at constant rate - allows burst up to bucket size - most natural) + Leaky Bucket (smooths burst into constant output rate - queue-based). Implementation: Redis with INCR + EXPIRE (atomic with Lua scripts). Response: 429 Too Many Requests + Retry-After header

Answer 39

Problem: microservices need to coordinate without 2PC (two-phase commit - too fragile). Saga = sequence of local transactions with compensating transactions for rollback. Choreography: services communicate via events (no central coordinator - loose coupling - harder to track). Orchestration: central Saga orchestrator directs services (visible flow - tighter coupling). Key requirement: each step must be idempotent + have a compensating transaction. Example: Order -> reserve inventory -> charge payment -> confirm order. Failure: refund -> release inventory -> cancel order

Answer 40

Vertical=bigger machine (limit+SPOF). Horizontal=more machines (stateless required). Stateless: store state in Redis/JWT - not on server. Consistent hashing: add/remove node remaps only 1/n keys (virtual nodes for balance). Write throughput: buffering + async + LSM trees + sharding. CQRS: separate write (normalized) from read (denormalized). Partitioning: horizontal (rows) + vertical (columns) + functional (domains). Hot partition: add random suffix to key. Cell-based: isolated stacks per user group (blast radius containment). Rate limit: Token Bucket (Redis) -> 429 response

Answer 41

99% availability = 3.65 days downtime/year (unacceptable for production). 99.9% (three nines) = 8.76 hours/year. 99.95% = 4.38 hours/year. 99.99% (four nines) = 52.6 minutes/year (good target). 99.999% (five nines) = 5.26 minutes/year (telecom grade - very expensive). Availability of N components in series: multiply (0.99 * 0.99 = 0.9801). Availability of redundant components in parallel: 1 - (1-A)^n. Adding a 99% component in parallel: 1-(0.01)^2 = 99.99%. Redundancy dramatically improves availability

Answer 42

HA minimizes downtime by eliminating Single Points of Failure (SPOF). Techniques: Redundancy (multiple instances of each component) + Failover (automatic switch to backup on failure) + Load Balancing (distribute traffic + detect failures) + Health Checks (detect unhealthy instances) + Multi-AZ deployment (different data centers) + Active-Active (all nodes serve traffic - best throughput) + Active-Passive (standby takes over on failure - simpler - some downtime). Chaos Engineering: deliberately inject failures to test resilience

Answer 43

High Availability: system remains operational despite failures - may have brief degraded performance - failover takes seconds to minutes. Fault Tolerance: system continues operating PERFECTLY without any degradation even when components fail - all redundancy is active - no failover delay. FT is more expensive (requires full hot standby). HA is more practical for most systems. Use case: airplane autopilot = fault tolerant (triple redundancy). Web service = high availability (failover in seconds). Design for HA - use FT only for safety-critical systems

Answer 44

Circuit Breaker prevents cascading failures. States: Closed (normal - all requests pass through - count failures) -> Open (failure threshold exceeded - all requests fail fast with error - no actual calls made - allows downstream to recover) -> Half-Open (after timeout - allow small number of test requests - if success: back to Closed - if failure: back to Open). Configuration: failure threshold (50%) + wait duration (60s) + half-open test count. Libraries: Resilience4j + Netflix Hystrix (deprecated). Essential for every service calling external dependencies

Answer 45

Health check is an endpoint (/health) that reports service status. Types: Liveness (is process alive? K8s restarts if failing) + Readiness (is service ready for traffic? K8s stops routing if failing) + Dependency health (check DB + cache + downstream services). A readiness check failing should: remove instance from load balancer + stop new requests + allow in-flight to complete + investigate. K8s liveness probe kills + restarts container on failure. AWS ELB deregisters unhealthy instances. Without health checks: traffic continues to dead servers

Answer 46

DR plan defines how to restore systems after catastrophic failure. Key metrics: RTO (Recovery Time Objective: max acceptable downtime - how long to restore) + RPO (Recovery Point Objective: max acceptable data loss - how far back is oldest acceptable backup). DR strategies by cost/RTO: Backup & Restore (cheapest - RTO hours - restore from S3 backup) + Pilot Light (minimal always-on infra - scale up on disaster - RTO minutes) + Warm Standby (scaled-down but running - promote quickly - RTO minutes) + Multi-Site Active/Active (full capacity in 2+ regions - RTO seconds - most expensive)

Answer 47

RTO (Recovery Time Objective): how long can the system be down? Target time from failure detection to full restoration. Short RTO: need hot standby + automated failover. Long RTO: can restore from backup manually. RPO (Recovery Point Objective): how much data loss is acceptable? Time window of data you can afford to lose. RPO=0: synchronous replication (no loss) + higher latency. RPO=1 hour: hourly backups or async replication. Relationship: lower RTO + lower RPO = higher cost. Financial systems: RTO<1min + RPO=0. Analytics: RTO=4hr + RPO=24hr

Answer 48

Graceful degradation: when a component fails the system continues with reduced functionality instead of complete failure. Examples: recommendation service down -> show popular items instead of personalized. Payment service slow -> queue payment + confirm immediately + process async. Search service down -> show recent items. Image service down -> show placeholder. Implementation: circuit breaker + fallback response + cached data + feature flags (disable failing feature). Design for: what is the minimum viable experience when each dependency fails?

Answer 49

Bulkhead isolates system components so failure in one doesn't cascade. Named after ship bulkheads. Implementation: separate thread pools per downstream dependency (if Service B is slow: only B's thread pool fills up - Service C still works). Separate connection pools per service. Resource quotas per tenant. K8s: resource limits (CPU/memory) per container. AWS: separate Auto Scaling Groups per service. Benefit: one slow dependency cannot starve others. Combined with: circuit breaker (stop calling slow service) + timeout (define max wait)

Answer 50

Idempotent operation produces same result regardless of how many times executed. Critical because: network failures cause retries + message queues deliver at-least-once + clients retry on timeout without knowing if server processed. Implementation: idempotency key (unique request ID in header) -> server stores processed request IDs -> if same ID arrives again -> return cached result without re-processing. Database: use UPSERT instead of INSERT. HTTP: GET/PUT/DELETE are idempotent. POST is NOT. Payment APIs must be idempotent (no double charges)

Answer 51

2PC coordinates a distributed transaction across multiple nodes. Phase 1 (Prepare): coordinator asks all participants if they can commit. Phase 2 (Commit/Abort): if all say yes -> coordinator tells all to commit -> if any say no -> abort. Problems: blocking protocol (participants hold locks while waiting) + coordinator SPOF (if coordinator crashes between phases -> participants stuck) + slow (multiple network round trips) + scalability bottleneck. Modern distributed systems avoid 2PC: use Saga pattern instead + accept eventual consistency + design compensating transactions

Answer 52

Availability: 99.9%=8.7h/yr + 99.99%=52min/yr + 99.999%=5min/yr. Parallel redundancy: 1-(1-A)^n. Circuit Breaker: Closed->Open(fast fail)->Half-Open(test). Health checks: Liveness(restart if dead) + Readiness(remove from LB if not ready). DR: Backup&Restore < Pilot Light < Warm Standby < Multi-Site Active/Active. RTO=how long down. RPO=how much data loss. Graceful degradation: fallback to cached/default. Bulkhead: separate thread pools per dependency. Idempotency: unique request ID -> safe retries. Avoid 2PC: use Saga instead

Answer 53

Requirements: create short URL + redirect + analytics. Scale: 100M URLs total + 1000 writes/sec + 100K reads/sec (read-heavy). Short code: auto-increment ID -> Base62 encode (a-z A-Z 0-9 = 7 chars = 62^7 = 3.5T URLs). Or MD5 hash first 7 chars (handle collisions with counter). Storage: DynamoDB or Redis (KV: short_code -> long_url). Redirect: 302 (temporary - trackable) vs 301 (permanent - cached by browser). Architecture: LB + stateless app servers + Redis cache (hot URLs) + DynamoDB. Analytics: async via Kafka. Scale: shard by short_code hash

Answer 54

Core: post tweet + follow users + view feed. Scale: 100M DAU + 5K tweet writes/sec + 500K feed reads/sec (100:1 read:write). Feed generation strategies: Pull (read on request: query followed users' tweets + sort + return - slow at read time) + Push (fan-out on write: when tweet posted -> push to all followers' feed caches - fast reads + expensive for celebrities with millions of followers). Hybrid: push for regular users + pull for celebrities. Storage: tweets in DB (Cassandra - time-series writes) + feed cache in Redis (list per user + trim to 1000). Media: S3 + CDN

Answer 55

Requirements: 1:1 messaging + group chat + online presence + message status (sent/delivered/read). Scale: 1B users + 50B messages/day. Real-time: WebSocket connections to chat servers. Architecture: user connects via WebSocket to chat server. Message flow: sender -> chat server -> recipient WebSocket (if online) + store in DB. Offline: push notification + store in DB until retrieved. Storage: Cassandra (message_id - conversation_id - timestamp - content) - wide column = fast range scans. Groups: fan-out message to all members. Presence service: heartbeat + Redis pub/sub. Media: S3 + CDN

Answer 56

Requirements: upload photos + follow users + news feed + search. Scale: 1B users + 1M photo uploads/day + 10M feed reads/sec. Photo storage: S3 + CDN (serve images from edge). Metadata: PostgreSQL (user_id - photo_id - caption - timestamp - tags). Feed: pre-computed feed in Redis (list of photo IDs per user - fan-out on write for regular users - pull for celebrities). Upload flow: client -> CDN upload URL (pre-signed S3 URL) -> S3 -> trigger async processing Lambda (resize thumbnails + extract metadata) -> update DB + invalidate feed cache. Search: Elasticsearch for caption/hashtag search

Answer 57

Requirements: upload video + transcode + stream. Scale: 500M users + 500 hours of video uploaded/min. Upload: pre-signed S3 URL (direct from client to S3 - bypass servers). Processing pipeline: S3 upload event -> SQS -> video processing workers (FFmpeg: transcode to multiple resolutions: 360p - 720p - 1080p - 4K) -> store in S3 + update DB (status + CDN URLs). Streaming: CloudFront CDN + HLS/DASH adaptive bitrate streaming (segments + manifest file). Metadata: PostgreSQL. View count: async Kafka -> Redis counter -> periodic flush to DB. Recommendation: ML + graph DB

Answer 58

Requirements: match riders to drivers + real-time location + pricing + routing. Scale: 5M rides/day + 1M concurrent drivers. Driver location: drivers send GPS every 4 seconds -> store in Redis Geo (geohash) + Cassandra (history). Matching: rider requests ride -> query nearby drivers (Redis GEORADIUS within 5km) -> rank by ETA -> offer to best driver. Trip service: PostgreSQL (rider_id - driver_id - pickup - dropoff - status - price). Real-time communication: WebSocket or long polling. ETA/routing: Google Maps API or internal routing engine. Surge pricing: demand/supply ratio per geohash cell

Answer 59

Typeahead: show search suggestions as user types. Requirements: low latency (<100ms) + relevant suggestions + handle 10K req/sec. Storage: Trie (prefix tree) or inverted index. Architecture: offline: crawl queries -> build frequency-weighted trie -> serialize to storage. Online: user types 'app' -> query trie -> return top-5 suggestions by frequency. Scale: shard trie by first 2 chars of prefix -> route queries to correct shard. Cache: cache top-N suggestions per prefix in Redis. CDN: cache popular prefixes at edge. Update: aggregate query logs daily -> rebuild trie offline -> swap in

Answer 60

Types: push (mobile) + email + SMS + in-app. Scale: 1B notifications/day. Architecture: notification service receives events from other services via Kafka -> workers consume + fan-out to channels. Rate limiting: max notifications per user per day. User preferences: store opted-out channels per user (DB + Redis cache). Priority queues: critical (security alerts) vs non-critical (promotions). Retry logic: failed notifications -> DLQ -> retry with backoff. Providers: Firebase (push) + SendGrid/SES (email) + Twilio (SMS). Deduplication: prevent duplicate notifications for same event

Answer 61

Requirements: limit requests per user per time window. Algorithms: Token Bucket (fixed token refill rate - allows burst - most common) + Sliding Window Counter (accurate - Redis sorted set). Architecture: distributed rate limiter using Redis (all app servers share same Redis). Implementation: Lua script in Redis for atomicity: INCR key + EXPIRE key = atomic check-and-increment. Config: rules stored in Redis/DB (user_id: 100 req/min + IP: 1000 req/min + API key: 10000 req/min). Response: 429 + Retry-After: X header. Edge rate limiting: at CDN/API Gateway (faster - before hitting servers)

Answer 62

Requirements: O(1) get/put + high throughput + distributed + fault tolerant. Architecture: cache cluster (N nodes) + clients use consistent hashing to route keys to nodes. Node: hash map + LRU eviction + fixed memory limit. Consistent hashing: each node at multiple positions on hash ring (virtual nodes) -> even distribution. Replication: write to 2 nodes for fault tolerance. Consistency: cache-aside (application manages) + TTL-based expiration. Failure: if node fails -> consistent hashing routes affected keys to next node -> cache miss -> load from DB -> warm up. Add node: only 1/N keys remapped

Answer 63

Distributed locks coordinate access across multiple servers. Use cases: prevent double payment processing + leader election + resource reservation. Redis-based (Redlock algorithm): SET key value NX EX 30 (atomic set if not exists + 30s expiry). Acquire: SET returns OK -> lock acquired. Release: check value matches + DEL (Lua script for atomicity). Acquire with retry: retry with jitter until lock acquired or timeout. Expiry prevents deadlock if holder crashes. Redlock: acquire lock on N/2+1 Redis nodes for stronger guarantee. Alternatives: Zookeeper ephemeral nodes + etcd leases

Answer 64

Requirements: get(key) + put(key-value) + delete(key) + high availability + horizontal scaling. Architecture: consistent hashing ring -> each key maps to node + replicas. Write path: client -> coordinator node -> write to W nodes -> acknowledge. Read path: client -> coordinator -> read from R nodes -> return latest version (vector clock). W+R>N = quorum for strong consistency. Typically: N=3 W=2 R=2. Storage: LSM tree (MemTable -> WAL -> SSTables + Bloom filter for key existence). Partitioning: virtual nodes. Anti-entropy: gossip protocol to sync inconsistent replicas

Answer 65

Requirements: crawl the web + extract URLs + store pages. Scale: crawl 1B pages. Architecture: URL Frontier (priority queue of URLs to crawl) + Fetcher workers (download HTML) + DNS resolver cache + Parser (extract URLs + text) + Deduplication (Bloom filter + URL hash set) + Storage (S3 for HTML + Elasticsearch for indexing). Crawl politeness: respect robots.txt + rate limit per domain + delay between requests. BFS order preferred (broad coverage). Priority: rank by PageRank + freshness. Distributed: URL frontier sharded by domain hash -> each shard has dedicated crawler workers

Answer 66

Requirements: upload/download files + sync across devices + share + version history. Scale: 500M users + 50M DAU + 10M file uploads/day. Upload: client splits file into chunks (4-8MB) -> check if chunk already stored (deduplication by SHA256 hash) -> upload new chunks to S3 via pre-signed URLs -> metadata server records file + chunk list. Download: fetch chunk list + download chunks from CDN -> reassemble. Sync: delta sync (upload only changed chunks). Metadata: PostgreSQL (file_id - user_id - chunk_ids - version - shared_with). Conflict resolution: last-write-wins or version history

Answer 67

URL Shortener: Base62(autoincrement ID) + KV store + Redis cache + 302 redirect. Twitter feed: hybrid push (regular users) + pull (celebrities) + Redis feed list + Cassandra tweets. Chat: WebSocket + Cassandra + presence service. Instagram: pre-signed S3 upload + Lambda transcoding + CDN + Redis feed. YouTube: S3 upload + FFmpeg workers + HLS/DASH streaming + CloudFront. Uber: Redis Geo for driver locations + GEORADIUS for matching + Cassandra location history. Typeahead: Trie + frequency weight + cache in Redis. Notification: Kafka fan-out + priority queues + DLQ + rate limit

Answer 68

OSI 7 layers: 1-Physical + 2-Data Link + 3-Network (IP - routing) + 4-Transport (TCP/UDP - ports - reliability) + 5-Session + 6-Presentation (encoding/encryption) + 7-Application (HTTP - DNS - FTP - SMTP). For system design focus on: Layer 3 (IP - routing - VPC - subnets) + Layer 4 (TCP vs UDP - ports - firewalls - security groups - load balancers) + Layer 7 (HTTP/HTTPS - REST - gRPC - WebSocket - application-level load balancing). L4 load balancer routes by IP/port. L7 load balancer routes by URL/headers/content

Answer 69

REST: resource-based URLs + HTTP verbs + JSON + stateless + cacheable + widely supported + human readable + versioning via URL/header. Best for: public APIs + browser clients + simple CRUD. GraphQL: single endpoint + client specifies exact data needed + reduces over/under-fetching + strongly typed schema + real-time via subscriptions. Best for: BFF (flexible client queries) + mobile (bandwidth sensitive) + complex relationships. gRPC: binary Protocol Buffers + HTTP/2 + strongly typed + auto-generated clients + streaming + fastest. Best for: internal service-to-service + polyglot environments

Answer 70

TLS (Transport Layer Security) encrypts data in transit between client and server. HTTPS = HTTP over TLS. TLS handshake: 1) Client Hello (supported cipher suites + TLS version). 2) Server Hello (chosen cipher + certificate). 3) Client verifies certificate (signed by trusted CA). 4) Key exchange (asymmetric crypto to establish symmetric session key). 5) Encrypted communication begins (symmetric AES). TLS termination: SSL certificate at load balancer or API Gateway (decrypt there + communicate unencrypted internally or re-encrypt). Certificate management: AWS ACM (free + auto-renew)

Answer 71

Service Discovery allows services to find each other without hardcoded IPs (instances are dynamic). Types: Client-side (service queries registry + picks instance + calls directly - Netflix Eureka - service caches registry). Server-side (service calls LB/proxy + proxy queries registry + routes - Kubernetes DNS - simpler for service). Push vs Pull: Consul uses health checks + gossip. Kubernetes: each Service gets DNS name (service-name.namespace.svc.cluster.local) + kube-proxy routes to healthy pods. Self-registration: service registers on startup + deregisters on shutdown (or health check fails)

Answer 72

mTLS: both client AND server authenticate each other with certificates (regular TLS only server authenticates). How: both have certificate + private key -> during TLS handshake both present + verify certificates. Use for: service-to-service communication in zero-trust networks + securing internal microservice communication + API client authentication (instead of API keys). Managed by service mesh (Istio auto-injects mTLS between all services). Benefits: strong authentication + encryption + no API key management. Certificate rotation: automated via cert-manager

Answer 73

IP address: unique identifier for network device (IPv4: 32-bit e.g. 192.168.1.1 + IPv6: 128-bit). Subnet: logical subdivision of network. CIDR (Classless Inter-Domain Routing): notation for IP ranges: 10.0.0.0/24 = 10.0.0.0 to 10.0.0.255 (256 addresses - /24 means 24 bits fixed). /16 = 65536 addresses. /32 = one address. VPC CIDR: 10.0.0.0/16. Public subnet: 10.0.1.0/24 (has internet gateway route). Private subnet: 10.0.2.0/24 (no direct internet - use NAT gateway for outbound). Security groups: stateful firewall at instance level. NACL: stateless at subnet level

Answer 74

OAuth2 is authorization framework for delegated access. Flows: Authorization Code (web apps: user logs in with Google -> get auth code -> exchange for tokens) + Client Credentials (service-to-service: no user involved - service uses client_id + client_secret to get token) + PKCE (mobile/SPA: prevents auth code interception). Tokens: Access Token (short-lived: 1hr JWT - sent in Authorization: Bearer header) + Refresh Token (long-lived - exchange for new access token). Store tokens: access token in memory + refresh token in httpOnly cookie (XSS protection). OIDC adds identity (ID token)

Answer 75

OSI: L3=IP(routing) + L4=TCP/UDP(ports+reliability) + L7=HTTP/gRPC/WebSocket. REST=JSON+HTTP(public APIs). GraphQL=flexible queries+single endpoint(BFF). gRPC=binary+HTTP/2+fastest(internal). TLS handshake: cert verify -> key exchange -> symmetric encryption. mTLS: both sides authenticate (service mesh). OAuth2: Authorization Code(users) + Client Credentials(services). CIDR: /24=256IPs + /16=65536IPs. Service discovery: client-side(Eureka) vs server-side(K8s DNS). JWT: stateless token(header.payload.signature)

Answer 76

Metrics: quantitative measurements over time (request rate + error rate + latency P50/P95/P99 + CPU + memory + DB connections). Collected by Prometheus. Visualized in Grafana. Alert when threshold exceeded. Logs: discrete events with context (timestamp + level + service + trace-id + message + structured JSON). Collected by Fluentd/Filebeat -> Elasticsearch -> Kibana (ELK). Traces: track single request across multiple services (trace-id + span-id per service hop). Tools: Jaeger + Zipkin + AWS X-Ray. All three needed together for complete observability

Answer 77

RED method (for services): Rate (requests per second) + Errors (error rate %) + Duration (latency distribution - P50/P95/P99). Apply to every API endpoint. USE method (for resources): Utilization (% time resource is busy - CPU 80% utilized) + Saturation (amount of work queued - disk I/O queue depth) + Errors (error events - disk read errors). RED = user-facing service health. USE = infrastructure resource health. Together: RED shows user impact + USE shows root cause. Alert on: P99 latency > SLO + error rate > threshold + saturation > 80%

Answer 78

Distributed tracing tracks a request as it flows across multiple services. Every request gets a unique Trace ID at entry point (API Gateway). Each service call creates a Span (child of parent span). Spans include: service name + operation + duration + status + tags. Propagated via HTTP headers (W3C traceparent standard or X-B3-TraceId). Visualization: Jaeger/Zipkin UI shows waterfall diagram - identify which service caused latency. Implementation: auto-instrumented by OpenTelemetry SDK. Without tracing: debugging distributed systems is like finding a needle in a haystack

Answer 79

Log Aggregation collects logs from all services into central searchable store. ELK Stack: Elasticsearch (distributed search + store) + Logstash (collect + parse + transform) + Kibana (visualize + dashboards + alerts). EFK: replace Logstash with Fluentd (lighter). Log shipping: containers write to stdout -> Fluentd DaemonSet collects -> ships to Elasticsearch. Structured logging: JSON format (not plain text) for easy parsing. Each log entry must include: service name + instance ID + trace-id + correlation-id + timestamp + level + message. Retention: 30-90 days

Answer 80

SLI (Service Level Indicator): specific metric measuring service behavior (availability % + P99 latency + error rate). SLO (Service Level Objective): target value for SLI agreed internally (availability > 99.9% + P99 < 200ms). Error Budget: 100% - SLO = allowed failure budget (99.9% SLO = 0.1% error budget = 8.76h downtime/year). If error budget exhausted: freeze feature deployments + focus on reliability. If error budget healthy: can accept more risk. Used in SRE (Site Reliability Engineering) to balance reliability vs velocity

Answer 81

Alert on symptoms not causes (alert on high error rate - not CPU usage - CPU is a cause). Alert should be: actionable (someone must do something) + urgent (requires immediate response) + rare (alert fatigue kills response quality). Alert types: P1 (page on-call - immediate response) + P2 (team notification - fix in hours) + P3 (ticket - fix in days). Avoid: alerting on every possible metric (noise). Use: multi-window multi-burn rate alerts (fast burn + slow burn for SLO). Tool: PagerDuty + OpsGenie for on-call routing + escalation

Answer 82

Chaos Engineering deliberately injects failures to test system resilience. Principles: 1) Define steady state (normal metrics). 2) Hypothesize failure won't affect it. 3) Inject failure (kill server - add latency - disconnect DB). 4) Observe. 5) Fix weaknesses found. Tools: Chaos Monkey (Netflix - random instance termination) + Chaos Mesh (K8s - pod/network/disk chaos) + AWS FIS (Fault Injection Service) + Gremlin. Game Days: team exercises with planned chaos. Start with staging environment. Mature teams run in production (with safeguards). Builds confidence in resilience

Answer 83

3 pillars: Metrics (Prometheus + Grafana) + Logs (ELK/EFK - structured JSON) + Traces (Jaeger/Zipkin - trace-id spans services). RED: Rate + Errors + Duration (per service). USE: Utilization + Saturation + Errors (per resource). SLI=metric. SLO=target. Error Budget=100%-SLO (freeze deploys when exhausted). Alerting: symptom-based + actionable + rare. P1=page oncall + P2=team notify + P3=ticket. Distributed tracing: W3C traceparent header. Chaos Engineering: inject failure -> observe -> fix. Correlation ID in every log entry

Answer 84

Authentication (who are you: OAuth2 + JWT + MFA) + Authorization (what can you do: RBAC + ABAC + IAM policies) + Encryption in transit (TLS/HTTPS everywhere) + Encryption at rest (DB encryption + S3 SSE + KMS) + Rate limiting (prevent abuse + DDoS) + Input validation (prevent injection + XSS + CSRF) + Secrets management (Vault + AWS Secrets Manager - never in code) + Audit logging (who did what when) + Network segmentation (VPC + private subnets + security groups) + Vulnerability scanning + Penetration testing

Answer 85

JWT (JSON Web Token): header.payload.signature. Payload: user_id + roles + expiry (exp) + issued at (iat). Signed with: HMAC-SHA256 (symmetric - shared secret) or RS256 (asymmetric - private key signs + public key verifies). Security: short expiry (15min access token) + refresh tokens for renewal + invalidation: maintain token blacklist in Redis or use short expiry + token binding. Never store sensitive data in payload (base64 decoded by anyone). Use RS256 for distributed verification (services verify without calling auth server). Store: access token in memory + refresh in httpOnly cookie

Answer 86

RBAC (Role-Based Access Control): permissions assigned to roles + users assigned to roles. Example: Admin role -> all permissions. Editor role -> read+write. Viewer -> read only. Simple + easy to manage + good for most systems. ABAC (Attribute-Based Access Control): permissions based on attributes of user + resource + environment. Example: user.department==Finance AND resource.sensitivity==Low AND time.isBusinessHours -> allow. More flexible + fine-grained + complex to manage. Use RBAC for: most web apps. ABAC for: complex enterprise authorization + data access policies

Answer 87

OWASP Top 10 most critical web security risks: Broken Access Control + Cryptographic Failures + Injection (SQL injection - NoSQL - LDAP) + Insecure Design + Security Misconfiguration + Vulnerable Components + Authentication Failures + Integrity Failures + Logging/Monitoring Failures + SSRF (Server-Side Request Forgery). SQL Injection prevention: parameterized queries (never string concatenation). XSS prevention: output encoding + CSP headers. CSRF prevention: SameSite cookies + CSRF tokens. Design with security in mind from start not as afterthought

Answer 88

VPC (Virtual Private Cloud): logically isolated network. Segmentation: Public subnets (load balancers - API gateways - have internet gateway route) + Private subnets (app servers - databases - no direct internet). Security Groups: stateful firewall (allow rules only - return traffic auto-allowed). NACL: stateless subnet firewall (explicit allow + deny + numbered rules). NAT Gateway: private subnet outbound internet (for updates) without inbound. VPC Peering: connect two VPCs. Private Link: access AWS services without internet. Bastion Host: jump server for SSH to private instances

Answer 89

DDoS (Distributed Denial of Service) overwhelms system with traffic. Layers: L3/L4 (volumetric: UDP flood - ICMP flood - mitigated by AWS Shield Standard - free) + L7 (application: HTTP flood - mitigated by WAF + rate limiting). Protection strategies: CDN absorbs volumetric attacks (CloudFront + Cloudflare) + Rate limiting at edge (API Gateway + CDN) + IP reputation blocking + CAPTCHA for suspicious traffic + Auto Scaling (absorb traffic) + AWS Shield Advanced (24/7 DDoS response team) + WAF rules (block patterns). Anycast routing distributes attack traffic globally

Answer 90

Security layers: Auth(JWT+OAuth2) + AuthZ(RBAC/ABAC) + TLS everywhere + Encryption at rest(KMS) + Rate limiting + Input validation + Secrets(Vault) + Audit logs + VPC segmentation. JWT: short expiry(15min) + RS256 for distributed + httpOnly cookie for refresh token. RBAC=roles. ABAC=attributes. OWASP: SQL injection(parameterized queries) + XSS(encoding+CSP) + CSRF(SameSite+CSRF token). VPC: public subnet(LB) + private subnet(DB+app) + Security Groups(stateful) + NACL(stateless). DDoS: CDN + Shield + WAF + Rate limiting

Answer 91

IaaS (Infrastructure as a Service): raw compute + storage + network. You manage: OS + runtime + app + data. Examples: EC2 + S3 + VPC. Maximum control + maximum responsibility. PaaS (Platform as a Service): managed runtime + framework. You manage: app + data only. Examples: Elastic Beanstalk + Heroku + Google App Engine. Faster development + less ops. SaaS (Software as a Service): complete software. You manage: nothing technical. Examples: Gmail + Salesforce + GitHub. Zero infrastructure work. Choose based on: control needed vs ops overhead tolerance

Answer 92

Serverless (FaaS): write functions + cloud manages all infrastructure. Pay per invocation + execution duration (ms). Auto-scales to zero (cost efficient at low traffic) + infinite scale. Examples: AWS Lambda + Azure Functions + Google Cloud Functions. Trade-offs: Cold start latency (100ms-2s first invocation after idle) + max execution time (15min Lambda) + stateless (no local state between invocations) + vendor lock-in + harder debugging + limited runtime customization. Use for: event processing + scheduled tasks + webhooks + APIs with unpredictable traffic

Answer 93

Kubernetes orchestrates containerized workloads across a cluster. Provides: self-healing (restart failed containers) + horizontal auto-scaling (HPA - scale on CPU/memory/custom metrics) + rolling deployments (zero downtime) + service discovery (DNS) + load balancing + config management (ConfigMaps + Secrets) + resource limits (prevent noisy neighbor). Key objects: Pod (1+ containers) + Deployment (desired state) + Service (stable network endpoint) + Ingress (external HTTP routing) + HPA (auto-scaling) + PVC (persistent storage). De facto standard for production container orchestration

Answer 94

IaC defines infrastructure in code files (version-controlled + reviewable + automated). Tools: Terraform (HCL - multi-cloud - declarative - state file) + AWS CloudFormation (YAML/JSON - AWS-native - free) + AWS CDK (TypeScript/Python - synthesizes to CloudFormation) + Pulumi (real programming languages). Benefits: reproducible environments (dev=staging=prod) + peer review of infra changes + automated provisioning + drift detection (actual vs desired) + rollback by reverting commit. GitOps: infra changes via PR -> CI/CD applies automatically

Answer 95

Auto-scaling adjusts capacity based on demand. Types: Horizontal Pod Autoscaler (HPA - add/remove pods based on CPU/memory/custom metrics - K8s) + Vertical Pod Autoscaler (VPA - adjust resource requests/limits) + Cluster Autoscaler (add/remove EC2 nodes based on pending pods) + AWS Auto Scaling Group (scale EC2 instances by CloudWatch metrics). Policies: Target Tracking (maintain metric at target: CPU=60%) + Step Scaling (add N instances when metric crosses threshold) + Scheduled (known patterns: add capacity at 9am). Scale-out: add instances. Scale-in: remove (with cooldown period to prevent thrashing)

Answer 96

Multi-region deploys system in 2+ geographic regions for: lower latency (serve users from nearest region) + disaster recovery (region outage doesn't take down service) + data sovereignty (store EU user data in EU). Challenges: data replication across regions (latency + consistency) + increased cost + operational complexity. Active-Active: all regions serve traffic (Route 53 latency routing) + data sync between regions (conflict resolution needed). Active-Passive: one region primary + others standby (simpler - some downtime on failover). Data: Global tables (DynamoDB) + Aurora Global Database + S3 Cross-Region Replication

Answer 97

Data Center: physical facility with servers + cooling + power + networking. Availability Zone (AZ): one or more data centers in a region - physically separate + independent power/cooling/networking - connected by low-latency private links. Region: geographic area with 2+ AZs (us-east-1 has 6 AZs). Deploy across multiple AZs for HA (AZ failure = data center fire/flood/power outage). Deploy across multiple Regions for: geo distribution + extreme HA + disaster recovery + data sovereignty. AZ redundancy: automatic in most AWS managed services (RDS Multi-AZ + ELB + Auto Scaling)

Answer 98

IaaS=EC2+S3(you manage OS+app). PaaS=Beanstalk(you manage app only). SaaS=Gmail(nothing to manage). Serverless: pay per ms + cold start + stateless + auto-scale to zero. K8s: self-healing + HPA + rolling deploy + service discovery. IaC: Terraform(multi-cloud) + CloudFormation(AWS) + CDK(code->CFN). Auto-scaling: HPA(pods) + Cluster Autoscaler(nodes) + ASG(EC2). Multi-region: Active-Active(latency routing+sync) vs Active-Passive(simpler+failover). AZ=1 DC. Region=2+ AZs. Deploy: multi-AZ always + multi-region for global/extreme HA

Answer 99

Synchronous: client waits for operation to complete before continuing. Simple + immediate feedback + tight coupling between caller and callee. Problem: slow operations block caller + cascading timeouts. Use for: simple CRUD + real-time queries + operations that need immediate result. Asynchronous: client submits work and continues. Decoupled + resilient (queue buffers failures) + better throughput. Patterns: async via message queue (fire and forget + poll for result) + callbacks/webhooks (server calls back when done) + async/await (non-blocking I/O). Use for: email sending + video processing + payment processing + any slow operation

Answer 100

Elasticsearch is a distributed full-text search engine based on Apache Lucene. Core: inverted index (word -> list of documents containing it). Index: collection of documents. Shard: unit of distribution (index split into shards across nodes). Replica: copy of shard for HA + read scaling. Write: document indexed -> tokenized -> inverted index updated. Read: query -> search all shards -> merge results -> rank by relevance (TF-IDF + BM25). Use cases: full-text search + log analytics (ELK) + autocomplete + faceted search + geospatial. Sync from DB: CDC (Debezium) or dual write

Answer 101

Layers: Client -> DNS (Route 53 latency routing to nearest region) -> CDN edge (CloudFront - cache static content + dynamic content with short TTL) -> Regional Load Balancer -> App Servers -> Regional Cache (Redis) -> Regional DB (read replica) -> Primary DB (writes only in home region). CDN cache hit: P99 < 10ms. Cache miss: P99 < 50ms (regional). DB read: P99 < 100ms. DB write: P99 < 200ms. Media: uploaded to S3 -> Lambda transcoding -> served via CloudFront. Images: serve via CDN + WebP format + responsive sizes

Answer 102

Push (Fan-out on Write): when event occurs -> immediately push to all subscribers/feeds. Pros: reads are fast (pre-computed). Cons: expensive for high-fan-out (celebrity with 100M followers -> 100M writes per tweet). Pull (Fan-in on Read): reader computes feed on read time by fetching latest from followed users. Pros: simple writes. Cons: slow reads for users following many accounts. Hybrid (used by Twitter): push for regular users (< 1M followers) + pull for celebrities (fan-out would be too expensive). Combine at read time. Most real systems use hybrid

Answer 103

Leader election selects one node as leader to perform singleton tasks (scheduled jobs + coordination + write ordering). Why needed: multiple instances of a service -> only one should run scheduled jobs. Algorithms: Raft (consensus algorithm used in etcd + CockroachDB) + Zookeeper (ephemeral sequential znodes) + Redis (SET NX EX - first to set key wins) + Kubernetes leader election (ConfigMap/Lease object). Leader failure: followers detect via heartbeat timeout -> trigger new election. Kubernetes: built-in leader election for controllers. Spring Integration: @EnableLeaderInitiator

Answer 104

Gradually replace legacy system with new system without big-bang rewrite. Steps: 1) Add proxy/facade in front of legacy system. 2) Implement new service for one feature. 3) Route that feature's requests to new service via proxy. 4) Migrate more features iteratively. 5) Eventually legacy system receives no traffic -> decommission. Benefits: no big-bang rewrite + continuous operation + easy rollback (route back to legacy) + learn incrementally. Named after fig tree that grows around host tree. Used for: monolith to microservices + legacy system replacement + DB migration

Answer 105

EDA: components communicate by producing and consuming events. Events are immutable facts about what happened. Producer doesn't know consumers. Consumers react independently. Benefits: loose coupling + temporal decoupling + easy extensibility (add consumer without changing producer) + natural audit log + replay capability. Challenges: eventual consistency + complex debugging (hard to trace causality) + event ordering issues + duplicate processing (idempotency required) + event schema evolution. Tools: Kafka (durable streaming) + SNS/SQS (simpler) + EventBridge (SaaS integration)

Answer 106

Polyglot persistence uses different database types for different parts of the system based on data access patterns. Example e-commerce: Product catalog -> MongoDB (flexible JSON documents + search) + User sessions -> Redis (fast K-V + TTL) + Orders -> PostgreSQL (ACID transactions + relational) + Activity feed -> Cassandra (high write throughput + time-series) + Search -> Elasticsearch (full-text + facets) + Recommendations -> Neo4j (graph relationships). Each DB optimized for its use case. Challenge: data consistency across stores + operational complexity of running many DB types

Answer 107

Service Mesh: infrastructure layer handling service-to-service communication via injected sidecar proxies (Envoy). Provides: mTLS (encrypted service communication) + traffic management (canary + A/B + circuit breaking) + observability (distributed tracing + metrics per route) + service discovery - all without application code changes. Examples: Istio + Linkerd + AWS App Mesh. Do you need it? No for: small number of services + simple communication. Yes for: many services + strong security requirements + advanced traffic management + centralized observability. Adds operational complexity

Answer 108

CAP Theorem: Consistency (all nodes see same data at same time) + Availability (every request gets response) + Partition Tolerance (system works despite network partitions). Can only guarantee 2 of 3. P is always needed (partitions happen) -> choose C or A. PACELC: extends CAP: during Partition: tradeoff A vs C. Else (normal operation): tradeoff Latency vs Consistency. DynamoDB: PA/EL (partition-tolerant + available + low latency over consistency). PostgreSQL: PC/EC (consistent even in partitions + consistent over low latency). Helps choose DB based on actual trade-offs

Answer 109

Denormalization: store pre-computed/pre-joined data to avoid expensive JOIN queries at read time. Trade-off: faster reads + more storage + update complexity (must update denormalized copy when source changes). Materialized View: pre-computed query result stored as table. Updated: periodically (refresh scheduled) or on-trigger (incremental). PostgreSQL: CREATE MATERIALIZED VIEW + REFRESH MATERIALIZED VIEW. Use for: expensive aggregate queries run frequently + reporting dashboards. In microservices: each service maintains its own denormalized read model updated via events (CQRS read side)

Answer 110

Sync: immediate+simple+blocking. Async: queue-based+resilient+decoupled. Push(fan-out on write): fast reads+expensive for celebrities. Pull(fan-in on read): simple writes+slow reads. Hybrid=both. Elasticsearch: inverted index + shards + replicas. Leader election: Redis SET NX EX or K8s Lease. Strangler Fig: gradual migration via proxy (route feature-by-feature to new service). EDA: loose coupling + eventual consistency + idempotent consumers required. Polyglot persistence: right DB per use case. CAP: P always needed -> C vs A. PACELC adds latency trade-off. Materialized views: pre-computed for fast reads

System Design Flashcards

(134 cards)