CAP Theorem
Consistent when Partitioned (CP)
Available when Partitioned (AP)
Data Storage
Relational DB (RDS, Postgres, MySQL) → Strong consistency, joins, transactions.
NoSQL (MongoDB, DynamoDB, Cassandra) → Flexible schema, high write/read throughput, denormalization.
Object storage (S3, GCS) → Cheap, durable, good for files/blobs.
Time-series DB (InfluxDB, TimescaleDB) → Optimized for append-only, temporal queries.
Rule of thumb: Store metadata in DB, large blobs in S3.
Caching
Where?
Eviction policies: LRU, LFU, TTL.
Estimation (quick math for interviews)
Storage:
* 1 KB ≈ 1,000 bytes; 1 MB ≈ 1M bytes; 1 GB ≈ 1B bytes.
* Text row (id + timestamp + ~200 chars) ≈ 1 KB
* Photo (JPEG, compressed) ≈ 200 KB
* 1 minute of video (Full HD 1080p) ≈ 10 MB
Cache:
* 20% of total for 99% hit rate
* Each machine 1TB ram
RPS estimation:
* Lambda ≈ 1000; EC2 ≈ 500 (t3.medium) - 10,000 (c6i.4xlarge)
* DAU × requests/user ÷ seconds in a day.
* Example: 10M DAU × 20 req ÷ 86,400 ≈ 2,300 RPS.
AWS Services Use Cases
When stuck
Storage, throughput, consistency
Steps