What are some key metrics that you should know about caching (specifically around latency, operations per second, and memory bounds)?
For most modern caching systems:
- latency (roughly ~1ms latency)
- operations per second (roughly 100k/second)
- memory bounds (up to ~1TB)
What are some signs/triggers that you should may need to consider scaling a cache?
Generally speaking:
- poor hit rate (< 80%)
- higher latency (> 1ms)
- high memory (> 80% usage)
- churn/thrashing
What are some key metrics that you should know about most modern databases (specifically around transactions, latency, and storage capacities)?
Most modern databases support:
- high transactions (> 50k/second)
- low-latency (<5ms/read)
- high storage capacities (64TB+)
When should you consider possibly scaling your database? What are some common signs?
You should consider scaling if you start to see:
- high write throughput (> 10k writes/second)
- higher latencies (> 5ms read latencies)
- geographic distribution (product)
What are some key metrics about your average modern application server (specifically in terms of concurrent requests, CPU, and RAM?)
The average web server should be able to support:
- large number of concurrent requests (100k+/second)
- large CPUs (8-64 cores, 2-4Ghz)
- high amounts of RAM (64-128GB minimum, up to 2TB)
What are some key identifiers or metrics that may indicate you need to consider scaling an application server?
You should consider scaling during:
- high CPU (> 70% utilization)
- high memory (> 80%)
- latency threatening SLAs
- connections near capacity (>= 100k)
What are some key metrics to know about popular message queues (specifically regarding throughput, latency, and storage)?
Your average message queue (e.g, RabbitMQ, Kafka, etc.) should be able to support:
- extremely high throughput (1M+/second/broker)
- low-latency (<5ms end-to-end latency)
- reasonably high storage (~50TB)
What type of identifiers may indicate you need to scale your message queues/brokers?
You should consider scaling or adding appropriate resources if:
- extremely high throughput (~800k+/second)
- high partition counts (~200 partitions/cluster)
- growing consumer lag