What is scalability in system design?
Scalability is the ability of a system to handle increased load by adding resources. It can be vertical (adding more power to existing machines) or horizontal (adding more machines).
What is the difference between vertical and horizontal scaling?
Vertical scaling means adding more power (CPU, RAM) to existing servers. Horizontal scaling means adding more servers to distribute the load. Horizontal scaling is generally more flexible and fault-tolerant.
What is reliability in system design?
Reliability is the ability of a system to continue functioning correctly even when failures occur. It’s often measured by uptime percentages like 99.9% (three nines) or 99.99% (four nines).
What is availability?
Availability is the proportion of time a system is operational and accessible when needed. High availability systems minimize downtime through redundancy and failover mechanisms.
What is the CAP theorem?
CAP theorem states that a distributed system can only guarantee two out of three properties: Consistency (all nodes see the same data), Availability (every request gets a response), and Partition Tolerance (system works despite network failures).
What is maintainability?
Maintainability refers to how easy it is to operate, update, debug, and extend a system over time. It includes code quality, documentation, and operational practices.
What does consistency mean in distributed systems?
Consistency means all nodes in a distributed system see the same data at the same time. Strong consistency ensures immediate updates, while eventual consistency allows temporary inconsistencies.
What is fault tolerance?
Fault tolerance is the ability of a system to continue operating properly in the event of failure of some of its components. It’s achieved through redundancy and graceful degradation.
What is latency?
Latency is the time delay between a request and its response. Lower latency means faster response times. It’s typically measured in milliseconds.
What is throughput?
Throughput is the number of operations or requests a system can handle in a given time period, often measured in requests per second (RPS) or transactions per second (TPS).