CAP
CAP theorem states that it is impossible for a distributed system to simultaneously provide all three of the following desirable properties:
Consistency (C): Every read receives the most recent write (no stale data). This means users can read or write from/to any node in the system and will receive the same data.
Availability (A): Every request receives a (non-error) response, without guarantee of latest. availability refers to a system’s ability to remain accessible even if one or more nodes in the system go down.
Every request receives a response (success or failure), even during failures.
Partition tolerance (P): The system continues to operate despite network failures that split it.
π‘ Key Idea:
In a distributed system, you can only fully guarantee two of the three at any given moment β not all three. We cannot build a general data store/db that is continually available, sequentially consistent, and tolerant to any partition failures. We can only build a system that has any two of these three properties.
π Real-World Examples:
Type
CP (Consistent + Partition-Tolerant)
Focus: Sacrifices Availability
Examples: HBase, MongoDB (when configured for strong consistency)
AP (Available + Partition-Tolerant)
Focus: Sacrifices Consistency
Examples: CouchDB, Cassandra, DynamoDB (eventual consistency mode)
CA (Consistent + Available)
Focus: Not possible under real network partitions (theoretical only) β
β οΈ Important nuance:
Partition Tolerance is not optional in distributed systems.
So the real tradeoff is Consistency vs Availability (during partitions)
Note: System that is not partition-tolerant will be forced to give up either Consistency or Availability in the case of a network partition. Therefore, the theorem can really be stated as: In the presence of a network partition, a distributed system must choose either Consistency or Availability.
CP System:
π Example (Very Clear)
Imagine a CP system with 2 nodes:
Node A βXβ Node B (network partition)
A write goes to Node A
Node B cannot see it
If Node B served reads β data inconsistency
So CP choice:
Node B refuses requests
Or Node A refuses writes
β‘οΈ Consistency preserved
β‘οΈ Availability sacrificed
π― When Partition Happens:
You must choose:
CP: Return error or timeout to preserve consistency.
AP: Return possibly stale data to stay available.
π Example Scenario:
Imagine a network partition splits your cluster.
A CP system may reject a request if it can’t guarantee the latest data.
An AP system may still serve old data but wonβt error out.
What is a Partition in CAP Theorem?
In the context of distributed systems, a partition (also called network partition) is when the network communication breaks between two or more parts (nodes) of the system β but each part still runs independently.
π§ Simple Analogy:
Imagine your app is deployed on three servers:
Server A (East Coast)
Server B (West Coast)
Server C (Europe)
Now suppose a network failure happens:
Server A canβt talk to Server B and C.
But each server is still up and running.
This split is a network partition.
π Impact of a Partition:
When a partition happens, your system must choose:
Should it serve stale data from available servers? (Availability, sacrifice Consistency)
Or should it block access until it reconnects to get the latest data? (Consistency, sacrifice Availability)
CA (Consistency + Availability)
These systems provide:
β Consistency β All nodes see the same data at the same time
β Availability β Every request receives a response (no errors or timeouts)
π« Partition Tolerance β They assume no network partitions happen
βWhy CA Isnβt Practical in Distributed Systems
CA is only achievable in systems where no network partition occurs β meaning all nodes are always connected and never fail to communicate with each other.
NOTE: Partition Tolerance is mandatory, not optional.
CA = Consistency + Availability, but NOT Partition Tolerance
That means:
All nodes always agree on data (Consistency)
Every request always gets a response (Availability)
But the system cannot handle partitions
So when a partition happens:
Nodes cannot communicate
The system breaks correctness or availability
Therefore the system is no longer functional
But in real-world distributed systems, network partitions can and do happen (e.g., server crashes, network lag, dropped packets, latency spikes). So if a CA system faces a partition, it can’t maintain both consistency and availability β it will break.
Where CA is possible:
Single-node databases (no distribution involved/ not at scale ) eg: Single-node RDBMS like SQLite
PACELC Theorem
PACELC stands for:
If there is a Partition (P), a system must choose between Availability (A) and Consistency (C); Else (E), when the system is running normally, it must choose between Latency (L) and Consistency (C).
So it’s:
Partition β Availability or Consistency
ELse (no partition) β Latency or Consistency
CAP only considers trade-offs during network partitions.
When distributed systems are not partitioned β so PACELC adds another layer.
if there is a partition (‘P’), a distributed system can tradeoff between availability and consistency (i.e., ‘A’ and ‘C’);
else (‘E’), when the system is running normally in the absence of partitions, the system can tradeoff between latency (‘L’) and consistency (‘C’).
Model: PA/EL
Partitioned (P): Prioritize Availability
Else (no P): Prioritize Latency
i.e Giving up on consistency
Exmaple: Amazon DynamoDB, Cassandra
Model: PC/EC
Partitioned (P): Prioritize Consistency
Else (no P): Prioritize Consistency
i.e Giving up on availability
Example: HBase
Model: PA/EC
Partitioned (P): Prioritize Availability
Else (no P): Prioritize Consistency
i.e Giving up on consistency and latency respectively
Example: Cosmo and MONGO db
NOte: Why this tradeoff exists even with no partition
In a distributed system, data is replicated across nodes for fault tolerance and scale.
So even in a healthy network:
A write on Node A must be:
Sent to replicas (Node B, C, β¦)
Acknowledged
Reads may come from:
Leader
Replicas
Caches
Design choice:
Wait for replicas β higher consistency, higher latency (Sycnc Replication)
Respond immediately β lower latency, weaker consistency (Aysnc Replication)