Give me an example system design of a book store
WAF
AWS Gateway
Microservices - books, search, cart, billing
VPC
ALB
DB
Horizontal vs Vertical Scaling
Single point of failure
Redundancy
Types of Redundancy
Hardware Redundancy
Extra servers, CPUs, or network links.
Example: Two load balancers in active-passive mode.
Software Redundancy
Multiple implementations of the same logic or service.
Example: A payment service built with a fallback provider.
Data Redundancy
Copying or replicating data across nodes or regions.
Example: Database replication (primary + replicas).
Time Redundancy
Retrying failed operations.
Example: Message queues re-delivering undelivered events.
🎯 Purpose of Redundancy
High availability (system stays up even if parts fail).
Fault tolerance (errors in one component don’t cascade).
Disaster recovery (data and services can be restored).
Load balancing (sharing work across multiple resources).
⚖️ Trade-offs
Pros: Improves reliability, availability, safety.
Cons: Increases cost, complexity, and can introduce consistency issues.
👉 In short: redundancy in software architecture is about having backup paths (extra hardware, software, or data) so the system can survive failures gracefully.
Circuit breaker
ingress vs egress§
cap theorm
🔑 CAP Theorem Basics
It states that in a distributed data system, you can only fully guarantee two out of the following three properties at the same time:
Consistency (C)
Every read gets the most recent write (all nodes see the same data at the same time).
Example: If you update your bank balance, any device you check immediately shows the new balance.
Availability (A)
Every request gets a valid response, even if some nodes are down.
The system always responds, but the data might not be the very latest.
Partition Tolerance (P)
The system continues to operate even if there’s a network split (some nodes can’t talk to others).
This is non-optional in distributed systems because network failures are inevitable.
⚖️ Trade-off
Since P is unavoidable, real-world systems typically choose between:
CP (Consistency + Partition Tolerance): Sacrifice some availability.
Example: HBase, MongoDB (in some configurations).
AP (Availability + Partition Tolerance): Sacrifice strict consistency (eventual consistency instead).
Example: Cassandra, DynamoDB.
📊 Example
Banking system: Chooses Consistency over Availability (better to reject a transaction than show wrong balance).
Social media feed: Chooses Availability over Consistency (better to show slightly stale posts than show an error).
👉 In short: CAP theorem reminds us that in distributed systems you can’t have Consistency, Availability, and Partition Tolerance all perfectly at once. You must design around which trade-off matters most.