Database Replication Flashcards

(12 cards)

1
Q

Why do distributed systems need data replication?

A

A single node can’t provide high availability, scalability, and low latency under failures. Replication spreads copies across nodes to keep systems fast and resilient.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is data replication?

A

Keeping multiple copies of the same data across different nodes (often geographically distributed) to improve availability, performance, and read scalability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the main benefits of replication?

A

Lower latency (data closer to users)
Higher availability (survives node failures)
Higher read throughput (more replicas serve reads)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What problems does replication introduce?

A

Keeping replicas consistent
Handling failed nodes
Choosing sync vs async replication
Managing replication lag
Handling concurrent writes
Picking a consistency model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does synchronous replication work?

A

The primary waits for acknowledgments from all replicas before confirming the write to the client.
Trade-off: Strong consistency ✅, high latency ❌

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does asynchronous replication work?

A

The primary does not wait for replicas before responding to the client.
Trade-off: Low latency & high availability ✅, possible data loss ❌

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is primary-secondary replication?

A

One primary handles all writes, secondaries replicate data and serve reads.
Best for: Read-heavy workloads
Weakness: Primary bottleneck + write scalability limits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What happens if the primary node fails?

A

A secondary is promoted via:
-Manual failover (operator decides)
-Automatic leader election

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are common replication techniques?

A

Statement-based: Replicates SQL statements (simple, risky with nondeterminism)
WAL shipping: Replicates transaction logs (durable, tightly coupled) - WAL=write ahead log. Preserves consistency because it replicates actual changes instead of SQL, safe with nondeterministic functions
Logical (row-based): Replicates row changes (flexible, portable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What issues arise with async replication?

A

Lost writes if primary crashes
Read-after-write inconsistency
Mitigation: Read user-modified data from the leader

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is multi-leader replication?

A

Multiple nodes accept writes and replicate to each other.
Pros: Better write scalability & offline support
Cons: Write conflicts are common

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How does leaderless replication maintain consistency?

A

All nodes accept reads/writes. Consistency comes from quorums:
Write to w nodes
Read from r nodes
Guarantee correctness if w + r > n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly