Problem 1) Distributed Cache
Question 1) List down the functional requirements
Problem 1) Distributed Cache
Question 2) List down the non functional requirements
Problem 1) Distributed Cache
Question 3) List down the microservices covering all functional requirements
Problem 1) Distributed Cache
Question 4) Logical Diagram & flow of data
Diagram: https://drive.google.com/file/d/1H_oF4bNQDzG57k0cl2MPgun4gLNbv_U-/view?usp=sharing
Problem 1) Distributed Cache
Question 5) Schema Design
Concrete schema is not required here.
Data persistence :
Main memory is volatile, so we need to store the data on the
secondary memory.
● Two common techniques
- Append only file/logs
■ Data is written to disk on every write
■ Data is written to disk at a fixed time interval (1 sec)
- Snapshot
■ Snapshot of entire data is created at configurable time
Problem 1) Distributed Cache
Question 6) API Design
Problem 1) Distributed Cache
Question 7) Business logic for the problem statement
What is LRU ?
→ A combination of Doubly Linked List and a Hash Table, or a LinkedHashMap
● DLL nodes will have the key
● On access
- Move the key to the head of the DLL
● To evict
- Remove the key from the tail
- Delete from the Hash Table
● Update existing value
- Update the Hash Table
- Move the key to the head of the DLL
● Adding a new key
- Add to Hash Table
- Add the key to the head of the DLL
Problem 1) Distributed Cache
Question 8) Design Consideration for the problem statement
https://drive.google.com/file/d/1fN2-Lui7LEgOg89PfgGhHYbNsS1hpwQs/view?usp=sharing
and
https://drive.google.com/file/d/1BJTqYpEEweBbf4gl_2KuE3LBHalZhlHd/view?usp=sharing
A deterministic set of reasons can be posed across all interviews
○ Need to scale for storage (storage and cache tiers) - Yes
○ Need to scale for throughput (CPU/IO) - Yes
○ Need to scale for API parallelization - No
○ Need to remove hotspots - Maybe
○ Availability and Geo-distribution - Yes
Problem 2) How to Scale
Question 1) Replication
■ Once write is done by primary, when do you send ack back to the user would
change based on this.
■ Additional complexity of 2 or 3 phase commit or paxos for strong
consistency.
Problem 2) How to Scale
Question 2) Sharding
Distribute data across multiple cache servers