Database and System Design Flashcards

Question 1

Q

Request - Response

Answer

A

Client sends a request
Server parses the request
Server processes the request
Server sends a response
Client parses the response and consume

Where it is used?
- Web, HTTP, DNS, SSH
- RPC (remote procedure call)
- SQL and Database Protocols
- APIs (REST/SOAP/GraphQL)

Anatomy of a Request / Response
- Request structure is defined by both client and server
- Request has a boundary
- Defined by a protocol and message format
- Same for the response
- E.g HTTP Request

Building an upload image service with request response
- Send large request with the image (simple)
- Chunk image and send a request per chunk (resumable)

Doesn’t work everywhere
- Notification Service
- Chatting Application
- Con because you will have to constantly send requests asking if the chat responded causing huge latency and network spam
- Spamming server with empty requests congesting network
- Very long running requests
- If I send a request and request takes long time to process, lots of time waiting (solution is asynchronous processing)
- What if client disconnects?

Question 2

Q

Synchronous vs Asynchronous

Answer

A

Can I do work while waiting?
Asynchronous comes from the meaning of not being in sync

Synchronous I/O (Old way)
- Caller sends a request and blocks
- Caller cannot execute any code meanwhile
- Receiver responds, caller unblocks
- Caller and Receiver are in “sync”

Example of an OS synchronous I/O
- Program asks OS to read from disk
- Program main thread is taken off of the CPU
- Read completes, program can resume execution

Asynchronous I/O
- Caller sends a request
- Caller can work until it gets a response
- Caller either:
- Check if response is ready (epoll)
- Receiver calls back when its done (io_uring)
- Spins up a new thread that blocks
- Caller and receiver are not necessary in sync

Example of an OS asynchronous call (NodeJS)
- Program spins up a secondary thread
- Secondary thread reads from disk, OS blocks it
- Main program still running and executing code
- Thread finish reading and calls back main thread

Synchronous vs Asynchronous in Request Response
- Synchronicity is a client property
- Most modern client libraries are asynchronous
- e.g Clients send an HTTP request and do work

Synchronous vs Asynchronous in real life
- In synchronous communication, the caller waits for a response from receiver
- Asking someone a question in a meeting
- Asynchronous communication the response can come whenever. Caller and receiver can do anything meanwhile
- Email

Asynchronous workload is everywhere
- Asynchronous programming (promises)
- Asynchronous backend processing
- Asynchronous commits in postgres
- Asynchronous IO in linux (epoll, io_uring)
- Asynchronous replication
- Asynchronous OS sync (fs cache)

Question 3

Q

Push

Answer

A

Want data as soon as possible

Request/response isn’t always ideal
- Client wants real time notification from backend
- A user ljust logged in
- A message just received
- Push model is good for certain cases

What is Push?
- Client connects to server
- Server sends data to client
- Client doesn’t have to request anything
- Protocol must be bidirectional
- Used by RabbitMQ

Push Pros and Cons
- Pros
- Real Time (pushing the result, get results real time)
- Cons
- Clients must be online (physically connected to server)
- Clients might not be able to handle the load
- Server might push a bunch of data to client but has no idea from the client whether the client can handle all the data
- Requires a bidirectional protocol
- Polling is preferred for light clients

Question 4

Q

Short Polling

Answer

A

Request is taking a while, I’ll check with you later
Takes quick time to poll

Where request/response isn’t ideal
- A request takes a long time to process
- Upload a YouTube video
- Backend wants to send notification
- User just logged in
- Polling is good communication style

What is short polling?
- Client sends a request
- Server responds immediately with a handle (id?)
- Server continues to process the request
- Client uses that handle to check for status
- Multiple “short” request response as polls

Short Polling Pros and Cons
- Pros
- Simple
- Good for long running requests
- Client can disconnect
- Cons
- Too Chatty
- As you scale and have many users, there will be lots of request and be congested
- Network Bandwidth
- Servers on cloud can cause higher charges
- Wasted backend resources
- When receiving a poll, uses server resources

Question 5

Q

Long Polling

Answer

A

Request is taking long, I’ll check with you later but talk to me only then its ready
Avoids chattiness and network congestion of short polling

Where request/response & polling isn’t ideal
- Request takes a long time to process
- Upload a YouTube video
- Backend wants to send notification
- User just logged in
- Short polling Is a good but chatty
- Meet long polling (Kafka uses it)

Long Polling
- Client sends a request
- Server responds immediately with a handle
- Server continues to process the request
- Client uses that handle to check for status
- Server does not reply until it has the response
- So we got a handle, we can disconnect and less chatty
- Some variations has timeout too

Long Polling Pros and Cons
- Pros
- Less chatty and backend friendly
- Client can still disconnect
- Cons
- Not real time

Question 6

Q

Server Sent Events

Answer

A

One request, a very long response
Never ending response, streaming chunks of data that are never end

Limitations of Request/Response
- Vanila request/response isn’t ideal for notification backend
- Client wants real time notification from backend
- User just logged in
- Message just received
- Push works but restrictive
- Server sent events with request/response
- Designed for HTTP

What is a server sent events
- Response has start and end
- Client sends a request
- Server sends logical events as part of response
- Server never writes the end of the response
- It is still a request but an unending response
- Client parses stream data looking for events
- Works with request/response (HTTP)

Server Sent Events Pros and Cons
- Pros
- Real Time
- Compatible with Request/response
- Cons
- Clients must be online
- Clients might not be able to handle
- Polling is preferred for light clients
- HTTP/1.1 problem (6 connections)

Question 7

Q

Publish Subscribe (PubSub)

Answer

A

One publisher many readers
Designed to solve problem where lots of services need to talk to each other
- Create mesh architecture where clients speak to different clients
All of them publish content to server and let servers decide
RabbitMQ

Request/Response pros and cons
- Pros
- Elegant and simple
- Scalable
- Cons
- Bad for multiple receivers
- High coupling
- Client/server have to be running
- Chaining, circuit breaking

Pub/sub pros and cons
- Pros
- Scales w/ multiple receivers
- Great for micro services
- Loose coupling
- Works while clients not running
- Cons
- Message delivery issues (two generals problem)
- Complexity
- Network Saturation

Question 8

Q

GraphQL

Answer

A

GraphQL: Allows clients to specify exactly what data they want to receive, avoiding over-fetching and under-fetching. Choose this when you have diverse clients with different data needs.

Question 9

Q

REST

Answer

A

Uses HTTP verbs (GET, POST, PUT, DELETE) to perform CRUD operations on resources. This should be your default choice for most interviews.

Question 10

Q

RPC (Remote Procedure Call)

Answer

A

Action-oriented protocol (like gRPC) that’s faster than REST for service-to-service communication. Use for internal APIs when performance is critical.

Don’t overthink this. Default to REST unless you have a specific reason not to. For real-time features, you’ll also need WebSockets or Server-Sent Events, but design your core API first.

For Twitter, we would choose REST and design our endpoints using our core entities as resources. Resources should be plural nouns that represent things in your system:

Question 11

Q

Blob Storage

Answer

A

Sometimes you’ll need to store large, unstructured blobs of data. This could be images, videos, or other files. Storing these large blobs in a traditional database is both expensive and inefficient and should be avoided when possible. Instead, you should use a blob storage service like Amazon S3 or Google Cloud Storage. These platforms are specifically designed for handling large blobs of data, and are much more cost-effective than a traditional database.
Blob storage services are simple. You can upload a blob of data and that data is stored and get back a URL. You can then use this URL to download the blob of data. Often times blob storage services work in conjunction with CDNs, so you can get fast downloads from anywhere in the world. Upload a file/blob to blob storage which will act as your origin, and then use a CDN to cache the file/blob in edge locations around the world.

Question 12

Q

Search Optimized Database

Answer

A

Search optimized databases, on the other hand, are specifically designed to handle full-text search. They use techniques like indexing, tokenization, and stemming to make search queries fast and efficient. In short, they work by building what are called inverted indexes. Inverted indexes are a data structure that maps from words to the documents that contain them. This allows you to quickly find documents that contain a given word. A simple example of an inverted index might look like this:

Question 13

Q

API Gateway

Answer

A

Especially in a microservice architecture, an API gateway sits in front of your system and is responsible for routing incoming requests to the appropriate backend service. For example, if the system receives a request to GET /users/123, the API gateway would route that request to the users service and return the response to the client. The gateway is also typically responsible for handling cross-cutting concerns like authentication, rate limiting, and logging.

Question 14

Q

CDN

Answer

A

A content delivery network (CDN) is a type of cache that uses distributed servers to deliver content to users based on their geographic location. CDNs are often used to deliver static content like images, videos, and HTML files, but they can also be used to deliver dynamic content like API responses.

Database and System Design Flashcards

(14 cards)