If you are designing Bit.ly or a similar URL shortening service – what are the 2-3 primary requirements you should target?
Users should:
- Provide a URL and receive a shortened code that corresponds to the URL
- Aliases (optional)
- Expiration (optional)
- Provide a shortened code and be redirected to the original URL
- Unique (obviously)
- Short (as possible, depends on scale)
For a Bit.ly-like system, what might be some common non-functional requirements?
What core entities would exist in a Bit.ly system?
The first two can be combined from a data perspective:
In fleshing out the APIs and interfaces for a Bit.ly like service what might be some common endpoints for your REST API?
// Request to create the new URL/short code
POST /urls/ : string (200)
// Get request to serve redirect of original URL
GET /urls/<code> (302, 301 optionally)</code>
What’s the crux of the Bit.ly system design problem?
Key/short code generation
What are two strong options for solving Bit.ly’s key generation problem?
How might you implement a counter-based, key generation approach for a service like Bit.ly?
There are a few options, you could use something like a distributed cache (e.g., Redis) to maintain it, or simply leverage a database to allocate a range of available identifiers to a server on startup that it would use (and request more when nearing capacity).
If you needed a distributed configuration model, you could also consider something more heavy-handed like Apache Zookeeper, etc.
If you had to design a system like Dropbox – what might be the core requirements of handling a system like that?
Users can:
- Upload file to our system
- Download/view files from our system
- Permissions (optional)
- Sharing a file with another user?
- Sync files from local/server devices
For a Dropbox-like system, what might be some important non-functional requirements?
What core entities would be present in a Dropbox-style system?
The entities would probably be summed up as:
In our Dropbox system design example, what types of REST-based endpoints would we have (initially)?
Obviously these would all contain additional payloads related to authentication/authorization via headers, session tokens, etc.
// Upload a file
POST /files/
{
file: byte[],
metadata: { … }
}
// Download a file
GET /files/<fileId> : bytes[] (more likely redirect)</fileId>
// Share a file
POST /files/<fileId>/share
{
users: string[]
}</fileId>
// Detect remote changes of a file
GET /files/<fileId>/changes</fileId>
In a Dropbox-like system, how would you go about storing the files in terms of technology choices?
The files would need to be decoupled, that is, storing the actual bytes in some form of blob storage (e.g., GCS, S3, etc.) and the metadata could be stored within a database (non-relational is probably a fine choice)
When uploading a file in our Dropbox style system – walk through a few alternate approaches and options?
In the Dropbox upload process involving a presigned URL, how might this affect the flow during an upload?
The usual process of performing an upload would now provide the user with a presigned URL for them to upload the target file provided the metadata. The user could then upload the file directly (and efficiently to blob storage that the URL).
Once the upload was complete, we could leverage an event driven process (usually available from blob storage) to get a signal the file was complete and update the database. If signals weren’t available, we could leverage some type of queue or polling processing on the client to verify the file was uploaded.