TinyURL System Design
Never dive directly into the design phase; it may raise red flags during the interview!
Interviewer: Design URL shortner application such as TinyURL.
1. Feature Expectations [5 mins]
You: Before we start, could you clarify the expected scale of users and usage patterns? Understanding this will guide us in determining capacity and performance requirements.
Interviewer:We anticipate millions of daily URL shortening requests with scalability for future growth. Security is crucial to prevent misuse and ensure privacy.
You: Thanks for clarifying. I'll now outline the TinyURL system design, starting with the functional requirements
Primary Functional Requirements
- URL Shortening
- URL Redirecting
Secondary Functional Requirements
- Custom Short URLs
- Expiration Date
TinyURL is for anyone needing to share or manage URLs effectively and developers who require API access for integrating URL shortening into applications.
To keep the discussion focused and manageable, I will not cover features such as URL Content Validation, Advanced Security Features, Comprehensive User Management, or Bulk URL Shortening, as these could be considered secondary and would require additional time to discuss comprehensively. If time permits, we can discuss Custom Short URLs and Expiration Dates later.
Limit the number of features discussed to one or two, as covering more can be time-consuming and may detract from explaining the most critical aspects of the design.
2. Estimations [5 mins]
Estimating the design for TinyURL involves assessing the scale of expected user traffic and storage requirements, while also considering the complexity of implementing features like custom short URLs and expiration dates. Detailed estimation should factor in database capacity, server resources, and potential growth scenarios to ensure robust performance and scalability.
- Assume the service needs to handle 100 million URLs per day.
- Given the rate of 100 million URLs per day / 24 hours / 3600 seconds ≈ 1157 writes per second.
- Assuming a 10:1 read-to-write ratio: 1157 writes per second * 10 = 11570 reads per second.
- Assuming 100 bytes per URL * 100 million URLs/day * 365 days/year * 10 years = 365 billion URLs.
- Storage needed = 365 billion URLs * 100 bytes/URL = 36.5 TB.
Clear estimations demonstrate planning and analytical skills crucial for system scalability and performance assessment.
3. Design Goals [5 mins]
To manage these estimations effectively, the design goals for TinyURL emphasize robust and reliable URL shortening with a focus on simplicity and user-friendliness. The system aims to generate short URLs efficiently for reliable redirection, ensuring scalability to handle high volumes of requests while maintaining responsiveness.
- High Availability
- Scalability
- Fault Tolerance
Specify latency/throughput targets and decide on consistency/availability levels based of estimations discussed for robust system design.
4. High-Level Design [5-8 mins]
The high-level design for a URL shortening platform akin to TinyURL includes a scalable microservices architecture utilizing APIs for both read and write operations. The system will employ a NoSQL database schema optimized for fast data retrieval and storage, supported by efficient URL shortening algorithms to ensure rapid generation and redirection of shortened URLs.
I. APIs for Read/Write Scenarios
For TinyURL, APIs for read/write scenarios would include endpoints for creating new shortened URLs, retrieving original URLs from shortened versions, and managing expiration dates or custom short URLs. These APIs ensure seamless integration with client applications while maintaining data integrity and accessibility.
Get TinyURL
Endpoint | Parameters | Response |
GET /urls/{url_key} |
url_key |
Redirect to the original long URL associated with url_key |
Create TinyURL
Endpoint | Parameters | Response |
POST /urls |
long_url , user_id , expiration_date |
JSON object with short_url and metadata like creation timestamp. |
II. Database Schema
While SQL databases offer advantages such as robust ACID transactions, complex query capabilities, and structured data models, these features are not critical for a URL shortening service like TinyURL. So, we chose a NoSQL database because it provides high scalability, flexibility, and performance, essential for handling the large volumes of URL shortening requests with low latency. NoSQL databases can efficiently manage vast amounts of unstructured data and scale horizontally by adding more nodes, ensuring the service remains responsive even under high load. Additionally, the schema-less nature of NoSQL databases allows for easy modifications and rapid development, accommodating evolving requirements without costly schema migrations.
Users Table
"user_id": "UUID",
"username": "string",
"email": "string",
"password_hash": "string",
"created_at": "timestamp"
}
Keys Table
"key_id": "UUID",
"user_id": "UUID,
"hash_key": "string",
"created_at": "timestamp"
}
URLs Table
"url_id": "UUID",
"user_id": "UUID",
"key_id": "UUID",
"long_url": "string",
"created_at": "timestamp",
"expiration": "timestamp"
}
Ensure clear and concise communication of design choices and their implications to demonstrate deep understanding and critical thinking.
5. Deep Dive [10-12 mins]
Let's begin the design process for the TinyURL system by outlining the core components and their interactions. We will focus on developing a scalable architecture, defining APIs for read/write operations, designing an appropriate database schema, and implementing efficient algorithms for URL shortening and redirection.
1. Users
Users send requests to create or retrieve short URLs via the Load Balancer, which distributes them across multiple App Servers.
2. Load Balancer
The Load Balancer serves as the front door to the service, intelligently distributing incoming requests across multiple app servers. This crucial component ensures no single server becomes overwhelmed, maintaining high availability and responsiveness even under heavy traffic conditions.
3. API Gateway
It acts as a single point of entry for all API calls, consolidating various microservices or backend services behind a unified API interface. It performs tasks such as request routing, protocol translation, authentication, authorization, and traffic management (like rate limiting).
- /POST createURL: Handles requests to shorten a new URL.
- /GET visitURL: Routes requests to retrieve the original long URL from a short URL.
4. User Service
The user service manages user-related functionalities such as registration, authentication, profile management, and user-specific data access within an application or system.
5. User Database
The User Database stores essential information about registered users, likely including usernames, email addresses, and potentially encrypted passwords. This NoSQL database enables user management features and personalized experiences for registered users.
6. Key Generation Service
Key Generation Service is crucial for generating unique short URLs efficiently, ensuring minimal collisions and optimal use of resources. It assigns and manages short keys (URL aliases) that map to original long URLs, facilitating quick redirection without exposing the full URL to users. The service operates as follows:
-
Key Generation:
- The service generates a unique identifier (key) for each new URL. This identifier can be created using methods such as hashing or UUID generation to ensure uniqueness and avoid collisions.
-
Key Verification and Assignment:
- Once a key is generated, the service checks if the key is already present.
- If the key is not present, it is assigned to the long URL, creating a new shortened URL.
- If the key is already present, the service appends a preset string to the key and generates a new key. This process is repeated until a unique key is found and assigned to the long URL.
7. Key Database
Key Database is a specialized system for securely storing and managing cryptographic keys, ensuring controlled access and auditability for encryption, authentication, and other secure operations.
8. URL Database
URL database stores mappings between unique shortened URLs and their corresponding long URLs for efficient redirection.
9. Cleanup Service
Clean Up Service plays a vital role in system maintenance, periodically scanning for and removing expired URLs. This helps manage storage space, maintain system performance, and ensure that temporary or time-sensitive content is appropriately handled.
High level architecture of TinyURL
6. Futher Optimizations [2 - 5 mins]
Caching
Caching is really important for a URL shortening service like TinyURL. It helps make the service faster by keeping popular short URLs ready to go in memory, so they load quickly. This takes pressure off the main servers, letting the service handle more users smoothly. Plus, caching makes sure the service stays reliable even when lots of people are using it or if there are temporary internet problems. Overall, caching makes TinyURL faster, more efficient, and more dependable for everyone who uses it.
Cache systems like Redis, Memcached, AWS ElastiCache, Microsoft Azure Cache for Redis, and Google Cloud Memorystore store data using key-value pairs efficiently. They vary in features and integrations, catering to diverse caching needs from high-performance in-memory caching to managed cloud services that ensure scalability and reliability.
Data Stored in Cache
"key": "string",
"value": "string"
}
1. User Cache
This cache stores user-related data such as user profiles, preferences, and access tokens. It helps in quickly validating user credentials, managing session states, and personalizing user experiences. By caching user data, the system reduces the overhead of frequent database queries, improving authentication speed and overall responsiveness.
Example User Cache
{
"key": "user:1234",
"value": {
"user_id": "1234",
"username": "testuser",
"email": "user@example.com",
}
},
{
...
},
]
2. URL Cache
URL Cache stores mappings of short URLs to their corresponding long URLs. This cache optimizes the retrieval of long URLs for redirection purposes, minimizing the need to query the main database. It accelerates the process of resolving short URLs to their original destinations, enhancing the performance and reliability of the URL redirection service.
Example URL Cache
{
"key": "abcd1234"
"value": "url:https://www.example.com/some-long-url",
},
{
...
},
]
High level architecture of TinyURL after optimizations
7. Data Flow [5-8 mins]
URL Shortening:
- User submits a long URL for shortening.
- Unique key is generated (e.g., by hashing) for the long URL.
- Key and corresponding long URL are stored in the database.
- Shortened URL is created using the key.
- Shortened URL is returned to the user for access.
URL Retrieval:
- User uses the shortened URL.
- Service extracts key from the shortened URL.
- Key is used to retrieve original long URL from the database.
- User is redirected to the original long URL.
Don't forget to explain the end to end flow of your design!
This architecture is designed to be scalable, resilient, and efficient, ensuring that the platform can handle a high volume of user interactions and data processing with minimal latency and high availability.