Unpacking the Architecture of a Scalable Distributed URL Shortener for 100,000 RPS

Designing a URL shortener capable of processing over 100,000 URL generation requests per second presents significant architectural challenges, demanding unique short links and sub-100ms latency. Initial architectural explorations reveal a critical trade-off between performance and data integrity. A simple synchronous approach, where each short URL generation involves an immediate database check for uniqueness, proves too slow due to cumulative latency, failing to meet the high RPS requirement. Conversely, an asynchronous model, leveraging message queues like AWS SQS for deferred database writes, can achieve high throughput and low user-facing latency. However, this method struggles to guarantee global uniqueness of short URLs, as concurrent generations might produce identical identifiers before database persistence and validation can occur, leading to potential collisions.

The robust solution integrates parallel processing, distributed coordination, and advanced database features to reconcile these conflicts. It employs a coordinator-worker cluster architecture, often implemented in Go, to distribute incoming requests across multiple servers. DynamoDB, chosen for its horizontal scalability, handles the high volume of short URL generations through parallel batch processing. To mitigate ‘hot partitions’—where highly popular custom domains could overload single database nodes—Global Secondary Indexes (GSI) are combined with sharding techniques, appending numerical suffixes to partition keys to ensure even load distribution. Redis is incorporated for efficient aggregation of processed batches, further optimizing response times. This design emphasizes breaking down large problems, utilizing independent services, leveraging concurrency, and delegating complex transaction management to a horizontally scalable database. While this implementation is heavily integrated with AWS services, the underlying principles of distributed system design are adaptable, though adopting similar capabilities with relational databases like PostgreSQL (e.g., YugabyteDB, pg_shard) introduces additional configuration and maturity considerations, underscoring that the core challenge lies in architectural vision over specific tooling.