Design Distributed ID Generator
The Problem#
In a distributed system with multiple servers/nodes, you can’t rely on simple auto-incrementing database IDs because:
- Multiple nodes might generate the same ID simultaneously
- Coordination between nodes would be too slow
- You need globally unique IDs without a single point of failure
This post explores how to design a distributed ID generator that can scale to hundreds of thousands of IDs per second across multiple data centers.
Requirements & Constraints#
Functional Requirements#
Scale:
- Initial: 10,000 IDs per second
- Target: 100,000 IDs per second
- Calculate: 100,000 IDs/sec × 86,400 seconds = 8.64 billion IDs per day
ID Properties:
- Format: 64-bit numeric integers (database-friendly)
- Ordering: Time-sortable for efficient indexing and range queries
- Gaps: Acceptable (uniqueness is more important than sequential ordering)
- Security: Predictability is acceptable (internal use only)
Infrastructure:
- Servers: Start with 10, scale to 1,024 servers
- Data centers: 3 regions
- Clock sync: NTP with <1 second skew
Performance:
- Latency: Sub-millisecond preferred
- No single point of failure
- Losing IDs on crash is acceptable (but no duplicates ever)
Key Design Questions#
Architecture Decision: Centralized vs Distributed?
- Centralized: Easy to maintain, but creates bottleneck and single point of failure
- Distributed (Chosen): Each server generates IDs independently
Bit Allocation: How to structure the 64-bit ID?
- Need to balance timestamp range, machine capacity, and throughput
Machine ID Assignment: How do servers get unique IDs?
- Options: Manual configuration, ZooKeeper coordination, or configuration service
Clock Issues: What if the system clock goes backwards?
- Must detect and handle to prevent duplicate IDs
Why Not UUIDs?#
Before diving into our solution, let’s understand why random UUIDs aren’t ideal:
Problems with Random UUIDs:
- Page splits: Database reorganizes index pages constantly
- Poor locality: Related data scattered across disk
- Cache misses: Can’t keep “hot” data in memory efficiently
- Slow writes: Every insert might touch different disk pages
- Index bloat: B-tree becomes fragmented
- Storage: 128 bits vs our 64-bit solution
Solution: Time-ordered IDs (Snowflake approach) solve these issues by keeping related IDs together.
The Solution: 64-Bit Snowflake ID Structure#
Our ID is composed of 4 parts within 64 bits:
| 1 bit | 41 bits | 10 bits | 12 bits |
|-------|------------|-----------|-----------|
| Sign | Timestamp | Machine | Sequence |
| 0 | Millisec | ID | Number |
Bit Allocation Breakdown#
1. Sign Bit (1 bit): Always 0
- Ensures the ID is positive when treated as signed integer
2. Timestamp (41 bits):
- Capacity: 2^41 = 2,199,023,255,552 milliseconds ≈ 69.73 years
- With custom epoch (e.g., Jan 1, 2020), we get ~70 years of IDs
- Provides natural time-ordering for database efficiency
3. Machine ID (10 bits):
- Capacity: 2^10 = 1,024 unique machines
- Can be split for multi-datacenter: 5 bits datacenter (32 DCs) + 5 bits machine (32 per DC)
- Ensures uniqueness across distributed servers
4. Sequence Number (12 bits):
- Capacity: 2^12 = 4,096 IDs per millisecond per machine
- Equals 4 million IDs per second per machine
- Total capacity: 1,024 machines × 4M IDs = 4.1 billion IDs/second
- Far exceeds our requirement of 100K IDs/second
Capacity Analysis#
With our bit allocation:
- Per machine: 4,096,000 IDs/second
- 10 machines: 40,960,000 IDs/second
- 100 machines: 409,600,000 IDs/second
- Requirement: 100,000 IDs/second ✓
This provides excellent headroom for growth.
Handling Edge Cases#
Clock Drift: When Time Goes Backwards#
The Problem: If the system clock moves backwards, we risk generating duplicate IDs.
Detection: Simple comparison
if current_time < last_timestamp:
# Clock moved backwards!
Handling Strategies:
Wait Strategy (Small drift < 5ms):
- Pause and wait for clock to catch up
- Best for minor NTP adjustments
Continue Strategy (Medium drift 5-100ms):
- Keep using last_timestamp
- Continue incrementing sequence number
- Acceptable since IDs don’t need precise timestamps
Reject Strategy (Large drift > 100ms):
- Return error and log incident
- Prevents service from generating bad IDs
- Requires manual intervention
Hybrid Approach (Recommended):
drift = last_timestamp - current_time if drift < 5ms: wait_until(last_timestamp) elif drift < 100ms: use last_timestamp + increment sequence else: raise ClockDriftError
Prevention & Monitoring:
- Configure NTP properly on all servers
- Monitor clock drift metrics continuously
- Alert on any backward clock movements
- Log all clock anomalies for analysis
Sequence Number Exhaustion#
Problem: What if we generate 4,096 IDs within the same millisecond?
Solution Options:
- Wait for next millisecond (Recommended): Brief pause, guaranteed uniqueness
- Return error: Fail fast, let caller retry
- Spin wait: Busy-wait until clock advances
Machine ID Assignment#
Options:
Manual Configuration:
- Simple: Add machine ID to config file
- Pros: No external dependencies
- Cons: Human error risk, manual tracking
ZooKeeper/Consul:
- Auto-assign IDs from a range
- Pros: Automated, centralized tracking
- Cons: External dependency
Database Registration:
- Servers register on startup
- Pros: Audit trail, easy monitoring
- Cons: Database dependency
Recommendation: Use ZooKeeper for automatic assignment with manual override capability.
Implementation Considerations#
Multi-Datacenter Support#
Split the 10-bit machine ID:
- 5 bits for datacenter: Supports 32 datacenters
- 5 bits for machine: 32 machines per datacenter
- Total: 32 × 32 = 1,024 unique ID generators
Time Precision Trade-offs#
Milliseconds (Chosen):
- ✓ Good balance between precision and lifespan
- ✓ 69+ years of IDs
- ✓ Sufficient granularity for most applications
Seconds:
- ✓ 2,199+ years of IDs
- ✗ Lower granularity (need more sequence bits)
ID Generation Algorithm#
function generate_id():
current_time = get_current_milliseconds()
if current_time < last_timestamp:
handle_clock_backwards(current_time)
if current_time == last_timestamp:
sequence = (sequence + 1) & 4095 # 12-bit mask
if sequence == 0:
# Exhausted sequence, wait for next millisecond
current_time = wait_next_millisecond(last_timestamp)
else:
sequence = 0
last_timestamp = current_time
# Construct the ID
id = ((current_time - EPOCH) << 22) |
(machine_id << 12) |
sequence
return id
Advantages of This Design#
- Scalable: Supports up to 1,024 machines generating 4B+ IDs/second
- Time-ordered: IDs naturally sort by creation time
- Database-friendly: Sequential IDs improve B-tree index performance
- Compact: Only 64 bits (vs 128 for UUIDs)
- Independent: No coordination needed between servers
- Fast: Sub-millisecond latency, no network calls
- Debuggable: Can extract timestamp and machine ID from any ID
Conclusion#
The Snowflake approach provides an excellent balance between simplicity, performance, and scalability for distributed ID generation. By carefully allocating bits across timestamp, machine ID, and sequence number, we achieve:
- 100K+ IDs/second capacity (with room for 40x growth)
- 69+ years of unique IDs
- Zero coordination overhead between servers
- Database-optimized time-ordered IDs
This design has been battle-tested at scale by Twitter, Instagram, Discord, and many others, making it a proven solution for distributed systems.