TL;DR
Consistent hashing maps both keys and nodes to a circular hash space (ring). Each key is assigned to the first node clockwise from its hash position. When a node joins or leaves, only keys near that node need to move—not the entire dataset. This makes it ideal for distributed caches and databases where nodes change frequently.
Visual Overview
THE HASH RING ┌────────────────────────────────────────────────────┐ │ │ │ 0° │ │ │ │ │ N1 ──● │ │ / \ K1● │ │ / \ │ │ 270° ● ● 90° │ │ N3 N2 │ │ \ / │ │ \ ●K2 / │ │ \ / │ │ ● │ │ 180° │ │ │ │ Hash(K1) = 80° → walks clockwise → lands on N2 │ │ Hash(K2) = 200° → walks clockwise → lands on N3 │ │ │ │ Rule: Key belongs to first node clockwise │ └────────────────────────────────────────────────────┘ WHY NOT MODULO HASHING? ┌────────────────────────────────────────────────────┐ │ Modulo: node = hash(key) % N │ │ │ │ With N=3 nodes: │ │ Key "user_1" (hash=10): 10 % 3 = 1 → Node 1 │ │ Key "user_2" (hash=15): 15 % 3 = 0 → Node 0 │ │ Key "user_3" (hash=23): 23 % 3 = 2 → Node 2 │ │ │ │ Add 4th node (N=4): │ │ Key "user_1": 10 % 4 = 2 → MOVED to Node 2! │ │ Key "user_2": 15 % 4 = 3 → MOVED to Node 3! │ │ Key "user_3": 23 % 4 = 3 → MOVED to Node 3! │ │ │ │ → ALMOST ALL keys remapped! Cache miss storm! │ └────────────────────────────────────────────────────┘
Core Explanation
What is Consistent Hashing?
Real-World Analogy: Imagine a circular running track with mile markers (0-100). Runners (nodes) stand at certain positions on the track. When a new item arrives, you calculate which mile marker it corresponds to, then walk clockwise until you hit the first runner—that runner is responsible for the item.
If a runner leaves, only their items need to be picked up by the next runner clockwise. If a new runner joins, they only take items between them and the previous runner.
How It Works
ALGORITHM STEPS ┌────────────────────────────────────────────────────┐ │ 1. CREATE THE RING │ │ └─ Ring positions: 0 to 2^32 (or 360°) │ │ │ │ 2. PLACE NODES ON RING │ │ └─ For each node: position = hash(node_id) │ │ └─ Node A: hash("A") = 100 │ │ └─ Node B: hash("B") = 200 │ │ └─ Node C: hash("C") = 300 │ │ │ │ 3. ASSIGN KEY TO NODE │ │ └─ key_pos = hash(key) │ │ └─ Walk clockwise from key_pos │ │ └─ First node you hit owns this key │ │ │ │ EXAMPLE: │ │ hash("user_123") = 150 │ │ 150 → walk clockwise → first node is B (200) │ │ Result: "user_123" lives on Node B │ └────────────────────────────────────────────────────┘
Adding a Node
BEFORE: 3 Nodes (A, B, C) ┌────────────────────────────────────────────────────┐ │ 0° │ │ │ │ │ A ───●── 100° │ │ / \ │ │ / \ ●●● Keys 101-200 │ │ 300° ● ● 200° │ │ C B │ │ \ / │ │ \ / │ │ ●●● Keys 201-300 │ │ │ │ A owns: 301-100 (wraps around) │ │ B owns: 101-200 │ │ C owns: 201-300 │ └────────────────────────────────────────────────────┘ AFTER: Add Node D at 150° ┌────────────────────────────────────────────────────┐ │ 0° │ │ │ │ │ A ───●── 100° │ │ / \ │ │ / D ──●── 150° │ │ 300° ● ● 200° │ │ C B │ │ │ │ A owns: 301-100 (unchanged) │ │ D owns: 101-150 (NEW - taken from B) │ │ B owns: 151-200 (reduced range) │ │ C owns: 201-300 (unchanged) │ │ │ │ → Only keys 101-150 moved! (~1/4 of B's keys) │ │ → A and C completely unaffected │ └────────────────────────────────────────────────────┘
Removing a Node
BEFORE: 4 Nodes (A, B, C, D) ┌────────────────────────────────────────────────────┐ │ D at 150° owns keys 101-150 │ │ B at 200° owns keys 151-200 │ └────────────────────────────────────────────────────┘ AFTER: Remove Node D ┌────────────────────────────────────────────────────┐ │ D removed at 150° │ │ B (next clockwise) absorbs D's keys │ │ B now owns keys 101-200 │ │ │ │ → Only D's keys moved │ │ → A and C completely unaffected │ └────────────────────────────────────────────────────┘ KEY INSIGHT ┌────────────────────────────────────────────────────┐ │ With N nodes and K total keys: │ │ │ │ Node added: ~K/N keys move (only new node's) │ │ Node removed: ~K/N keys move (to successor) │ │ │ │ Compare to modulo: ~K keys move (almost all!) │ └────────────────────────────────────────────────────┘
The Problem: Uneven Distribution
PROBLEM: NODES DON'T HASH EVENLY ┌────────────────────────────────────────────────────┐ │ 0° │ │ │ │ │ A ────────●── 50° (owns 0-50 = 50°) │ │ \ │ │ \ │ │ ● 250° B (owns 51-250 = 200°)│ │ / │ │ / │ │ C ────────●── 300° (owns 251-360 = 110°) │ │ │ │ Node A: 14% of ring │ │ Node B: 55% of ring ← HOTSPOT! │ │ Node C: 31% of ring │ │ │ │ This is just bad luck with hash positions │ └────────────────────────────────────────────────────┘ SOLUTION: VIRTUAL NODES ┌────────────────────────────────────────────────────┐ │ Give each physical node MULTIPLE ring positions │ │ │ │ Physical Node A → Virtual: A1, A2, A3, A4 │ │ Physical Node B → Virtual: B1, B2, B3, B4 │ │ Physical Node C → Virtual: C1, C2, C3, C4 │ │ │ │ 12 virtual nodes spread more evenly on ring │ │ Each physical node gets ~33% of keys │ │ │ │ More vnodes = better balance (typically 100-256) │ └────────────────────────────────────────────────────┘
Real Systems Using Consistent Hashing
| System | Use Case | Virtual Nodes | Notes |
|---|---|---|---|
| Amazon DynamoDB | Partitioning | Yes (256 default) | Described in famous Dynamo paper |
| Apache Cassandra | Data distribution | Yes (configurable) | Default 256 vnodes per node |
| Memcached (ketama) | Cache routing | Yes | Client-side hashing |
| Redis Cluster | Slot assignment | Fixed 16384 slots | Slots, not pure consistent hashing |
| Riak | Key-value storage | Yes | Pioneered virtual nodes |
| Discord | Guild/server routing | Yes | Routes users to servers |
Case Study: Distributed Cache
DISTRIBUTED CACHE ARCHITECTURE ┌────────────────────────────────────────────────────┐ │ │ │ Application │ │ │ │ │ ↓ │ │ Cache Client (consistent hashing) │ │ │ │ │ ├──► hash("user:123") → Cache Server A │ │ ├──► hash("user:456") → Cache Server B │ │ └──► hash("user:789") → Cache Server C │ │ │ │ SCENARIO: Server B dies │ │ ┌──────────────────────────────────────────────┐ │ │ │ Without consistent hashing: │ │ │ │ All keys rehash → 66% cache miss rate! │ │ │ │ │ │ │ │ With consistent hashing: │ │ │ │ Only B's keys (33%) miss → 33% miss rate │ │ │ │ A and C keys still hit! │ │ │ └──────────────────────────────────────────────┘ │ │ │ │ SCENARIO: Add Server D │ │ ┌──────────────────────────────────────────────┐ │ │ │ Only ~25% of keys move to D (cold) │ │ │ │ Other 75% still hit existing caches │ │ │ └──────────────────────────────────────────────┘ │ └────────────────────────────────────────────────────┘
When to Use Consistent Hashing
✓ Perfect Use Cases
DISTRIBUTED CACHING
Scenario: Memcached cluster with frequent node changes
Requirement: Minimize cache miss storms during scaling
Configuration: 100+ vnodes per physical node
Trade-off: Client needs consistent hashing library
DATABASE SHARDING
Scenario: Horizontal scaling of user data
Requirement: Predictable data location, minimal resharding
Configuration: Vnodes based on disk capacity
Trade-off: Range queries span multiple nodes
LOAD BALANCING (Session Affinity)
Scenario: Route users to same backend server
Requirement: Same user always hits same server
Configuration: Hash on user ID or session token
Trade-off: Uneven load if user activity varies
CDN EDGE ROUTING
Scenario: Route requests to nearest/best edge server
Requirement: Consistent routing, graceful failover
Configuration: Geographic + hash-based assignment
Trade-off: May need to consider latency too
✕ When NOT to Use
SMALL FIXED CLUSTERS Problem: 3-5 nodes that rarely change Alternative: Simple modulo or explicit assignment Why: Complexity not worth it for rare rebalancing RANGE QUERIES REQUIRED Problem: Need to query ranges of keys Alternative: B-tree partitioning, range-based sharding Why: Consistent hashing scatters adjacent keys CENTRALIZED COORDINATOR EXISTS Problem: Already have a master that tracks assignments Alternative: Let coordinator manage key-node mapping Why: Consistent hashing is for decentralized routing
Interview Application
Common Interview Question
Q: “Design a distributed cache like Memcached. How would you route requests to cache servers?”
Strong Answer:
“I’d use consistent hashing for routing. Here’s my design:
Why Consistent Hashing:
- Cache servers will fail and scale frequently
- Modulo hashing causes massive cache invalidation on any change
- Consistent hashing limits key movement to ~K/N keys
Implementation:
- Hash ring: Map 0 to 2^32 onto a circle
- Place servers: Hash each server hostname to ring position
- Route keys: Hash key, walk clockwise to first server
- Virtual nodes: 150 vnodes per physical server for balance
Client-side routing:
server = ring.get_node(hash(key)) response = server.get(key)Handling server failures:
- Client detects failure (timeout/error)
- Mark server dead, route to next clockwise
- Failed server’s keys become cache misses (expected)
- When server recovers, gradually warm up
Replication for availability:
- Store key on N servers (walk clockwise, pick N)
- Read from any replica
- Write to all N (or quorum)
Real-world example: Amazon’s Dynamo paper (2007) established this pattern. Most modern distributed caches and databases use it.”
Follow-up: How do virtual nodes improve the system?
“Virtual nodes solve the uneven distribution problem:
Without vnodes:
- 3 servers might hash to positions that give one server 60% of keys
- That server becomes a hotspot
With vnodes (e.g., 150 per server):
- Each physical server has 150 ring positions
- 450 total positions spread more evenly
- Statistical balancing: each server gets ~33% ± small variance
Additional benefits:
- Heterogeneous hardware: More powerful servers get more vnodes
- Faster rebalancing: When server dies, load spreads across many servers (not just one successor)
- Gradual migration: Can move vnodes one at a time during maintenance”
Code Example
Consistent Hash Ring (Python)
import hashlib
from bisect import bisect_left, insort
from typing import Optional, List
from dataclasses import dataclass
@dataclass
class VirtualNode:
"""A virtual node on the hash ring."""
position: int
physical_node: str
vnode_id: int
class ConsistentHashRing:
"""
Consistent hash ring with virtual nodes.
Usage:
ring = ConsistentHashRing(vnodes_per_node=150)
ring.add_node("cache-server-1")
ring.add_node("cache-server-2")
server = ring.get_node("user:12345") # Returns assigned server
Note: This implementation maintains a sorted positions list for O(log N)
lookups. Production implementations may use more sophisticated data
structures (e.g., skip lists, balanced trees).
"""
def __init__(self, vnodes_per_node: int = 150):
self.vnodes_per_node = vnodes_per_node
self.ring: List[VirtualNode] = [] # Sorted by position
self.positions: List[int] = [] # Parallel list for O(log N) bisect
self.nodes: set = set()
def _hash(self, key: str) -> int:
"""Hash a key to a position on the ring (0 to 2^32)."""
digest = hashlib.md5(key.encode()).hexdigest()
return int(digest[:8], 16)
def add_node(self, node: str) -> List[str]:
"""
Add a physical node with multiple virtual nodes.
Returns list of keys that would need to move to this node.
"""
if node in self.nodes:
return []
self.nodes.add(node)
# Add virtual nodes at different ring positions
for i in range(self.vnodes_per_node):
vnode_key = f"{node}:vnode:{i}"
position = self._hash(vnode_key)
vnode = VirtualNode(position, node, i)
# Insert at correct position to maintain sorted order
idx = bisect_left(self.positions, position)
self.ring.insert(idx, vnode)
self.positions.insert(idx, position)
return [] # In production, return keys that need migration
def remove_node(self, node: str) -> List[str]:
"""
Remove a physical node and all its virtual nodes.
Returns list of keys that need to move to successor nodes.
"""
if node not in self.nodes:
return []
self.nodes.remove(node)
# Rebuild ring and positions without removed node
new_ring = []
new_positions = []
for vn, pos in zip(self.ring, self.positions):
if vn.physical_node != node:
new_ring.append(vn)
new_positions.append(pos)
self.ring = new_ring
self.positions = new_positions
return [] # In production, return keys that need migration
def get_node(self, key: str) -> Optional[str]:
"""
Find the node responsible for a key. O(log N) lookup.
Returns the first node clockwise from the key's position.
"""
if not self.ring:
return None
position = self._hash(key)
# O(log N) bisect on pre-computed positions list
idx = bisect_left(self.positions, position)
# Wrap around if past the end
if idx >= len(self.ring):
idx = 0
return self.ring[idx].physical_node
def get_nodes(self, key: str, n: int = 3) -> List[str]:
"""
Get N unique nodes for a key (for replication).
Walks clockwise and collects unique physical nodes.
"""
if not self.ring:
return []
position = self._hash(key)
idx = bisect_left([vn.position for vn in self.ring], position)
nodes = []
seen = set()
for i in range(len(self.ring)):
vnode = self.ring[(idx + i) % len(self.ring)]
if vnode.physical_node not in seen:
nodes.append(vnode.physical_node)
seen.add(vnode.physical_node)
if len(nodes) >= n:
break
return nodes
def get_distribution(self) -> dict:
"""Get the key distribution across nodes (for monitoring)."""
if not self.ring:
return {}
# Count ring space owned by each node
distribution = {node: 0 for node in self.nodes}
prev_pos = self.ring[-1].position
for vnode in self.ring:
# Space from previous position to this position
if vnode.position > prev_pos:
space = vnode.position - prev_pos
else:
# Wrapped around
space = (2**32 - prev_pos) + vnode.position
# This space belongs to this vnode's physical node
distribution[vnode.physical_node] += space
prev_pos = vnode.position
# Convert to percentages
total = sum(distribution.values())
return {node: (space / total) * 100 for node, space in distribution.items()}
# Usage example
if __name__ == "__main__":
ring = ConsistentHashRing(vnodes_per_node=150)
# Add cache servers
for i in range(3):
ring.add_node(f"cache-{i}.example.com")
print("Distribution with 3 nodes:")
for node, pct in ring.get_distribution().items():
print(f" {node}: {pct:.1f}%")
# Route some keys
print("\nKey routing:")
for key in ["user:123", "user:456", "session:abc", "product:789"]:
node = ring.get_node(key)
replicas = ring.get_nodes(key, n=2)
print(f" {key} → primary: {node}, replicas: {replicas}")
# Add a fourth server
print("\nAdding cache-3.example.com...")
ring.add_node("cache-3.example.com")
print("Distribution with 4 nodes:")
for node, pct in ring.get_distribution().items():
print(f" {node}: {pct:.1f}%")
# Show same keys (most should stay on same servers)
print("\nKey routing after adding node:")
for key in ["user:123", "user:456", "session:abc", "product:789"]:
node = ring.get_node(key)
print(f" {key} → {node}")
Related Content
See It In Action:
- Consistent Hashing Explainer - Visual walkthrough of the ring
Related Concepts:
- Sharding - Partitioning data across nodes
- Virtual Nodes - Improving distribution balance
- Load Balancing - Traffic distribution patterns
Quick Self-Check
- Can explain consistent hashing in 60 seconds?
- Understand why modulo hashing fails for dynamic clusters?
- Know what happens when a node is added or removed?
- Can explain virtual nodes and why they improve balance?
- Understand time complexity of key lookup (O(log N))?
- Know real systems that use consistent hashing?
Production signal