Sharding Strategies

slug: 004-sharding number: 4 title: "Sharding Strategies" description: "Range, hash, and consistent hashing. The trade-offs that determine whether your database survives the next 10× of growth." youtubeId: null publishedAt: null anchor: authors: "Giuseppe DeCandia et al." year: 2007 title: "Dynamo: Amazon's Highly Available Key-value Store" institution: "Amazon" venue: "SOSP"

The pattern at a glance

Long-form article coming soon. The narration below is the spoken version of this episode — read it as a quick transcript while the written companion is in draft.

Your orders table reaches one billion rows. Query latency starts climbing. Reads that used to take ten milliseconds now take eight hundred. The database is fine. The hardware is fine. The problem is you've outgrown one machine.

Vertical scaling — bigger disks, more memory, faster CPU — buys time. Eventually it stops. Then you have to spread the data across multiple machines.

This is sharding.

Replication is not sharding. Replication makes copies of the same data. It helps with reads. It does not help when writes are the bottleneck, or when the data simply doesn't fit on one disk.

Sharding splits the data itself. Different rows live on different machines. A single query now needs to know which machine holds the answer — or fan out to all of them.

You trade one big database for many small ones. You buy horizontal scalability with operational complexity.

The pattern is older than NoSQL. Partitioned databases were standard practice in mainframe systems decades ago. But the modern formulation comes from Amazon's Dynamo paper, published at SOSP 2007 by Giuseppe DeCandia and team. Dynamo introduced consistent hashing as the mechanism for partitioning data across an elastic set of machines.

Three sharding strategies. Each has a different trade-off.

One: range-based sharding. Rows go to shards by some key range. Users A through H on shard one, I through P on shard two, Q through Z on shard three. Queries that filter on the key hit one shard. The trap: hotspots. If new users register in alphabetical clusters, one shard takes all the write load.

Two: hash-based sharding. Hash the key, then take the modulo by the number of shards. Even distribution. The trap: range queries now require a scatter-gather — meaning ask every shard and merge the results in application code.

Three: directory-based sharding. A lookup service maps each key to a shard. Maximum flexibility. The trap: the directory itself becomes a coordination bottleneck and a single point of failure.

A real-world problem with naive hash-based sharding: adding a new shard means rehashing every row. Eight shards become nine. The modulo changes. Every row's destination changes. Migration nightmare.

Consistent hashing solves this. Map both the data and the shards onto a ring. Each row lives on the nearest shard going clockwise. Adding a new shard reshuffles only the rows on the immediate neighboring arc — typically one over N of the total data, where N is the number of shards.

This is the Dynamo paper's central contribution. Production systems extend it with virtual nodes — meaning each shard owns many small slices of the ring instead of one large slice. Virtual nodes smooth distribution and prevent hotspots when a shard fails.

Three traps every sharded system hits.

One: the wrong shard key. A poorly chosen key creates hotspots that no amount of scaling fixes. If you sharded comments by author, a single deleted-user account would take an outsized share of writes. Choose keys that distribute load.

Two: cross-shard queries. A join across two tables in different shards is no longer a JOIN. It's a scatter-gather: ask every shard, merge the results in application code. Slower, more code, more failure modes. Schema design now has to anticipate which queries cross shards, and try to keep related data co-located on the same shard.

Three: rebalancing. Adding capacity to a running system means moving data while queries are live. You need versioned routing tables, dual-write windows, and careful coordination — or you accept a maintenance window. Most teams underestimate this cost.

Sharding adds complexity to every read, every write, every backup, every schema migration. Don't shard until you've exhausted vertical scaling, read replicas, caching, and query optimization.

A single relational database instance can comfortably handle terabytes of data and millions of writes per day. Most workloads never need to shard. Sharding too early creates problems that wouldn't have existed and that you cannot easily undo.

When you must shard, shard exactly one dimension. Resist sharding by multiple keys until the first sharded version is proven in production.

Sharding is the operational tax you pay to break the single-machine ceiling. The data is the same. The mental model changes — every query now considers location.

The hard part is not the partitioning math. The hard part is everything around it: schema design, rebalancing, cross-shard transactions, and the day you realize the key was wrong.

Next episode: CAP and PACELC. The formal trade-offs every distributed system makes, whether the architects know it or not.

The pattern at a glance

Transcript