Consistent hashing - What is consistent hashing?
General idea
1) Map servers and keys onto the ring using a uniformly distributed hash function
2) To find out which server a key is mapped to, go clockwise from the key position until the first server on the ring is found
Consistent hashing - Hash servers
Using a hash function (eg SHA-1), we map servers based on server IP or name onto a ring.
sha1(server_0)
sha1(server_1)
…
These get placed on a ring
Consistent hashing - Hash keys
Using the same hash function (eg SHA-1) we hash keys onto the same ring as the servers
To determine which server a key is stored on, we go clock wise from the key position on the run until a server is found
Consistent hashing - virtual nodes (aka replicas)
A virtual node refers to the real node, and each server has multiple virtual nodes on the ring.
As the number of virtual nodes increases, the distribution of keys become more balanced
When a server is added or removed, a fraction of data needs to be redistributed