0% found this document useful (0 votes)
34 views

MGS 655: Distributed Computing: Consistent Hashing

This document discusses peer-to-peer (P2P) systems and consistent hashing. It explains that P2P systems allow for redundant storage, selection of nearby servers, search capabilities, and authentication without centralized control. Consistent hashing is introduced as a technique for mapping keys to nodes in a distributed system that allows machines to join and leave the network with minimal data redistribution. The properties of consistent hashing are described, including smooth load balancing when machines are added or removed and directing references for an object to only a small number of caches even when client views are inconsistent.

Uploaded by

loljunks
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

MGS 655: Distributed Computing: Consistent Hashing

This document discusses peer-to-peer (P2P) systems and consistent hashing. It explains that P2P systems allow for redundant storage, selection of nearby servers, search capabilities, and authentication without centralized control. Consistent hashing is introduced as a technique for mapping keys to nodes in a distributed system that allows machines to join and leave the network with minimal data redistribution. The properties of consistent hashing are described, including smooth load balancing when machines are added or removed and directing references for an object to only a small number of caches even when client views are inconsistent.

Uploaded by

loljunks
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

MGS 655: Distributed Computing

Consistent Hashing

January 3, 2001

January 3, 2001

1/8

Peer-to-Peer (P2P) Systems

P2P Systems are distributed systems without any centralized control, in


which each node runs software with equivalent functionality.
Some features:
1

Redundant storage and permanence

Selection of nearby servers

Search capability

Authentication

Hierarchical naming

Core operation: Efficient location of data items.

January 3, 2001

2/8

The Chord Protocol

Given a key, map the key onto a node.


Depending on the application, the node may be responsible for storing a
value associated with the key.
Chord uses consistent hashing to assign keys to chord nodes.

January 3, 2001

3/8

Hashing

Typical hashing based schemes - spread load through a fixed


collection of servers
Assume there are K objects and n cache machines. Easiest way to
load balance Kn
Sometimes, it may be interesting to do hash (K ) mod n.

Question
When is this hashing technique not a good idea?
January 3, 2001

4/8

When to use Consistent Hashing?

When a fixed collection of machines do not exist.


Machines come and go as they are brought into the network
The information about what machines are functional propagates
slowly across the network, so clients have incompatible views.
Almost every item will have to be hashed to a different (new) location.
Suddenly all the cached data is useless !!

January 3, 2001

5/8

More on Consistent Hashing

Views: The set of caches of which a particular client is aware


Assumption: Each machine is aware of a constant fraction of the
currently operating caches
A client uses a consistent hash function to map a object to one of the
caches in its view

January 3, 2001

6/8

Properties of the Hash Function


When a machine is added to or removed from the set of caches, the
expected fraction of objects that must be moved to a new cache is
the minimum needed to maintain a balanced load across the caches.
Property of smoothness.
Over all the client client views, the total number of different caches
to which a object is assigned is small. This is called spread
Over all the client views, the number of distinct objects assigned to a
particular cache is small. Property is called load
Even in case of inconsistent views, references for a given object are
directed only to a small number of caching machines.
No one cache is assigned an unreasonable number of objects.
Smooth changes in the set of caching machines are matched by
smooth evolution in the location of cached objects
January 3, 2001

7/8

January 3, 2001

8/8

You might also like