0% found this document useful (0 votes)
35 views33 pages

Distributed Hash Tables

Distributed hash tables (DHTs) provide a scalable lookup service by mapping keys to nodes in a distributed system. DHTs use consistent hashing to assign keys to nodes, balancing load as nodes join and leave. Chord is a seminal DHT that implements efficient lookups using a "finger table" at each node to route queries towards the responsible node in O(logN) hops. Nodes in Chord maintain routing information to adapt to changes in the network topology.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views33 pages

Distributed Hash Tables

Distributed hash tables (DHTs) provide a scalable lookup service by mapping keys to nodes in a distributed system. DHTs use consistent hashing to assign keys to nodes, balancing load as nodes join and leave. Chord is a seminal DHT that implements efficient lookups using a "finger table" at each node to route queries towards the responsible node in O(logN) hops. Nodes in Chord maintain routing information to adapt to changes in the network topology.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

DistributedHashTables

DHTs
Likeitsoundsadistributedhashtable
Put(Key,Value)
Get(Key)>Value


Interfacevs.Implementation
Put/Getisanabstractinterface
Veryconvenienttoprogramto
Doesn'trequireaDHTintoday'ssenseofthe
world.
e.g.,Amazon'sS^3storageservice
/bucketname/objectid>data
We'llmostlyfocusonthebackendlog(n)lookup
systemslikeChord
Butresearchershaveproposedalternate
architecturesthatmayworkbetter,dependingon

assumptions!
Lasttime:UnstructuredLookup
Pureflooding(Gnutella),TTLlimited
Sendmessagetoallnodes
Supernodes(Kazaa)
Floodtosupernodesonly
Adaptivesupernodesandothertricks(GIA)
Noneofthesescaleswellforsearchingfor
needles


AlternateLookups
Keepinmindcontraststo...
Flooding(Unstructured)fromlasttime
Hierarchicallookups
DNS
Properties?Rootiscritical.Today'sDNSrootis
widelyreplicated,runinserioussecuredatacenters,
etc.Loadisasymmetric.
NotalwaysbadDNSworksprettywell
Butnotfullydecentralized,ifthat'syourgoal

P2PGoal(general)
Harnessstorage&computationacross
(hundreds,thousands,millions)ofnodesacross
Internet
Inparticular:
Canweusethemtocreateagigantic,hugely
scalableDHT?


P2PRequirements
Scaletothosesizes...
Berobusttofaultsandmalice
Specificchallenges:
Nodearrivalanddeparturesystemstability
Freeloadingparticipants
Maliciousparticipants
Understandingboundsofwhatsystemscanand
cannotbebuiltontopofp2pframeworks

DHTs
Twooptions:
lookup(key)>nodeID
lookup(key)>data
WhenyouknowthenodeID,youcanaskit
directlyforthedata,butspecifyinginterfaceas
>dataprovidesmoreopportunitiesforcaching
andcomputationatintermediaries
Differentsystemsdoeither.We'llfocusonthe
problemoflocatingthenoderesponsibleforthe

data.Thesolutionsarebasicallythesame.

AlgorithmicRequirements
Everynodecanfindtheanswer
Keysareloadbalancedamongnodes
Note:We'renottalkingaboutpopularityofkeys,
whichmaybewildlydifferent.Addressingthisisa
furtherchallenge...
Routingtablesmustadapttonodefailuresand
arrivals
Howmanyhopsmustlookupstake?
Tradeoffpossiblebetweenstate/maint.trafficand

numlookups...
ConsistentHashing
Howcanwemapakeytoanode?
Considerordinaryhashing
func(key)%N>nodeID
Whathappensifyouadd/removeanode?
Consistenthashing:
MapnodeIDstoa(large)circularspace
Mapkeystosamecircularspace
Keybelongstonearestnode

DHT:ConsistentHashing
Key 5 K5
Node 105

N105 K20

Circular ID space N32

N90

K80
A key is stored at its successor: node with next higher ID

15441Spring2004,JeffPang 11
ConsistentHashing
VeryusefulalgorithmictrickoutsideofDHTs,
etc.
Anytimeyouwanttonotgreatlychangeobject
distributionuponbucketarrival/departure
Detail:
Tohavegoodloadbalance
Mustrepresenteachbucketbylog(N)virtual
buckets

15441Spring2004,JeffPang 12
DHT:ChordBasicLookup
N120
N10
Where is key 80?
N105

N90 has K80 N32

K80 N90

N60

15441Spring2004,JeffPang 13
DHT:ChordFingerTable
1/4 1/2

1/8

1/16
1/32
1/64
1/128

N80

Entryiinthefingertableofnodenisthefirstnodethatsucceedsor
equalsn+2i
Inotherwords,theithfingerpoints1/2niwayaroundthering

15441Spring2004,JeffPang 14
DHT:ChordJoin
Assumeanidentifierspace[0..8]

Noden1joins Succ.Table
0 iid+2isucc
1 021
7 131
251

6 2

5 3
4

15441Spring2004,JeffPang 15
DHT:ChordJoin

Noden2joins Succ.Table
0 iid+2isucc
1 022
7 131
251

6 2

Succ.Table
iid+2isucc
5 3
031
4
141
261

15441Spring2004,JeffPang 16
DHT:ChordJoin
Succ.Table
iid+2isucc
011
122
240
Nodesn0,n6join Succ.Table
0 iid+2isucc
1 022
7 136
256
Succ.Table
iid+2isucc
070 6 2
100
222
Succ.Table
iid+2isucc
5 3
036
4
146
266

15441Spring2004,JeffPang 17
DHT:ChordJoin
Succ.Table Items
iid+2isucc 7
Nodes: 011
n1,n2,n0,n6 122
240

Items: 0
1
Succ.Table Items
iid+2isucc 1
f7,f2 7
022
136
256
Succ.Table 6 2
iid+2isucc
070
100 Succ.Table
222 iid+2isucc
5 3
036
4
146
266

15441Spring2004,JeffPang 18
DHT:ChordRouting
Succ.Table Items
iid+2isucc 7
Uponreceivingaqueryfor 011
itemid,anode: 122
Checkswhetherstoresthe 240
itemlocally
Ifnot,forwardsthequeryto 0 Succ.Table Items
thelargestnodeinits 7 1 iid+2isucc 1
successortablethatdoes query(7) 022
notexceedid 136
256
Succ.Table 6 2
iid+2isucc
070
100 Succ.Table
222 iid+2isucc
5 3
036
4
146
266

15441Spring2004,JeffPang 19
DHT:ChordSummary
Routingtablesize?
LogNfingers
Routingtime?
Eachhopexpectsto1/2thedistancetothe
desiredid=>expectO(logN)hops.

15441Spring2004,JeffPang 20
Alternatestructures
Chordislikeaskiplist:eachtimeyougoway
towardsthedestination.Othertopologiesdothis
too...

15441Spring2004,JeffPang 21
Treelikestructures
Pastry,Tapestry,Kademlia
Pastry:
NodesmaintainaLeafSetsize|L|
|L|/2nodesabove&belownode'sID
(LikeChord'ssuccessors,butbidirectional)
Pointerstolog_2(N)nodesateachleveliofbit
prefixsharingwithnode,withi+1different
e.g.,nodeid01100101
storestoneighborat1,00,010,0111,...
15441Spring2004,JeffPang 22
Hypercubes
theCANDHT
EachhasID
Maintainspointerstoaneighborwhodiffersin
onebitposition
Onlyonepossibleneighborineachdirection
Butcanroutetoreceiverbychanginganybit

15441Spring2004,JeffPang 23
SomanyDHTs...
Comparealongtwoaxes:
Howmanyneighborscanyouchoosefromwhen
forwarding?(ForwardingSelection)
Howmanynodescanyouchoosefromwhen
selectingneighbors?(NeighborSelection)
Failureresilience:Forwardingchoices
Pickinglowlatencyneighbors:Bothhelp

15441Spring2004,JeffPang 24
Proximity
Ring:
Forwarding:log(N)choicesfornexthopwhen
goingaroundring
Neighborselection:Pickfrom2^inodesatleveli
(greatflexibility)
Tree:
Forwarding:1choice
Neighbor:2^i1choicesforithneighbor

15441Spring2004,JeffPang 25
Hypercube
Neighbors:1choice
(neighborswhodifferinonebit)
Forwarding:
Canfixanybityouwant.
N/2(expected)waystoforward
So:
Neighbors:Hypercube1,Others:2^i
Forwarding:tree1,hypercubelogN/2,ringlogN
15441Spring2004,JeffPang 26
Howmuchdoesitmatter?
Failureresiliencewithoutrerunningrouting
protocol
Treeismuchworse;ringappearsbest
Butallprotocolscanusemultipleneighborsat
variouslevelstoimprovethese#s
Proximity
Neighborselectionmoreimportantthanroute
selectionforproximity,anddrawsfromlargespace
witheverythingbuthypercube
15441Spring2004,JeffPang 27
Otherapproaches
Insteadoflog(N),cando:
Directrouting(everyoneknowsfullroutingtable)
Canscaletotensofthousandsofnodes
Mayfaillookupsandretrytorecoverfrom
failures/additions
Onehoproutingwithsqrt(N)stateinsteadoflog(N)
state
What'sbestforrealapplications?Stillupinthe
air.
15441Spring2004,JeffPang 28
DHT:Discussion

Pros:
GuaranteedLookup
O(logN)pernodestateandsearchscope

(Orotherwise)

Cons:
Hammerinsearchofnail?Nowbecoming
popularinp2pBittorrentDistributed
Tracker.Butstillwaitingformassive
uptake.Ornot.
Manyservices(likeGoogle)arescalingto

huge#swithoutDHTlikelog(N)
15441Spring2004,JeffPang 29
FurtherInformation
Wedidn'ttalkaboutKademlia'sXORstructure
(likeageneralizedhypercube)
SeeTheImpactofDHTRoutingGeometryon
ResilienceandProximityformoredetailabout
DHTcomparison
Nosilverbullet:DHTsveryniceforexactmatch,
butnotforeverything(nextfewslides)

15441Spring2004,JeffPang 30
Writable,persistentp2p

Doyoutrustyourdatato100,000monkeys?

Nodeavailabilityhurts
Ex:Store5copiesofdataondifferentnodes
Whensomeonegoesaway,youmustreplicate
thedatatheyheld
Harddrivesare*huge*,butcablemodemupload
bandwidthistinyperhaps10Gbytes/day
Takesmanydaystouploadcontentsof200GB
harddrive.Veryexpensiveleave/replication
situation!

15441Spring2004,JeffPang 31
Whenarep2p/DHTsuseful?

Cachingandsoftstatedata
Workswell!BitTorrent,KaZaA,etc.,all
usepeersascachesforhotdata
Findingreadonlydata
Limitedfloodingfindshay
DHTsfindneedles

BUT

15441Spring2004,JeffPang 32
APeertopeerGoogle?

Complexintersectionqueries(the+who)
Billionsofhitsforeachtermalone

Sophisticatedranking
Mustcomparemanyresultsbeforereturninga
subsettouser

Very,veryhardforaDHT/p2psystem
Needhighinternodebandwidth
(ThisisexactlywhatGoogledoesmassive
clusters)

15441Spring2004,JeffPang 33

You might also like