Where To Leave The Data ?: - Parallel Systems - Scalable Distributed Data Structures - Dynamic Hash Table (P2P)
Where To Leave The Data ?: - Parallel Systems - Scalable Distributed Data Structures - Dynamic Hash Table (P2P)
– Parallel systems
– Scalable Distributed Data Structures
– Dynamic Hash Table (P2P)
Introduction
• Parallel machines are quite common and affordable
3.Locating all tuples such that the value of a given attribute lies within a
specified range – range queries.
– E.g., 10 r.A < 25.
Comparison of Partitioning Techniques (Cont.)
Round robin:
• Advantages
– Best suited for sequential scan of entire relation on each query.
– All disks have almost an equal number of tuples; retrieval work is
thus well balanced between disks.
Hash partitioning:
• Good for sequential access
– Assuming hash function is good, and partitioning attributes form a
key, tuples will be equally distributed between disks
– Retrieval work is then well balanced between disks.
• Good for point queries on partitioning attribute
– Can lookup single disk, leaving others available for answering
other queries.
– Index on partitioning attribute can be local to disk, making lookup
and update more efficient
• No clustering, so difficult to answer range queries
Comparison of Partitioning Techniques (Cont.)
Range partitioning:
• Provides data clustering by partitioning attribute value.
• Good for sequential access
• Good for point queries on partitioning attribute: only one disk needs
to be accessed.
• For range queries on partitioning attribute, one to a few disks may
need to be accessed
• If a relation contains only a few tuples which will fit into a single disk
block, then assign the relation to a single disk.
• Large relations are preferably partitioned across all the available disks.
• Basic idea:
– If any normal partition would have been skewed, it is very likely the
skew is spread over a number of virtual partitions
– Skewed virtual partitions get spread across a number of processors, so
work gets distributed evenly!
• Scalable Distributed Data Structures
– http://ceria.dauphine.fr/witold.html
– Or the LH* ACM-TODS paper (Dec. 96)
An SDDS
growth through splits under inserts
Servers
Clients
An SDDS
growth through splits under inserts
Servers
Clients
An SDDS
growth through splits under inserts
Servers
Clients
An SDDS
growth through splits under inserts
Servers
Clients
An SDDS
growth through splits under inserts
Servers
Clients
An SDDS
Clients
An SDDS
Clients
An SDDS
Clients
An SDDS
Clients
An SDDS
Clients
Known SDDSs
DS
SDDS Classics
(1993)
LH*m, LH*g
LH*SA Security
LH*s
LH*RS
LH* ( A classic)
K100 N100
Circular
ID Space N32 K11, K30
N5
N10
N110 “Where is key 50?”
N20
N99
“Key 50 is
N32
At N60”
N40
N80
N60
N10
20, 32, 40
5, 10, 20 N110
N20
32, 40, 60
110, 5, 10 N99
N32
40, 60, 80
1/8
1/16
1/32
1/64
1/128
N80
Chord lookups take O(log N) hops
N5
N10
N110 K19
N20
N99
N32 Lookup(K19)
N80
N60