Transcript Talk Slides
Canon in G Major: Designing DHTs with Hierarchical Structure Prasanna Ganesan (Stanford University) Krishna Gummadi (U. of Washington) Hector Garcia-Molina (Stanford University) 1 Distributed Hash Table (DHT) • Hash table over dynamic set of nodes – Insert(key), Lookup(key) – Join(node), Leave(node) 3 • Partition hash space by node ID • “Flat” overlay network 1 – O(log n) operations – Homogeneous components – No central point of failure 5 6 7 0 8 10 13 11 2 Why Hierarchies? • Hierarchies exist! • Scalability, fault isolation, autonomy, etc. • Goal: Inherit best of both worlds with a hierarchical DHT 3 What is a Hierarchical DHT? USA Stanford CS Washington EE Path PathLocality Convergence Efficiency, Caching, Security, Fault Local DHTs Bandwidth isolation Optimization Access Control 4 Problem Statement • Convert flat DHTs to hierarchical (Canonization) – – – – Chord Crescendo CAN Can-Can Symphony Cacophony Kademlia Kandy • Caveat: Preserve homogeneity and state vs. routing trade-offs of flat DHTs 5 Roadmap • • • • Chord and Crescendo Routing in Crescendo Storage and Caching Experimental Evaluation 6 Chord • Circular N-bit ID space • Node x links to succ( x + 2^i ) 0011 0001 0000 0101 0111 1000 7 Crescendo • Key idea: Recursive structure – Construct bottom-up; merge smaller DHTs – Lowest level: Chord Stanford CS EE 8 Merging two Chord rings 3 3 Black 2 node x: Connect to y iff 5 2 8 0 10 13 12 • y closer than any other black 8 node • y = succ(x + 2^i) 13 9 Crescendo Details n blue nodes • Generalize two-ring merge – Merging multiple rings – Multi-level hierarchies • Making it incremental – New node joins bottom-up Total=logn/m mm+ log (n/m +1) nodes nodes • How many links per node? = log(m+n) log(n/m+1) links log m links – Roughly log n ; independent of hierarchy 10 Routing in Crescendo • Greedy clockwise routing! 3 5 • Path locality by greedy routing! 2 0 8 • Path convergence at closest node to destination of same color • Local DHTs by construction! 10 13 12 11 Extensions • Can apply to other DHTs • Mix-and-match DHTs – Stanford runs Chord, Washington runs Kademlia • Additional support for proximity 12 Storage and Caching • Hierarchical Storage – Specify storage domain subtree • Access control provided by routing itself • Caching: Cache at convergent nodes at each level – Nodes form a distributed cache 13 Experiments • c-level hierarchy – Uniform/Zipf distribution at each level • Basic metrics – #links / node – #hops / query 14 Number of Links vs. Number of Levels 17 16 Number of Links 15 14 13 Chord Levels=2 Levels=3 Levels=4 Levels=5 12 11 10 9 1000 10000 100000 Number of Nodes 15 Levels vs. Routing Hops 9 8.5 Number of Hops 8 7.5 Chord Levels=2 Levels=3 Levels=4 7 6.5 6 5.5 5 4.5 4 1000 10000 100000 Number of Nodes 16 Path Locality • GT-ITM topology as hierarchy • Compare Crescendo with Chord – With proximity adaptation • Path locality – Latency to different level destinations 17 Path Locality 1000 900 Latency (ms) 800 700 600 Chord (Prox.) Crescendo Crescendo (Prox.) 500 400 300 200 100 0 Top First Second Third Fourth Query Locality 18 Conclusions • Generic mechanism for hierarchical DHTs – – – – Locality: Fault isolation, Security, Efficiency Convergence: Caching, Bandwidth savings Local DHTs: Hierarchical Access Control Preserves degree vs. routing trade-offs • Potentially useful in large-scale deployment 19 Number of Links vs. Number of Levels 0.35 Fraction of Nodes 0.3 0.25 Levels=1 Levels=2 Levels=3 Levels=5 0.2 0.15 0.1 0.05 0 0 5 10 15 20 25 Number of Links 20 Path Convergence • Overlap fraction Length(Overlap)/ Length(Total) Length: Latency or Number of Hops 21 Overlap Fraction 1.2 Overlap Fraction 1 0.8 Chord (Latency) Crescendo (Latency) Chord (Hops) Crescendo (Hops) 0.6 0.4 0.2 0 Top Level 1 Level 2 Level 3 Level 4 Domain Level 22 Caching • Cache on convergent nodes at each level – Nodes in the level form a distributed cache • Advantages – Exploit locality of reference in hierarchy – Queries guaranteed to hit cache – Interesting cache replacement possibilities 23