Transcript Document
CERIA Laboratory LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment W. LITWIN T. SCHWARZ H.YAKOUBEN Paris Dauphine University Santa Clara University (USA) Paris Dauphine University [email protected] [email protected] [email protected] Plan Objective Overview: SDDS & P2P LH*RSP2P Architecture Addressing Properties Churn Management Conclusion LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 2 Objective Very Large Scalable Files LH*RS key High search availability requires to deal withat most one churn P2P Design a new SDDS for a structured P2P environment most one AAtHigh forwarding available message for key search or insert Data or scan Structure (fastest known and performance) treatment of CHURN forwarding message LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 3 SDDS (1993) A File of records identified by keys SDDS client nodes face the applications and send queries to SDDS server nodes No centralized addressing Servers contain application or parity data In buckets Overflowing servers split on new servers Servers do not notify clients about splits LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 4 SDDS (1993) Clients use images of the file state for addressing Key based Range queries Scans … Images get adjusted towards the file state during queries by Image Adjustment Messages Triggered by incorrect addressing by the client IAMs reflect the file evolution by splits or, rarely, merges. IAMs reflect also the location changes because of failures and recovery LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 5 SDDS Typology Data Structures SDDS(1993) Tree Hash 1-dimensional LH*, DDH, EH*, CHORD... Classics d-dimensional IH*… 1-d Tree RP*, m-d Tree k-RP*, SD-Rtree, DRT*, BATON VBI-Tree High Availability Structured P2P Schemes LH*m LH*g k-Availability LH*rs Security LH*sa LH*s Alg. Sign… LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 6 SDDS Expansion Growth through splits under inserts Peer New Peer Clients LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 7 SDDS Client Image Evolution Clients LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 8 SDDS 2007 Prototype Available at CERIA site Announced at DbWorld Managing LH* RS and RP* files In distributed RAM Uner Windows Over 1gbs Ethernet Various functions Response time reaching 30 microsec Up to 300 times faster than disk files 9 P2P (1995 ?) Autonomous nodes store and search data By flooding in early systems Freenet, Napster, Gnutella… Structured P2P reduce the flooding Using decentralized data structures Distributed Hash Table (DHT) especially Few folks know the concept is due to B. Devine FODO 93 Chord, P-tree, VBI, Baton… Structured P2P schemes are specific SDDS schemes 10 LH*RSP2P Addressing Global Addressing a hi(C ) ; if a < n then a hi+1(C ) ; Rule /* a is the address of peer destination of the key C*/ /* (i, n) state of an SDDS file, they are only known to the file coordinator node hi (C ) = C mod 2i Client Address Calculus a’ hi’(C ) ; if a’ < n’ then a hi’+1(C ) ; /* a’ is the address of peer destination of the key C*/ LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 11 LH*RSP2P File Expansion File starts with i = 0 and n = 0 and a single data bucket 0 Every bucket m keeps the bucket level j of hash function hi last used to split, j = 0 initially. Overflowing bucket m alerts the coordinator Coordinator notifies bucket n to split Bucket n applies hi + 1 About half of keys migrates to new bucket n + 2i Bucket n and the new one set j = j + 1 Coordinator performs n=n+1 if n = 2i then i = i + 1 and n = 0 12 LH*RSP2P Architecture based on LH*RS LH*RSP2P Peer Server Part LH*P2P LH*RS DB LH*RS Client LH*RS PB Pupils j i’ n’ LH*RS Client LH*RSP2P Peer Client Part Peer Candidate Peer Client & Spare Storage Candidate Peer LH*RSP2P Peer LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 13 Peer & Pupil Image Adjustment During Peer Split i’ = j-1 ; /* j value before the split n‘ = a +1 /* a is the splitting bucket if n’ = 2i’ then i’ = j + 1 ; n’ = 0 ; LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 14 Example i’= j =1; n’= m+1= 1+1; If n’=21 then n’=0; i’= i’+1 and (i’, n’)= (2,0) j=2 j=1 j=2 j=2 j=2 j=2 j=2 i’=1 n’=1 i’=1 n’=0 i’=1 n’=1 i’=1 n’=1 i’=2 n’=0 i’=1 n’=1 i’=2 n’=0 P2 P0 P2 P3 P0 P1 P1 i=1 n=1 Coordinator Peer (CP) Before splitting i=2 CP n=0 After splitting LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 15 Server Address Calculus a’ hj (C ) ; if a’= a then exit else send C to bucket a exit; /* Bucket a is the correct one /* Forwarding to bucket a’ Simpler and faster than for LH* As only one forwarding is possible LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 16 Peer Image Adjustment by IAM IAM comes from the correct bucket Bucket a is the forwarding one Bucket level j is that of the correct bucket 0f the forwarding one as well i’ j - 1, n’ a + 1 ; if n’ >2i’ then n’ 0 ; i’ i’ + 1 ; • Same algorithm as for the adjustment of the local client and of pupils after a split LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 17 Peer Image Adjustment by IAM Checking and forward the key using A2 Pairs 9 IAM a=1 j=4 j=4 j=4 j=3 j=4 i’=3 n’=1 i’=3 n’=2 i’=2 n’=1 i’=3 n’=2 P0 P1 P4 9 i =3 n=2 P9 PC LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 18 Peer Image Adjustment by IAM 9 IAM a=1 j=4 Pairs j=4 j=4 j=3 j=4 i’=3 n’=1 i’=3 n’=2 i’= 3 n’= 2 i’=3 n’=2 P0 P1 P4 9 i =3 n=2 P9 PC LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 19 LH*RSP2P TUTOR, Example of the File Expansion Update Pupil Peers j=3 j=2 j=3 j=3 j=3 i’=2 n’=1 i’=2 i’=1 n’=3 n’=1 i’=2 n’=2 i’=2 n’=3 P0 P2 P5 i’=2 i’=0 n’=1 n’=0 Candidate PupilPeer Assign a Tutor for Candidate Peer: LH-hash of the client IP Address P6 i=2 i=2 n=2 n=3 PC LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 20 Properties of LH*RSP2P : 1. 2. 3. The maximal number of forwarding messages for the key search is one. The maximal number of rounds for the scan search can be two. The worst case addressing performance of LH*RSP2P as defined by Property 1 is the fastest possible for any SDDS or a practical structured P2P addressing scheme. LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 21 Proof Property 1 Case 1 : i’ = i and n’ < n Peer a addresses peer a’, using its image (i’,n’) from last split No IAM came since. j = i’+1 j = i’ 0 a n a’ No forwarding 2i’ n+2i’ a+2i’ LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 22 Proof Property 1 Case 1 : i’ = i and n’ < n Peer a addresses peer a’, using its image (i’,n’) from last split No IAM came since. j = i’+1 j = i’ 0 a n a’ 2i’ n+2i’ a+2i’ Forwarding possible for any address a’ between (a, n) LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 23 Proof Property 1 Case 2 : i = i’ + 1 and n < n’ Peer a addresses peer a’, using its image (i’,n’) from last split No IAM came since. j = i’+2 j = i’+1 j = i’ 0 na 2i’ a’ 2i’+1 n+2i’+1 Forwarding possible for any address a’ beyond [n, a] LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 24 Proof Property 2 Peer a sends the scan to all buckets in its image • Including its image (i’, n’) • Receiving peer a’ can have bucket level j as in the image • j (a) = j’ (a) • No forwarding of the scan • Or, bucket a’ split • Once and only once • j (a) = j’ (a) + 1 • See the figs for the key address calculus LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 25 Proof Property 2 Peer a’ forwards the scan to its (only) child • No child can have a child • Peer a would first need to split again as well •Every peer gets thus the scan and only once •There at worst two rounds LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 26 Proof Property 2 •The only faster worst case performance is zero forwarding messages •Every split has to be notified then to every peer •It would be against the scalability goal of every SDDS & structured P2P scheme LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 27 LH*RSP2P Churn Management Bucket reliability group with k parity buckets protect against up to k bucket failures per group Data Record Tutoring records 5 4 3 2 1 0 Rank Parity Record • • • • • • • • • • • • • • • Data Peer • • • • • • • • • • • • • • • Parity Peer LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 28 LH*RSP2P Churn Management Peer leaves with notice Say that’s OK j Coordinator Peer i’,n’ Notification P0 … j … j i’,n’ i’,n’ i’,n’ Pl Pm Candidate Peer LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 29 LH*RSP2P Churn Management Peer leaves without notice or fails LH*RS Bucket Recovery Forward Coordinator Peer j j j i’,n’ Pl-1 i’,n’ i’,n’ i’,n’ Pl Pm Parity Peer Query LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 30 LH*RSP2P Churn Management Peer leaves without notice or fails LH*RS Bucket Recovery j i’,n’ j Coordinator Peer Pl i’,n’ Pl-1 j i’,n’ i’,n’ Pm Parity Peer Answer LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 31 LH*RSP2P Churn Management Sure Search : Protects against outdated server read (transient communication or peer failure) j i’,n’ Pl Coordinator Peer j j j i’,n’ Pl-1 i’,n’ i’,n’ i’,n’ Pl Pm Parity Peer AnswerQuery LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 32 Conclusion LH*RSP2P require at most one forward message when addressing error occur Is the fastest known SDDS and P2P key based addressing algorithm Protects efficiently against churn Allows to manage very large scalable files Should have numerous applications LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 33 Current & Future Work Implementation of the peer node architecture and the tutoring functions Using existing LH*RS prototype Created 2004 by Rim Moussa & shown at VLDB Performance Analysis Variants LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 34 END Thank you for Your Attention Work partly funded by the IST eGov-Bus project LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 35 References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] Adina Crainiceanu, Prakash Linga, Johannes Gehrke, and Jayavel Shanmugasundaram. Querying Peer-to-Peer Networks Using P-Trees. In Proceedings of the Seventh International Workshop on the Web and Databases (WebDB 2004). , June 2004. Bolosky W. J, Douceur J. R, Howell J. The Farsite Project: A Retrospective. Operating System Review, April 2007, p.17-26 Devine R. Design and Implementation of DDH: A Distributed Dynamic Hashing Algorithm, Proc. Of the 4th Intl. Foundation of Data Organisation and Algorithms –FODO, 1993. Litwin, W. Neimat, M-A., Schneider, D. LH*: Linear Hashing for Distributed Files. ACMSIGMOD Int. Conf. On Management of Data, 93. Litwin, W., Neimat, M-A., Schneider, D. LH*: A Scalable Distributed Data Structure. ACMTODS, (Dec., 1996). Litwin, W., Neimat, M-A. High Availability LH* Schemes with Mirroring, Intl. Conf on Cooperating systems, , IEEE Press 1996. Litwin, W. Moussa R, Schwarz T. LH*rs- A Highly Available Distributed Data Storage. Proc of 30th VLDB Conference, , 2004. Litwin, W. Moussa R, Schwarz T. LH*rs- A Highly Available Scalable Distributed Data Structure. ACM-TODS, Sept 2005. Steven D. Gribble, Eric A. Brewer, Joseph M. Hellerstein, and David Culler. Scalable, Distributed Data Structures for Internet Service Construction, Proceedings of the Fourth Symposium on Operating Systems Design and Implementation (OSDI 2000) Stoica, Morris, Karger, Kaashoek, Balakrishma. CHORD : A scalable Peer to Peer Lookup Service for Internet Application. SIGCOMM’O, August 27-31, 2001, LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 36 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 37