Transcript ppt-slides

Data on Air: Organization
and Access
T. Imielinski, S. Viswanathan, and B.R. Badrinath
Presented by Qinhai Xia
Motivation
Power conservation:
•Processor:
AT&T Hobbit chip
Active mode -- 25mw,Doze mode -- 50 uw
•Broadcasting:
Quotrex system over FM channel
Parameter of Concern
Channel
Tuning time :
T1
T2
T1 + T2 + T3 + … + Tn
T3
Client
Tn
Latency:
Tn - T1
Some Definitions
Bucket
bcast
I
0
I
1
I
2
Index
Segment
• Bucket ID
• Bcast pointer
• Index pointer
• Bucket type
D
3
D
4
D
5
D
6
D
7
Data
Segment
D
8
D
9
Latency OPT
File
Previous
bcast
----------
Latency is the best: No overhead for index
Latency = Data/2 + C
Tuning time = Data/2 + C
Next
bcast
Tuning OPT
File
Previous
bcast
Index
----------
Tuning is the best:
Latency = (Data + Index) / 2 + (Data + Index) / 2 + C
= Data + Index + C
Tuning time = k + C
k: number of levels in the index tree
Next
bcast
(1, m) Indexing
Data 1
Previous
bcast
Data 2
Index
1
Index
2
Data m
Index
---------3
Index
m
Next
bcast
Tune Next
in
Continuous
Index
Pointer
Client
Active
Doze
Client
Active
Doze
Retrieving
Client
Active
(1, m) Indexing Continue
Analysis:
Latency = (Index + Data/m)/2 + (m*Index + Data)/2 + C
= ((m+1) * Index + (1/m + 1) * Data)) / 2 + C
Tuning Time = 1 + K + C
K: level of the index tree
Distributed Indexing
 Nonreplicated Distribution :
Different index segments are disjoint
 Entire Path Replication:
The path from the root to an index bucket B is
replicated
 Partial Path Replication (Distributed Indexing)
Between two index buckets B and B’, it is enough to
replicate just the path from the least common ancestor
Comparison
R a1 b1 c1 c2 c3 0 ----- 8 b2 c4c5 c6 9 ----- 17 b3 c7 c8 c9 18 ----- 26
Nonreplicated Distribution
R a1 b1 c1 c2 c3 0 ----- 8 R a1b2 c4c5 c6 9 ----- 17 R a1b3 c7 c8 c9 18 ----- 26
Full Path Replication
R a1 b1 c1 c2 c3 0 ----- 8 a1 b2 c4 c5 c6 9 -----17 a1 b3 c7 c8 c9 18 -----26
c c c
c c c
c c c
R a2 b4 10 11 12 27 ----- 35 a2 b5 13 14 15 36 -----44 a2 b6 16 17 18 45 -----53
Partial Path Replication (Distributed Indexing)
Distributed Indexing
Analysis
 r: level of index tree
 Level[r]: number of nodes on the rth level of the index tree
 Index[r]: the size of top r level of the index tree
  Indexr: additional index overhead
Index = Level[r+1] - 1
Latency = ((Index - Index[r])/ Level[r+1] +
Data/Level[r+1] + Data + Index + Indexr) /2 + C
Tuning Time = 2 + k + C
(1, M) vs. Distributed Indexing
Distributed indexing usually has a much
lower latency then (1,m)
Distributed indexing just has one more
bucket overhead than (1, m)
Nonclustered Indexing
a2 b1 c1
a2 b1 c3
a4 b1 c2
IS
Meta
Segment
a3 b2 c1
a1 b2 c2
•Pointer to the next
segment
•Offset to DS
(K < P < L)
a1 b3 c1
a4 b3 c2
a2 b3 c1
a1 b1 c2
a4 b1 c3
Meta
Segment
a3 b3 c3
a1 b3 c3
a4 b3 c2
•Pointer to the next
IS for value b
Nonclustered Indexing Protocol
Meta
Segment
Previous
bcast
Index
b=1
Index
b=2
Meta
Segment
Index
---------b=3
Index
b=1
----------
Tune Next
in
Continuous
Index
Pointer
Client
Active
Doze
Client
Active
Doze
Retrieving
Client
Active
Nonclustered Indexing Analysis
Latency 
1
Index  Index[ r ]
Data
*(


2
Level[ r  1]
M * Level[ r  1]
M *( Index Index )  Data )
TuningTime  1 ( K  1) C M
Real Example
Quotrex System
Information: 160,000 bytes
Bucket length: 128 bytes
Bandwidth: 10 kbps
Algorithm
Latency
Seconds
Tuning Time Tuning Time Power (J)
Active Mode Doze Mode Consumption
Latency_OPT
62.5
62.5
0
15.625
Tuning_OPT
130.3
0.4
129.9
0.106
(1, m) m = 5
90.9
0.5
90.4
0.13
Distributed Indexing
68.9
0.6
68.3
0.153
Non Cluster Latency_OPT
125
125
0
31.25
Non Cluster Tuning_OPT
187.5
2.2
185.3
0.559
Non Cluster Indexing
132.4
2.4
130
0.607