Space-Efficient Online Computation of Quantile Summaries
Michael Greenwald
Computer & Information Science Department University of Pennsylvania
200 South 33rd Street Philadelphia, PA 19104
greenwald@cis.upenn.edu
Sanjeev Khanna
y
Computer & Information Science Department University of Pennsylvania
200 South 33rd Street Philadelphia, PA 19104
sanjeev@cis.upenn.edu
ABSTRACT
An-approximatequantilesummaryofasequeneofN el-
ementsisadatastruturethatananswerquantilequeries
aboutthesequenetowithinapreisionofN.
Wepresentanewonlinealgorithmforomputing-approxi-
matequantilesummariesofverylargedatasequenes. The
algorithmhasaworst-asespaerequirementofO(
1
log(N)).
ThisimprovesuponthepreviousbestresultofO(
1
log
2
(N)).
Moreover,inontrasttoearlierdeterministialgorithms,our
algorithmdoesnotrequireaprioriknowledgeofthelength
oftheinputsequene.
Finally,the atualspaebounds obtainedonexperimental
dataaresigniantlybetterthantheworstaseguarantees
ofouralgorithmaswellastheobservedspaerequirements
ofearlier algorithms.
1. INTRODUCTION
Westudytheproblemofspae-eÆientomputationofquan-
tile summariesof very large data sets ina single pass. A
quantilesummaryonsistsofasmallnumberofpointsfrom
theinputdatasequene,andusesthosequantileestimatesto
giveapproximateresponsestoanyarbitraryquantilequery.
Summariesof large data sets havelong been used by pro-
grammersmotivatedbylimitedmemoryresoures. Elemen-
tarysummaries, suhas running averages or standard de-
viation,aretypiallysuÆientonlyforsimpleappliations.
The mean and variane are often either insuÆiently de-
sriptive, orare toosensitiveto outliersand other anoma-
Supported inpart by DARPA underContrat #F39502-
99-1-0512, and by the NationalSiene Foundation under
GrantANI-00-81901.
y
SupportedinpartbyanAlfredP.SloanResearhFellow-
ship.
lousdata.Forsuhases,onlinealgorithmsareneessaryto
generatequantilesummariesthat use little spae andpro-
videreasonablyaurateapproximationstothedistribution
funtioninduedbytheinputdatasequene[6,1,5,13,2℄.
1.1 Quantile Estimation for Database Appli- cations
Reentwork(e.g. [8,9,12℄)hashighlightedtheimportane
ofquantileestimatorsfordatabaseusersandimplementors.
Quantileestimatesareusedtoestimatethesizeofinterme-
diateresults,toallowqueryoptimizerstoestimatetheost
of ompeting plansto resolve database queries. Parallel
databases attemptto partition the data into value ranges
suhthatthesizeofallpartitionsareroughlyequal.Quan-
tileestimates anbeusedtohoosethe rangeswithoutin-
speting the atual data. Quantile estimates have several
other uses indatabases as well. User-interfaes may esti-
materesultsizesof queries,andprovidefeedbakto users.
Thisfeedbakmaypreventexpensiveandinorretqueries
from beingissued, and mayag disrepanies between the
user'smodelofthedatabaseanditsatualontent. Quan-
tileestimatesarealsousedbydatabaseuserstoharaterize
thedistributionofrealworlddatasets.
The existing body of work has also identied partiular
properties that quantile estimators require in order to be
usefulforthesedatabaseappliations|propertiesthatmay
notbestritlyneessarywhenestimatingquantilesinother
domains. Some of the desirable properties are as follows.
(1)Thealgorithmshouldprovidetunableandexpliitapri-
ori guarantees on the preision of the approximation. We
say that a quantile summary is -approximate if it an be
usedto answeranyquantilequerytowithin apreision of
N. Inotherwords,foranygivenrankr,an-approximate
quantilesummaryreturnsavaluewhoserankr 0
isguaran-
teed to be within the interval [r N;r+N℄. (2) The
algorithm shouldbe dataindependent. Neitheritsguaran-
tees shouldbe aeted by thearrivalorderor distribution
of values, nor should it require a priori knowledge of the
size of thedataset. (3)Thealgorithm should exeuteina
singlepassoverthedata. (4)Thealgorithmshouldhaveas
smallamemoryfootprintaspossible. Wenoteherethatthe
memoryfootprint applies to temporarystorage during the
omputation. We an always onstrut an -approximate
summary of size O(1=) as follows. We rst onstrut an
=2-approximate summary. For i from 0 to 2
, querythis
summaryforeahi
2
quantile. Itiseasytoseethattheset
ofresponsesonstitutesan-approximatesummary.
1.2 Previous Work
Severalearlier works have made progress towards meeting
the above-mentionedrequirements. Manku, Rajagopalan,
and Lindsay [8℄ present a single-pass algorithm that on-
strutsan-approximatequantilesummary. Thealgorithm
stritlyguaranteesapreisionofN,butitrequiresanad-
vaneknowledgeof N,the sizeof thedataset. Itrequires
O(
1
log
2
(N)) spae. In [8℄ the same authors present an
algorithm that does not require an advane knowledge of
N. However, theymustgive upthe deterministi guaran-
teeonauray. Instead,theyprovide onlyaprobabilisti
guaranteethatthequantileestimatesarewithinthedesired
preision.
Gibbons,Matias, andPoosala[4℄ estimatequantiles under
a dierent error metri, but their algorithm requires mul-
tiplepassesoverthedata. Similarly,Chaudhuri,Motwani,
and Narsayya[3℄ require multiple passes andonly provide
probabilistiguarantees.
Munro and Paterson [10℄, building on the earlier work of
Pohl[7℄,showedthat anyalgorithmthat exatlyomputes
the-quantileof asequene ofN data elementsinonlyp
passes, requires a spae of (N 1=p
). Thus the notion of
approximate quantiles is inherently neessaryfor obtained
sub-linearspaealgorithms.
Manyresearhershavealsoaddressedtheproblemofdeter-
miningthesmallestnumberofomparisonsthatarenees-
saryfor omputing a-quantile. We refer thereader toa
niesurveyartilebyPaterson[11℄foranoverviewofresults
inthisarea.
1.3 Our Results
Wedesignandanalyzeanewonlinealgorithmfor omput-
ing an -approximate quantile summary of large data se-
quenes. Thealgorithmhasaworst-asespaerequirement
ofO(
1
log(N)),thusimprovinguponthepreviousbestre-
sultofO(
1
log
2
(N)). Moreover,inontrasttoearlierdeter-
ministialgorithms,ouralgorithmdoesnotrequireapriori
knowledge ofthelengthoftheinputsequene.
Ourapproahisbasedonanoveldatastruturethatee-
tivelymaintainstherangeofpossibleranksforeahquantile
that we store. This diersfrom previous approahes that
impliitly assumed that the error in stored quantiles was
distributed roughly uniformlythroughout the distribution
of observed values. Byexpliitly maintaining the possible
rangeofrankvaluesforeahquantile,ouralgorithmisable
toadaptivelyhandlenewobservations:valuesobservednear
tightlyonstrained quantilesare morelikely tobedropped
andnewvaluesobservednearlooselyonstrained quantiles
are morelikelyto be stored. Intuitively speaking, theim-
provedbehaviorofouralgorithmisbasedonthefat(whih
weprove)thatnoinputsequeneanbe\bad"arosstheen-
tiredistributionatone. Inotherwords,aninputsequene
annotpersistentlypresent newobservationsthat mustbe
storedwithoutallowingustosafelydeleteoldstoredobser-
vations.
Wealsonotehere thatouralgorithmanbeparallelizedin
a straightforward mannerto deal with the senario where
a system of P independent proessors analyzes P disjoint
streamsderivedfromaparentsequene.Due tospae on-
siderations, wewill omitthedetails ofthisimplementation
inthisversion.
Finally,westudytheperformaneofouralgorithmfroman
empirial perspetive. The atual spae bounds obtained
onexperimentaldataaresigniantlybetterthanboththe
worst ase guarantees of our algorithm as well as the ob-
servedspaerequirementsofearlier algorithms. Forexam-
ple, when summarizing uniformlyrandom data with =
0:001 andN =10 7
,ouralgorithmusedanorderofmagni-
tudelessmemorythanthebestpreviouslyknownalgorithm.
2. THE NEW ALGORITHM
We will assume withoutany loss of generality that a new
observationarrivesaftereahunitoftimeandthuswewill
usento denoteboththenumberofobservations(elements
ofthe datasequene)thathavebeenseen sofar aswellas
theurrenttime. Ouralgorithmmaintainsasummarydata
strutureS=S(n)atalltimes,andwedenotebys=s(n),
the total spae used by it. Finally, we denote the given
preisionrequirementby.
2.1 The Summary Data Structure
At anypoint intimen,thedatastrutureS(n)onsists of
an ordered sequene of tuples whih orrespond to asub-
setof theobservations seenthusfar. Foreahobservation
v inS, we maintainimpliit bounds onthe minimum and
themaximumpossiblerankoftheobservationvamongthe
rstnobservations. Letrmin(v)andrmax(v)denoterespe-
tively thelowerandupperbounds ontherank ofvamong
the observations seen so far. Speially,S onsists oftu-
plest0;t1;:::;ts
1
whereeahtupleti=(vi;gi;i)onsists
ofthreeomponents: (i)avaluev
i
thatorrespondstoone
of theelementsinthedatasequeneseenthusfar, (ii) the
value gi equals rmin(vi) rmin(vi 1), and (iii) i equals
r
max (v
i ) r
min (v
i
). Weensurethat,atalltimes,themax-
imum and the minimum values are part of the summary.
Inotherwords,v0 andvs
1
alwaysorrespondtothemin-
imumandthemaximumelementsseensofar. Itiseasyto
seethatr
min (v
i )=
P
ji g
j andr
max (v
i )=
P
ji g
j +
i .
Thus gi+i 1 is an upperbound onthe total number
of observations that may have fallen betweenvi
1 and vi.
Finally, observe that P
i g
i
equals n, the total number of
observationsseensofar.
AnsweringQuantile Queries: Asummaryoftheabove
form an be used ina straightforward manner to provide
-approximateanswerstoquantilequeries. Theproposition
belowformsthebasisofourapproah.
Proposition 1. GivenaquantilesummarySintheabove
form,a-quantileanalwaysbeidentiedtowithinanerror
ofmaxi(gi+i)=2.
Proof. Letr =dneandlet e=max
i (g
i +
i
)=2. We
will searh for an indexi suhthat r e rmin(vi) and
rmax(vi)r+e. Clearly,suhavalueviapproximatesthe
guethatsuhanindeximustalwaysexist. First,onsider
theaser>n e. Wehavermin(vs
1
)=rmax(vs 1)=n,
andthereforei=s 1hasthedesiredproperty. Otherwise,
when rn e, we hoose the smallest indexj suhthat
rmax(vj) > r+e. It follows that r e rmin(vj
1 ). If
r e>rmin(vj 1)thenrmax(vj)=rmin(vj
1
)+gj+j>
r
min (v
j 1
)+2e; a ontradition to our assumption that
e=maxi(gi+i)=2. Byassumption,rmax(vj
1
)r+e,
thereforej 1is anexampleof anindexiwiththe above
desribedproperty.
Thefollowingisanimmediateorollary.
Corollary 1. Ifatany time n,the summaryS(n)sat-
isesthe property that maxi(gi+i) 2n, thenwe an
answerany -quantilequery towithin annpreision.
Atahighlevel,ouralgorithmfor maintainingthequantile
summaryproeedsasfollows. Wheneverthealgorithmsees
a newobservation, it inserts inthe summary a tupleor-
respondingtothis observation. Periodially, thealgorithm
performsasweepoverthesummaryto\merge"someofthe
tuplesintotheirneighborssoastofreeupspae. Theheart
ofthe algorithm is inthe mergephase where we maintain
severalonditionsthatallowustoboundthespaeusedby
Satanytime. ByCorollary1,itsuÆestoensurethatatall
timesmaxi(gi+i)2n. Motivatedbythisonsideration,
wewillsaythatanindividualtupleisfullifgi+i=b2n.
Theapaityofanindividualtupleisthemaximumnumber
ofobservations thatanbeountedbygi beforethe tuple
beomesfull.
Bands: Inorderto minimizethenumberof tuplesinour
summary,ourgeneralstrategywillbetodeletetupleswith
smallapaityandpreservetupleswithlargeapaity. The
mergephasewillfreeupspaebymergingtupleswithsmall
apaitiesintotupleswith\similar"orlargerapaities. We
say that two tuples, t
i and t
j
, have similar apaities, if
logapaity (t
i
)logapaity(t
j ).
This notion of similarity partitions the possible values of
into bands. Roughlyspeaking, we try to dividethe s
into bands that liebetweenelements of(0;
1
2 2n;
3
4 2n; :::
2 i
1
2 i
2n;:::2n 1;2n). (Theseboundariesorrespondto
apaitiesof2n;n;
1
2 n;:::
1
2 i
n;:::,8;4;2;1.) Aswewill
seeshortly,itisusefultodenebandsinawaythatensures
thepropertythatiftwosareeverinthesameband,they
neverappear in dierent bands as n inreases. Therefore,
for from 1 to dlog2ne, we let p =b2n and we dene
bandtobethesetofallsuhthatp 2
(pmod2
)<
p 2
1
(pmod2 1
). The(pmod2
)termholds
thebordersbetweenbandsstatiasninreases. Wedene
band0 to simply be p. Asa speial ase, we onsider the
rst1=2observations,with=0,tobeinabandoftheir
own. Figure1showsthebandboundariesas2ngoesfrom
24to34. We willdenotebyband(ti;n)thebandof i at
timen,andbyband
(n)all tuples(orequivalently,the
valuesassoiatedwiththesetuples)thathaveabandvalue
of.
1111111111222222222233333
2n 0123456789012345678901234 5678 901 234
24
25
26
27
28
29
30
31
32
33
34
Figure 1: Band boundaries as 2n progresses from
24to34. Therightmostbandineahrowisband0.
Proposition 2. Atanypointintimenandforany
1,band(n)ontainseither2
or2 1
distintvaluesof.
Proof. Theband(n) isboundedbelowby2n 2
(2nmod2
) and aboveby 2n 2 1
(2nmod2 1
).
If 2nmod2
<2 1
, then 2nmod2
=2nmod2 1
,
and band(n) ontains 2
2 1
= 2 1
distint values
of . If 2nmod2
2 1
, then 2nmod2
= 2 1
+
(2nmod2 1
),andband
(n) ontains2 1
+2 1
=2
distintvaluesof.
ATree Representation:Wewillnditusefultoimpose
a tree struture over the tuples. Given a summary S =
ht
0
;t
1
;:::;t
s 1
i,thetreeTassoiatedwithSontainsanode
Vi for eah ti and aspeial root node R . Theparentof a
nodeViisthenodeVjsuhthatjistheleastindexgreater
than iwith band(t
j
) > band(t
i
). If no suh indexexists,
thenthe nodeRis settobe theparent. Allhildren(and
alldesendants)ofagivennodeVihavevalueslargerthan
i
. ThefollowingtwopropertiesofTanbeeasilyveried.
Proposition 3. The hildren of any node in T are al-
ways arranged innon-inreasingorderof bandinS.
Proposition 4. For any node V, the set of all its de-
sendants inTformsaontiguous segment inS.
2.2 Operations
Wenowdesribethevariousoperationsthatweperformon
oursummarydatastruture. Westartwithadesriptionof
externaloperations:
2.2.1 External Operations
QUANTILE() Toomputean-approximate-quantile
from the summary S(n) after n observations, om-
pute the rank, r = dne. Find i suh that both
r rmin(vi)nandrmax(vi) rnandreturnvi.
i 1 i
andinsertthetuple(v;1;b2n),betweent
i 1 andt
i .
Inrements. Asaspeialase,ifvisthenewminimum
orthemaximumobservationseen,theninsert(v;1;0).
INSERT(v)maintains orretrelationshipsbetweengi,i,
r
min (v
i )andr
max (v
i
). Considerthatifvisinsertedbefore
vi,thevalueofrmin(v)maybeassmallasrmin(vi 1)+1,
andhenegi=1. Similarly,rmax(v)maybeaslargeasthe
urrentr
max (v
i
),whihinturnis boundedbyb2n. Note
thatrmin(vi)andrmax(vi)getinreasedby1afterinsertion.
COMPRESS()
forifroms 2to0do
if((BAND(i;2n)BAND(i+1;2n))&&
(g
i +g
i+1 +
i+1
<2n))then
DELETEalldesendantsoftiandthetupletiitself;
endif
endfor
endCOMPRESS
Figure2: Pseudo-ode for COMPRESS
2.2.2 Internal Operations
DELETE(vi) To delete the tuple (vi;gi;i) from S, re-
plae(v
i
;g
i
;
i )and(v
i+1
;g
i+1
;
i+1
)bythenewtu-
ple(v
i+1
;g
i +g
i+1
;
i+1
),andderements.
DELETE() orretly maintains the relationships be-
tweengi,i,rmin(vi)andrmax(vi). Deletingvihasno
eetonrmin(vi+1)and rmax(vi+1),so DELETE(vi)
shouldsimplypreserver
min (v
i+1 )andr
max (v
i+1 ).The
relationshipbetweenr
min (v
i+1 )andr
max (v
i+1 )ispre-
servedaslongasi+1isunhanged. Sinermin(vi+1)=
P
ji+1
gj,andwedeletegi,wemustinreasegi+1by
gi tokeeprmin(vi+1). Allotherentriesareunaltered
bythisoperation.
COMPRESS() TheoperationCOMPRESStriestomerge
togetheranodeandall itsdesendantsintoeitherits
parentnodeorintoitsrightsibling. Thepropertythat
wemustensureisthatthetuplethatresultsafterthis
merging is not full. By Proposition 4, we knowthat
a nodeand itshildrenalwaysform a ontiguousse-
quene of tuples in S(n). Let g
i
denote the sum of
g-valuesofthetuplet
i
andallitsdeendantsinT. It
iseasytoseethatmergingt
i
anditsdesendants(by
DELETEingthem) into ti+1would resultinti+1be-
ing updatedto(vi+1;g
i
+gi+1;i+1). Wewouldlike
to ensurethatthis resultingtupleisnotfull. Wesay
thatapairofadjaenttuplesti;ti+12S(n)ismerge-
able if (g
i
+gi+1+i+1 < 2n) and band(ti;n)
band(t
i+1
;n). At a high level, the COMPRESS op-
eration iterates overthe tuples inS(n) from right to
left,andwheneveritndsamergeablepairti;ti+1,it
mergest
i
aswellasalltuplesthatare desendantsof
ti inT(n)intoti+1. Notethatpairsoftuplesthatare
not mergeableat somepoint intime may beomeso
at a later point in time as the term b2n inreases
overtime. Figure2 givespseudo-odedesribingthis
theofsurvivingtuples,itfollows that
i
ofanyquantile
entryremainsunhangedoneithasbeeninserted.
COMPRESS()inspetstuplesfromright(highestindex)to
left. Therefore, it rstombineshildren (andtheir entire
subtree of desendants) into parents. It ombinessiblings
onlywhennomorehildrenanbeombinedintotheparent.
Initial State
S ;;s=0;n=0.
Algorithm
Toaddthen+1stobservation,v,tosummaryS(n):
if(n0mod 1
2 )then
COMPRESS();
end if
INSERT(v);
n=n+1;
Figure3: Pseudo-ode for thealgorithm
2.3 Analysis
It is easy to see that the data struture above maintains
an-approximate quantilesummaryat eahpointintime.
TheINSERTaswellasCOMPRESSoperationsalwaysen-
sure that g
i
+
i
2n at any point in time. We will
now establish that the total numberof tuples inthe sum-
maryS afternobservationshavebeenseen isboundedby
(11=2)log(2n).
We start by deninga notion of overage. We say that a
tupletinthequantilesummarySoversanobservationvat
anytimenifeitherthetupleforvhasbeendiretlymerged
into tior atupletthatoveredvhasbeenmergedintoti.
Moreover,atuplealwaysoversitself. Itiseasytoseethat
the total number of observations overed by t
i
is exatly
givenby gi =gi(n). The lemmasbelow give somesimple
properties onerning overage of observations by various
tuples.
Lemma 1. Atnopointintime,atuplefrombandov-
ersanobservationfromaband>.
Proof. Suppose at some time n, the event desribed
in the lemma ours. TheCOMPRESS subroutine never
mergesatupleti intoanadjaent tupleti+1ifthebandof
ti is greater thanthe bandof ti+1. Thus the onlyway in
whihthiseventanourisifitatsomepointintime,say
m,wehaveband(ti;m)band(ti+1;m)andattheurrent
time n, we have band(ti;n)> band(ti+1;n). We now ar-
guethat this annotour sine ifat any point intime`,
band(ti;`)=band(ti+1;`),thenforalln`,wemusthave
band(ti;n)=band(ti+1;n). Thebordersbetweenbandsare
stati,exeptwhentwobandsombine(forever). Band0is
alwaysnew. If2n2 1
mod2
,thenand+1om-
bineintothe+1band(isauniquebandforgivenn). All
bands >+1remainthesame. Beauseband0isalways
new,allbands<beome+1. Inotherwords,borders
thetotalnumberofobservations overedumulativelybyall
tupleswithbandvaluesin[0::℄isboundedby2
=.
Proof. ByProposition2,eahband(n)ontainsatmost
2
distint values of. Thereare nomorethan1=2 ob-
servationswithanygiven,soatmost2
=2observations
were inserted with 2 band. By Lemma 1, no obser-
vations frombands>willbe overed byanodefrom .
Thereforethenodesinquestionanover,atmost,thetotal
numberofobservationsfromallbands . Summingover
allyieldsanupperboundof2 +1
=2=2
=.
Thenextlemmashowsthatforanygivenbandvalue,only
asmall numberof nodes anhavea hild withthat band
value.
Lemma 3. Atany timenandfor anygiven,there are
atmost3=2nodesinT (n)thathaveahildwithbandvalue
of. Inotherwords,thereareatmost3=2parentsofnodes
fromband(n).
Proof. Letmminandmmax,respetivelydenotetheear-
liestandthelatesttimesatwhihanobservationinband
(n)
ould be seen. It is easy to verify that mmin = (2n
2
(2nmod2
))=2andmmax=(2n 2 1
(2nmod
2 1
))=2. Thus, any parent of a node inband(n) must
have
i
<2m
min .
FixaparentnodeViwithatleastonehildinband(n)and
letV
j
betherightmostsuhhild. Denotebym
j
thetime
atwhihtheobservationorrespondingtoVj wasseen.
Wewill showthatat leasta(2=3)-fration ofall observa-
tionsthat arrivedaftertimemminanbeuniquelymapped
tothepair(Vi;Vj). Thisinturnimpliesthatnomorethan
3=2suhVi's anexist,thusestablishingthelemma. The
mainideaunderlyingourproofisthat thefatthatCOM-
PRESS() did not merge Vj into Vi implies there mustbe
alargenumberofobservationsthatanbeassoiated with
theparent-hildpair(V
i
;V
j ).
We rst argue that g
j (n)+
P
i 1
k =j+1 g
k
(n) g
i 1 (n). If
j=i 1,it is triviallytrue. Otherwise, observe that any
tuplet
k
thatliesbetweent
j and t
i
mustbelong toaband
lessthanorequalto|elseV
k
,andnotV
i
,wouldbethe
parent of Vj. Therefore, P
i 1
k =j+1 g
k (n) g
i 1
(n)and the
laimfollows.
NowsineCOMPRESS()didnotmergeVj intoVi,itmust
betheasethatg
i 1
(n)+gi(n)+i>2n. Usingthelaim
above,weanonludethatg
j (n)+
P
i 1
k =j+1 g
k (n)+g
i (n)+
i >2n. Also, attimemj,wehad gi(mj)+i <2mj.
Sinem
j
isatmostm
max
,itmustbethat
g
j (n)+
i 1
X
k =j+1 g
k (n)+(g
i (n) g
i (m
j
))>2(n m
max ):
Finally observe that for any other suh parent-hild pair
V0 andV0,theobservationsountedaboveby(Vj;Vi)and
j i
min
observationsthatarrivedafterm
min
,weanboundthetotal
numberofsuhpairsby(n mmin)=(2(n mmax))whih
iseasilyveriedtobeatmost3=2.
Givena full pairof tuples(t
i 1
;t
i
),we say thatthe tuple
ti
1
isaleftpartnerandtiisarightpartnerinthisfullpair.
Lemma 4. Atanytime nandforany given,thereare
atmost4=tuples fromband
(n)thatare rightpartnersin
afulltuplepair.
Proof. LetXbethesetoftuplesinband(n)thatpar-
tiipateasarightpartnerinsomefullpair. Werstonsider
theasewhentuplesinXformasingleontiguoussegment
inS(n). Letti;:::;ti+p
1
beamaximalontiguoussegment
of band
(n)tuplesinS(n). Sinethesetuples arealivein
S(n),itmustbetheasethat
g
j 1 +g
j +
j
>2n ij<i+p:
Addingoverallj,weget
i+p 1
X
j=i g
j 1 +
i+p 1
X
j=i g
j +
i+p 1
X
j=i
j
>2pn:
Inpartiular,weanonludethat
2 i+p 1
X
j=i 1 g
j +
i+p 1
X
j=i
j>2pn:
The rst term in the LHS of the above inequality ounts
twiethenumberofobservationsoveredbynodesinband(n)
orbyoneofitsdesendantsinthetreeT (n). UsingLemma2,
thissumanbeboundedby2(2
=). Theseondterman
be boundedby p(2n 2 1
) sine the largestpossible
valueforatuplewithabandvalueoforlessis(2n 2 1
).
Substitutingthesebounds,weget
2 +1
+p(2n 2 1
) > 2pn
Simplifyingabove,wegetp<4=aslaimedbythelemma.
Finally,thesameargumentapplieswhennodesinX indue
multiple segments in S(n); we simply onsider the above
summationoverallsuhsegments.
Lemma 5. Atanytimenandforanygiven,themaxi-
mumnumberoftuplespossiblefromeahband(n)is11=2.
Proof. ByLemma4weknowthatthenumberofband(n)
nodesthatarerightpartnersinsomefullpairanbebounded
! .1 .05 .01 .005 .001 .1 .05 .01 .005 .001 .1 .05 .01 .005 .001
10 5
: 61 120 496 902 3290 183 360 1488 2706 9870 275 468 1519 2859 8334
10 6
: 76 156 664 1230 4983 228 468 1992 3690 14949 378 702 2748 4664 15155
10 7
: 94 185 835 1578 6662 282 555 2505 4734 19986 600 1032 3708 7000 27475
10 8
: 110 224 1067 2063 9148 330 672 3201 6189 27444 765 1477 5960 10320 37026
10 9
: 124 266 1249 2407 11074 372 798 3747 7221 33222 924 1880 7650 14742 59540
Table1: Number oftuplesstored and spaerequirementsfor \hard input" sequenes. For MRLalgorithm,
weassumethat eahquantile storedtakesonlyoneunit ofspae.
by 4=. Any otherband(n)node eitherdoesnotpartii-
pateinanyfullpairoroursonlyasaleftpartner. Werst
laimthateahparentofaband(n)nodeanhaveatmost
onesuhnodeinband(n). Toseethis,observethatifapair
ofnon-full adjaent tuplest
i
;t
i+1
,where t
i+1 2 band
(n),
isnotmergedthenitmustbebeauseband(ti;n)isgreater
than.ButProposition3tellsusthatthiseventanour
onlyoneforany,andtherefore,V
i+1
mustbetheunique
band(n)hildofitsparentthat doesnotpartiipate ina
fullpair. Itisalsoeasytoverifythatforeahparentnode,
atmostoneband
(n)anpartiipateonlyasaleftpartner
inafullpair. Finally,observethatonlyoneoftheabovetwo
eventsanourforeahparentnode. ByLemma3,there
areat most3=2 parentsofsuhnodes,and thus thetotal
numberofband
(n)nodesanbeboundedby11=2.
Theorem 1. At any time n,the total number of tuples
stored inS(n)isatmost(11=2)log(2n).
Proof. There are at most 1+blog2n bands at time
n. There an be at most 3=2 total tuples in S(n) from
bands0and 1. Fortheremainingbands,Lemma5bounds
the maximum numberof tuples in eah band. Theresult
follows.
3. EMPIRICAL MEASUREMENTS
Wenowdesribesomeempirialresultsonerningtheper-
formane of our algorithm in pratie. We experimented
withthreedierentlassesofinputdata: (1)A\hardase"
forouralgorithm,(2)\sorted"inputdata,and(3)\random"
input data. The \sorted" and \random" input sequenes
werehosenfor tworeasons. First,\random"should yield
someinsight into thebehavior of this algorithm on\aver-
age" inputs, or after some randomization. Seond, these
twosenarioswereusedtoproduetheexperimentalresults
in[8℄. TheMRLalgorithm[8℄isthebestpreviouslyknown
algorithm.
Weobservedduring theseruns that, inpratie, the algo-
rithm used substantially less spae than indiated by our
analysisfromthe previoussetion. Theobservedspaere-
quirements alsoturn out to be better thanthose required
by the MRLalgorithm. Moreover, whenwerun ouralgo-
rithmwiththe samespae asusedbythe MRLalgorithm,
the observed error is signiantly better than that of the
MRLalgorithm. Wewill refer to this latervariant as the
pre-alloated variantof ouralgorithm. Inontrast, we will
refertothebasiversionofthealgorithmwherewealloate
anewquantileentryonlywhentheobservederrorisabout
Ourimplementationofthe algorithmdieredslightlyfrom
thatdesribedinSetion2intwoways.First,newobserva-
tionswereinsertedasatuple(v;1;gi+i 1)ratherthan
as(v;1;b2n). Thelatterapproahisusedintheprevious
setion stritlytosimplifytheoretialanalysisofthe spae
omplexity. Seond,ratherthanrunningCOMPRESSafter
every 1=2 observations, instead, for eah observation in-
sertedintoS,onetuplewasdeleted,whenpossible. When
notuple ould be deletedwithout ausing itssuessor to
beome overfull, the size of S grew by 1. Note that by
prealloatingalargeenoughnumberofstoredquantiles,no
inreaseinspaeneedevertakeplae, assuming youknow
N inadvane.
Foreahexperimentwemeasuredboththemaximumspae
used to produe the summary, and the observed preision
of the results. Wemeasured spaeonsumption by ount-
ingthenumberofstoredtuples.Whenomparingourspae
onsumptiontotheMRLalgorithm,wepessimistiallymul-
tiplied thenumberofstoredtuplesby3toaountfor our
reordingthevalueandboththeminandmaxrankofeah
storedelement.
3.1 Hard Input
Weonstrutheredatasequenesinadversarialmannerfor
our algorithm. At eah time step, we generate the next
observation so that it falls inthe largest urrent \gap" in
ourquantilesummary.
We suessively fed observationsto our summary,withno
advanehint aboutthetotal numberof observationsto be
seen. Wemeasuredthemaximumamountofspaerequired
as thesize oftheinputsequeneinreased to10 9
. Table 1
reports the results of this experiment for N ranging over
powersof10from10 5
to10 9
.
Notethattherequirednumberofquantilesstoredisapprox-
imatelya fatorof11lowerthanthe worst-ase boundwe
omputed inthe previoussetion of this paper. Also note
that thenumberof quantiles westore issigniantlylower
thanthe numberused by the MRL algorithm. Even after
multiplyingour tupleount byafator of3,wealmostal-
waysrequirelessspaethanMRL.Theonlyexeptionisin
=:001andN =10 5
,wherethespaeostofouralgorithm
exeedsthatoftheMRLalgorithm.
3.2 Sorted Input
Theseondsenario,\sorted",measuresthebehaviorofthe
summary whenthedata arrivesinsorted order. We xed
=:001andonstruted summariesof sortedsequenesof
5 6 7
qi# MRL Ouralgorithm,Prealloated Ouralgorithm,Adaptive
N ! 10
5
10 6
10 7
10 5
10 6
10 7
10 5
10 6
10 7
jSj 8334 15155 27475 2778 5052 9158 756 756 756
Max 0.00035 0.000194 0.000167 0.00027 0.000128 0.000090 0.00095 0.000899 0.000819
1 0.00015 0.000199 0.000091 0.00021 0.000020 0.000077 0.00074 0.000057 0.000618
2 0.00006 0.000050 0.000120 0.00024 0.000056 0.000009 0.00039 0.000259 0.000203
3 0.00006 0.000210 0.000062 0.00010 0.000052 0.000031 0.00010 0.000744 0.000665
4 0.00024 0.000161 0.000001 0.00001 0.000016 0.000005 0.00040 0.000860 0.000002
5 0.00002 0.000033 0.000070 0.00002 0.000092 0.000050 0.00016 0.000494 0.000230
6 0.00022 0.000166 0.000053 0.00012 0.000048 0.000014 0.00027 0.000716 0.000632
7 0.00000 0.000037 0.000085 0.00024 0.000060 0.000066 0.00007 0.000388 0.000488
8 0.00010 0.000084 0.000043 0.00012 0.000096 0.000035 0.00021 0.000829 0.000090
9 0.00019 0.000207 0.000095 0.00006 0.000124 0.000014 0.00033 0.000000 0.000038
10 0.00013 0.000060 0.000100 0.00012 0.000088 0.000050 0.00055 0.000036 0.000354
11 0.00005 0.000098 0.000013 0.00002 0.000000 0.000014 0.00005 0.000542 0.000185
12 0.00004 0.000096 0.000001 0.00008 0.000004 0.000022 0.00017 0.000093 0.000010
13 0.00006 0.000107 0.000045 0.00014 0.000008 0.000044 0.00039 0.000263 0.000220
14 0.00002 0.000116 0.000038 0.00020 0.000008 0.000056 0.00022 0.000732 0.000665
15 0.00003 0.000098 0.000049 0.00023 0.000028 0.000041 0.00008 0.000316 0.000425
Table 2: Spaeand preision measurements for\sorted" ase.
erroroverallpossiblequantilequeries,andhosetoquery15
quantilesatrank q
i
16
N,forqi=[1::15℄,tostudythebehavior
atspeiquantiles.
We ompared three algorithms for onstruting the sum-
mary.First,weusedtheMRLalgorithmtoomputeasum-
mary wherewe prealloated the storagerequired by MRL
asafuntionofN and. Seond,wepre-alloatedthesame
amount of storage required by MRL (1/3 as many stored
quantilesasMRL, though),andranouralgorithm without
alloatinganymorequantiles. Finally,weranouralgorithm
inthe adaptivemode; westarted with 1
2
stored quantiles
andonlyalloatedextrastorageifitwasimpossibletodelete
existingquantileswithoutexeedingapreisionof:001n.
Table 2reportsthe results ofthis experiment. jSj reports
thenumberofstoredquantilesneededtoahievethedesired
preision. Therowlabeled\max"reportsthemaximumer-
rorofallpossiblequantilequeriesonthesummary.Inorder
to give an indiationof the behavior of this algorithm for
speiquantiles,theremainingrowslisttheapproximation
erroroftheresponsetothequeryfortheq
i
=16thquantile.
To interpretthe entries inTable 2, onsider the .5 quan-
tile(50thptile, or8/16). Forasequeneof10 5
elements,
theadaptive algorithm usesonly756tuples, butreturnsa
value withanapproximation error of .00021. MRL stores
overeighttimesasmanyquantiles,andreturnsavaluewith
error .00010, almost twie as aurate. Our prealloated
algorithm stores only one third as many tuples as MRL,
butreturnsavaluewithanapproximationerrorof.00012{
omparableauraybutusing onlyone thirdthe number
oftuples.
Infat,however,theerroronanyindividualquantileisnot
representativeofthe erroras awhole |hadwehosento
inspetthe1/4quantileinsteadof1/2,thenouralgorithm
wouldhavebeen24timesasaurateasMRL!Hadweho-
sen 3/4, thenMRL would have been twie as aurate as
ours. Of the 15 quantiles we sampled, we outperformed
MRLon6outof15for asequeneofsize10 5
,10outof15
forsize10 6
,and11outof15for10 7
.Individualqueriesare
highlysensitivetohowlosethequantilequeryhappensto
beto somesinglestoredquantile. Onaverage,inompari-
sontoMRLusingthesamestorage,ouralgorithmreported
better worst-ase observed error,andomparable observed
error (we performslightlyworsefor N =10 5
, butslightly
better for N = 10 6
and 10 7
). Both algorithms ahieved
higher preision thandemanded by the a priori speia-
tion.
The mostinteresting resultis that our adaptivealgorithm
seemstorequireonly756storedquantiles,regardlessofthe
sizeoftheinputsequene. Closerexperimentationrevealed
that thealgorithm onlyneeds all 756stored quantilesat a
fairly earlystageintheomputation|the exessstorage
redues the observed error, slightly. One an see this by
observing the maximum error in Table 2. For a desired
= :001, one would expet that the maximum observed
errorwouldbeapproximatelyequalto .001,too. However,
for 10 5
the maximum error isonly :000955 and as N gets
largerthemaximumerrorgetssmaller.
The maximumerror oersanother interesting insight into
thebehaviorofouralgorithm. Notethattheoptimalvalue
formaximumerrorinallasesis1=(2jSj)(thisoursonlyif
thestoredquantilesaredistributedevenlyamongallvalues,
and we know their rank preisely). For example, for 756
quantiles,theoptimalmaxerroris.00066. For2778 quan-
tiles, the ideal maximum error is .00018. Our algorithm
deliversamaximumerrorwithinafatorof2ofoptimal. In
ontrast,the optimalmaxerrorof 8334stored quantiles is
5:9910 5,yettheMRLalgorithmdeliversamaxerror6
times as large. Infat, for MRL, thedisrepany between
the idealmaxerrorandobservedmaxerrorseemstogrow
as N (andjSj)getslarger; for N =10 7
,the observedmax
errorismorethan9timestheoptimalvalue.
3.3 Random Input
The third senario, \random", selets eah datum by se-
leting an element (without replaement) from a uniform
skeweddistribution, butthe orderin whihthe values are
observedbythesummaryishosenbytheuniformrandom
proess.
Asinthesortedase,wexed=:001andsummarizedse-
quenesoflengths10 5
;10 6
,and10 7
. Weagainomputedthe
maximumerror,thequantilesatrank q
i
16 N,forq
i
=[1::15℄,
andmeasuredtheatual maximumstorage requirementto
omputethesummary.Inontrasttothesortedinputase
where a single experiment was suÆient to determine the
expeted behavior, random input requires running several
trialstoilluminateexpetedbehavior. Weraneahexperi-
ment50timesandreportthemin,max,meanandstandard
deviationforeverymeasurement. Tables3through5report
theseresults.
Theobservedofourprealloatedalgorithmisroughlytwie
as aurateas MRL, although ouradvantage seemsto in-
rease steadily as N getslarger. Notsurprisingly,the ob-
servedof ouradaptive algorithmstayslose to 0.001 re-
gardlessofhowlargeN gets. Theobservedstoragerequire-
ments,however, may be surprising. These are one again
the most interesting results of our \random" senario. It
appearsthatforuniformlyrandominputtherequiredspae
isindependentofN,thesizeofthedataset,anddependent
only upon . In all our experiments, a :001-approximate
summaryofarandominputwasahievedwithroughly920
tuples.
4. CONCLUDING REMARKS
Wepresentedanewonlinealgorithmforomputingquantile
summariesofverylargesequenesofdatainaspae-eÆient
manner. Ouralgorithmimprovesupontheearlierresultsin
twosigniantways. First,itimprovesthespaeomplexity
by a fator of (log(N)). Seond, it does not require a
prioriknowledgeoftheparameterN |thatis,italloates
morespaedynamiallyasthedatasequenegrowsinsize.
Anobviousquestioniswhetherornotthespaeomplexity
ahieved by our algorithm is asymptotially optimal. We
believethattheanswerisintheaÆrmativeindeed.
Ourempirialstudyofthenewalgorithmprovidesevidene
that our algorithm ompares favorably with the previous
algorithmsinpratieaswell. Aurioustrend observedin
ourexperimentsisthatonrandominputs,thespaerequire-
mentsofthealgorithmseemonlytodependontheerrorpa-
rameterand beome independent of thesequene length
N. Itwillbeinterestingtoanalytiallyverifythisbehavior
and to understandthe minimalharateristis of the data
sequenesthatlead tosuhimprovedspaerequirements.
5. REFERENCES
[1℄ RakeshAgrawalandArunSwami.Aone-pass
spae-eÆientalgorithmforndingquantiles.Pro.
7thInt.Conf.Managementof Data,COMAD,
28{30Deember1995.
[2℄ KhaledAlsabti,SanjayRanka,andVineetSingh.A
one-passalgorithmfor auratelyestimatingquantiles
fordisk-residentdata.Proeedingsofthe23rdIntl.
CA94022,USA,1997.Morgan KaufmannPublishers.
[3℄ SurajitChaudhuri,RajeevMotwani,andVivek
Narasayya.Randomsamplingforhistogram
onstrution: howmuhisenough? InACMSIGMOD
'98,volume28,pages436{447,Seattle,WA,June1{4,
1998.
[4℄ PhillipB.Gibbons,YossiMatias,andViswanath
Poosala.Fastinrementalmaintenaneofapproximate
histograms.InProeedingsofthe23rdIntl.Conf.Very
LargeDataBases,VLDB,pages466{475.Morgan
Kaufmann,25{27August1997.
[5℄ MihaelB.Greenwald.Pratialalgorithmsforself
salinghistogramsorbetterthanaveragedata
olletion. PerformaneEvaluation,27&28:19{40,
Otober1996.
[6℄ R.JainandI.Chlamta.TheP 2
algorithmfor
dynamialulationofquantileandhistograms
withoutstoringobservations.Communiations ofthe
ACM,28(10):1076{1085, Otober1986.
[7℄ I.Pohl.Aminimumstoragealgorithmforomputing
themedian.IBMResearh ReportRC2701,November
1969.
[8℄ GurmeetSinghManku,SridharRajagopalan, and
BrueG.Lindsay.Approximatemediansandother
quantilesinonepassandwithlimitedmemory.ACM
SIGMOD '98,volume28,pages426{435,Seattle,WA,
June1998.
[9℄ GurmeetSinghManku,SridharRajagopalan, and
BrueG.Lindsay.Randomsamplingtehniquesfor
spaeeÆientonlineomputationoforderstatistisof
largedatasets.InACMSIGMOD '99,volume29,
pages251{262.Philadelphia,PA,June1999.
[10℄ J.I.MunroandM.S.Paterson.Seletionandsorting
withlimitedstorage.TheoretialComputerSiene,
vol. 12: 315{323;1980.
[11℄ M.S.Paterson.Progressinseletion.TehnialReport,
UniversityofWarwik,Coventry,UK,1997.
[12℄ ViswanathPoosala, VenkateshGanti,andYannisE.
Ioannidis.Approximatequeryansweringusing
histograms.BulletinoftheIEEE TehnialCommittee
onDataEngineering,22(4):6{15, Deember1999.
[13℄ ViswanathPoosala, PeterJ.Haas,YannisE.
Ioannidis,andEugeneJ.Shekita.Improved
histograms forseletivityestimationofrange
prediates. InACMSIGMOD96,volume26, pages
294{305,Montreal,Quebe,Canada,June4{6, 1996.
i
jSj! 8334 2778 [898-939℄,919.188.63
[range(10 4
)℄avgstdev [range(10 4
)℄avgstdev [range(10 4
)℄avgstdev
Max [4.3-5.2℄0.00046982.02e-05 [2.9-2.95℄0.00029200.24e-05 [8.25-8.70℄0.00084870.91e-05
1 [0.0-3.2℄0.00009287.38e-05 [0.1-2.5℄ 0.00010747.19e-05 [0.1-7.8℄0.00032221.88e-04
2 [0.0-3.0℄0.00011307.58e-05 [0.2-2.5℄ 0.00012166.42e-05 [0.1-7.0℄0.00032161.88e-04
3 [0.0-3.5℄0.00011048.86e-05 [0.0-2.7℄ 0.00012207.36e-05 [0.2-7.7℄0.00034062.07e-04
4 [0.0-2.8℄0.00010406.93e-05 [0.0-2.7℄ 0.00012367.44e-05 [0.1-7.6℄0.00029521.98e-04
5 [0.0-3.7℄0.00011728.81e-05 [0.0-2.6℄ 0.00008446.07e-05 [0.1-6.6℄0.00031021.88e-04
6 [0.1-3.0℄0.00010467.69e-05 [0.0-3.3℄ 0.00009127.41e-05 [0.2-6.7℄0.00029861.64e-04
7 [0.2-3.6℄0.00013467.97e-05 [0.0-2.5℄ 0.00010786.45e-05 [0.0-6.9℄0.00030901.89e-04
8 [0.1-3.8℄0.00009828.86e-05 [0.0-3.1℄ 0.00011347.08e-05 [0.0-7.7℄0.00029101.94e-04
9 [0.0-2.7℄0.00012227.37e-05 [0.0-2.5℄ 0.00010747.62e-05 [0.0-6.6℄0.00029101.75e-04
10 [0.0-3.4℄0.00012787.68e-05 [0.0-2.3℄ 0.00009126.01e-05 [0.0-7.0℄0.00027401.69e-04
11 [0.1-3.1℄0.00012047.87e-05 [0.0-2.8℄ 0.00009547.31e-05 [0.1-6.9℄0.00027901.84e-04
12 [0.1-2.4℄0.00010406.83e-05 [0.0-2.4℄ 0.00009406.71e-05 [0.2-8.2℄0.00035662.32e-04
13 [0.0-3.0℄0.00008786.83e-05 [0.0-2.3℄ 0.00011146.49e-05 [0.2-7.6℄0.00034462.01e-04
14 [0.0-3.1℄0.00009828.05e-05 [0.0-2.5℄ 0.00011966.80e-05 [0.4-8.2℄0.00034241.99e-04
15 [0.0-2.8℄0.00010007.12e-05 [0.0-2.8℄ 0.00013308.24e-05 [0.1-6.2℄0.00029521.86e-04
Table3: N=100;000;Samples= 50;random order.
q
i
# MRL OurAlgorithm,Prealloated OurAlgorithm,Adaptive
jSj! 15155 5052 [900-939℄ 919.388.92
[range(10 4
)℄avgstdev [range(10 4
)℄avgstdev [range(10 4
)℄avgstdev
Max [3.02-3.63℄0.00032751.44e-05 [1.495-1.520℄15.04e-050.06e-05 [7.835-8.215℄0.00080040.82e-05
1 [0.02-3.00℄0.00011947.88e-05 [0.05-1.41℄5.41e-053.37e-05 [0.00-7.78℄0.00031732.12e-04
2 [0.09-3.19℄0.00012487.69e-05 [0.04-1.41℄5.79e-053.65e-05 [0.06-6.94℄0.00032591.80e-04
3 [0.01-2.90℄0.00012537.27e-05 [0.01-1.28℄5.73e-053.71e-05 [0.15-7.11℄0.00031721.87e-04
4 [0.01-2.71℄0.00010927.47e-05 [0.02-1.43℄5.57e-053.46e-05 [0.07-7.04℄0.00035461.97e-04
5 [0.12-2.84℄0.00012607.44e-05 [0.03-1.36℄5.45e-053.59e-05 [0.02-7.06℄0.00029071.78e-04
6 [0.01-3.20℄0.00009847.68e-05 [0.01-1.22℄5.89e-053.26e-05 [0.29-6.57℄0.00029721.76e-04
7 [0.01-2.79℄0.00012567.52e-05 [0.01-1.38℄5.03e-053.58e-05 [0.09-6.30℄0.00029511.60e-04
8 [0.05-3.27℄0.00012996.03e-05 [0.01-1.21℄4.55e-053.37e-05 [0.11-7.10℄0.00028921.73e-04
9 [0.22-3.27℄0.00012687.75e-05 [0.05-1.24℄5.88e-053.57e-05 [0.04-7.15℄0.00030152.04e-04
10 [0.13-3.74℄0.00013898.64e-05 [0.03-1.61℄7.14e-053.88e-05 [0.02-7.07℄0.00029242.04e-04
11 [0.09-3.01℄0.00014317.67e-05 [0.00-1.38℄5.81e-053.58e-05 [0.11-6.43℄0.00029892.01e-04
12 [0.03-3.32℄0.00014468.64e-05 [0.00-1.46℄4.86e-053.33e-05 [0.20-6.71℄0.00033781.66e-04
13 [0.04-2.84℄0.00013397.25e-05 [0.00-1.34℄5.30e-053.42e-05 [0.04-6.69℄0.00031281.70e-04
14 [0.04-2.74℄0.00012888.91e-05 [0.03-1.43℄5.65e-053.60e-05 [0.02-7.03℄0.00031461.86e-04
15 [0.02-2.92℄0.00012848.82e-05 [0.02-1.67℄5.45e-053.86e-05 [0.05-6.46℄0.00027971.72e-04
Table 4: N =1;000;000; Samples=50; randomorder.
qi# MRL OurAlgorithm,Prealloated OurAlgorithm,Adaptive
jSj! 27475 9158 [899-939℄918.428.71
[range(10 4
)℄avgstdev [range(10 4
)℄avgstdev [range(10 4
)℄avgstdev
Max [2.032-2.641℄2.35e-041.18e-05 [0.799-0.806℄8.01e-051.8e-07 [7.628-8.016℄7.82e-049.75e-06
1 [0.026-1.466℄4.98e-053.29e-05 [0.002-0.712℄2.74e-051.96e-05 [0.187-6.123℄2.87e-041.65e-04
2 [0.022-1.922℄6.32e-054.98e-05 [0.001-0.764℄2.94e-052.22e-05 [0.166-6.814℄3.04e-041.80e-04
3 [0.019-1.750℄5.90e-054.62e-05 [0.002-0.656℄2.93e-051.80e-05 [0.008-7.040℄3.68e-041.91e-04
4 [0.024-1.953℄6.19e-054.37e-05 [0.003-0.615℄2.98e-051.65e-05 [0.096-7.149℄2.98e-041.81e-04
5 [0.022-1.892℄7.02e-055.03e-05 [0.011-0.722℄2.99e-051.63e-05 [0.111-7.297℄2.56e-041.80e-04
6 [0.026-1.766℄6.61e-054.65e-05 [0.008-0.655℄2.60e-051.86e-05 [0.021-6.618℄3.27e-041.72e-04
7 [0.038-1.987℄5.75e-054.33e-05 [0.025-0.688℄3.30e-051.63e-05 [0.009-5.620℄2.14e-041.47e-04
8 [0.004-1.801℄5.69e-054.29e-05 [0.006-0.712℄2.69e-052.01e-05 [0.043-7.718℄3.17e-041.96e-04
9 [0.012-2.252℄6.47e-054.19e-05 [0.003-0.675℄2.90e-051.83e-05 [0.116-7.167℄2.83e-041.93e-04
10 [0.011-1.840℄6.11e-054.28e-05 [0.006-0.649℄2.64e-051.67e-05 [0.050-7.225℄3.09e-041.83e-04
11 [0.010-1.640℄6.67e-054.41e-05 [0.005-0.727℄2.99e-051.78e-05 [0.231-6.606℄2.60e-041.66e-04
12 [0.013-1.847℄6.09e-054.69e-05 [0.013-0.686℄2.68e-051.71e-05 [0.018-6.639℄2.95e-041.51e-04
13 [0.005-1.747℄5.80e-053.87e-05 [0.015-0.680℄2.82e-051.93e-05 [0.014-6.518℄3.06e-041.90e-04
14 [0.026-1.853℄7.12e-055.07e-05 [0.000-0.671℄3.43e-051.84e-05 [0.051-7.385℄2.69e-041.99e-04
15 [0.022-1.510℄5.57e-053.56e-05 [0.019-0.775℄2.91e-051.83e-05 [0.029-6.415℄2.74e-041.80e-04
Table5: N=10;000;000;Samples= 50;random order.