Algorithms
2
Ralph Neininger
3
Institute for Mathematics, Goethe University, 60054 Frankfurt a.M., Germany
4
neininger@math.uni-frankfurt.de
5
Jasmin Straub
6
Institute for Mathematics, Goethe University, 60054 Frankfurt a.M., Germany
7
jstraub@math.uni-frankfurt.de
8
Abstract
9
In this extended abstract a general framework is developed to bound rates of convergence for
10
sequences of random variables as they mainly arise in the analysis of random trees and divide and
11
conquer algorithms. The rates of convergence are bounded in the Zolotarev distances. Concrete
12
examples from the analysis of algorithms and data structures are discussed as well as a few examples
13
from other areas. They lead to convergence rates of polynomial and logarithmic order. A crucial
14
role is played by a factor 3 in the exponent of these orders in cases where the normal distribution is
15
the limit distribution.
16
2012 ACM Subject Classification Theory of computation → Sorting and searching; Theory of
17
computation → Divide and conquer
18
Keywords and phrases weak convergence, probabilistic analysis of algorithms, random trees, prob-
19
ability metrics
20
Digital Object Identifier 10.4230/LIPIcs.CVIT.2016.23
21
1 Introduction and notation
22
In this extended abstract we consider a general recurrence for (probability) distributions
23
which covers many instances of complexity measures of divide and conquer algorithms and
24
parameters of random search trees. We consider a sequence (Yn)n≥0ofd-dimensional random
25
vectors satisfying the distributional recursion
26
Yn
=d K
X
r=1
Ar(n)Y(r)
I(n)r +bn, n≥n0, (1)
27 28
where (A1(n), . . . , AK(n), bn, I(n)),(Yn(1))n≥0, . . . ,(Yn(K))n≥0are independent, the coefficients
29
A1(n), . . . , AK(n) are random (d×d)-matrices,bn is ad-dimensional random vector, I(n)=
30
(I1(n), . . . , IK(n)) is a random vector in {0, . . . , n}K, n0 ≥ 1 and (Yn(r))n≥0
= (Yd n)n≥0 for
31
r= 1, . . . , K. Moreover,K≥1 is a fixed integer, but extensions toK being random and
32
depending onnare possible.
33
This is the framework of [14] where some general convergence results are shown for
34
appropriate normalizations of theYn. The content of the present extended abstract is to
35
also study the rates of convergence in such limit theorems.
36
We define the normalized sequence (Xn)n≥0 by
37
Xn:=Cn−1/2(Yn−Mn), n≥0,
38
whereMn is ad-dimensional vector andCn a positive definite (d×d)-matrix. Essentially,
39
we chooseMn as the mean and Cn as the covariance matrix of Yn if they exist or as the
40
© Ralph Neininger and Jasmin Straub;
leading order terms in expansions of these moments asn→ ∞. The normalized quantities
41
satisfy the following modified recursion:
42
Xn =d
K
X
r=1
A(n)r X(r)
Ir(n)+b(n), n≥n0, (2)
43 44
with
45
A(n)r :=Cn−1/2Ar(n)C1/2
Ir(n), b(n):=Cn−1/2 bn−Mn+
K
X
r=1
Ar(n)MI(n) r
!
(3)
46 47
and independence relations as in (1).
48
In the context of the contraction method the aim is to establish transfer theorems of the
49
following form: After verifying the assumptions of appropriate convergence of the coefficients
50
A(n)r →A∗r, b(n)→b∗ then convergence in distribution of random vectors (Xn) to a limitX
51
is implied. The limit distributionL(X) is identified by a fixed-point equation obtained from
52
(2) by considering formallyn→ ∞:
53
X =d
K
X
r=1
A∗rX(r)+b∗. (4)
54 55
Here (A∗1, . . . , A∗K, b∗), X(1), . . . , X(K)are independent andX(r)=d X forr= 1, . . . , K.
56
The aim of the present extended abstract is to endow such general transfer theorems
57
with bounds on the rates of convergence. As a distance measure between (probability)
58
distributions we use the Zolotarev metric. For various of the applications we discuss, bounds
59
on the rate of convergence have been derived one by one for more popular distance measures
60
such as the Kolmogorov–Smirnov distance. The transfer theorems of the present paper are
61
in terms of the smoother Zolotarev metrics. However, they are easy to apply and cover a
62
broad range of applications at once. A crucial role is played by a factor 3 in the exponent of
63
these orders in cases where the normal distribution is the limit distribution, see Remark 4.
64
In the rest of this section we fix some notation. Regarding norms of vectors and (random)
65
matrices we denote forx∈Rd bykxkits Euclidean norm and for a random vectorX and
66
some 0< p < ∞, we set kXkp := E[kXkp](1/p)∧1. Furthermore, for a (d×d)-matrixA,
67
kAkop:= supkxk=1kAxkdenotes the spectral norm ofAand for a random suchAwe define
68
kAkp := E[kAkpop](1/p)∧1 for a random square matrix and 0 < p < ∞. Note that for a
69
symmetric (d×d)-matrixA, we have kAkop = max{|λ|: λeigenvalue ofA}. By Idd the
70
d-dimensional unit matrix is denoted. For multilinear forms the norm is defined similarly.
71
Furthermore we define byPd the space of probability distributions in Rd (endowed with
72
the Borel σ-field), by Psd :={L(X) ∈ Pd : kXks <∞} and for a vector m ∈Rd, and a
73
symmetric positive semidefinited×dmatrixC the spaces
74
Psd(m) :={L(X)∈ Psd:E[X] =m}, s >1, (5)
75
Psd(m, C) :={L(X)∈ Psd:E[X] =m,Cov(X) =C}, s >2.
7677
We use the conventionPsd(m) :=Psd fors≤1 andPsd(m, C) :=Psd(m) fors≤2.
78
The Zolotarev metricsζs, [19], are defined for probability distributionsL(X),L(Y)∈ Pd
79
by
80
ζs(X, Y) :=ζs(L(X),L(Y)) = sup
f∈Fs
|E(f(X)−f(Y))| (6)
81 82
where fors=m+α,0< α≤1, m∈N0,
83
Fs:={f ∈Cm(Rd,R) :kf(m)(x)−f(m)(y)k ≤ kx−ykα}.
8485
Note that these distance measures may be infinite. Finite metrics are given byζs onPsd for
86
0≤s≤1, by ζs onPsd(m) for 1< s≤2, and byζsonPsd(m, C) for 2< s≤3, cf. (5).
87
2 Results
88
We return to the situation outlined in the introduction, where we have normalized (Yn)n≥0
89
in the following way:
90
Xn:=Cn−1/2(Yn−Mn), n≥0, (7)
9192
whereMn is ad-dimensional random vector andCn a positive definite (d×d)-matrix. As
93
recalled in Section 1, fors >1, we may fix the mean and covariance matrix of the scaled
94
quantities to guarantee the finiteness of the ζs-metric. Therefore, we chooseMn =E[Yn]
95
forn≥0 ands >1. Fors >2, we additionally have to control the covariances ofXn. We
96
assume that there exists an n1 ≥0 such that Cov(Yn) is positive definite forn≥n1 and
97
chooseCn= Cov(Yn) forn≥n1andCn= Iddfor n < n1. Fors≤2, we just assume that
98
Cn is positive definite and setn1= 0 in this case.
99
The normalized quantities satisfy the modified recursion
100
Xn
=d K
X
r=1
A(n)r X(r)
Ir(n)+b(n), n≥n0,
101 102
withA(n)r andb(n) given in (3). The following theorem discusses a general framework to
103
bound rates of convergence for the sequence (Xn)n≥0. For the proof, we need some technical
104
conditions which guarantee that the sizesIr(n)of the subproblems grow withn. More precisely,
105
we will assume that there exists some monotonically decreasing sequence R(n)>0 with
106
R(n)→0 such that
107
1
{Ir(n)<`}A(n)r
s= O(R(n)), n→ ∞, (8)
108 109
for all`∈Nandr= 1, . . . , K and that
110
1
{Ir(n)=n}A(n)r
s→0, n→ ∞, (9)
111 112
for allr= 1, . . . , K.
113
2.1 A general transfer theorem for rates of convergence
114
Our first result is a direct extension of the main Theorem 4.1 in [14], where we essentially
115
only make all the estimates there explicit. The main result of the present extended abstract
116
in contained in the subsequent subsection.
117
ITheorem 1. Let(Xn)n≥0 bes-integrable, 0< s≤3, and satisfy recurrence (7) with the
118
choices forMn andCn specified there. We assume that there exists-integrableA∗1, . . . , A∗K, b∗
119
and some monotonically decreasing sequenceR(n)>0 withR(n)→0 such that, as n→ ∞,
120
b(n)−b∗ s+
K
X
r=1
A(n)r −A∗r
s= O(R(n)). (10)
121 122
If conditions (8) and (9) are satisfied and if
123
lim sup
n→∞ E
K
X
r=1
R(Ir(n)) R(n)
A(n)r
s op
!
<1, (11)
124 125
then we have, as n→ ∞,
126
ζs(Xn, X) = O(R(n)),
127128
whereL(X) is given as the unique fixed point inPsd(0,Idd) of the equation
129
X =d
K
X
r=1
A∗rX(r)+b∗, (12)
130 131
with(A∗1, . . . , A∗K, b∗), X(1), . . . , X(K) independent andX(r)=d X forr= 1, . . . , K.
132
IRemark 2. In applications, the convergence rate of the coefficients (conditions (8) and
133
(10)) is often faster than the convergence rate of the quantities Xn, see, e.g., Section 4.4.
134
In these cases, it is often possible to perform the induction step in the proof of Theorem 1
135
although condition (11) does not hold. To be more precise, we may assume
136
1{I(n)
r <`}A(n)r s+
b(n)−b∗ s+
A(n)r −A∗r
s= O(R(n))e
137
for every`≥0,r= 1, . . . , K andn→ ∞. Then, instead of condition (11), it is sufficient to
138
find someK >0 such that
139
E K
X
r=1
1{n1≤Ir(n)<n}
R(Ir(n))
R(n) kA(n)r ksop
≤1−pn− R(n)e
KR(n) (13)
140 141
for all largenwithpn :=E hPK
r=11
{Ir(n)=n}kA(n)r ksopi .
142
2.2 An improved transfer theorem for normal limit distributions
143
We now consider the special case where the sequence (Xn)n≥0is 3-integrable and satisfies
144
recursion (2) with (A(n)1 , . . . , A(n)K , b(n))−→L3 (A∗1, . . . , A∗K, b∗) for some 3-integrable coefficients
145
A∗1, . . . , A∗K, b∗with
146
b∗= 0,
K
X
r=1
A∗r(A∗r)T = Idd
147
almost surely. Corollary 3.4 in [14] implies that, ifE[PK
r=1kA∗rk3op]<1, equation (12) has a
148
unique solution in the spaceP3d(0,Idd). Furthermore, e.g., using characteristic functions, it
149
is easily checked that this unique solution is the standard normal distributionN(0,Idd).
150
In this special case of normal limit laws, it is possible to derive a refined version of
151
Theorem 1. Instead of the technical condition (8), we now need the weaker condition
152
1
{Ir(n)<`}A(n)r
3
3= O(R(n)), n→ ∞, (14)
153 154
for all`∈Nandr= 1, . . . , K. Moreover, condition (10) concerning the convergence rates of
155
the coefficients can be weakened, which is formulated in the following theorem.
156
ITheorem 3. Let(Xn)n≥0 be given as in (7) and be 3-integrable. We assume that for some
157
R(n)>0 monotonically decreasing withR(n)→0 asn→ ∞we have
158
K
X
r=1
A(n)r (A(n)r )T −Idd
3/2 3/2
+ b(n)
3
3= O(R(n)), (15)
159 160
and the technical conditions (9) and (14) being satisfied for s= 3. If
161
lim sup
n→∞ E
K
X
r=1
R(Ir(n)) R(n)
A(n)r
3 op
!
<1, (16)
162 163
then we have, asn→ ∞,
164
ζ3(Xn,N(0,Idd)) = O(R(n)).
165
Proof. (Sketch) We define an accompanying sequence (Zn∗)n≥0 by
166
Zn∗:=
K
X
r=1
A(n)r TI(n)
r N(r)+b(n), n≥0,
167
where (A(n)1 , . . . , A(n)K , I(n), b(n)), N(1), . . . , N(K) are independent,L(N(r)) =N(0,Idd) for
168
r= 1, . . . , K andTnTnT = Cov(Xn) forn≥0. Hence,Zn∗ isL3-integrable, E[Zn∗] = 0 and
169
Cov(Zn∗) = Iddfor alln≥n1. By the triangle inequality, we have
170
ζ3(Xn,N(0,Idd))≤ζ3(Xn, Zn∗) +ζ3(Zn∗,N(0,Idd)).
171
Then, the assertion follows inductively if one has shown the bound ζ3(Zn∗,N(0,Idd)) =
172
O(R(n)): Using the convolution property of the multidimensional normal distribution, we
173
obtain the representation
174
Zn∗=
K
X
r=1
A(n)r TI(n)
r N(r)+b(n)=d GnN+b(n), (17)
175 176
where GnGTn = PK
r=1A(n)r TI(n) r TT
I(n)r (A(n)r )T, L(N) = N(0,Idd) andN is independent of
177
(Gn, b(n)). As Cov(Zn∗) = Iddfor alln≥n1, we haveE[GnGTn+b(n)(b(n))T] = Iddforn≥n1.
178
Furthermore, we have b(n)
3
3= O(R(n)) and
179
GnGTn −Idd
3/2 3/2=
K
X
r=1
A(n)r TI(n) r TT
Ir(n)(A(n)r )T−Idd
3/2 3/2
180
= O
K
X
r=1
1{Ir(n)<n1}A(n)r (TI(n) r TT
Ir(n)−Idd)(A(n)r )T
3/2 3/2
181
+
K
X
r=1
A(n)r (A(n)r )T −Idd
3/2 3/2
!
182
= O
K
X
r=1
1{I(n)
r <n1}A(n)r
3 3+
K
X
r=1
A(n)r (A(n)r )T−Idd
3/2 3/2
!
183
= O(R(n)).
184185
Thus, the following Lemma 5 impliesζ3(Zn∗,N(0,Idd)) = O(R(n)). Lemma 5 is the main
186
part of the present proof. J
187
IRemark 4. Theorem 3, when applicable, often improves over Theorem 1 by a factor 3 in
188
the exponent, see Remark 9 for an example. This is caused by the additional exponents in
189
(15) in comparison to (10).
190
ILemma 5. Let(Zn∗)n≥0 be a sequence of d-dimensional random vectors satisfying Zn∗=d
191
GnN +b(n), where Gn is a random (d×d)-matrix, b(n) a centered random vector with
192
E[GnGTn +b(n)(b(n))T] = Idd and N∼ N(0,Idd)independent of(Gn, b(n)). Furthermore, we
193
assume that, as n→ ∞,
194
GnGTn −Idd
3/2 3/2+
b(n)
3
3= O(R(n))
195
for appropriateR(n). Then, we have, asn→ ∞,
196
ζ3(Zn∗,N(0,Idd)) = O(R(n)).
197
The proof of Lemma 5 builds upon ideas of [15].
198
3 Expansions of moments
199
In applications to problems arising in theoretical computer science, where the recurrence
200
(1) is explicitly given one usually has no direct means to identify the orders of the terms
201
kb(n)−b∗ksand kA(n)r −A∗rks. This is due to the fact that the mean vector Mn and the
202
covariance matrixCn, for the cases 1< s≤2 and 2< s≤3 respectively, which are used
203
for the normalization (7) are typically not exactly known or too involved to be amenable
204
to explicit calculations. As a substitute one usually has asymptotic expansions of these
205
sequences asn→ ∞.
206
In the present section we assume the dimension to be d = 1 and Ar(n) = 1 for all
207
r = 1, . . . , K and provide tools to apply the general Theorems 1 and 3 on the basis of
208
expansions of the mean and variance. We assume that
209
E[Xn] =µ(n) =f(n) + O(e(n)), Var(Xn) =σ2(n) =g(n) + O(h(n)), (18)
210211
withe(n) =o(f(n)) andh(n) =o(g(n)). To connect Theorems 1 and 3 to recurrences with
212
known expansions we use the following notion.
213
IDefinition 6. A sequence (a(n))n≥0 of non-negative numbers is called essentially non-
214
decreasingif there exists a c >0 such that a(m)≤ca(n) for all0≤m < n.
215
The scaling introduced in (7) with the special choicesAr(n) = 1 for allr= 1, . . . , K leads to
216
the scaled recurrence for (Xn) given in (2) with
217
A(n)r =σ(Ir(n))
σ(n) , b(n)= 1 σ(n)
bn−µ(n) +
K
X
r=1
µ(Ir(n))
. (19)
218 219
Additionally, we consider the corresponding quantities
220
A(n)r =g1/2(Ir(n))
g1/2(n) , b(n)= 1 g1/2(n)
bn−f(n) +
K
X
r=1
f(Ir(n))
. (20)
221 222
Then we have:
223
ILemma 7. With A(n)r , b(n) given in (19),A(n)r ,b(n) given in (20), and the expansions for
224
µ(n),σ2(n)given in (18) the following holds.
225
If the sequenceh/g1/2 is essentially non-decreasing then
226
A(n)r −A∗r s≤
A(n)r −A∗r
s+ Oh(n) g(n)
. (21)
227 228
If the sequencehis essentially non-decreasing then
229
K
X
r=1
(A(n)r )2−1 s≤
K
X
r=1
(A(n)r )2−1
s+ Oh(n) g(n)
. (22)
230 231
If the sequencee is essentially non-decreasing then
232
b(n)−b∗ s≤
b(n)−b∗
s+ Oh(n)
g(n)+ e(n) g1/2(n)
. (23)
233 234
If the sequenceg/h is essentially non-decreasing and
235
T(n) :=E
K
X
r=1
gs/2−1(Ir(n))h(Ir(n))R(Ir(n)) gs/2(n)R(n)
236
then we have
237
E
K
X
r=1
σs(Ir(n))R(Ir(n)) σs(n)R(n) ≤E
K
X
r=1
gs/2(Ir(n))R(Ir(n))
gs/2(n)R(n) + O(T(n)). (24)
238 239
Proof. We show (21), the other bounds can be shown similarly. Note thatσ2(n) =g(n) +
240
O(h(n)) impliesσ(n) =g1/2(n) + O(h(n)/g1/2(n)) and that for any essentially non-decreasing
241
sequence (a(n))n≥0we haveka(Ir(n))k∞= O(a(n)). Sinceh/g1/2is essentially non-decreasing
242
we obtain
243
A(n)r = σ(Ir(n))
σ(n) =g1/2(Ir(n)) + O(h(Ir(n))/g1/2(Ir(n)))
244 σ(n)
=g1/2(Ir(n)) + O(h(n)/g1/2(n))
g1/2(n) ·g1/2(n)
245 σ(n)
= g1/2(Ir(n)) g1/2(n) + O
h(n) g(n)
!
1 + O h(n)
g(n)
246
=g1/2(Ir(n))
g1/2(n) + O h(n)
g(n) 1 +g1/2(Ir(n)) g1/2(n)
!!
.
247 248
Hence, we obtain
249
kA(n)r −A∗rks≤ kA(n)r −A∗rks+ O h(n)
g(n)
1 + A(n)r
s
.
250 251
SinceA(n)r →A∗r in Lswe havekA(n)r ks= O(1), hence
252
kA(n)r −A∗rks≤ kA(n)r −A∗rks+ O h(n)
g(n)
,
253 254
which is bound (21). J
255
Note that in applications the terms on the right hand side in the estimates (21)–(24) can
256
easily be bound when expansions as in (18) with explicit functionse, f, g, h are available.
257
4 Applications
258
We start by deriving a known result to illustrate in detail how to apply our framework of the
259
previous sections.
260
4.1 Quicksort: Key comparisons
261
The number of key comparisonsYn needed by the Quicksort algorithm to sortnrandomly
262
permuted (distinct) numbers satisfies the distributional recursion
263
Yn=d YIn+Yn−1−I0
n+n−1, n≥1, (25)
264
whereY0:= 0 and (Yk)k=0,...,n−1,(Yk0)k=0,...,n−1, In are independent, In is uniformly distrib-
265
uted on{0, . . . , n−1}, andYk
=d Yk0,k≥0. Hence, equation (25) is covered by our general
266
recurrence (1). For the expectation and variance ofYn exact expressions are known which
267
imply the asymptotic expansions
268
EYn= 2nlog(n) + (2γ−4)n+ O(logn), (26)
269
Var(Yn) =σ2n2−2nlog(n) + O(n), (27)
270271
whereγdenotes Euler’s constant andσ:=p
7−2π2/3>0. We introduce the normalized
272
quantitiesX0:=X1:=X2:= 0 and
273
Xn := Yn−EYn
pVar(Yn), n≥3. (28)
274
To apply Theorem 1 we need to find an 0< s≤3 and a sequence (R(n)) with (10) and (11).
275
Note that theYn are bounded, thusLs-integrable for anys >0. To bound theLs-norms
276
appearing in (10) we use Lemma 7 and choose
277
f(n) = 2nlog(n) + (2γ−4)n, e(n) = logn,
278
g(n) =σ2n2, h(n) =nlogn.
279280
With these functions we obtain for the quantities defined in (20) that
281
A(n)1 =In
n, A(n)2 = n−1−In
n ,
282
b(n)= 1 σ
2In
n logIn
n + 2n−1−In
n logn−1−In
n +n−1 n + O
logn n
283 284
With the embeddingIn=bnUcwithU uniformly distributed over the unit interval [0,1] we
285
have
286
A∗1=U, A∗2= 1−U, b∗= 1
σ(2Ulog(U) + 2(1−U) log(1−U) + 1) =: 1 σϕ(U).
287 288
The limit theorem Xn →X has been derived by different methods by Régnier [16] and
289
Rösler [17]. Rösler [17] also found that the scaled limitY :=σX satisfies the distributional
290
fixed-point equation
291
Y =d U Y + (1−U)Y0+ϕ(U). (29)
292
Lower and upper bounds for the rate of convergence in Xn → X have been studied for
293
various metrics in Fill and Janson [6] and Neininger and Rüschendorf [13].
294
Now, we apply the framework of the present paper: For r= 1,2 and anys≥1 we find
295
that
296
kA(n)r −A∗rks= O1 n
.
297 298
Using Proposition 3.2 of Rösler [17] we obtain
299
kbn−b∗ks= Ologn n
.
300 301
Moreover, we have
302
h(n)
g(n) = O(R(n)) and e(n)
g1/2(n) = O(R(n)) with R(n) := logn n ,
303 304
thus Lemma 7 implies that condition (10) is satisfied for our choice of the sequenceR. To
305
verify condition (11) by use of (24) we obtain that for T(n) given in Lemma 7 we find
306
T(n) = O(log(n)/n)→0 and that
307
E
2
X
r=1
gs/2(Ir(n))R(Ir(n)) gs/2(n)R(n) =E
2
X
r=1
Ir(n)
n
!s−1
logIr(n)
logn .
308 309
Note that the latter expression has a limes superior of less than 1 if and only ifs >2. Hence,
310
Theorem 1 is applicable fors >2 and yields that
311
ζs(Xn, X) = O logn
n
, for 2< s≤3. (30)
312 313
The bound (30) had previously been shown fors= 3 in [13], where also the optimality of
314
the order was shown, i.e., thatζ3(Xn, X) = Θ (log(n)/n).
315
In the full paper version we also discuss bounds on rates of convergence for various cost
316
measures of the related Quickselect algorithms under various models for the rank to be
317
selected.
318
4.2 Size of m-ary search trees
319
The size ofm-ary search trees satisfies the recurrence (1) withK=m≥3,A1(n) =· · ·=
320
Am(n) = 1, n0=m, bn= 1, i.e., we have
321
Yn
=d m
X
r=1
Y(r)
Ir(n)+ 1, n≥m.
322 323
For a representation ofI(n)we define for independent, identically unif[0,1] distributed random
324
variables U1, . . . , Um−1 their spacings in [0,1] by S1 = U(1), S2 = U(2)−U(1), . . . , Sm :=
325
1−U(m−1), where U(1), . . . U(m−1) denote the order statistics ofU1, . . . , Um−1. Then I(n)
326
has the mixed multinomial distribution:
327
I(n)=d M(n−m+ 1, S1, . . . , Sm).
328
By this we mean that given (S1, . . . , Sm) = (s1, . . . , sm) we have thatI(n) is multinomial
329
M(n−m+ 1, s1, . . . , sm) distributed. Expectations, variances and limit laws forYn have
330
been studied, see[12, 4]. We have
331
EYn=µn+ O(1 +nα−1), m≥3, (31)
332
Var(Yn) =σ2n+ O(1 +n2α−2), 3≤m≤26, (32)
333334
Here, the constantsµ, σ > 0 depend onmandα∈Rdepends on m such thatα < 1 for
335
m≤ 13, 1 ≤α≤ 4/3 for 14 ≤m ≤19, and 4/3 ≤α≤ 3/2 for 20 ≤m ≤26, see, e.g.,
336
Mahmoud [12, Table 3.1] for the values α= αm depending on m. It is known that Yn
337
standardized by mean and variance satisfies a central limit law form ≤26, whereas the
338
standardized sequence has no weak limit for m > 26 due to dominant periodicities, see
339
Chern and Hwang [4]. The rate of convergence in the central limit law form≤26 for the
340
Kolmogorov metric has been identified in Hwang [9]. Our Theorem 3 implies the central limit
341
theorem forYn withm≤26 with the same (up to anεfor 3≤m≤19) rate of convergence
342
for the Zolotarev metricζ3:
343
ITheorem 8. The size Yn of a random m-ary search tree with nitems inserted satisfies,
344
form≤26,
345
ζ3
Yn−EYn
pVar(Yn),N(0,1)
=
( O n−1/2+ε
, 3≤m≤19, O n−3(3/2−α)
, 20≤m≤26, (33)
346 347
as n→ ∞.
348
Proof. In order to apply Theorem 3 we have to estimate the orders ofkPm
r=1(A(n)r )2−1k3/2
349
andkb(n)
3with A(n)r and b(n) defined in (3). For this we apply Lemma 7. From (31) and
350
(32) we obtain that for the quantities appearing in Lemma 7 we can choose f(n) = µn,
351
e(n) = 1∨nα−1,g(n) =σ2n, andh(n) = 1∨n2(α−1). Hence we obtain
352
m
X
r=1
(A(n)r )2−1 3/2=
m
X
r=1
Ir(n)
n −1
3/2=m−1
n = O n−1
353
and O(h(n)/g(n)) = O(n−(1∧(3−2α))). This implies
354
m
X
r=1
(A(n)r )2−1
3/2 3/2
= O n−((3/2)∧(3(3/2−α))) .
355
Similarly we obtain
356
b(n) 3= 1
σ√ n
1−µn+
m
X
r=1
µIr(n) 3= 1
σ√ n
1−µ(m−1)
3= O n−1/2
357
and O(e(n)/g1/2(n)) = O(n−(1∧(3/2−α))). This implies
358
b(n)
3
3= O n−((3/2)∧(3(3/2−α))) .
359
Hence, condition (15) is satisfied withR(n) =n−((3/2)∧(3(3/2−α))). J
360
IRemark 9. Using Theorem 1 instead of Theorem 3 in the latter proof is also possible but
361
leads to a bound O(n−(3/2−α)) for 20≤m≤26, missing the factor 3 appearing in Theorem
362
8.
363
In the full paper version we also discuss rates of convergence for the number of leaves of
364
d-dimensional random point quadtrees in the model of [7, 3, 8] where a similar behavior
365
as in Theorem 8 appears. A technically related example is the number of maxima in right
366
triangles in the model of [1, 2], where the ordern−1/4 appears. Our framework also applies.
367
4.3 Periodic functions in mean and variance
368
We now discuss some examples where the asymptotic expansions of the mean and the
369
variance include periodic functions instead of fixed constants. This is the case for several
370
quantities in binomial splitting processes such as tries, PATRICIA tries and digital search
371
trees. Throughout this section, we assume that we have a 3-integrable sequence (Yn)n≥0
372
satisfying the recursion
373
Yn
=d Y(1)
I1(n)+Y(2)
I2(n)+bn, n≥n0, (34)
374 375
with (I(n), bn),(Yn(1))n≥0and (Yn(2))n≥0independent and (Yn(r))n≥0= (Yd n)n≥0for r= 1,2.
376
Furthermore, I1(n) has the binomial distribution Bin(n,12) and I2(n) = n−I1(n) or I1(n)
377
is binomially Bin(n−1,12) distributed and I2(n) =n−1−I1(n). Mostly, these binomial
378
recurrences are asymptotically normally distributed, see [10, 11, 14, 18] for some examples.
379
Our first theorem covers the case of linear mean and variance, i.e. we assume that, as
380
n→ ∞,
381
E[Yn] =nP1(log2n) + O(1), (35)
382
Var(Yn) =nP2(log2n) + O(1), (36)
383384
for some smooth and 1-periodic functionsP1, P2 withP2>0. Possible applications would
385
start with the analysis of the number of internal nodes of a trie fornstrings in the symmetric
386
Bernoulli model and the number of leaves in a random digital search tree, see, e.g., [10].
387
ITheorem 10. Let(Yn)n≥0 be3-integrable and satisfy (34) withkbnk3= O(1), (35) and
388
(36). Then, for anyε >0 andn→ ∞, we have
389
ζ3Yn−E[Yn]
pVar(Yn),N(0,1)
= O(n−1/2+ε).
390
We now consider the case where our quantities Yn satisfy recursion (34) withbn being
391
essentiallyn. We assume that, asn→ ∞, we have
392
E[Yn] =nlog2(n) +nP1(log2n) + O(1), (37)
393
Var(Yn) =nP2(log2n) + O(1), (38)
394395
for some smooth and 1-periodic functionsP1, P2 withP2>0. This covers, for example, the
396
external path length of random tries and related digital tree structures constructed fromn
397
random binary strings under appropriate independence assumptions.
398
ITheorem 11. Let (Yn)n≥0 be 3-integrable and satisfy (34) withkbn−nk3 = O(1), (37)
399
and (38). Then, for anyε >0 andn→ ∞, we have
400
ζ3
Yn−E[Yn]
pVar(Yn),N(0,1)
= O(n−1/2+ε).
401
4.4 A multivariate application
402
We consider a random binary search tree withnnodes built from a random permutation of
403
{1, . . . , n}. Forn≥0, we denote byL0n the number of nodes with no left descendant and
404