Convergence Rates in the Probabilistic Analysis of

(1)

Algorithms

2

Ralph Neininger

3

Institute for Mathematics, Goethe University, 60054 Frankfurt a.M., Germany

4

neininger@math.uni-frankfurt.de

5

Jasmin Straub

6

Institute for Mathematics, Goethe University, 60054 Frankfurt a.M., Germany

7

jstraub@math.uni-frankfurt.de

8

Abstract

9

In this extended abstract a general framework is developed to bound rates of convergence for

10

sequences of random variables as they mainly arise in the analysis of random trees and divide and

11

conquer algorithms. The rates of convergence are bounded in the Zolotarev distances. Concrete

12

examples from the analysis of algorithms and data structures are discussed as well as a few examples

13

from other areas. They lead to convergence rates of polynomial and logarithmic order. A crucial

14

role is played by a factor 3 in the exponent of these orders in cases where the normal distribution is

15

the limit distribution.

16

2012 ACM Subject Classification Theory of computation → Sorting and searching; Theory of

17

computation → Divide and conquer

18

Keywords and phrases weak convergence, probabilistic analysis of algorithms, random trees, prob-

19

ability metrics

20

Digital Object Identifier 10.4230/LIPIcs.CVIT.2016.23

21

1 Introduction and notation

22

In this extended abstract we consider a general recurrence for (probability) distributions

23

which covers many instances of complexity measures of divide and conquer algorithms and

24

parameters of random search trees. We consider a sequence (Yn)_n≥0ofd-dimensional random

25

vectors satisfying the distributional recursion

26

Yn

=d K

X

r=1

Ar(n)Y^(r)

I⁽ⁿ⁾_r +bn, n≥n0, (1)

27 28

where (A₁(n), . . . , A_K(n), b_n, I⁽ⁿ⁾),(Yn⁽¹⁾)n≥0, . . . ,(Yn^(K))n≥0are independent, the coefficients

29

A1(n), . . . , AK(n) are random (d×d)-matrices,bn is ad-dimensional random vector, I⁽ⁿ⁾=

30

(I₁⁽ⁿ⁾, . . . , I_K⁽ⁿ⁾) is a random vector in {0, . . . , n}^K, n0 ≥ 1 and (Yn^(r))n≥0

= (Yd n)n≥0 for

31

r= 1, . . . , K. Moreover,K≥1 is a fixed integer, but extensions toK being random and

32

depending onnare possible.

33

This is the framework of [14] where some general convergence results are shown for

34

appropriate normalizations of theYn. The content of the present extended abstract is to

35

also study the rates of convergence in such limit theorems.

36

We define the normalized sequence (Xn)_n≥0 by

37

Xn:=C_n^−1/2(Yn−Mn), n≥0,

38

whereMn is ad-dimensional vector andCn a positive definite (d×d)-matrix. Essentially,

39

we chooseMn as the mean and Cn as the covariance matrix of Yn if they exist or as the

40

(2)

leading order terms in expansions of these moments asn→ ∞. The normalized quantities

41

satisfy the following modified recursion:

42

X_n =^d

K

X

r=1

A⁽ⁿ⁾_r X^(r)

I_r⁽ⁿ⁾+b⁽ⁿ⁾, n≥n₀, (2)

43 44

with

45

A⁽ⁿ⁾_r :=C_n^−1/2Ar(n)C^1/2

I_r⁽ⁿ⁾, b⁽ⁿ⁾:=C_n^−1/2 bn−Mn+

K

X

r=1

Ar(n)M_I(n) r

!

(3)

46 47

and independence relations as in (1).

48

In the context of the contraction method the aim is to establish transfer theorems of the

49

following form: After verifying the assumptions of appropriate convergence of the coefficients

50

A⁽ⁿ⁾r →A^∗_r, b⁽ⁿ⁾→b^∗ then convergence in distribution of random vectors (X_n) to a limitX

51

is implied. The limit distributionL(X) is identified by a fixed-point equation obtained from

52

(2) by considering formallyn→ ∞:

53

X =^d

K

X

r=1

A^∗_rX^(r)+b^∗. (4)

54 55

Here (A^∗₁, . . . , A^∗_K, b^∗), X⁽¹⁾, . . . , X^(K)are independent andX^(r)=^d X forr= 1, . . . , K.

56

The aim of the present extended abstract is to endow such general transfer theorems

57

with bounds on the rates of convergence. As a distance measure between (probability)

58

distributions we use the Zolotarev metric. For various of the applications we discuss, bounds

59

on the rate of convergence have been derived one by one for more popular distance measures

60

such as the Kolmogorov–Smirnov distance. The transfer theorems of the present paper are

61

in terms of the smoother Zolotarev metrics. However, they are easy to apply and cover a

62

broad range of applications at once. A crucial role is played by a factor 3 in the exponent of

63

these orders in cases where the normal distribution is the limit distribution, see Remark 4.

64

In the rest of this section we fix some notation. Regarding norms of vectors and (random)

65

matrices we denote forx∈R^d bykxkits Euclidean norm and for a random vectorX and

66

some 0< p < ∞, we set kXk_p := E[kXk^p]^(1/p)∧1. Furthermore, for a (d×d)-matrixA,

67

kAkop:= sup_kxk=1kAxkdenotes the spectral norm ofAand for a random suchAwe define

68

kAkp := E[kAk^p_op]^(1/p)∧1 for a random square matrix and 0 < p < ∞. Note that for a

69

symmetric (d×d)-matrixA, we have kAkop = max{|λ|: λeigenvalue ofA}. By Idd the

70

d-dimensional unit matrix is denoted. For multilinear forms the norm is defined similarly.

71

Furthermore we define byP^d the space of probability distributions in R^d (endowed with

72

the Borel σ-field), by P_s^d :={L(X) ∈ P^d : kXks <∞} and for a vector m ∈R^d, and a

73

symmetric positive semidefinited×dmatrixC the spaces

74

P_s^d(m) :={L(X)∈ P_s^d:E[X] =m}, s >1, (5)

75

P_s^d(m, C) :={L(X)∈ P_s^d:E[X] =m,Cov(X) =C}, s >2.

7677

We use the conventionP_s^d(m) :=P_s^d fors≤1 andP_s^d(m, C) :=P_s^d(m) fors≤2.

78

The Zolotarev metricsζs, [19], are defined for probability distributionsL(X),L(Y)∈ P^d

79

by

80

ζ_s(X, Y) :=ζ_s(L(X),L(Y)) = sup

f∈Fs

|E(f(X)−f(Y))| (6)

81 82

(3)

where fors=m+α,0< α≤1, m∈N0,

83

Fs:={f ∈C^m(R^d,R) :kf^(m)(x)−f^(m)(y)k ≤ kx−yk^α}.

8485

Note that these distance measures may be infinite. Finite metrics are given byζs onP_s^d for

86

0≤s≤1, by ζs onP_s^d(m) for 1< s≤2, and byζsonP_s^d(m, C) for 2< s≤3, cf. (5).

87

2 Results

88

We return to the situation outlined in the introduction, where we have normalized (Y_n)_n≥0

89

in the following way:

90

X_n:=C_n^−1/2(Y_n−M_n), n≥0, (7)

9192

whereMn is ad-dimensional random vector andCn a positive definite (d×d)-matrix. As

93

recalled in Section 1, fors >1, we may fix the mean and covariance matrix of the scaled

94

quantities to guarantee the finiteness of the ζs-metric. Therefore, we chooseMn =E[Yn]

95

forn≥0 ands >1. Fors >2, we additionally have to control the covariances ofX_n. We

96

assume that there exists an n1 ≥0 such that Cov(Yn) is positive definite forn≥n1 and

97

chooseCn= Cov(Yn) forn≥n1andCn= Iddfor n < n1. Fors≤2, we just assume that

98

C_n is positive definite and setn₁= 0 in this case.

99

The normalized quantities satisfy the modified recursion

100

Xn

=d K

X

r=1

A⁽ⁿ⁾_r X^(r)

I_r⁽ⁿ⁾+b⁽ⁿ⁾, n≥n0,

101 102

withA⁽ⁿ⁾r andb⁽ⁿ⁾ given in (3). The following theorem discusses a general framework to

103

bound rates of convergence for the sequence (X_n)_n≥0. For the proof, we need some technical

104

conditions which guarantee that the sizesIr⁽ⁿ⁾of the subproblems grow withn. More precisely,

105

we will assume that there exists some monotonically decreasing sequence R(n)>0 with

106

R(n)→0 such that

107

1

{Ir⁽ⁿ⁾<`}A⁽ⁿ⁾_r

_s= O(R(n)), n→ ∞, (8)

108 109

for all`∈Nandr= 1, . . . , K and that

110

1

{Ir⁽ⁿ⁾=n}A⁽ⁿ⁾_r

_s→0, n→ ∞, (9)

111 112

for allr= 1, . . . , K.

113

2.1 A general transfer theorem for rates of convergence

114

Our first result is a direct extension of the main Theorem 4.1 in [14], where we essentially

115

only make all the estimates there explicit. The main result of the present extended abstract

116

in contained in the subsequent subsection.

117

ITheorem 1. Let(Xn)_n≥0 bes-integrable, 0< s≤3, and satisfy recurrence (7) with the

118

choices forM_n andC_n specified there. We assume that there exists-integrableA^∗₁, . . . , A^∗_K, b^∗

119

and some monotonically decreasing sequenceR(n)>0 withR(n)→0 such that, as n→ ∞,

120

b⁽ⁿ⁾−b^∗ _s+

K

X

r=1

A⁽ⁿ⁾_r −A^∗_r

_s= O(R(n)). (10)

121 122

(4)

If conditions (8) and (9) are satisfied and if

123

lim sup

n→∞ E

K

X

r=1

R(Ir⁽ⁿ⁾) R(n)

A⁽ⁿ⁾_r

s op

!

<1, (11)

124 125

then we have, as n→ ∞,

126

ζ_s(X_n, X) = O(R(n)),

127128

whereL(X) is given as the unique fixed point inP_s^d(0,Idd) of the equation

129

X =^d

K

X

r=1

A^∗_rX^(r)+b^∗, (12)

130 131

with(A^∗₁, . . . , A^∗_K, b^∗), X⁽¹⁾, . . . , X^(K) independent andX^(r)=^d X forr= 1, . . . , K.

132

IRemark 2. In applications, the convergence rate of the coefficients (conditions (8) and

133

(10)) is often faster than the convergence rate of the quantities X_n, see, e.g., Section 4.4.

134

In these cases, it is often possible to perform the induction step in the proof of Theorem 1

135

although condition (11) does not hold. To be more precise, we may assume

136

1_{I(n)

r <`}A⁽ⁿ⁾_r _s+

b⁽ⁿ⁾−b^∗ _s+

A⁽ⁿ⁾_r −A^∗_r

_s= O(R(n))e

137

for every`≥0,r= 1, . . . , K andn→ ∞. Then, instead of condition (11), it is sufficient to

138

find someK >0 such that

139

E ^K

X

r=1

1{n1≤I_r⁽ⁿ⁾<n}

R(Ir⁽ⁿ⁾)

R(n) kA⁽ⁿ⁾_r k^s_op

≤1−pn− R(n)e

KR(n) (13)

140 141

for all largenwithp_n :=E hPK

r=11

{I_r⁽ⁿ⁾=n}kA⁽ⁿ⁾r k^s_opi .

142

2.2 An improved transfer theorem for normal limit distributions

143

We now consider the special case where the sequence (X_n)_n≥0is 3-integrable and satisfies

144

recursion (2) with (A⁽ⁿ⁾₁ , . . . , A⁽ⁿ⁾_K , b⁽ⁿ⁾)−→^L³ (A^∗₁, . . . , A^∗_K, b^∗) for some 3-integrable coefficients

145

A^∗₁, . . . , A^∗_K, b^∗with

146

b^∗= 0,

K

X

r=1

A^∗_r(A^∗_r)^T = Idd

147

almost surely. Corollary 3.4 in [14] implies that, ifE[PK

r=1kA^∗_rk³_op]<1, equation (12) has a

148

unique solution in the spaceP₃^d(0,Idd). Furthermore, e.g., using characteristic functions, it

149

is easily checked that this unique solution is the standard normal distributionN(0,Id_d).

150

In this special case of normal limit laws, it is possible to derive a refined version of

151

Theorem 1. Instead of the technical condition (8), we now need the weaker condition

152

1

{I_r⁽ⁿ⁾<`}A⁽ⁿ⁾_r

3

3= O(R(n)), n→ ∞, (14)

153 154

for all`∈Nandr= 1, . . . , K. Moreover, condition (10) concerning the convergence rates of

155

the coefficients can be weakened, which is formulated in the following theorem.

156

(5)

ITheorem 3. Let(Xn)n≥0 be given as in (7) and be 3-integrable. We assume that for some

157

R(n)>0 monotonically decreasing withR(n)→0 asn→ ∞we have

158

K

X

r=1

A⁽ⁿ⁾_r (A⁽ⁿ⁾_r )^T −Idd

3/2 3/2

+ b⁽ⁿ⁾

3

3= O(R(n)), (15)

159 160

and the technical conditions (9) and (14) being satisfied for s= 3. If

161

lim sup

n→∞ E

K

X

r=1

R(Ir⁽ⁿ⁾) R(n)

A⁽ⁿ⁾_r

3 op

!

<1, (16)

162 163

then we have, asn→ ∞,

164

ζ₃(X_n,N(0,Idd)) = O(R(n)).

165

Proof. (Sketch) We define an accompanying sequence (Z_n^∗)_n≥0 by

166

Z_n^∗:=

K

X

r=1

A⁽ⁿ⁾_r T_I(n)

r N^(r)+b⁽ⁿ⁾, n≥0,

167

where (A⁽ⁿ⁾₁ , . . . , A⁽ⁿ⁾_K , I⁽ⁿ⁾, b⁽ⁿ⁾), N⁽¹⁾, . . . , N^(K) are independent,L(N^(r)) =N(0,Idd) for

168

r= 1, . . . , K andTnT_n^T = Cov(Xn) forn≥0. Hence,Z_n^∗ isL3-integrable, E[Z_n^∗] = 0 and

169

Cov(Z_n^∗) = Id_dfor alln≥n₁. By the triangle inequality, we have

170

ζ₃(X_n,N(0,Id_d))≤ζ₃(X_n, Z_n^∗) +ζ₃(Z_n^∗,N(0,Id_d)).

171

Then, the assertion follows inductively if one has shown the bound ζ₃(Z_n^∗,N(0,Id_d)) =

172

O(R(n)): Using the convolution property of the multidimensional normal distribution, we

173

obtain the representation

174

Z_n^∗=

K

X

r=1

A⁽ⁿ⁾_r T_I(n)

r N^(r)+b⁽ⁿ⁾=^d G_nN+b⁽ⁿ⁾, (17)

175 176

where GnG^T_n = PK

r=1A⁽ⁿ⁾r T_I(n) r T^T

I⁽ⁿ⁾_r (A⁽ⁿ⁾r )^T, L(N) = N(0,Idd) andN is independent of

177

(Gn, b⁽ⁿ⁾). As Cov(Z_n^∗) = Iddfor alln≥n1, we haveE[GnG^T_n+b⁽ⁿ⁾(b⁽ⁿ⁾)^T] = Iddforn≥n1.

178

Furthermore, we have b⁽ⁿ⁾

3

3= O(R(n)) and

179

GnG^T_n −Idd

3/2 3/2=

K

X

r=1

A⁽ⁿ⁾_r T_I(n) r T^T

I_r⁽ⁿ⁾(A⁽ⁿ⁾_r )^T−Idd

3/2 3/2

180

= O

K

X

r=1

1{Ir⁽ⁿ⁾<n₁}A⁽ⁿ⁾_r (T_I(n) r T^T

I_r⁽ⁿ⁾−Id_d)(A⁽ⁿ⁾_r )^T

3/2 3/2

181

+

K

X

r=1

A⁽ⁿ⁾_r (A⁽ⁿ⁾_r )^T −Id_d

3/2 3/2

!

182

= O

K

X

r=1

1_{I(n)

r <n1}A⁽ⁿ⁾_r

3 3+

K

X

r=1

A⁽ⁿ⁾_r (A⁽ⁿ⁾_r )^T−Idd

3/2 3/2

!

183

= O(R(n)).

184185

Thus, the following Lemma 5 impliesζ₃(Z_n^∗,N(0,Idd)) = O(R(n)). Lemma 5 is the main

186

part of the present proof. J

187

(6)

IRemark 4. Theorem 3, when applicable, often improves over Theorem 1 by a factor 3 in

188

the exponent, see Remark 9 for an example. This is caused by the additional exponents in

189

(15) in comparison to (10).

190

ILemma 5. Let(Z_n^∗)_n≥0 be a sequence of d-dimensional random vectors satisfying Z_n^∗=^d

191

GnN +b⁽ⁿ⁾, where Gn is a random (d×d)-matrix, b⁽ⁿ⁾ a centered random vector with

192

E[GnG^T_n +b⁽ⁿ⁾(b⁽ⁿ⁾)^T] = Idd and N∼ N(0,Idd)independent of(Gn, b⁽ⁿ⁾). Furthermore, we

193

assume that, as n→ ∞,

194

G_nG^T_n −Id_d

3/2 3/2+

b⁽ⁿ⁾

3

3= O(R(n))

195

for appropriateR(n). Then, we have, asn→ ∞,

196

ζ3(Z_n^∗,N(0,Idd)) = O(R(n)).

197

The proof of Lemma 5 builds upon ideas of [15].

198

3 Expansions of moments

199

In applications to problems arising in theoretical computer science, where the recurrence

200

(1) is explicitly given one usually has no direct means to identify the orders of the terms

201

kb⁽ⁿ⁾−b^∗k_sand kA⁽ⁿ⁾r −A^∗_rk_s. This is due to the fact that the mean vector M_n and the

202

covariance matrixCn, for the cases 1< s≤2 and 2< s≤3 respectively, which are used

203

for the normalization (7) are typically not exactly known or too involved to be amenable

204

to explicit calculations. As a substitute one usually has asymptotic expansions of these

205

sequences asn→ ∞.

206

In the present section we assume the dimension to be d = 1 and Ar(n) = 1 for all

207

r = 1, . . . , K and provide tools to apply the general Theorems 1 and 3 on the basis of

208

expansions of the mean and variance. We assume that

209

E[Xn] =µ(n) =f(n) + O(e(n)), Var(Xn) =σ²(n) =g(n) + O(h(n)), (18)

210211

withe(n) =o(f(n)) andh(n) =o(g(n)). To connect Theorems 1 and 3 to recurrences with

212

known expansions we use the following notion.

213

IDefinition 6. A sequence (a(n))n≥0 of non-negative numbers is called essentially non-

214

decreasingif there exists a c >0 such that a(m)≤ca(n) for all0≤m < n.

215

The scaling introduced in (7) with the special choicesAr(n) = 1 for allr= 1, . . . , K leads to

216

the scaled recurrence for (Xn) given in (2) with

217

A⁽ⁿ⁾_r =σ(Ir⁽ⁿ⁾)

σ(n) , b⁽ⁿ⁾= 1 σ(n)

bn−µ(n) +

K

X

r=1

µ(I_r⁽ⁿ⁾)

. (19)

218 219

Additionally, we consider the corresponding quantities

220

A⁽ⁿ⁾_r =g^1/2(Ir⁽ⁿ⁾)

g^1/2(n) , b⁽ⁿ⁾= 1 g^1/2(n)

b_n−f(n) +

K

X

r=1

f(I_r⁽ⁿ⁾)

. (20)

221 222

Then we have:

223

(7)

ILemma 7. With A⁽ⁿ⁾r , b⁽ⁿ⁾ given in (19),A⁽ⁿ⁾_r ,b⁽ⁿ⁾ given in (20), and the expansions for

224

µ(n),σ²(n)given in (18) the following holds.

225

If the sequenceh/g^1/2 is essentially non-decreasing then

226

A⁽ⁿ⁾_r −A^∗_r _s≤

A⁽ⁿ⁾_r −A^∗_r

_s+ Oh(n) g(n)

. (21)

227 228

If the sequencehis essentially non-decreasing then

229

K

X

r=1

(A⁽ⁿ⁾_r )²−1 _s≤

K

X

r=1

(A⁽ⁿ⁾_r )²−1

_s+ Oh(n) g(n)

. (22)

230 231

If the sequencee is essentially non-decreasing then

232

b⁽ⁿ⁾−b^∗ _s≤

b⁽ⁿ⁾−b^∗

_s+ Oh(n)

g(n)+ e(n) g^1/2(n)

. (23)

233 234

If the sequenceg/h is essentially non-decreasing and

235

T(n) :=E

K

X

r=1

g^s/2−1(Ir⁽ⁿ⁾)h(Ir⁽ⁿ⁾)R(Ir⁽ⁿ⁾) g^s/2(n)R(n)

236

then we have

237

E

K

X

r=1

σ^s(Ir⁽ⁿ⁾)R(Ir⁽ⁿ⁾) σ^s(n)R(n) ≤E

K

X

r=1

g^s/2(Ir⁽ⁿ⁾)R(Ir⁽ⁿ⁾)

g^s/2(n)R(n) + O(T(n)). (24)

238 239

Proof. We show (21), the other bounds can be shown similarly. Note thatσ²(n) =g(n) +

240

O(h(n)) impliesσ(n) =g^1/2(n) + O(h(n)/g^1/2(n)) and that for any essentially non-decreasing

241

sequence (a(n))n≥0we haveka(Ir⁽ⁿ⁾)k∞= O(a(n)). Sinceh/g^1/2is essentially non-decreasing

242

we obtain

243

A⁽ⁿ⁾_r = σ(Ir⁽ⁿ⁾)

σ(n) =g^1/2(Ir⁽ⁿ⁾) + O(h(Ir⁽ⁿ⁾)/g^1/2(Ir⁽ⁿ⁾))

244 σ(n)

=g^1/2(Ir⁽ⁿ⁾) + O(h(n)/g^1/2(n))

g^1/2(n) ·g^1/2(n)

245 σ(n)

= g^1/2(Ir⁽ⁿ⁾) g^1/2(n) + O

h(n) g(n)

!

1 + O h(n)

g(n)

246

=g^1/2(Ir⁽ⁿ⁾)

g^1/2(n) + O h(n)

g(n) 1 +g^1/2(Ir⁽ⁿ⁾) g^1/2(n)

!!

.

247 248

Hence, we obtain

249

kA⁽ⁿ⁾_r −A^∗_rk_s≤ kA⁽ⁿ⁾_r −A^∗_rk_s+ O h(n)

g(n)

1 + A⁽ⁿ⁾_r

_s

.

250 251

SinceA⁽ⁿ⁾_r →A^∗_r in Lswe havekA⁽ⁿ⁾_r ks= O(1), hence

252

kA⁽ⁿ⁾_r −A^∗_rks≤ kA⁽ⁿ⁾_r −A^∗_rks+ O h(n)

g(n)

,

253 254

which is bound (21). J

255

Note that in applications the terms on the right hand side in the estimates (21)–(24) can

256

easily be bound when expansions as in (18) with explicit functionse, f, g, h are available.

257

(8)

4 Applications

258

We start by deriving a known result to illustrate in detail how to apply our framework of the

259

previous sections.

260

4.1 Quicksort: Key comparisons

261

The number of key comparisonsYn needed by the Quicksort algorithm to sortnrandomly

262

permuted (distinct) numbers satisfies the distributional recursion

263

Y_n=^d Y_I_n+Y_n−1−I⁰

n+n−1, n≥1, (25)

264

whereY0:= 0 and (Yk)k=0,...,n−1,(Y_k⁰)k=0,...,n−1, In are independent, In is uniformly distrib-

265

uted on{0, . . . , n−1}, andYk

=d Y_k⁰,k≥0. Hence, equation (25) is covered by our general

266

recurrence (1). For the expectation and variance ofYn exact expressions are known which

267

imply the asymptotic expansions

268

EYn= 2nlog(n) + (2γ−4)n+ O(logn), (26)

269

Var(Yn) =σ²n²−2nlog(n) + O(n), (27)

270271

whereγdenotes Euler’s constant andσ:=p

7−2π²/3>0. We introduce the normalized

272

quantitiesX0:=X1:=X2:= 0 and

273

Xn := Yn−EYn

pVar(Yn), n≥3. (28)

274

To apply Theorem 1 we need to find an 0< s≤3 and a sequence (R(n)) with (10) and (11).

275

Note that theYn are bounded, thusLs-integrable for anys >0. To bound theLs-norms

276

appearing in (10) we use Lemma 7 and choose

277

f(n) = 2nlog(n) + (2γ−4)n, e(n) = logn,

278

g(n) =σ²n², h(n) =nlogn.

279280

With these functions we obtain for the quantities defined in (20) that

281

A⁽ⁿ⁾₁ =I_n

n, A⁽ⁿ⁾₂ = n−1−I_n

n ,

282

b⁽ⁿ⁾= 1 σ

2I_n

n logI_n

n + 2n−1−I_n

n logn−1−I_n

n +n−1 n + O

logn n

283 284

With the embeddingIn=bnUcwithU uniformly distributed over the unit interval [0,1] we

285

have

286

A^∗₁=U, A^∗₂= 1−U, b^∗= 1

σ(2Ulog(U) + 2(1−U) log(1−U) + 1) =: 1 σϕ(U).

287 288

The limit theorem X_n →X has been derived by different methods by Régnier [16] and

289

Rösler [17]. Rösler [17] also found that the scaled limitY :=σX satisfies the distributional

290

fixed-point equation

291

Y =^d U Y + (1−U)Y⁰+ϕ(U). (29)

292

Lower and upper bounds for the rate of convergence in Xn → X have been studied for

293

various metrics in Fill and Janson [6] and Neininger and Rüschendorf [13].

294

(9)

Now, we apply the framework of the present paper: For r= 1,2 and anys≥1 we find

295

that

296

kA⁽ⁿ⁾_r −A^∗_rks= O1 n

.

297 298

Using Proposition 3.2 of Rösler [17] we obtain

299

kbn−b^∗ks= Ologn n

.

300 301

Moreover, we have

302

h(n)

g(n) = O(R(n)) and e(n)

g^1/2(n) = O(R(n)) with R(n) := logn n ,

303 304

thus Lemma 7 implies that condition (10) is satisfied for our choice of the sequenceR. To

305

verify condition (11) by use of (24) we obtain that for T(n) given in Lemma 7 we find

306

T(n) = O(log(n)/n)→0 and that

307

E

2

X

r=1

g^s/2(Ir⁽ⁿ⁾)R(Ir⁽ⁿ⁾) g^s/2(n)R(n) =E

2

X

r=1

Ir⁽ⁿ⁾

n

!s−1

logIr⁽ⁿ⁾

logn .

308 309

Note that the latter expression has a limes superior of less than 1 if and only ifs >2. Hence,

310

Theorem 1 is applicable fors >2 and yields that

311

ζ_s(X_n, X) = O logn

n

, for 2< s≤3. (30)

312 313

The bound (30) had previously been shown fors= 3 in [13], where also the optimality of

314

the order was shown, i.e., thatζ₃(X_n, X) = Θ (log(n)/n).

315

In the full paper version we also discuss bounds on rates of convergence for various cost

316

measures of the related Quickselect algorithms under various models for the rank to be

317

selected.

318

4.2 Size of m-ary search trees

319

The size ofm-ary search trees satisfies the recurrence (1) withK=m≥3,A1(n) =· · ·=

320

A_m(n) = 1, n₀=m, b_n= 1, i.e., we have

321

Yn

=d m

X

r=1

Y^(r)

I_r⁽ⁿ⁾+ 1, n≥m.

322 323

For a representation ofI⁽ⁿ⁾we define for independent, identically unif[0,1] distributed random

324

variables U₁, . . . , U_m−1 their spacings in [0,1] by S₁ = U₍₁₎, S₂ = U₍₂₎−U₍₁₎, . . . , S_m :=

325

1−U_(m−1), where U₍₁₎, . . . U_(m−1) denote the order statistics ofU1, . . . , Um−1. Then I⁽ⁿ⁾

326

has the mixed multinomial distribution:

327

I⁽ⁿ⁾=^d M(n−m+ 1, S1, . . . , Sm).

328

By this we mean that given (S1, . . . , Sm) = (s1, . . . , sm) we have thatI⁽ⁿ⁾ is multinomial

329

M(n−m+ 1, s₁, . . . , s_m) distributed. Expectations, variances and limit laws forY_n have

330

been studied, see[12, 4]. We have

331

EYn=µn+ O(1 +n^α−1), m≥3, (31)

332

Var(Yn) =σ²n+ O(1 +n^2α−2), 3≤m≤26, (32)

333334

(10)

Here, the constantsµ, σ > 0 depend onmandα∈Rdepends on m such thatα < 1 for

335

m≤ 13, 1 ≤α≤ 4/3 for 14 ≤m ≤19, and 4/3 ≤α≤ 3/2 for 20 ≤m ≤26, see, e.g.,

336

Mahmoud [12, Table 3.1] for the values α= α_m depending on m. It is known that Y_n

337

standardized by mean and variance satisfies a central limit law form ≤26, whereas the

338

standardized sequence has no weak limit for m > 26 due to dominant periodicities, see

339

Chern and Hwang [4]. The rate of convergence in the central limit law form≤26 for the

340

Kolmogorov metric has been identified in Hwang [9]. Our Theorem 3 implies the central limit

341

theorem forYn withm≤26 with the same (up to anεfor 3≤m≤19) rate of convergence

342

for the Zolotarev metricζ3:

343

ITheorem 8. The size Yn of a random m-ary search tree with nitems inserted satisfies,

344

form≤26,

345

ζ3

Yn−EYn

pVar(Y_n),N(0,1)

=

( O n^−1/2+ε

, 3≤m≤19, O n^{−3(3/2−α)}

, 20≤m≤26, (33)

346 347

as n→ ∞.

348

Proof. In order to apply Theorem 3 we have to estimate the orders ofkPm

r=1(A⁽ⁿ⁾r )²−1k3/2

349

andkb⁽ⁿ⁾

₃with A⁽ⁿ⁾r and b⁽ⁿ⁾ defined in (3). For this we apply Lemma 7. From (31) and

350

(32) we obtain that for the quantities appearing in Lemma 7 we can choose f(n) = µn,

351

e(n) = 1∨n^α−1,g(n) =σ²n, andh(n) = 1∨n^2(α−1). Hence we obtain

352

m

X

r=1

(A⁽ⁿ⁾_r )²−1 _3/2=

m

X

r=1

Ir⁽ⁿ⁾

n −1

_3/2=m−1

n = O n⁻¹

353

and O(h(n)/g(n)) = O(n−(1∧(3−2α))). This implies

354

m

X

r=1

(A⁽ⁿ⁾_r )²−1

3/2 3/2

= O n−((3/2)∧(3(3/2−α))) .

355

Similarly we obtain

356

b⁽ⁿ⁾ 3= 1

σ√ n

1−µn+

m

X

r=1

µI_r⁽ⁿ⁾ ₃= 1

σ√ n

1−µ(m−1)

3= O n^−1/2

357

and O(e(n)/g^1/2(n)) = O(n−(1∧(3/2−α))). This implies

358

b⁽ⁿ⁾

3

3= O n−((3/2)∧(3(3/2−α))) .

359

Hence, condition (15) is satisfied withR(n) =n−((3/2)∧(3(3/2−α))). J

360

IRemark 9. Using Theorem 1 instead of Theorem 3 in the latter proof is also possible but

361

leads to a bound O(n^{−(3/2−α)}) for 20≤m≤26, missing the factor 3 appearing in Theorem

362

8.

363

In the full paper version we also discuss rates of convergence for the number of leaves of

364

d-dimensional random point quadtrees in the model of [7, 3, 8] where a similar behavior

365

as in Theorem 8 appears. A technically related example is the number of maxima in right

366

triangles in the model of [1, 2], where the ordern^−1/4 appears. Our framework also applies.

367

(11)

4.3 Periodic functions in mean and variance

368

We now discuss some examples where the asymptotic expansions of the mean and the

369

variance include periodic functions instead of fixed constants. This is the case for several

370

quantities in binomial splitting processes such as tries, PATRICIA tries and digital search

371

trees. Throughout this section, we assume that we have a 3-integrable sequence (Y_n)_n≥0

372

satisfying the recursion

373

Yn

=d Y⁽¹⁾

I₁⁽ⁿ⁾+Y⁽²⁾

I₂⁽ⁿ⁾+bn, n≥n0, (34)

374 375

with (I⁽ⁿ⁾, b_n),(Yn⁽¹⁾)_n≥0and (Yn⁽²⁾)_n≥0independent and (Yn^(r))_n≥0= (Y^d _n)_n≥0for r= 1,2.

376

Furthermore, I₁⁽ⁿ⁾ has the binomial distribution Bin(n,¹₂) and I₂⁽ⁿ⁾ = n−I₁(n) or I₁⁽ⁿ⁾

377

is binomially Bin(n−1,¹₂) distributed and I₂⁽ⁿ⁾ =n−1−I1(n). Mostly, these binomial

378

recurrences are asymptotically normally distributed, see [10, 11, 14, 18] for some examples.

379

Our first theorem covers the case of linear mean and variance, i.e. we assume that, as

380

n→ ∞,

381

E[Yn] =nP1(log₂n) + O(1), (35)

382

Var(Yn) =nP2(log₂n) + O(1), (36)

383384

for some smooth and 1-periodic functionsP₁, P₂ withP₂>0. Possible applications would

385

start with the analysis of the number of internal nodes of a trie fornstrings in the symmetric

386

Bernoulli model and the number of leaves in a random digital search tree, see, e.g., [10].

387

ITheorem 10. Let(Yn)_n≥0 be3-integrable and satisfy (34) withkbnk3= O(1), (35) and

388

(36). Then, for anyε >0 andn→ ∞, we have

389

ζ₃Yn−E[Yn]

pVar(Yn),N(0,1)

= O(n^−1/2+ε).

390

We now consider the case where our quantities Yn satisfy recursion (34) withbn being

391

essentiallyn. We assume that, asn→ ∞, we have

392

E[Y_n] =nlog₂(n) +nP₁(log₂n) + O(1), (37)

393

Var(Y_n) =nP₂(log₂n) + O(1), (38)

394395

for some smooth and 1-periodic functionsP1, P2 withP2>0. This covers, for example, the

396

external path length of random tries and related digital tree structures constructed fromn

397

random binary strings under appropriate independence assumptions.

398

ITheorem 11. Let (Yn)n≥0 be 3-integrable and satisfy (34) withkbn−nk3 = O(1), (37)

399

and (38). Then, for anyε >0 andn→ ∞, we have

400

ζ3

Yn−E[Yn]

pVar(Y_n),N(0,1)

= O(n^−1/2+ε).

401

4.4 A multivariate application

402

We consider a random binary search tree withnnodes built from a random permutation of

403

{1, . . . , n}. Forn≥0, we denote byL0n the number of nodes with no left descendant and

404