Remarks on Low-Dimensional Projections of High-Dimensional Distributions
Lutz Dumbgen and Perla Zerial December 6, 1996
Abstract.
Let P = P(q) be a probability distribution on q-dimensional space.Necessary and sucient conditions are derived under which a randomd-dimensional projection of P converges weakly to a xed distribution Q on
R
d as q tends to innity, whiled is an arbitrary xed number. This complements a well-known result of Diaconis and Freedman (1984). Further we investigated-dimenional projections ofP, whereb P is the empirical distribution of a random sample from P of size n. Web prove a conditional Central Limit Theorem for random projections of n1=2(Pb ;P) given the data P, as q and n tend to innity.bCorrespondence to: Lutz Dumbgen, Institut fur Angewandte Mathematik, Univer- sitat Heidelberg, Im Neuenheimer Feld 294, D-69120 Heidelberg, Germany
lutz@statlab.uni-heidelberg.de
Research supported in part by European Union Human Capital and Mobility Program ERB CHRX-CT 940693.
1
1 Introduction
A standard method of exploring high-dimensional datasets is to examine various low-dimensional projections thereof. In fact, many statistical procedures are based explicitly or implicitly on a \projection pursuit", cf. Huber (1985). Diaconis and Freedman (1984) showed that under weak regularity conditions on a distribution P = P(q) on
R
q, \most" d-dimensional orthonormal projections of P are similar (in the weak topology) to a mixture of centered, spherically symmetric Gaussian distribution onR
d if q tends to innity while d is xed. A graphical demonstration of this disconcerting phenomenon is given by Buja et al. (1996). It should be pointed out that it is not a simple consequence of Poincare's (1912) Lemma, although the latter is at the heart of the proof. The present paper provides further insight into this phenomenon. We extend Diaconis and Freedman's (1984) results in two directions.Section 2 gives necessary and sucient conditions on the sequence (P(q))q d such that \most" d-dimensional projections of P are similar to some distribution Q on
R
d. It turns out that these conditions are essentially the conditions of Diaconis and Freedman (1984). The novelty here is necessity. The limit distribution Q is automatically a mixture of centered, spherically symmetric Gaussian distributions.The family of such measures arises in Eaton (1981) in another, related context.
More precisely, let ; = ;(q) be uniformly distributed on the set of column-wise orthonormal matrices in
R
qd (cf. Section 4.2). Dening>P := LXP(>X)
for 2
R
dq, we investigate under what conditions the random distribution ;>P converges weakly in probability to an arbitrary xed distributionQ as q!1, while d is xed.Section 3 studies the dierence between P and the empirical distribution P =b Pb(qn) ofn independent random vectors with distribution P. Suppose that (P(q))q d
2
satises the conditions of Section 2 and ; is independent from P. Then, as n andb q tend to innity, the standardized empirical measure n1=2(;>Pb ;;>P) satises a conditional Central Limit Theorem given the data P.b
Proofs are deferred to Section 4. The main ingredients are Poincare's (1912) Lemma and a modication of a method invented by Hoeding (1952) in order to prove weak convergence of conditional distributions, which is of independent interest.
Further we utilize some results from the theory of empirical processes.
2 The Diaconis-Freedman Eect
Let us rst settle on some terminology. A random distribution Q on a separableb metric space (
M
) is a mapping from some probability space into the set of Borel probability measures onM
such that R f dQ is measurable for any function fb 2Cb(
M
), the space of bounded, continuous functions onR
d. We say that a sequence (Qbk)k of random distributions onM
converges weakly in probability to some xed distributionQ if for each f 2Cb(M
),Z f dQbk !p
Z f dQ as k !1:
In symbols, Qbk !wp Q as k ! 1. We say that the sequence (Qbk)k converges weakly in distribution to a random distributionQ onb
M
if for each f 2Cb(M
),Z f dQbk !L
Z f dQ as kb !1:
In symbols,Qbk !wLQ as kb !1. Standard arguments show that (Qbk)k converges in probability toQ if, and only if,
fsup2FbL
Z f dQbk;
Z f dQ !p 0 (k !1)
whereFbLstands for the class of functionsf :
M
!;11] such thatjf(x);f(y)j (xy) for xy2M
.Now we can state the rst result.
3
Theorem 2.1
The following two assertions on the sequence (P(q))q d are equiva- lent:(A1)
There exists a probability measure Q onR
d such that;>P !wp Q as q !1:
(A2)
If X = X(q)X =f Xf(q) are independent random vectors with distribution P, thenL(q;1kXk2) !w R and q;1X>Xf !p 0 as q!1 for some probability measureR on 01.
(Throughout, kxk denotes Euclidean norm (x>x)1=2.) The limit distribution Q is equal to the normal mixture
Z
Nd(02I)R(d2):
Corollary 2.2
The random probability measure ;>P converges weakly to the stan- dard Gaussian distributionNd(0I) in probability if, and only if, the following con- dition is satised:(B)
For independent random vectorsX = X(q)X =f Xf(q) with distribution P, q;1kXk2 !p 1 and q;1X>fX !p 0 as q!1: 2 The implication \(A2) =) (A1)" in Theorem 2.1 as well as suciency of con- dition (B) in Corollary 2.2 are due to Diaconis and Freedman (1984, Theorem 1.1 and Proposition 4.2).Example 2.3
Conditions (A1-2) are not very restrictive requirements. For in- stance, suppose that P = L(k +kZk)1kq, where (Zk)k 1 is a sequence of independent, identically distributed random variables with mean zero and variance
4
one, and = (q) 2
R
q, = (q) 201q. Then conditions (A1-2) are satised if, and only if,(
A3
) q;1kk2 ! 0 q;1kk2 ! r 0 and q;1 max1kq k2 ! 0 as q !1, whereR = r.
3 Empirical Distributions
In some sense Theorem 2.1 is a negative, though mathematically elegant result. It warns us against hasty conclusions about high-dimensional data sets after examin- ing a couple of low-dimensional projections. In particular, one should not believe in multivariate normality only because several projections of the data \look nor- mal". On the other hand, even small dierences between dierent low-dimensional projections of P may be intriguing. Therefore in the present section we study theb relationship between projections of the empirical distribution P and correspondingb projections of P.
In particular, we are interested in the halfspace norm
k;>Pb ;;>PkKS := sup
closed halfspaces HRd j;>P(H)b ;;>P(H)j
of ;>Pb ; ;>P. In case of d = 1 this is the usual Kolmogorov-Smirnov norm of
;>bP;;>P. In what follows we use several well-known results from empirical process theory. Instead of citing original papers in various places we simply refer to the excellent treatises of Pollard (1984) and van der Vaart and Wellner (1996). It is known that
IE sup
2Rq d k>Pb ;>PkKS C(q=n)1=2 (3.1)
for some universal constant C. For the latter supremum is just the halfspace norm of Pb ;P, and generally the set of closed halfspaces in
R
k is a Vapnik-Cervonenkis5
class with Vapnik-Cervonenkis index k + 1. Inequality (3.1) does not capture the typical deviation betweend-dimensional projections of P and P. In fact,b
2Rsupq d IEk>bP ;>PkKS C(d=n)1=2: This implies that
IEk;>Pb ;;>PkKS C(d=n)1=2 (3.2)
where the random projector ; and P are always assumed to be stochastically in-b dependent. The subsequent results imply precise information about the conditional distribution of n1=2k;>Pb;;>PkKS given the data P. This point of view is naturalb in connection with exploratory projection pursuit. It turns out that under condi- tion (B) of Corollary 2.2, this conditional distribution converges weakly in proba- bility to a xed distribution. Under the weaker conditions (A1-2) of Theorem 2.1 it converges weakly in distribution to a specic random distribution on the real line.
More generally, letH be a countable class of measurable functions from
R
d into ;11]. Any nite signed measure onR
d denes an element h 7! (h) := R hd of the space `1(H) of all bounded functions on H equipped with supremum normkzkH:= suph2Hjz(h)j. We shall impose the following condition on the class H and some distributionQ on
R
d.(C1)
There exists a countable subset Ho of H auch that each h2H can be repre- sented as pointwise limit of some sequence inHo.(C2)
The setH satises the uniform entropy conditionZ
1
0
qlog(N(uH)du < 1:
Here N(uH) is the supremum of N(uHQ) over all probability measurese Q one
R
d, andN(uHQ) is the smallest number m such thate H can be covered withm balls having radius u with respect to the pseudodistanceQe(gh) := qQ((g;h)2):
6
(C3)
For any sequence (Qk)k of probability measures converging weakly to Q,kQk;QkH ! 0 ask !1:
An example for conditions (C1-3) is the setHof (indicators of) closed halfspaces in
R
d and any distribution Q onR
d such that Q(E) = 0 for any hyperplane E inR
d. Here condition (C3) is a consequence of Billingsley and Topsoe's (1967) results.Condition (C1) ensures that random elements such ask;>Pb;;>PkHare measur- able. A particular consequence of (C2) is existence of a centered Gaussian process BQ having uniformly continuous sample paths with respect toQ and covariances
IEBQ(g)BQ(h) = Q(gh);Q(g)Q(h):
This is proved via a Chaining argument. In the subsequent theorem we consider a decomposition of BQ as a sum BQ1 +BQ2 of two independent centered Gaussian processes onH. With the help of Anderson's (1955) Lemma or further application of Chaining one can show thatBQ1andBQ2 admit versions with uniformly continuous sample paths.
Theorem 3.1
Suppose that the sequence (P(q))q d satises conditions (A1-2) of Theorem 2.1, and suppose that conditions (C1-3) are satised with Q being the corresponding limit measureR Nd(02I)R(d2). DeneB(qn) := n1=2(;>Pb;;>P)(h)h2H
and let F be a continuous functional on `1(H) such that F(B(qn)) is measurable for allq d and n1. Then, as n and q tend to innity,
L
F(B(qn))Pb !wL L
F(BQ1+BQ2)BQ2
where BQ1 and BQ2 are two independent centered Gaussian processes having uni- formly continuous sample paths with respect toQ and covariances
IEBQ1(g)BQ1(h) = Q(gh);Z Nd(02I)(g)Nd(02I)(h)R(d2) 7
= Z Nd(02I)(gh);Nd(02I)(g)Nd(02I)(h)R(d2) IEBQ2(g)BQ2(h) = Z Nd(02I)(g)Nd(02I)(h)R(d2);Q(g)Q(h):
(Thus BQ1+BQ2 denes a version of BQ.)
Corollary 3.2
Suppose that the sequence (P(q))q d satises condition (B) of Corol- lary 2.2, and suppose that conditions (C1-3) are satised for Q = Nd(0I). Let F be as in Theorem 3.1. Then, asn and q tend to innity,L
F(B(qn))Pb !wp L
F(BQ): 2
The measurability of F(B(qn)) can be dropped, provided that our denition of weak convergence of random distributions is suitably extended see Remark 4.3 in Section 4.1.
4 Proofs
4.1 Hoeding's (1952) technique and a modication thereof
In connection with randomization tests, Hoeding (1952) observed that weak con- vergence of conditional distributions of test statistics is equivalent to the weak con- vergence of the unconditional distribution of suitable statistics in
R
2. His result can be extended straightforwardly as follows.Lemma 4.1
(Hoeding). For k 1 letXkXfk 2X
k and Tk 2T
k be independent random variables, whereXkXfk are identically distributed. Further let k be some measurable mapping fromX
kT
k into the separable metric space (M
), and let Q be a xed Borel probability measure onM
. Then, as k !1, the following two assertions are equivalent:(
D1
) Lk(XkTk)Tk
!
wp Q:
(
D2
) Lk(XkTk)k(XfkTk) !w QQ:8
An application of this equivalence with non-Euclidean spaces
M
is given by Romano (1989). We shall utilize Lemma 4.1 in order to prove Theorem 2.1. In connection with empirical measures we use the following modication of Lemma 4.1, which is of independent interest.Lemma 4.2
Fork
2f123:::gf1gletXkXk1Xk2:::2X
kandTk 2T
kbe independent random variables, where XkXk1Xk2::: are identically distributed.Further let k be some measurable mapping from
X
kT
k into (M
). Then, as k! 1, the following two assertions are equivalent:(
E1
) Lk(XkTk)Tk
!
wL L
1(X1T1)T1: (
E2
) For any integer L1, k(Xk`Tk)1`L !L1(X1`T1)1`L:
Remark 4.3
(Non-separablity and non-measurability). Suppose that the met- ric space (M
) is possibly nonseparable, and that the mappings k, 1 k <1, are possibly non-measurable. The implications \(D2) =) (D1)" and \(E2)
=) (E1)" remain valid, provided that the limit distributions Q in Lemma 4.1 and L(1(X1T1)) in Lemma 4.2 have separable support, if one uses Homann- Jorgensens notion of weak convergence (cf. van der Vaart and Wellner 1996, Chap- ter 1). The conditional distribution Lk(XkTk)Tk = tk
stands for the outer measure IPnk(Xktk) 2 o on
M
, and Lk(XkTk)Tkis said to converge weakly to Q in probability if for each xed f 2 Cb(
M
), the real-valued random el- ement IEf(k(XkTk))Tkconverges in outer probability to Q(f). Analogously,
L
k(XkTk)Tk
converges weakly in distribution to L1(X1T1)T1 if for any xed f 2Cb(
M
), IEf(k(XkTk))Tkconverges in distribution (in the sense of Homann-Jorgensen) to the random variable IEf(1(X1T1))T1.
In this framework the reverse implications \(D1) =)(D2)" and \(E1) =)(E2)"
remain valid under some measurability. For instance, these conclusions are correct, 9
provided that for each k 2 f123:::g the mapping k(XkTk) is measurable with respect to the -eld on
M
generated by closed balls with respect to .Given some familiarity with these concepts, one can easily adapt the subsequent proofs of Lemmas 4.1 and 4.2.
Proof of Lemma 4.1.
Dene Yk :=k(XkTk) and Yek :=k(XfkTk). Suppose rst that L(YkYek)!w QQ. Then for any f 2Cb(M
),IEIE(f(Yk)jTk);Q(f)2
= IEIE(f(Yk)jTk)2;2Q(f)IEIE(f(Yk)jTk) +Q(f)2
= IEIE(f(Yk)f(Yek)jTk);2Q(f)IEIE(f(Yk)jTk) +Q(f)2
= IE(f(Yk)f(Yek));2Q(f)IEf(Yk) +Q(f)2
!
Z f(y)f(y)Q(dy)Q(de y)e ;Q(f)2
= 0:
ThusL(YkjTk)!wp Q.
On the other hand, suppose that L(YkjTk) !wp Q. Then for arbitrary fg 2
Cb(
M
),IEf(Yk)g(Yek) = IEIEf(Yk)g(Yek)Tk
= IEIE(f(Yk)jTk)IE(f(Yek)jTk)
! Q(f)Q(g)
because IE(h(Yk)jTk) !p R hdQ and IE(h(Yk)jTk) khk1 < 1 for each h 2
Cb(
M
). Thus we know that IEF(YkYek) ! R F dQ Q for arbitrary functions F(yy) = f(y)g(e y) with fge 2 Cb(M
). But this is known to be equivalent to weak convergence ofL(YkYek) toQQ see van der Vaart and Wellner (1996, Chapter 1.4).2
Proof of Lemma 4.2.
DeneYk :=k(XkTk) and Yk` :=k(Xk`Tk). Sup- pose rst that (Yk`)1`L !L(Y1`)1`L for any integerL1. For arbitrary xed10
f 2Cb(
M
),IEIE(f(Yk)jTk);L;1XL
`=1f(Yk`)2
= IEIEIE(f(Yk)jTk);L;1XL
`=1f(Yk`)2Tk
= IEVarL;1XL
`=1f(Yk`)Tk
L;1kfk21:
Thus the sample mean L;1PL`=1f(Yk`) approximates the conditional expectation IE(f(Yk)jTk) arbitrarily well in quadratic mean, provided thatL is suciently large.
However, the variableL;1PL`=1f(Yk`) converges in distribution toL;1PL`=1f(Y1`), according to the Continuous Mapping Theorem. Consequently, IE(f(Yk)jTk) con- verges in distribution to IE(f(Y1)jT1), whence L(YkjTk)!wL L(Y1jT1).
On the other hand, suppose that the conditional distributionL(YkjTk) converges weakly in distribution toL(Y1jT1). In order to show that (Yk`)1`L converges in distribution to (Y1`)1`L one has to show that
IEYL
`=1f`(Yk`) ! IEYL
`=1f`(Y1`)
for arbitrary functions f1f2:::f` 2Cb(
M
) (cf. van der Vaart and Wellner, 1996, Chapter 1.4). ButIEYL
`=1f`(Yk`) = IEIEYL
`=1f`(Yk`)Tk = IE YL
`=1IE(f`(Yk)jTk):
Thus it suces to show that IE(f`(Yk)jTk)1`L converges in distribution to
IE(f`(Y1)jT1)1`L. This follows easily from our assumption on L(YkjTk) via Fourier transformation, since for arbitrary 2
R
L,IE expp;1 XL
`=1`IE(f`(Yk)jTk) = IE expp;1IE(F(Yk)jTk)
with F :=PL`=1`f` 2Cb(
M
). 211
4.2 Proofs for Section 2
That ; = ;(q) is \uniformly" distributed on the set of column-wise orthonormal ma- trices in
R
qd means thatL(;) = L(;) for any xed orthonormal matrix 2R
qq. For existence and uniqueness of the latter distribution we refer to Eaton (1989, Chap- ters 1 and 2). For the present purposes the following explicit construction described in Eaton (1989, Chapter 7) of ; is sucient. LetZ = Z(q) := (Z1Z2:::Zd) be a random matrix inR
qd with independent, standard Gaussian column vectors Zj inR
q. Then; := Z(Z>Z);1=2 has the desired distribution, and
; = q;1=2Z (I + Op(q;1=2)) as q!1: (4.1)
This equality can be viewed as an extension of Poincare's (1912) Lemma.
Proof of Theorem 2.1.
Let ; = ;(Z) as above. Suppose that Z = Z(q), X = X(q) and X =f Xf(q) are independent with L(X) = L(X) = P, and let Yf Y be twoe independent random vectors inR
d with distribution Q. According to Lemma 4.1, condition (A1) is equivalent to(
A1
0);>X
;>fX
!
!
L Y
Ye
!
: Because of equation (4.1) this can be rephrased as (
A1
00)Y(q) Ye(q)
!
:=
q;1=2Z>X q;1=2Z>fX
!
!
L Y
Ye
!
:
Now we prove equivalence of (A1") and (A2) starting from the observation that
L
Y(q) Ye(q)
!!
= IEL
Y(q) Ye(q)
!
XXf
!
= IEN2d(0 (q)) where
(q) :=
q;1kXk2I q;1X>fX I q;1X>fX I q;1kXfk2I
!
2
R
2d2d: 12Suppose that condition (A2) holds. Then (q) converges in distribution to a random diagonal matrix
:=
S2I 0 0 Se2I
!
with independent random variablesS2Se2having distributionR. Clearly this implies that IEN2d(0 (q)) !w IEN2d(0 ) = L
Y
Ye
!!
with Q = IEN(0S2I). Hence (A1") holds.
On the other hand, suppose that (A1") holds. For any t = (t>1t>2)>2
R
2d, the Fourier transform ofL((Y(q) >Ye(q)>)>) at t equalsIE expp;1(t>1Y(q)+t>2Ye(q)) = IE exp(;t> (q)t=2) = H(q)(a(t)) wherea(t) :=;kt1k2=2;kt2k2=2;t>1t2>2
R
3, andH(q)(a) := IE expa1q;1kXk2+a2q;1kXfk2+a3q;1X>Xf
denotes the Laplace transform ofLq;1kXk2q;1kXfk2q;1X>fX>at a2
R
3. By assumption, the Fourier transform at t converges toIE exp(p;1t>1Y )IE exp(p;1t>2Y ):
Setting t2 = 0 and varying t1 shows that the Laplace transform of L(q;1kXk2) converges pointwise on ];10] to a continuous function. Hence q;1kXk2 converges in distribution to some random variableS2 0, and Q = IENd(0S2I). Therefore, if Se2 denotes an independent copy of S2, we know that H(q)(a(t)) converges to
IE exp(a1(t)S2)IE exp(a2(t)S2) = IE expa1(t)S2+a2(t)Se2+a3(t)0: A problem at this point is that for dimensiond = 1 the set fa(t) : t 2
R
2dgR
3has empty interior. Thus we cannot apply the standard argument about weak 13
convergence and convergence of Laplace transforms. However, lettingt2 =t1 with
kt1k2=2 = 1, one may conclude that for arbitrary r > 0, 0 = limq!1
H(q)(;1;1;2) +H(q)(;1;12);2H(q)(;100)2
= limq!1
H(q)(;1;1;2) +H(q)(;1;12);2IE exp(;q;1kXk2;q;1kXfk2)
= 2 limq!1 IE exp;q;1kXk2 ;q;1kXfk2cosh(2q;1X>fX);1
2exp(;2r)(cosh(2);1)
limsupq
!1
IPnq;1kXk2 < rq;1kXfk2 < rjq;1X>fXjo
2exp(;2r)(cosh(2);1) limsupq
!1
IPfjq;1X>fXjg;2IPfq;1kXk2 rg
2exp(;2r)(cosh(2);1)limsupq
!1
IPfjq;1X>Xfjg;2IPfS2 rg whence
limsupq
!1
IPfjq;1X>fXjg 2IPfS2rg:
Consequently,q;1X>fX !p 0. 2
Proof of equivalence of (A1-2) and (A3).
Proving that (A3) implies (A1- 2) is elementary. In order to show that (A1-2) implies (A3) note rst that condi- tions (A1-2) for the distributionsP(q) implythe same conditions for the symmetrized distributionsPo =Po(q) := L(XXe)PP(X ;X) =f L(k(Zk ;Zq+k)1kq
: Condition (A2) for these distributions reads as follows.
L
q;1Xq
k=1k2(Zk ;Zq+k)2 !w Ro =R ? R and (4.2)
q;1Xq
k=1k2(Zk;Zq+k)(Z2q+k;Z3q+k) !p 0:
(4.3)
The summandsq;12k(Zk;Zq+k)(Z2q+k;Z3q+k), 1kq, in (4.3) are independent and symmetrically distributed. Therefore one can easily deduce from (4.3) that
14
q;1max1kqk2!0. But then q;1Xq
k=1k2(Zk ;Zq+k)2 = 2q;1kk2+op(1 +q;1kk2)
and one can deduce from (4.2) thatq;1k(q)k2 converges to some xed numberr in particular, R = r. Now we return to the original distributionsP. Here the second half of (A2) means that
q;1Xk
k=1(k +kZk)(k +kZq+k)
= q;1kk2+q;1Xq
k=1kk(Zk+Zq+k) +q;1Xq
k=1k2ZkZq+k
= op(1):
Since
IEq;1Xq
k=1kk(Zk +Zq+k)2 = q;2Xq
k=12k2k = o(q;1kk2) IEq;1Xq
k=1k2ZkZq+k
2
= q;2Xq
k=1k4 ! 0
it follows that q;1kk2 !0. 2
4.3 Proof of Theorem 3.1
Let (;(q`))` 1 be a sequence of independent copies of ; which is stochastically inde- pendent fromP. Deneb
B(qn`) := n1=2(;(q`)>Pb;;(q`)>P)(h)h2H:
The B(qn`), ` 1, are dependent copies of B(qn). Further consider independent processes BQ(1)1BQ(2)1BQ(3)1::: and BQ2 with L(BQ(`)1) = L(BQ1) and L(BQ2) as described in Theorem 3.1. According to Lemma 4.2 it suces to show that for any xed integerL1 and ! :=f12:::Lg, the random elements
~B(qn) := B(qn`)(h)(`h)2H
15
converge in distribution in`1(! H) to
~B := (BQ(`)1+BQ2)(h)(`h)2H
as q ! 1 and n ! 1. For that purpose it suces to verify the following two claims.
(F1)
As q ! 1 and n ! 1, the nite-dimensional marginal distributions of the process ~B(qn) converge to the corresponding nite-dimensional distributions of ~B.(F2)
As q!1, n!1 and #0, max`2
gh2H:supQ(gh)<
B(qn`)(g);B(qn`)(h) !p 0:
In order to verify assertions (F1-2) we consider the conditional distribution of
~B(qn) given the random matrix
~; = ~;(q) := (;(q1);(q2):::;(qL)) 2
R
qLd: In fact, if we dene~f`h(v) := h(v`) for v = (v1>:::vL>)>2
R
Ld then B(qn`)(h) = n1=2(~;>Pb ;~;>P)(~f`h):Thus L(~B(qn)j~;) is essentially the distribution of an empirical process based on n independent random vectors with distribution ~;>P on
R
Ld and indexed by the family ~H :=f~f`h :`2!h2Hg.The multivariate version of Lindeberg's Central Limit Theorem entails that for large q and n the nite-dimensional marginal distributions of ~B(qn), conditional on
~;, can be approximated by the corresponding nite-dimensional distributions of a centered Gaussian process on ! H with the same covariance function, namely
(q)(`g)(mh) := CovB(qn`)(g)B(qnm)(h)~;
= ~;>P(~f`g~fmh);~;>P(~f`g)~;>P(~fmh):
16