• Keine Ergebnisse gefunden

A Appendix: Proofs

Im Dokument , and J. Nielsen (Seite 28-39)

Before we come to the proofs of our results let us collect some facts about iterative projections. Let us dene the following spaces of additive functions

H=fm2

L

2(p) : m(x) = m1(x1) +::: + md(xd) (p a.s.)

Z

m(x)p(x)dx = 0g

Hj =fm 2H:m(x) = mj(xj) (p a.s.) for a function mj 2

L

2(pj)g:

The norm in the space H is denoted by kmk2 = R m2(x)p(x)dx for m 2 H. For m 2 Hj we get with mj(xj) = m(x) (p a.s.) that kmk2 = R m2(x)p(x)dx = R m2j(xj)pj(xj)dxj. The projection of an element of H onto Hj is denoted by $j. The operator %j =I;$j gives the projection onto the linear space

H

?j = fm 2H:

Z

m(x)(xj)p(x)dx = 0 for all 2Hjg

= fm 2H:

Z

m(x)p(x)dx;j = 0 (pj a.s.)g: Form(x) = m1(x1) +::: + md(xd)2H we get

%jm(x) = m1(x1) +::: + mj;1(xj;1) +mj(xj) +mj+1(xj+1) +::: + md(xd) (36)

with mj(xj) =;X

k6=j

Z

mk(xk)pjk(xjxk)

pj(xj) dxk: (37)

We dene the operator%bj as %j but with mj(xj) on the right hand side of (36) replaced by mbj(xj) =;X

k6=j

Z

mk(xk)pbjk(xjxk)

pbj(xj) dxk: (38)

28

PutT = %d%1 andT =b %bd%b1. We will see below that in our set up the backtting algorithm is based on iterative applications of T. A central tool for understanding backtting will be given byb the next lemma that describes iterative applications ofT.

Lemma norm of the operator T ]. Suppose that condition A1 holds. Then T : L2(p) !

L

2(p) is a positive self adjoint operator with operator norm = supfkTfk :kfk 1g < 1. Hence, for every m2H we get

kTrmk rkmk: (39)

Furthermore, for every m 2 H there exist mj 2 Hj (1 j d) such that m(u) = m1(u1) +::: + md(ud) (p a.s.) and with a constant c > 0

kmkcmaxfkm1k:::kmdkg: (40)

Proof of Lemma. We start by proving (39). It is known that (39) holds with 2 1;

Qd

j=1sin2(j) where cosj =(HjHj+1+:::+Hr) and where for two subspacesL1 andL2 the quan-tity(L1L2) is the cosine of the minimalangle betweenL1andL2, i.e.,(L1L2) = supfR h1(x)h2(x) p(x)dx : hj 2 Lj \(L1 \L2)?khjk 1(j = 12g. This result was shown in Smith, Solomon, and Wagner (1977). For a discussion, see Deutsch (1985) and Bickel, Klaassen, Ritov and Wellner (1993), Appendix A.4. We will show now that for 1 j d the subspaces Mj =H1+::: +Hj are closed subsets of L2(p). This implies that (Hj+1Mj)< 1 for j = 1:::d;1, see again Deutsch (1985), Lemma 2.5 and Bickel, Klaassen, Ritov and Wellner (1993), Appendix A.4, Proposition 2. To prove thatMj is closed we will use the following two facts. For two closed subspacesL1 andL2 of L2(p) it holds thatL1+L2 is closed if and only if there exists a constantc > 0 such that for all m 2L1+L2 there existm1 2L1 and m2 2L2 with m(u) = m1(u1) +m2(u2) (p a.s.) and

kmk cmaxkm1kkm2k]: (41) Furthermore,L1+L2 is closed if the projection ofL2 onto L1 is compact. For the proof of these two statements see Bickel, Klaassen, Ritov and Wellner (1993), Appendix A.4, Proposition 2. Suppose

29

now that it has already been proved for j jo;1 that Mj is closed and that we want to show that

Mjo is closed. As mentioned above, for this claim it suces to show that $jojMjo;1 is compact. We remark rst that (41) implies that for everym2 Mjo;1 there existmj 2 Hj (j jo;1) such that m(u) = m1(u1) +::: + mjo;1(ujo;1) (p a.s.) and with a constant c > 0

kmkcmaxkm1k:::kmjo;1k]: (42) We will prove that

k$jomk2 const:

"

jXo;1

j=1

Z

R2jjo(xjxjo)pj(xj)pj0(xj0)dxjdxj0

#

kmk2 (43) with Rjjo(xjxjo) = pjjo(xjxjo)

pjo(xjo)pj(xj):

Inequality (43) implies compactness of $jojMjo;1. To see this one argues as in the standard proofs for compactness of Hilbert-Schmidt operators, see e.g., Example 3.2.4 in Balakrishnan (1981).

It remains to show (43). This follows from (42) with applications of the Cauchy-Schwarz inequal-ity.Equation (40) follows as (42).

Proof of Theorem 1. The following lemma establishes the result.

Lemma norm of the operator Tb ]. Suppose that conditions A1-A2 hold. Choose with < < 1. Then, with probability tending to one, the operator norm supnkTf(x)]b k:kfk 1o is bounded by .

30

Proof of Lemma. We remark rst that the distance betweenmj and mbj, see (37)-(38) can be bounded as follows.

kmbj ;mjk X

k6=j

kmkkSjk

with Sjk2 =

Z

pjk(xjxk) pk(xk)pj(xj);

pbjk(xjxk) pk(xk)bpj(xj)

2pk(xk)pj(xj)dxj dxk: WithSj = maxk6=jjSjkj this and equation (40) imply

kmbj ;mjk d

ckmkSj:

Now because of (A2), Sj =oP(1). This gives k%bj ;%jk=oP(1). Now the statement of the lemma follows from

k

Tb;Tk=oP(1):

Lemma stochastic expansion ofme]. Suppose that conditions A1-A2 hold. Then there exist constants 0 < < 1 and C > 0 such that with probability tending to one, for me the following stochastic expansion holds for s1:

m(x) =e Xs

r=0

Tb1rmb1(x) + ::: +Xs

r=0

Tbrdmbd(x) + Rs](x)

where Tbj = %bj%bj;1%b1%bd%bd;1%bj+1 and Rs](x) = R1s](x1) +::: + Rds](xd) is a function in H with

kRjs]k Cs: (44)

Under the additional assumption of (A3) it holds that supx

j

jRjs](xj)j Cs: (45) 31

Proof of Lemma. We remark rst that (11) can be rewritten as m(x) =e %bjm(x) +e mbj(xj):

Iterative applications of this equation for j = 1:::d gives m(x) =e Tbm(x) +e b(x) where

b(x) =%bd%b1mb1(x) + ::: +%bdmbd;1(x) +mbd(xd):

With the last equality we get the following expansion m(x) =e X1

r=0

Tbrb(x):

Plugging the denition ofb into this equation gives m(x) =e X1

r=0

Tb1rmb1(x) + ::: +X1

r=0

Tbrdmbd(x):

The operator norms ofTb1:::Tbd are smaller than , with probability tending to one, for < 1 large enough. This follows from the last lemma and it shows that the innite series expansion in the last equation is well dened. Furthermore, this can be used to prove that for C1 > 0 large enough, with probability tending to one, kRjs]k C1s. This implies claim (44) because of (40). Assume now (A4). For the proof of (45) note that forC2 > 0 large enough with probability tending to one for all functionsfg in Hj with supxjjf(xj)j 1 and kgk 1 it holds for k 6=j that

Z

pbjk(xjxk)

pbk(xk) f(xj)dxj

C2 (46)

Z

pbjk(xjxk)

pbk(xk) g(xj)dxj

C2: (47)

32

Equation (47) follows from assumption (A4) by application of the Cauchy Schwarz inequality. Equa-tions (46) and (47) imply that for C3 > 0 large enough with probability tending to one for all functionsh inH with khk 1 it holds that

k

Thb k C3: (48)

Claim (45) can be shown by using (44) and (48).

Proof of Theorem 2. The following lemma establishes the result.

Lemma behaviour of the stochastic component of me]. Suppose (A1) - (A5). Then we have that

supx

j

jmeAj(xj);mbAj(xj)j=OP(logn n): (49)

Proof of Lemma. Proceeding as in the last lemma we get with s = Clogn (where C is chosen large enough)

meA(x);mbA(x) =Xs

r=1

Tb1rmbA1(x) + ::: +Xs

r=1

TbrdmbAd(x) + Rs](x) whereRs](x) = R1s](x1) +::: + Rds](xd) is a function inH with

supx

j

jRjs](xjj n: It remains to show

supx kTb1rmbA1(x)k=OP( n):

This follows from assumption (A5) by arguments as in the proof of the last lemma.

Proofs of Theorems 10 and 20. The theorems follow as Theorems 1 and 2 by essentially the same arguments. In particular, instead of L2(p) we consider now L2(V p) = ff = (f0:::fd) :fj :

33

Rd!Rwith R fT(x)V f(x)p(x) dx <1g. Furthermore, now the spacesH and Hj are dened as

H = fm2

L

2(V p) : m0(x) = m1(x1) +::: + md(xd) (p a.s.) for functions m1 2

L

2(p1):::

md 2

L

2(pd)

Z

m0(x)p(x)dx = 0 and for j = 1:::d the functions mj depend only on xjg

Hj = fm2H:m0(x) = mj(xj) (p a.s.) for a function mj 2

L

2(pj) and for `6=j it holds that m`(x)0g:

Note that again every functionf inHis a sum of functions in Hj: there exist functionsfj :R!R2

with x ! e0 ej

fj(xj) is a function inHj f(x) =Xd

j=1 e0 ej

fj(xj):

Here for j = 0:::d the vector ej denotes the (j + 1)st eigenvector of Rd+1. The operators %j is now dened as in (36) with

mj(xj) =;X

k6=j

Z

M

;1j

S

jkpjk(xjxk)

pj(xj) mk(xk)dxk:

Furthermore, we dene the operator %bj now as %j but with mj(xj) on the right hand side of (36) replaced by

mj(xj) =;X

k6=j

Z

^M

;1j (xj)

^S

jk(xjxk)mk(xk)dxk:

Proceeding as above one can show that the norm of the operators T = %d%1 and T =b %bd%b1 is smaller than < 1 with probability tending to one]. Theorems 10 and 20 follow by stochastic expansions of

m

e, compare the last two lemmas.

Proof of Theorem 3. Let kgk1= supxjg(x)j: Then, under these conditions, 34

kbpjk;E(pbjk)k1 = O

by (50) and (51), assumptions A2 and A4 are also satised by straightforward use of the geometric series expansion and the above result. Specically, we have

1

pbj(xj) = 1 pj(xj);

pbj(xj);pj(xj) pbj(xj)pj(xj) :

Likewise, assumption A3 is satised by B2, B3, and (52). By the triangle inequality, supx

As for the rst term, without loss of generality, we can suppose that 35

mbAj(xj) =n;1Xn by straightforward change of variables. The argument is now quite similar to that given in Masry (1996). We drop the k subscript for convenience. Since the support of X is compact, it can be covered by a nite numberc(n) of cubes Inr with centresxr with dimensionl(n): We then have

supx2X

To handle the second term we must use an exponential inequality and a blocking argument as in Masry's proof. In conclusion, by appropriate choice ofc(n) we obtain Q1+Q2 =O(log n=n1=2) with probability one.

36

References

1] Auestad, B. and Tjstheim, D., (1991). Functional identication in nonlinear time series.

InNonparametric Functional Estimation and Related Topics, ed. G. Roussas, Kluwer Academic:

Amsterdam. pp 493{507.

2] Balakrishnan, A. V. (1981). Applied functional analysis, Springer, New York, Heidelberg, Berlin.

3] Bickel, P. J., C. A. J. Klaassen, Y. Ritov, and J. A. Wellner (1993). Ecient and adaptive estimation for semiparametric models. The John Hopkins University Press, Baltimore and London.

4] Bollerslev, T., Engle, R.F., and D.B. Nelson(1994). ARCH Models. InThe Handbook of Econometrics, vol. IV, eds. D.F. McFadden and R.F. Engle III. North Holland.

5] Deutsch, F.(1985). Rate of convergence of the method of alternating projections. In: Paramet-ric optimization and approximation. Ed. by B. Brosowski and F. Deutsch 96 - 107 Birkhauser, Basel, Boston, Stuttgart.

6] Fan, J, E. Mammen, and W. Hardle(1996). Direct estimation of low dimensional compo-nents in additive models.Preprint.

7] Hardle, W. (1990). Applied Nonparametric Regression. Cambridge: Cambridge University Press.

37

8] Hardle, W. and L. Yiang(1996) `Nonparametric Autoregression with MultiplicativeVolatil-ity and Additive Mean,' Forthcoming inJ. Time Ser. Anal.

9] Hastie, T. and R. Tibshirani (1991). Generalized Additive Models. Chapman and Hall, London.

10] Linton, O.B. (1996). Ecient estimation of additive nonparametric regression models.

Biometrika, To Appear.

11] Linton, O.B. and W. Hardle. (1996). Estimating additive regression models with known links.Biometrika

83

, .

12] Linton, O.B. and J.P. Nielsen.(1995). Estimating structured nonparametric regression by the kernel method.Biometrika

82

, 93-101.

13] Mammen, E., Marron, J. S., Turlach, B. and Wand, M. P.(1997). A general framework for smoothing.Preprint. Forthcoming.

14] Masry, E.(1996). Multivariate regression estimation: Local polynomial tting for time series.

Stochastic Processes and their Applications.

65

, 81-101.

15] Masry, E. (1996). Multivariate local polynomial regression for time series: Uniform strong consistency and rates.J. Time Ser. Anal.

17

, 571-599.

16] Newey, W.K.(1994). Kernel estimation of partial means.Econometric Theory.

10

, 233-253.

17] Nielsen, J.P., and O.B. Linton (1997). An optimization interpretation of integration and backtting estimators for separable nonparametric models.J. Roy. Statist. Soc., Ser. B, Forth-coming.

38

18] Opsomer, J. D.,(1997). On the existence and asymptotic properties of backtting estimators.

Preprint.

19] Opsomer, J. D. and D. Ruppert(1997). Fitting a bivariate additive model by local poly-nomial regression.Ann. Statist.

25

, 186 - 211.

20] Robinson, P.M.(1983). Nonparametric estimators for time series. J. Time Ser. Anal.

4

, 185-197.

21] Rosenblatt, M. (1956). A central limit theorem and strong mixing conditions, Proc. Nat.

Acad. Sci.

4

, 43-47.

22] Ruppert, D., and M. Wand (1994). Multivariate locally weighted least squares regression.

Ann. Statist.

22

, 1346-1370.

23] Smith, K. T., D. C. Solomon, and S. L. Wagner (1977). Practical and mathematical aspects of the problem of reconstructing objects from radiographs. Bull. Amer. Math. Soc.

83

, 1227 -1270.

24] Tjstheim, D., and B. Auestad (1994). Nonparametric identication of nonlinear time series: projections. J. Am. Stat. Assoc.

89

, 1398-1409.

39

Im Dokument , and J. Nielsen (Seite 28-39)

ÄHNLICHE DOKUMENTE