• Keine Ergebnisse gefunden

Hard thresholding wavelet trend estimation

3.8 Trend estimation via wavelet shrinkage

3.8.2 Hard thresholding wavelet trend estimation

Li and Xiao (2007) considered the nonparametric regression model (1.2) with long memory data, that are not necessarily Gaussian, and provided an asymptotic expansion for the MISE of hard thresholding wavelet estimators. They derived MISE expansion for the piecewise smooth underlying mean regression function.

It turns out that MISE convergence rate is the same as by analogous minimax expansion (see e.g. Wang 1996). Throughout this thesis,mψ N will denote the number of vanishing moments of ψ, i.e.

N

0

tkψ(t)dt= 0, k= 0,1, . . . , mψ1 (3.22)

and ∫ N

0

tmψψ(t)dt=νmψ ̸= 0. (3.23) We assume that bothϕand ψsatisfy a uniform H¨older condition of exponent 1/2 and the following conditions are satisfied:

The errors ξi in (1.2) given by ξi = G(ϵi) where ϵi a Gaussian zero mean second order stationary process with long-range dependence and G has Hermit rank m. Here, long-range dependence is characterized by (1.1).

The smoothing parameter q, decomposition level J and thresholdingδj are functions of n. We assume that J → ∞, q→ ∞ as n → ∞, and for every j = 1,2, . . . , q1, holds

2J+jδj2 0, 2(J+j)(2r+1)δj2 → ∞, δj2 (4e)mC12N1+mα(lnn)m+1

mmn2(J+j)(1mα) (3.24) where

C12 =CγJ2(m) m!

N 0

N 0

|x−y|ψ(x)ψ(y)dx dy

with J(m) = E(G(Z)Hm(Z)) where Z Gaussian with mean 0, variance 1 and Hm is Hermite polynomial of order m. We additionally assume that 0< mα <1.

Under these conditions they derived

Theorem 3.4. If, in addition to the conditions stated before, we assume that the rth derivative g(r) is continuous and bounded on [0,1]. Let mψ = r, then as

n→ ∞, E[

||g−ˆg||2L2

] =C2(n1N2J)+ (N2J)2r(r!)2νr2(122r)1

1

0

g(r)2(t)dt (3.25) where C2 is a positive and finite constant defined by

C22 =Cγ

J2(m) m!

N 0

N 0

|x−y|ϕ(x)ϕ(y)dx dy.

In this theorem, they have assumed that the mean regression function g is rth times continuously differentiable. However, this result can be generalized. In addition to last theorem, Li and Xiao (2007) showed, that ifg(r) is only piecewise continuous, this result still holds, as given in the following:

Theorem 3.5. If, in addition to the conditions stated before, we suppose thatg(r) exists on [0,1] except for at most a finite number of points, and, where it exists, it is piecewise continuous and bounded. Furthermore, assume that supp(g(r)) has positive Lebesgue measure and mψ = r. In particular, g itself may be only piecewise continuous. Also, assume that2(2r+mα)Jn2rmα → ∞. Then (3.25) still holds.

Furthermore, this implies

Remark 9. For the Gaussian error special case, the Hermite rank of Gis1(i.e., m= 1). In this case, if the decomposition levelJ is chosen of size 2r+αα log2n, then the convergence rate of M ISE is n(2r+α)2rα , which is the same as those in Wang (1996). Moreover, Hall and Hart (1990) gave a similar asymptotic expansion forM ISE, considering kernel estimator in fixed-design nonparametric regression when error is Gaussian long memory process and trend is smooth.

No further justification for the specific choice of J and q was given by Li and Xiao (2007), as well as no optimality result is derived. We refer to Li and Xiao (2007) for further discussion.

Beran and Feng (2002a) in contrast to Hall and Hart (1990) considered kernel estimator in semiparametric fractional autoregressive models (SEM IF AR) and derived corresponding results for this case. The class of SEM IF AR models

includes GaussianF ARIM Amodels, what allows us to formulate this results in form that we using later in the empirical studies.

Before starting the result, we require following definitions. Let K(x) be a sym-metric, nonnegative polynomial kernels with

K(x) = and M¨uller 1979). Suppose that we observe time series data of the form (1.2).

For a given bandwidthb >0 and t∈[0,1], the kernel estimate of g defined by

The following results originally derived by Beran and Feng (2002a)

Theorem 3.6. Letbn>0be a sequence of bandwidths withbn 0andnbn → ∞,

1. Mean integrated squared error in [∆,1∆]

1

2. The bandwidth that minimizes the asymptotic MISE is given by bopt=n2d5−2d1

Note that forr = 0 we obtain box-kernel and this result take following represen-tation:

Theorem 3.7. Let K(x) = 12I{x∈[1,1]}. Define β(d) = 22dΓ(12d) sin(πd)

d(2d+ 1) . Then, under the assumptions of theorem 3.6, we have

1. Mean integrated squared error in [∆,1∆]

1

E {

[g(t)−d(t)]ˆ 2 }

dt=b4n 1 36

(∫ 1

(g(2))2

dx )

+(nbn)2d1Cfβ(d) +o( max{

b4n,(nbn)2d1}) . 2. The bandwidth that minimizes the asymptotic MISE is given by

bopt =n2d52d1

[ 1

(12d)Cfβ(d)

(∫ 1

(g(2))2

dx

)]1/(52d)

.

Note that theK, defined in (3.26), is only a second order kernel. Similar results can be obtained for kernel estimates with higher order kernels. This is obviously beyond the scope of the present thesis.

Asymptotically optimal wavelet estimation of trend functions

In this chapter, we consider the hard thresholding wavelet trend estimation for data of the form (1.2). In theorem 4.1 we establish rate optimality and this result is only of limited practical use, since the estimator is not exactly defined.

In order to apply the result to observed data, optimal parameter choice needs to be derived. This question is addressed in theorems 4.2 and 4.3 below. The presentation in this chapter is fairly detailed whereas the main results correspond to Beran and Shumeyko (2011a).

Specifically, this chapter is organized as follows. Basic definitions and notations are introduced in section 4.1, the main results are given in section 4.2. The results are illustrated by tables and figures, which achieved by a short simulation study, in section 4.3. To verify the behavior of data adaptive wavelet estimation as outlined below, we carried out a simulation study with different test functions g and a Gaussian F ARIM A(0, d,0) residual process. We also simultaneously compared hard thresholding wavelet estimators with minimax soft thresholding wavelet estimators and kernal estimators. Proofs are in appendix 4.4 will finalize the chapter.

4.1 Notations

Throughout this chapter, we are making a following assumptions. Suppose that we observe time series data of the form (1.2) with a Gaussian zero mean second

46

order stationary process with long-range dependence. The long-range dependence will be characterized by (1.1).

Let ϕ(t) and ψ(t) be the father and mother wavelets respectively with compact support [0, N] for someN N and such that

ψ(0) =ψ(N) = 0. (4.1)

It then follows

N 0

ϕ(t)dt =

N 0

ϕ2(t)dt=

N 0

ψ2(t)dt= 1, (4.2) Note that, for the sake of generality, the support ofϕand ψ is chosen to be [0, N] instead of [0,1]. By this way, it is possible to choose from a larger variety of wavelet generating functions satisfying (4.2) (see e.g. Daubechies 1992, Cohen et al. 1993). mψ Nwill denote the number of vanishing moments ofψ and νmψ is mψth moment, as it is defined by (3.22) and by (3.23) respectively.

For every trend function g L2([0,1]), and every decomposition level J 0, smoothing parameterq and thresholdδj, we define the hard thresholding wavelet estimator by (3.18).