Hard thresholding wavelet trend estimation

3.8 Trend estimation via wavelet shrinkage

3.8.2 Hard thresholding wavelet trend estimation

Li and Xiao (2007) considered the nonparametric regression model (1.2) with long memory data, that are not necessarily Gaussian, and provided an asymptotic expansion for the MISE of hard thresholding wavelet estimators. They derived MISE expansion for the piecewise smooth underlying mean regression function.

It turns out that MISE convergence rate is the same as by analogous minimax expansion (see e.g. Wang 1996). Throughout this thesis,m_ψ ∈N will denote the number of vanishing moments of ψ, i.e.

∫ _N

t^kψ(t)dt= 0, k= 0,1, . . . , m_ψ−1 (3.22)

and ∫ _N

t^m^ψψ(t)dt=ν_m_ψ ̸= 0. (3.23) We assume that bothϕand ψsatisfy a uniform H¨older condition of exponent 1/2 and the following conditions are satisﬁed:

• The errors ξ_i in (1.2) given by ξ_i = G(ϵ_i) where ϵ_i a Gaussian zero mean second order stationary process with long-range dependence and G has Hermit rank m. Here, long-range dependence is characterized by (1.1).

• The smoothing parameter q, decomposition level J and thresholdingδj are functions of n. We assume that J → ∞, q→ ∞ as n → ∞, and for every j = 1,2, . . . , q−1, holds

2^J+jδ_j² →0, 2(J+j)(2r+1)δ_j² → ∞, δ_j² ≥ (4e)^mC₁²N⁻^1+mα(lnn)^m+1

m^mn^mα2^(J+j)(1⁻^mα) (3.24) where

C₁² =C_γJ²(m) m!

∫N 0

|x−y|⁻^mαψ(x)ψ(y)dx dy

with J(m) = E(G(Z)H_m(Z)) where Z Gaussian with mean 0, variance 1 and H_m is Hermite polynomial of order m. We additionally assume that 0< mα <1.

Under these conditions they derived

Theorem 3.4. If, in addition to the conditions stated before, we assume that the rth derivative g^(r) is continuous and bounded on [0,1]. Let m_ψ = r, then as

n→ ∞, E[

||g−ˆg||²L2

] =C₂(n⁻¹N2^J)^mα+ (N2^J)⁻^2r(r!)⁻²ν_r²(1−2⁻^2r)⁻¹

∫ ₁

g^(r)²(t)dt (3.25) where C₂ is a positive and ﬁnite constant deﬁned by

C₂² =Cγ

J²(m) m!

∫N 0

|x−y|⁻^mαϕ(x)ϕ(y)dx dy.

In this theorem, they have assumed that the mean regression function g is rth times continuously diﬀerentiable. However, this result can be generalized. In addition to last theorem, Li and Xiao (2007) showed, that ifg^(r) is only piecewise continuous, this result still holds, as given in the following:

Theorem 3.5. If, in addition to the conditions stated before, we suppose thatg^(r) exists on [0,1] except for at most a ﬁnite number of points, and, where it exists, it is piecewise continuous and bounded. Furthermore, assume that supp(g^(r)) has positive Lebesgue measure and mψ = r. In particular, g itself may be only piecewise continuous. Also, assume that2^(2r+mα)Jn⁻^2rmα → ∞. Then (3.25) still holds.

Furthermore, this implies

Remark 9. For the Gaussian error special case, the Hermite rank of Gis1(i.e., m= 1). In this case, if the decomposition levelJ is chosen of size _2r+α^α log₂n, then the convergence rate of M ISE is n⁻^(2r+α)^2rα , which is the same as those in Wang (1996). Moreover, Hall and Hart (1990) gave a similar asymptotic expansion forM ISE, considering kernel estimator in ﬁxed-design nonparametric regression when error is Gaussian long memory process and trend is smooth.

No further justiﬁcation for the speciﬁc choice of J and q was given by Li and Xiao (2007), as well as no optimality result is derived. We refer to Li and Xiao (2007) for further discussion.

Beran and Feng (2002a) in contrast to Hall and Hart (1990) considered kernel estimator in semiparametric fractional autoregressive models (SEM IF AR) and derived corresponding results for this case. The class of SEM IF AR models

includes GaussianF ARIM Amodels, what allows us to formulate this results in form that we using later in the empirical studies.

Before starting the result, we require following deﬁnitions. Let K(x) be a sym-metric, nonnegative polynomial kernels with

K(x) = and M¨uller 1979). Suppose that we observe time series data of the form (1.2).

For a given bandwidthb >0 and t∈[0,1], the kernel estimate of g deﬁned by

The following results originally derived by Beran and Feng (2002a)

Theorem 3.6. Letb_n>0be a sequence of bandwidths withb_n →0andnb_n → ∞,

1. Mean integrated squared error in [∆,1−∆]

∫ 1−∆

2. The bandwidth that minimizes the asymptotic MISE is given by b_opt=n^2d^5−2d⁻¹

Note that forr = 0 we obtain box-kernel and this result take following represen-tation:

Theorem 3.7. Let K(x) = ¹₂I{x∈[−1,1]}. Deﬁne β(d) = 2^2dΓ(1−2d) sin(πd)

d(2d+ 1) . Then, under the assumptions of theorem 3.6, we have

1. Mean integrated squared error in [∆,1−∆]

∫ ₁₋_∆

∆

E {

[g(t)−d(t)]ˆ ² }

dt=b⁴_n 1 36

(∫ 1−∆

∆

(g⁽²⁾)2

dx )

+(nbn)^2d⁻¹Cfβ(d) +o( max{

b⁴_n,(nbn)^2d⁻¹}) . 2. The bandwidth that minimizes the asymptotic MISE is given by

b_opt =n^2d⁵⁻⁻^2d¹

[ 1

(1−2d)C_fβ(d)

(∫ 1−∆

∆

(g⁽²⁾)2

)]−1/(5−2d)

Note that theK, deﬁned in (3.26), is only a second order kernel. Similar results can be obtained for kernel estimates with higher order kernels. This is obviously beyond the scope of the present thesis.

Asymptotically optimal wavelet estimation of trend functions

In this chapter, we consider the hard thresholding wavelet trend estimation for data of the form (1.2). In theorem 4.1 we establish rate optimality and this result is only of limited practical use, since the estimator is not exactly deﬁned.

In order to apply the result to observed data, optimal parameter choice needs to be derived. This question is addressed in theorems 4.2 and 4.3 below. The presentation in this chapter is fairly detailed whereas the main results correspond to Beran and Shumeyko (2011a).

Speciﬁcally, this chapter is organized as follows. Basic deﬁnitions and notations are introduced in section 4.1, the main results are given in section 4.2. The results are illustrated by tables and ﬁgures, which achieved by a short simulation study, in section 4.3. To verify the behavior of data adaptive wavelet estimation as outlined below, we carried out a simulation study with diﬀerent test functions g and a Gaussian F ARIM A(0, d,0) residual process. We also simultaneously compared hard thresholding wavelet estimators with minimax soft thresholding wavelet estimators and kernal estimators. Proofs are in appendix 4.4 will ﬁnalize the chapter.

4.1 Notations

Throughout this chapter, we are making a following assumptions. Suppose that we observe time series data of the form (1.2) with a Gaussian zero mean second

order stationary process with long-range dependence. The long-range dependence will be characterized by (1.1).

Let ϕ(t) and ψ(t) be the father and mother wavelets respectively with compact support [0, N] for someN ∈N and such that

ψ(0) =ψ(N) = 0. (4.1)

It then follows

∫ N 0

ϕ(t)dt =

∫ N 0

ϕ²(t)dt=

∫ N 0

ψ²(t)dt= 1, (4.2) Note that, for the sake of generality, the support ofϕand ψ is chosen to be [0, N] instead of [0,1]. By this way, it is possible to choose from a larger variety of wavelet generating functions satisfying (4.2) (see e.g. Daubechies 1992, Cohen et al. 1993). mψ ∈Nwill denote the number of vanishing moments ofψ and νm_ψ is m_ψth moment, as it is deﬁned by (3.22) and by (3.23) respectively.

For every trend function g ∈ L²([0,1]), and every decomposition level J ≥ 0, smoothing parameterq and thresholdδ_j, we deﬁne the hard thresholding wavelet estimator by (3.18).

Im Dokument Data adaptive wavelet methods for Gaussian long-memory processes (Seite 46-51)