Asymptotic distribution - Piecewise polynomial regression with fractional residuals for the ana

which gives the desired result.

Remark 5.1.11. Theorem 5.1.9 and theorem 5.1.10 generalise [BW11, theorem 1, p. 318].

Finally, note that consistency also carries over to noncontinuous piecewise polynomials, with only minor changes of the proof.

5.2 Asymptotic distribution

5.2.1. We now turn to the asymptotic distribution of the least squares estimator for continuous, piecewise polynomials. At first, note that the partial derivatives ∂tµ(t, τ) and ∂_τ_iµ(t, τ), τ ∈ R^ν ×S^l, i = ν + 1, . . . , ν +l may not exist in the classical sense but only in the sense of absolutely continuous functions. Hence, the operators ∂_t and

∂_τ_i are always understood in the sense of the absolutely continuous functions. In con-trast to that, the corresponding left and right derivatives of ∂t±µ(t, τ) and ∂_τ_i±µ(t, τ)

τ ∈R^ν ×S^l, i=ν+ 1, . . . , ν+l

always exist.

For brevity, left and right partial derivatives with respect to theτi will be denoted by

∂_i+ and ∂_i+. Partial derivatives in the sense of an absolutely continuous function will be denoted by ∂_i. Note that ∂_i+µ(t, θ) =∂i−µ(t, θ) =∂_iµ(t, θ) for almost all t∈[0,1].

Recall that the left and right derivative share some properties with the classical deriva-tive. In particular, they are linear operators and the product rule applies. More precisely, let J ⊂ R be an open interval and let f, g : J → R be continuous functions such that

∂t±f(t) and∂t±g(t) exists for allt ∈R. Then

∂_t+(f +g)(t) = lim

h↓0

f(t+h)+g(t+h)−f(t)−g(t)

h =∂_t+f(t) +∂_t+g(t) and

∂_t+(f ·g)(t) = lim

h↑0

f(t+h)·g(t+h)−f(t)·g(t) h

= lim

h↑0

f(t+h)·g(t+h)−f(t+h)·g(t)+f(t+h)·g(t)−f(t)·g(t) h

= lim

h↑0 f(t+h)g(t+h)−g(t) h + lim

h↑0g(t)^f^(t+h)−f_h ^(t)

=f(t)·∂_t+g(t) +g(t)·∂_t+f(t).

Analogous results for ∂t− is obtained in a similar manner.

As an equivalent to stationary points, we have the following relation: lett ∈J⁰ such that f has a local minimum int₀. Then

∂t−f

t=t0 = lim

s↑t0

f(s)−f(t₀) s−t0 ≤0 and

∂_t+f

t=t0 = lim

s↓t₀

f(s)−f(t0) s−t₀ ≥0.

Hence, there exists c∈[0,1] such that c·∂t−f

t=t0 + (1−c)·∂t−f

t=t0 = 0.

One should note that there is a vast amount of literature on generalised directional derivatives in context of optimisation and implicit functions, see e.g. [Cla76, Pou77].

5.2.2. Defining the n×(ν+l) matrices

Qn±= [∂i±µ(t/n)]t=1,...,n;i=1,...,ν+l∈R^n×(ν+l), (5.3) we find by (4.5), that

n→∞lim n⁻¹(Q⁰_n+WnQn+)ik = Z 1

µ(i)(s, θ)w(s)µ(k)(s, θ) ds. (5.4) In analogy to M⁰_nWnMn, the matrix Q⁰_n+WnQn+ has full rank if n is large enough so that

Λ_w = lim

n n Q⁰_n+W_nQ_n+−1

(5.5) is well defined. The essential step to derive the asymptotic distribution is to show that

D(n) θˆ−θ

is asymptotically equivalent to _D(n)ⁿ (Q⁰_n+W_nQ_n+)⁻¹Q⁰_n+W_ne_n, see theorem 5.2.3 and 5.2.4. Once this asymptotic equivalence has been established, the asymptotic distribution of the (weighted) least squares estimator can easily be derived by virtue of section 3.2, see theorem 5.2.8 and theorem 5.2.9

Theorem 5.2.3. Let µbe an identified, continuous piecewise polynomial and let x(t) = µ _n^t

+ξ(t) with ξ(·) as in theorem 5.1.9 and let w be a weight function satisfying the conditions of definition 4.2.1. Denote r(n) = _Dⁿ

ξ(n). Assume that either s7→µ(s, θ)∈ C⁽¹⁾([0,1]) ∀θ∈Θ

s7→µ(s, θ)∈ C([0,1]) ∀θ ∈Θ andξ(·) is strictly stationary and ergodic.

Then, for any ∆ >0, P

r(n)

θˆ^w_n −θ−(Q⁰_n+W_nQ_n+)⁻¹Q⁰_n+W_ne_n >∆

=o(1) (5.6)

as n→ ∞.

Theorem 5.2.4. Let µbe an identified, continuous piecewise polynomial and let x(t) = µ _n^t

+ξ(t) with ξ(·) as in theorem 5.1.10. Denote r(n) = _Dⁿ

ξ(n). Assume that s 7→

µ(s, θ)∈ C([0,1]) for all θ∈Θ. Then, for any ∆>0, P

r(n)

θˆ_n−θ−(Q⁰_n+Q_n+)⁻¹Q⁰_n+e_n >∆

=o(1) (5.7)

as n→ ∞.

Proof of theorem 5.2.3 and theorem 5.2.4. We only prove (5.6) explicitly, as we can derive (5.7) by virtually the same argument. For what follows, set ˆu= ˆθ^w_n−θ. In analogy to the left and right derivative, we can define the right and left gradient by

∇₊ = ∂₁₊, . . . , ∂_(ν+l)+0

and

∇− = ∂1−, . . . , ∂(ν+l)−

. Since θ+ ˆu minimises L^w_n (τ), there exist ci ∈[0,1] such that

c_i·∂_i+L^w_n(θ+ ˆu) + (1−c_i)·∂i−L^w_n(θ+ ˆu) = 0 (i= 1, . . . , ν+l). (5.8) To state matters more concisely, we introduce the following notation: Forc= (c1, . . . , cν+l) and u∈R^ν+l definecu and (1−c)u by

cu= (c₁·u₁, . . . , c_ν+l·u_ν+l)⁰ and

(1−c)u= ((1−c₁)·u₁, . . . ,(1−c_ν+l)·u_ν+l)⁰. Likewise, for U∈R^(ν+l)×(ν^+l) define

cU=







c₁u₁₁ c₁u₁₂ · · · c₁u_1(ν+l) c₂u₂₁ c₂u₂₂ · · · c₂u_2(ν+l)

... ... ...

c_ν+lu_(ν+l)1 c_ν+lu_(ν_+l)2 · · · c₂u_(ν+l)(ν_+l)







and

(1−c)U=







(1−c₁)u₁₁ (1−c₁)u₁₂ · · · (1−c₁)u_1(ν+l) (1−c2)u21 (1−c2)u22 · · · (1−c2)u2(ν+l)

... ... ...

(1−c_ν+l)u_(ν+l)1 (1−c_ν+l)u_(ν+l)2 · · · (1−c₂)u_(ν+l)(ν+l)





 .

Note that is compatible with the standard matrix-vector operations, i.e.

c[U+V] =cU+cV, c(Uu) = (cU)v for U,V ∈R(ν+l)×(ν+l) and v ∈R^ν+l. Moreover, sinceci ∈[0,1],

kcUvk_p ≤ kUvk_p

for any p-norm on R^ν+l. In particular, kcUk ≤ kUkfor any induced matrix norm k·k.

In terms of this notation, (5.8) may be written as

c ∇₊L^w_n(θ+ ˆu) + (1−c) ∇−L^w_n(θ+ ˆu) = 0. (5.9) Observe that the function z 7→ µ _n^t, θ+zuˆ

is differentiable w.r.t. z in the classical sense for almost all z ∈[0,1]. For suchz, the chain rule implies

∂_z+µ _n^t, θ+zuˆ

ν+l

i=1

∂_i+µ _n^t, θ+zu)ˆ ˆ

u_i =∇₊µ _n^t, θ+zuˆ0

ˆ u.

In particular, µ _n^t, θ+ ˆu

−µ _n^t, θ

= Z 1

∂_sµ _n^t, θ+suˆ ds=

Z 1 0

∇₊µ _n^t, θ+sˆu ds

0u.ˆ Thus, by linearity and product rule,

∂_i+L(θ+ ˆu) =−2

t=1

w _n+1^t x(t)−µ(_n^t, θ+ ˆu)

∂_i+µ(_n^t, θ+ ˆu)

=−2

t=1

w _n+1^t

ξ(t)∂i+µ _n^t, θ+ ˆu

−

t=1

w _n+1^t µ _n^t, θ

−µ _n^t, θ+ ˆu

∂_i+µ _n^t, θ+ ˆu

=−2

t=1

w _n+1^t

ξ(t)∂_i+µ _n^t, θ

−

t=1

w _n+1^t ξ(t)

∂_i+µ _n^t, θ+ ˆu

−∂_i+µ _n^t, θ + 2

t=1

w _n+1^t

∂i+µ _n^t, θ+ ˆu Z 1

∇+µ _n^t, θ+sˆu ds

0u.ˆ Denoting

r_n+ =

t=1

w _n+1^t ξ(t)

∂_i+µ _n^t, θ+ ˆu

−∂_i+µ _n^t, θ

!ν+l

i=1

Q_n = Z 1

∂_i+µ _n^t, θ+suˆ0

t=1,...,n;i=1,...,ν+l

∈R^n×(ν+l), and

Qbn+ =

∂i+µ _n^t, θ+ ˆu

t=1,...,n;i=1,...,ν+l ∈R^n×(ν+l), we obtain

−¹₂∇₊L^w_n(θ+ ˆu) = Q⁰_n+W_ne_n+r_n+−Qb⁰_n+W_nQ_nu.ˆ (5.10) Likewise,

−¹₂∇−L^w_n(θ+ ˆu) = Q⁰_n−W_ne_n+rn−−Qb⁰_n−W_nQ_nu,ˆ (5.11) whereQbn− andrn− are defined in analogy to their counterpartsQb_n+and r_n+. Combining (5.10) and (5.11) with (5.9), we find

cQ⁰_n+W_ne_n+cr_n+−cQb⁰_n+W_nQ_nuˆ

+(1−c)Q⁰_n−Wnen+ (1−c)rn−−(1−c)Qb⁰_n−WnQ_nuˆ= 0, in other words

−Q⁰_n+W_ne_n =y_n−Q⁰_n+W_nQ_n+uˆ+C_nuˆ+cr_n++ (1−c)rn− (5.12) where we denote

y_n= (1−c)

Q⁰_n−−Q⁰_n+

W_ne_n and

Cn = h

Q⁰_n+−Qb⁰_n−

WnQn++Qb⁰_n−Wn

Qn+−Q_n

+ch

Qb⁰_n−−Qb⁰_n+

WnQ_n. The following lemma clarifies the structure of the remaindersr_n+andrn−, respectively.

Lemma 5.2.5. There exist sequences of random matrices (Rn+)_n∈_N ⊂ R(ν+l)×(ν+l) and (Rn−)_n∈_N⊂R(ν+l)×(ν+l) such that

(i) rn±=Rn±uˆ+vn±.

(ii) P(n⁻¹kRn±k>∆) =o(1) for all ∆>0.

The vectors v_n± are equal to zero if and only if µ has a continuous derivative. More precisely, defineJ ={k = 1, . . . , l: b1,k = 1} andJ^c={k= 1, . . . , l : b1,k >1}with b1,k

as in 4.1.1. Then

(iii) vn+ = (v1;n+, . . . , vν+l;n+)∈R^ν+l with

v_i;n+ = 0 (i= 1, . . . , ν), v_ν+k;n+ = 0 (k ∈ J^c), and

|v_ν+k;n+|² =

a_1,k

dⁿ(^η^ˆ^wk;n∨η_k)⁻¹e X

t=dⁿ(^η^ˆk;n^w ∧η_k)e

ξ(t)w _n+1^t

(k∈ J).

(iv) vn− = (v1;n−, . . . , vν+l;n−)∈R^ν+l with

v_i;n− = 0 (i= 1, . . . , ν), vν+k;n− = 0 (k ∈ J^c), and

|vν+k;n−|² =X

k∈I

a_1,k

bⁿ(^η^ˆ^wk;n∨ηk)c X

t=bⁿ(^η^ˆ^wk;n∧η_k)⁺¹c

ξ(t)w _n+1^t

(k ∈ J).

We postpone the proof of the lemma and proceed with the proof of the main result:

Define Rn and vn by

R_n=cR_n++ (1−c)R_n−, and

v_n=cv₊+ (1−c)vn−. Then, (5.12) can be written as

−Q⁰_n+W_ne_n =y_n+v_n+

C_n+R_n−Q⁰_n+W_nQ_n+

u. (5.13)

By (5.5), there exists K ∈R+ and N ∈N, such that for alln > N

n Q⁰_n+W_nQ_n+−1

≤K (n > N). We thus obtain for n > N

r(n)

uˆ− Q⁰_n+WnQn+

⁻¹

Qn+Wnen

>∆

r(n)

Q⁰_n+W_nQ_n+−1

[y_n+v_n+ [C_n+R_n] ˆu]

>∆

≤P r(n)

n K|y_n+v_n+C_nuˆ+R_nu|ˆ >∆ .

By conditioning on the distance

ˆη^w_k;n−ηk

(k ∈ J), we will show that the last probability indeed vanishes asn → ∞. More specifically, letI ⊂ J andκ∈(0,1) such that ^r(n)_nκ →0.

Denote I^c=J \ I and byE_n^I the event E_n^I =

ˆη^w_k;n−η_k

< n^−κ :k∈ I ∩

ηˆ_k;n^w −η_k

≥n^−κ :k ∈ I^c . Note that E_n^I¹ ∩ E_n^I² =∅ if I₁ 6=I₂ and S

I⊂J E_n^I = Ω. Hence P

nr(n)

n K|y_n+v_n+C_nuˆ+R_nu|ˆ >∆o

=P nr(n)

n K|yn+vn+Cnuˆ+Rnu|ˆ >∆ o

∩ [

I⊂J

E_n^I

I⊂J

nr(n)

n K|y_n+v_n+C_nuˆ+R_nu|ˆ >∆o

∩ E_n^I . It is thus sufficient to check that for all ∆>0 and all I ⊂ J

nr(n)

n |y_n+v_n+C_nuˆ+R_nu|ˆ >∆o

∩ E_n^I

→0. (5.14)

So, fix I ⊂ J and define v_n^I = v^I_1;n, . . . , v^I_ν+l;n by

v_i;n^I =







v_i;n if i=ν+k for some k ∈ I, 0 else.

Observe thatv_n =v^I_n^c+v^I_n. Define further the diagonal matrixO^I_n = o^I_ij

i=1,...,ν+l;j=1,...,ν+l

o^I_ij =







v^I_i;n^c ˆ

ui if i=j =ν+k with k ∈ I^c, 0 else.

Note that O^I_n is well-defined if the event E_n^I occurs, as |ˆu_ν+k| ≥n^−κ for all k ∈ I^c in this case. By the very definition of O^I_n, we have

v_n^I^c =O^I_nu.ˆ Hence, (5.14) follows, if we can show that

nr(n) n

y_n+v_n^I +

O^I_n+C_n+R_n ˆ u

>∆o

∩ E_n^I

→0. (5.15)

But lemma 5.2.5 implies

P n⁻¹kR_n|>∆

→0 for all ∆>0.

From lemma 5.2.6 and lemma 5.2.7 below, we conclude that, P

r(n)

n |y_n|>∆

→0 for all ∆ >0

and

P n⁻¹kC_nk>∆

→0 for all ∆>0.

Consequently, (5.15) follows, if we can show that P

n⁻¹ O^I_n

>∆ ∩ E_n^I

→0 for all ∆ >0, (5.16) P

nr(n) n

v_n^I >∆o

∩ E_n^I

→0 for all ∆>0, (5.17) and that r(n)ˆu is tight in the following sense: for all ε >0 exists ˜∆>0 such that

P n

r(n)|ˆu|>∆˜o

∩ E_n^I

≤ε. (5.18)

Proof of (5.16): We need to ascertain that P

|^v^ν+k;n|

|nˆuν+k| >∆

∩ E_n^I

→0 for all ∆ >0, k ∈ I^c. (5.19) So, fix k ∈ I^c and ∆>0. By definition of vn,

|^vν+k;n|

|nˆuν+k| >∆

∩ E_n^I

≤P

|^vν+k;n+|

|nˆuν+k| >∆/2

∩ E_n^I

|^vν+k;n−|

|nˆuν+k| >∆/2

∩ E_n^I

. To show that

|^vν+k;n+|

|nˆuν+k| >∆/2

∩ E_n^I

→0, (5.20)

choose δ >0 such that [η_k−δ, η_k+δ]⊂(0,1). Since ˆθ^w_n is consistent, we find P

|^vν+k;n+|

|nˆuν+k| >∆/2

∩ E_n^I

|^vν+k;n+|

|nˆuν+k| >∆/2

∩ E_n^I∩ {|ˆu| ≤δ}

+o(1).

By definition of v_n+ in lemma 5.2.5, P

|^vν+k;n+|

|nˆuν+k| >∆/2

∩ E_n^I ∩ {|ˆu| ≤δ}

1 nˆuν+k

P^dn(η^k^∨ˆ^η^wk;n)−1e

t=dn(η_k∧ηˆ_k;n^w )eξ(t)w _n+1^t

>∆/2

∩ {|ˆu| ≤δ} ∩ E_n^I

≤P

sup

0≤u≤δ,|u|>n^−κ

1 nu

Pdn(η_k+u)−1e

t=dnη_ke ξ(t)w _n+1^t

>∆/4

+ P

sup

−δ≤u≤0,|u|>n^−κ

1 nu

Pdnη_ke−1

t=dn(η_k+u)eξ(t)w _n+1^t

>∆/4

=Π₁+ Π₂,

say. Since [η−δ, η+δ]⊂(0,1), we may assume that ^t+dηne−1_n+1 ∈K for some compact set K ⊂ (0,1). Denote Ydnη_ke(t) =Pt

s=dnη_keξ(s), Y(t) = Pt

s=1ξ(s) and by VK(w) the total variation of w on K. Note that VK(w) is finite, since w is continuously differentiable on K. Then

Pdn(η_k+u)−1e

t=dnη_ke ξ(t)w _n+1^t =

Pdn(η_k+u)−1e t=dnη_ke

Ydnη_ke(t)−Ydnη_ke(t−1)

w _n+1^t

Pdn(ηk+u)−2e

t=dnη_ke Ydnη_ke(t)

w _n+1^t

−w _n+1^t+1 +Ydnη_ke(dn(η_k+u)−1e)w_dn(η

k+u)−1e n+1

≤2^dn(η^ksup^+u)−1e

t=dnη_ke

Y_dnη_k_e(t)

VK(w).

Consequently, by strict stationarity of ξ(·), Π₁ ≤P

sup

u>n^−κ 1 nu

dn(η_k+u)−1e

sup

t=dnη_ke

Ydnη_ke(t)

VK(w)>∆/8

sup

u>n^−κ 1 nu

dn(η_k+u)e−dnη_ke

sup

t=1

|Y(t)|VK(w)>∆/8

≤P

sup

u>n^−κ 1 nu

dnuesup

t=1

|Y(t)|V^K(w)>∆/8

. Likewise, we find

Π₂ ≤P

sup

u>n^−κ 1 nu

dnuesup

t=1

Ydnη_ke(t)

VK(w)>∆/8

. Hence, (5.20) follows, if we can show that

sup

u>n^−κ 1 nu¸^dunesup

t=1

|Yt|>∆^∗

=o(1) (5.21)

for all ∆^∗ >0. But by assumption, lim_n_n¹|Y_n|= 0 almost surely and thus lim sup

1ⁿ1

k|Y_k|>∆^∗o = 0 a.s. for all ∆^∗ >0 from which we conclude by contraposition that

T∞ k=1

t≥k

n_|Y

t >∆^∗o

= 0 for all ∆^∗ >0.

Thus, for allε >0, there exists N(ε)∈Nsuch that P

t≥N(ε)

n|Yt| t >∆^∗

≤ε.

Since un > n^1−κ, we may assume that dnue> N(ε). In this case P

sup

u>n^−κ dunesup

t=1

|Yt| nu >∆^∗

≤P

sup

u>n^−κ

Sdnue t=1

n_|Y

nu ≥∆^∗o

≤P

SN(ε)

t=1 {n^κ−1|Y_t|>∆^∗}

S∞ t=N(ε)+1

n_|Y

t >∆^∗o . By construction,

S∞ t=N(ε)+1

n_|Y

t >∆^∗o

≤ε,P

SN(ε)

t=1 {n^κ−1|Y_t|>∆^∗}

=O(n^κ−1).

Thus, (5.21) and in turn (5.20) follow.

By virtually the same argument, we can show that P

|^v^ν+k;n−|

|nˆuν+k| >∆/2

∩ E_n^I

→0, and so (5.16) follows.

Proof of equation (5.17): We must show that P

nr(n)

n |vν+k;n|>∆ o

∩ E_n^I

→0 for all ∆>0 and all k ∈ I.

To that end, fixk ∈ Iand ∆ >0. We may w.l.o.g. assume that [η_k−n^−κ, η_k+n^−κ]⊂ [a, b] for 0< a < b <1. By definition of v_n and lemma 5.2.5,

nr(n)

n |v_ν+k;n|>∆o

∩ E_n^I

nr(n) n

c_ν+k·v_(ν+k)+;n+ (1−c_nu+k)·v_(ν+k)−;n >∆o

∩ E_n^I

r(n)

n |a_1,k|P^dn(ˆ^η^wk;n∨η_k)e t=bn(ˆη_k;n^w ∧η_k)c

ξ(t)w _n+1^t >∆

∩ E_n^I

≤P

r(n)

n |a_1,k|Pdn(η_k+n^−κ)e t=bn(η_k−n^−κ)c

ξ(t)w _n+1^t >∆

≤P

r(n) n

Pdn(η_k+n^−κ)e

t=bn(η_k−n^−κ)c|ξ(t)|>∆^∗

where ∆^∗ = ^∆

|^a1,k|^supx∈[a,b]|w(x)|. SinceE(|ξ(t)|) does not depend ont, we obtain by Markov’s inequality

r(n) n

dn(η_k+n^−κ)e

t=bn(η_k−n^−κ)c

|ξ(t)|>∆^∗

≤ _n∆^r(n)∗

dn(η_k+n^−κ)e

t=bn(η_k−n^−κ)c

E|ξ(t)| →0 which proves the claim.

Proof of equation (5.18): From (5.13), we obtain Q⁰_n+W_ne_n+y_n+v_n^I =

Q⁰_n+W_nQ_n+−O^I_n−C_n−R_n ˆ

u. (5.22)

Since all eigenvalues ¹_nQ⁰_n+W_nQ_n+ are bounded away from zero as n → ∞, there exist K ∈R^∗+ such that

|ˆu| ≤ ^K_n

Q⁰_n+W_ne_n+y_n+v_n^I whenever ¹_n

O^I_n+C_n+R_n

is sufficiently small, say ¹_n

O^I_n+C_n+R_n

≤ ∆₀. Define therefore the event

F_n =₁

O^I_n+C_n+R_n

≤∆₀ . Then

P {|r(n)ˆu|>∆^∗} ∩ E_n^I

≤P {|r(n)ˆu|>∆^∗} ∩ E_n^I ∩ F_n

+P E_n^I ∩ F_n^c . But by lemma 5.2.5, lemma 5.2.7 and (5.16)

P E_n^I ∩ F_n^c

→0.

On the other hand

P {|r(n)ˆu|>∆^∗} ∩ E_n^I∩ F_n

≤P

nr(n)K n

Q⁰_n+W_ne_n+y_n+v_n^I

>∆^∗o

∩ E_n^I . So, for fixed ε >0, we find by 3.2.4, lemma 5.2.6 and (5.17), a ∆^∗ >0 such that

nr(n)K n

Q⁰_n+W_ne_n+y_n+v_n^I

>∆^∗o

∩ E_n^I

≤ε, which is indeed the desired result.

Proof of lemma 5.2.5. We consider only the case of the right derivative, as the arguments for left derivative are virtually identical.

For what follows, denote byr_i;n+andv_i;n+thei-th component ofr_n+andv_n+(i= 1, . . . , ν+l) and by r_im;n+ the im-th component of R_n+ (i, m= 1, . . . , ν+l).

(i) If i= 1, . . . , p₀, then ∂_i+µ(s, θ) does not depend onθ, i.e.

∂_i+µ _n^t, θ+ ˆu

=∂_i+µ _n^t, θ . In particular, r_i;n+ = 0 (i= 1, . . . , p₀). Hence

v_i;n+ = 0, r_im;n+= 0 (i= 1, . . . , p₀, m= 1. . . , ν +l).

(ii) Let i=p0+ 1, . . . , ν. Thenθi =aj,k for some k = 1, . . . , l and j = 1, . . . , pk. Hence

∂_i+µ(s, θ+ ˆu)−∂_i+µ(s, θ) = (s−uˆ_ν+k−η_k)^b₊^j,k −(s−η_k)^b₊^j,k

= Z 1

∂_z(s−uˆ_ν+kz−η)^b₊^j,k dz

=−uˆν+k

Z 1 0

bj,k(s−uˆν+kz−η)^b₊^j,k⁻¹dz, thus

r_i;n+ =

t=1

w _n+1^t ξ(t)

∂_i+µ _n^t, θ+ ˆu

−∂_i+µ _n^t, θ

=−

t=1

ξ(t)w _n+1^t ˆ u_ν+k

Z 1 0

b_j,k(_n^t −uˆ_ν+kz−η)^b₊^j,k⁻¹dz.

Therefore

r_im;n+ = 0, v_i;n+ = 0 (i=p₀+ 1, . . . , ν, m 6=ν+k) and

r_i(ν+k);n+=−

t=1

ξ(t)w _n+1^t Z 1

b_j,k _n^t −uˆ_ν+kz−ηbj,k−1

+ dz.

(iii) Let i=ν+k (k = 1. . . , l), in other words θ_i =η_k. In this case θ_j+p =a_j,k

j = 1, . . . , p_k; p=

k−1

k⁰=0

p_k⁰

Assume b_1,k = 1. Then

∂i+µ(s, θ+ ˆu)−∂i+µ(s, θ)

j=1

b_j,k[a_j,k+ ˆu_j+p]·(s−η_k−uˆ_ν+k)^b₊^j,k⁻¹−

j=1

b_j,ka_j,k·(s−η_k)^b₊^j,k⁻¹

j=1

u_j+pb_j,k·(s−η_k−uˆ_ν+k)^b₊^j,k⁻¹+

j=1

b_j,ka_j,k·(s−η_k−uˆ_ν+k)^b₊^j,k⁻¹−

j=1

b_j,ka_j,k·(s−η_k)^b₊^j,k⁻¹

j=1

u_j+pb_j,k·(s−η_k−uˆ_ν+k)^b₊^j,k⁻¹+a_1,k ·

(s−η_k−uˆ_ν+k)⁰₊−(s−η_k)⁰₊ +

j=2

a_j,kb_j,k·h

(s−η_k−uˆ_ν+k)^b₊^j,k⁻¹−(s−η_k)^b₊^j,k⁻¹i

j=1

u_j+pb_j,k·(s−η_k−uˆ_ν+k)^b₊^j,k⁻¹+a_1,k ·

(s−η_k−uˆ_ν+k)⁰₊−(s−η_k)⁰₊

−

j=2

aj,kbj,k(bj,k −1) ˆuν+k

Z 1 0

(s−ηk−uˆν+kz)^b₊^j,k⁻² dz.

Hence, r_i;n+=−

j=1

ˆ u_j+pb_j,k

t=1

w _n+1^t

ξ(t) _n^t −η_k−uˆ_ν+kbj,k−1

+ −

t=1

w _n+1^t

ξ(t)a_1,k·h

n−η_k−uˆ_ν+k0

+− _n^t −η_k0 +

i + ˆ

uν+k n

t=1

w _n+1^t ξ(t)

j=2

aj,kbj,k(bj,k −1) Z 1

0 t

n −ηk−uˆν+kzbj,k−2

+ dz

and thus

r_(ν+k)m;n+ =−

t=1

ξ(t)w _n+1^t

b_j,k _n^t −η_k−uˆ_ν+kbj,k−1 +

if m =p+j, j = 1, . . . , p_k, r_(ν+k)m;n+ =

t=1

ξ(t)w _n+1^t

j=2

a_j,k b²_j,k−b_j,k Z 1

0 t

n −η_k−zuˆ_ν+kbj,k−2

+ dz

if m =ν+k, and

r_(ν+k)m;n+ = 0 else. By 4.2.5,

|v_ν+k;n+|=

a_1,k

dn(ˆη^w_k;n∨η_k)−1e

t=dn(ˆη_k;n^w ∧η_k)e

ξ(t)w(_n+1^t ) .

If b_j,k >1, then all quantities remain the same, except for v_ν+k;n+ which is equal to zero.

In order to complete the proof, we need to show that ¹_nkR_n+k →0 in probability, i.e. for all ∆>0

P n⁻¹|r_im;n|>∆

=o(1) (i= 1, . . . , ν+l;m = 1, . . . ν+l). (5.23) To that end, it suffices to check that for all k= 1, . . . , l,m ∈N∪ {0} and ∆>0

n⁻¹

t=1

ξ(t)w _n+1^t _t

n −η_k−uˆ_ν+km +

>∆

=o(1), (5.24)

and since each r_jk;n can be represented as a linear combination of expressions of this kind.

So, fix m ∈ N ∪ {0}, k ∈ {1, . . . , ν} and ∆ > 0. At first, note that there exists Therefore, define the events

A_m =

and thus P

By 5.1.2 and 5.1.4, P(V₀^c) = o(1) and, by 3.2.4, P(A^c_m) = o(1). Theorem 5.1.9 implies P(U₀^c) = o(1).

Proof of (5.25): We apply similar arguments. Assume w.l.o.g. that ˆu_ν+k ≥ 0.

At first, we need to calculate the integral R1 0

n −η_k−zuˆ_ν+km

+ dz: If _n^t > η_k+δ and 0<|ˆu_ν+k| ≤δ, then

Z 1 0

n −η_k−zuˆ_ν_+k m

dz = 1 m+ 1

1 ˆ uν+k

h t

n−η_km+1

− _n^t −η_k−uˆ_ν+km+1i

= 1

m+ 1 1 ˆ uν+k

n−η_km+1

−

m+1

i=0 m+1

n −η_ki

(−ˆu_ν+k)^m+1−i

= 1

m+ 1

" _m X

i=0 m+1

n−η_ki

(−ˆu_ν+k)^m−i

# . If _n^t > η_k+δ and ˆu_ν+k = 0, this simplifies to

Z 1 0

n−ηk−zuˆν+k

+ dz = _n^t −ηk

m +. If |_n^t −η_k| ≤δ, we have the (rather trivial) upper bound

Z 1 0

n−η_k−zuˆ_ν+km

+ dz ≤1.

Hence, for ω ∈A_m∩U₀∩V₀, 1

t=1

ξ(t)w _n+1^t Z 1

0 t

n−η_k−zuˆ_ν+km + dz

≤

t n>ηk+δ

ξ(t)w _n+1^t ₁

m+1

^m X

i=0 m+1

n −ηk

(−ˆuν+k)^m−i

+ 1

n X

|t n−η_k|≤δ

ξ(t)w _n+1^t

≤

i=0

(−ˆu_ν+k)^m−i _m+1¹ ^m+1_i X

t n>ηk+δ

ξ(t)w _n+1^t _t

n −η_ki

+ 1

n X

|t n−η_k|≤δ

ξ(t)w _n+1^t

≤∆ 2 +

v u u t

1 n

|t n−η_k|≤δ

ξ_t² v u u t

1 n

|t n−η_k|≤δ

w_d² _n+1^t

<∆

and thus again P n⁻¹

t=1

ξ(t)w _n+1^t Z 1

0 t

n −ηk−zuˆν+k

m + dz

>∆

≤P(A^c_m∪U₀^c∪V₀^c).

Lemma 5.2.6. For all ∆ >0 P

r(n)

n |y_n|>∆

r(n) n

(5.26) Proof. Since ∂_j+µ(t, θ) = ∂j−µ(t, θ) for allt ∈[0,1] and i= 1, . . . ν, we have

|y_i;n|= 0 (i= 1, . . . , ν).

If i=ν+k (k = 1, . . . , l), then ∂_j+µ(t, θ) and ∂j−µ(t, θ) differ at most at t=η_k. Hence

r(n)

n |y_ν+k;n| ≤ ^r(n)_n

[∂_j+µ(η_k, θ)−∂j−µ(η_k, θ)]w _n+1^η^kⁿ ξdnη_ke

But, by Markov’s inequality,

P r(n)

n |y_ν+k;n|>∆

≤ ^r(n)_n E

[∂_j+µ(η_k, θ)−∂_j−µ(η_k, θ)]w _n+1^η^kⁿ ξ_dnη_k_e

→0, which proves the assertion.

Lemma 5.2.7. For all ∆ >0

P ¹_nkCnk>∆

=o(1).

Proof. From the triangle inequality, we obtain the upper bound kC_nk ≤

Q⁰_n+−Qb⁰_n−i

W_nQ_n+

Qb⁰_n−W_n

Q_n+−Q_n +

Qb⁰_n−−Qb⁰_n+i

W_nQ_n

Applying case-by-case arguments, we will show that each term converges to zero as n → ∞. To that end, we assume (as we may by theorem 5.1.9) that |ˆu| ≤ δ for some δ >0. Note that there exists a constant C ∈R+ such that

sup

i=1,...,ν+l,t∈[0,1],|u|≤δ

|∂i±µ(t, θ+u)| ≤C.

As for the first term, we obtain by definition 1

n hh

Q⁰_n+−Qb⁰_n−i

W_nQ_n+i

=1 n

t=1

∂i−µ _n^t, θ+ ˆu

−∂_i+µ _n^t, θ w _n^t

∂_j+µ_j _n^t, θ .

The functions (s, u)7→∂i+µ(s, θ+u) and (s, u)7→∂i−µ(s, θ+u) coincide and are contin-uous in all (s, u) whenever |s−η_k| >|u| (k = 1, . . . , l). In particular, they coincide and are uniformly continuous on the set

D(δ₁, δ₂) = {(s, u) :s ∈[0,1] such that |s−η_k| ≥δ₁,|u| ≤δ₂} if δ₁ ≥2δ₂. By choosingδ₁ and δ₂ sufficiently small, we thus obtain for any ε >0

∂_i+µ _n^t, θ

−∂i−µ _n^t, θ+u ≤ε, whenever _n^t, u

∈D(δ₁, δ₂). Thus, fixing ∆>0 and assuming |ˆu| ≤δ₂,

1 n

Q⁰_n+−Qb⁰_n−i

W_nQ_n+i

≤^2C_n² X

˛ t n−ηk

˛≤2δ

w _n^t

+εC (i, j = 1, . . . , ν+l)

<∆ if δ₁, δ₂ sufficiently small (i, j = 1, . . . , ν+l). Hence, for any sufficiently small δ >0,

1 n

Q⁰_n+−Qb⁰_n−i

W_nQ_n+i

>∆

≤P(|ˆu|> δ), and thus

Q⁰_n+−Qb⁰_n−i

W_nQ_n+

>∆

→0.

Similar arguments apply to the second term: By definition, 1

n h

Qb⁰_n−W_n

Q_n+−Q_ni

=1 n

t=1

∂i−µ _n^t, θ+ ˆu w _n^t

∂_j+µ _n^t, θ

− Z 1

∂_jµ _n^t, θ+ ˆus ds

. By above arguments, we can choose δ >0 such that for all _n^t, u

∈D(δ)

∂_j−µ _n^t, θ

− Z 1

∂_jµ _n^t, θ+su ds

≤ε, Hence, for fixed ∆>0 and |ˆu| ≤δ,

1 n

Qb⁰_n−W_n

Q_n+−Q_ni

≤^2C_n² X

˛ t n−η_k˛

˛≤2δ

w _n^t

+εC (i, j = 1, . . . , ν +l)

<∆ ifδ sufficiently small (i, j = 1, . . . , ν+l), which proves the claim.

Finally, consider the third term: By definition, 1

n hh

Qb⁰_n−−Qb⁰_n+i

W_nQ_ni

=1 n

t=1

∂i−µ

t n,θˆ^w_n

−∂_i+µ

n,θˆ^w_n w _n^t

Z 1 0

∂_jµ _n^t, θ+sˆu ds.

As above, ∂_i+µ t,θˆ^w_n

6= ∂i−µ t,θˆ_n^w

only if i = ν +k, k = 1, . . . , l and t = ˆη^w_k;n. Hence, for fixed ∆>0,

1 n

Qb⁰_n−−Qb⁰_n+i

W_nQ_ni

ij = 0 (i= 1, . . . , ν) and

1 n

Qb⁰_n−−Qb⁰_n+i

W_nQ_ni

≤2C² n

|_n^t^−ηk|^≤|^u^ˆν+k|

w _n^t

(i=ν+k)

<∆ if |ˆu| sufficiently small (i=ν+k), which completes the proof.

Theorem 5.2.8. Let w, ξ(·) and µbe as in theorem 5.2.3.

(i) Assume that the finite dimensional distributions of Z_ξ,n(·) converge to a fractional Brownian motionB_H(·) with Hurst parameterH = 0.5+d. Denote by Ξ the random vector

Ξ :=

w(s)∂_j+µ(s) dB_H(s)

j=1,...,ν+l

. Then, as n → ∞,

r(n) ˆθ^w_n −θ d

−

→ΛΞ. (5.27)

with Λ as in (5.5).

(ii) Assume that the finite dimensional distributions of Z_ξ,n(·) converge to a Hermite process H^m_H(·) with Hurst parameterH = 0.5 +d. Denote by Ξ^∗ the random vector

Ξ_m :=

w(s)∂_j+µ(s) dH^m_H(s)

j=1,...,ν+l

. Then, as n → ∞,

r(n) ˆθ^w_n −θ d

−

→ΛΞ_m. (5.28)

Theorem 5.2.9. Letξ(·) andµbe as in theorem 5.2.4. Setw= 1 in the above definitions of Λ, Ξ and Ξ_m.

(i) Assume that the finite dimensional distributions of Zξ,n(·) converge to a fractional Brownian motion B_H(·) with Hurst parameter H = 0.5 +d. Then, as n→ ∞,

r(n) ˆθn−θ d

−

→ΛΞ. (5.29)

(ii) Assume that the finite dimensional distributions of Zξ,n(·) converge to a Hermite process H^m_H(·) with Hurst parameter H = 0.5 +d. Then, as n → ∞,

r(n) ˆθn−θ d

−

→ΛΞm. (5.30)

Proof of theorem 5.2.8 and theorem 5.2.9. We only prove (5.27) explicitly, as (5.28), (5.29) and (5.30) follow by similar reasoning: By theorem 5.2.3, it suffices to show that

r(n) Q⁰_n+W_nQ_n+−1

Q⁰_n+W_ne_n→ΛΞ.

By definition

n Q⁰_n+W_nQ_n+−1

→Λ, so the claim follows, if we can show that

r(n)

n Q⁰_n+W_ne_n−→^d Ξ, i.e. if we can show that

r(n) n

α,Q⁰_n+W_ne_n

→ hα,Ξi ∀a∈R^ν+l. (5.31) But this a a consequence of section 3.2. By theorem 3.2.14,

r(n) n

α,Q⁰_n+W_ne_n

→ Z

µ_α+(s) dB_H(s) with µ_(α+)(t) :=Pν+l

j=1α_j∂_j+µ(t) which proves the claim.

5.2.10. The covariance matrix Σ₀ of Ξ follows immediately from the isometric property of the Wiener integral established in section 3.1. It can be expressed in terms of fractional integrals, see paragraph 3.2.4, as

Σ₀ =

Γ(d+ 1)² c²₁(d)

I₋^d(w ∂_j+µ) (s)

I₋^d (w ∂_k+µ) (s)ds

j,k=1,...,p+1

, (5.32)

Hence, under the assumptions of theorem 5.2.8 (i) and for w = 1 under the assumptions of theorem 5.2.9 (i) we have, as n→ ∞,

r(n) ˆθ^w_n −θ d

−

→N(0,ΛΣ₀Λ). (5.33)

Using a computer algebra system it is possible to obtain closed form formulas for the entries of the covariance matrix ΛΣ₀Λ in the case of w = 1. Consider, for instance, the variance of ˆηfor a cubic spline with an unknown knot η. denote byP_M^⊥

n(ˆη) the projection matrix on the subspace that is orthogonal to the columns ofM^⊥_n(ˆη). The first four columns of M^⊥_n(ˆη) correspond to the basis functions g_j(x) = x^j−1 (j = 1, . . . ,4), the last column corresponds to g₅(x) = (x−η)ˆ ³₊. Since minimisation of

X−P_M_n_(˜_η)(X)

2 =

P_M^⊥

n(˜η) e_n+

j=1

a_jm^j_n(η) +a₅m⁵_n(η)

with respect to ˜η is equivalent to minimising

P_M^⊥_n_(˜_η) e_n

a₅ +m⁵_n(η)

, ˆ

η depends on the parameters (a1, . . . , a5, , σ_ξ²) via the ratio ^σ_a^ξ

5 only. If ξ is multiplied by a constant c, the asymptotic variance of ˆη is multiplied by the factor c². Hence, the asymptotic variance is of the form

Var_asym(ˆη) = σ_ξ²

a²₅f(α, η) where f is a nonlinear function of η and α. Here, ^σ

2 ξ

a²₅ can be regarded as a noise to signal ratio. After some transformations in maple (see program 11.1.1, 11.1.2 and 11.1.3), the closed form of the asymptotic variance of ˆη normalised by D²(n) reads forα ∈(0,1) as

100(1−α)(2−α) a²₅(1−α)8

j=1

V_j(α, η) with (1−α)_k = (k−α)(k−1−α)· · ·(1−α) and

V1(α, η) := α⁶ η⁵(1−η)⁵

V2(α, η) :=−3α⁵`

7η²−7η+ 3´ η⁶(1−η)⁶ V3(α, η) :=α⁴

„ 12

η^4+α(1−η)⁷ + 12 η⁷(1−η)^4+α

V4(α, η) :=α⁴

„145η⁴−290η³+ 292η²−147η+ 30 η⁷(1−η)⁷

V5(α, η) :=−3α³`

97η⁶−291η⁵+ 562η⁴−639η³+ 395η²−124η+ 14´ η⁸(1−η)⁸

V6(α, η) := 6α³

`12η²+ 8η+ 7´ η⁸(1−η)^4+α +

`27 + 12η²−32η´ η^4+α(1−η)⁸

V7(α, η) :=−α²`

506η⁶−1518η⁵−207η⁴+ 2944η³−3135η²+ 1410η−252´ η⁸(1−η)⁸

V8(α, η) :=−12α² (2η+ 3)`

6η²−3η+ 7´

η⁸(1−η)^4+α −(2η−5)`

6η²−9η+ 10´ η^4+α(1−η)⁸

V9(α, η) :=6α`

280η⁶−840η⁵+ 665η⁴+ 70η³−521η²+ 346η−77´ η⁸(1−η)⁸

V10(α, η) := 6α

`24η³−72η²+ 4η+ 77´ η⁸(1−η)^4+α −

`24η³−68η−33´ η^4+α(1−η)⁸

V11(α, η) := 252 (2η−1)`

2η²+ 2η+ 1´

η⁸(1−η)^4+α −(2η−1)`

2η²−6η+ 5´ η^4+α(1−η)⁸

V12(α, η) := 252 (2η−1)² η⁸(1−η)⁸

For fixed α, each of the terms Vj(α, η) is symmetric at η = 0.5. Hence, the asymptotic variance is symmetric at η= 0.5. Moreover, it has a pole at η= 0 of the form

f(α, η)∼ 1 η^4+α

1200(6−2α²+ 3α)(1−α)(2−α) a²₄2(8−α)(6−α)(1−α)₄ . The asymptotic variance of ˆa₅ is given by

g(α, η) := 392(1−α)(2−α) 2(1−α)₈

j=1

W_j with

W1:=−α⁶25η²−25η+ 6

η⁷(1−η)⁷ −3α⁵175η⁴−350η³+ 283η²+ 16 η⁸(1−η)⁸

W2:=−α⁴3625k⁶−10875k⁵+ 15129k⁴−12133k³+ 5712k²−1458 + 156 η⁹(1−η)⁹

W3:=−30α⁴

„(2η−1)(5η−3)

η^6+α(1−η)⁹ +(2η−1)(5η−2) η⁹(1−η)^6+α

W4:=α³−216−15888η²−7275η⁸+ 29100η⁷−65214η⁶+ 93792η⁵+ 2880η+ 47580η³−84975η⁴ η¹⁰(1−η)¹⁰

W5:= 24α³

„75η⁴−290η³+ 430η²−260η+ 54

η⁽6 +α)(1−η)¹⁰ +75η⁴−10η³+ 10η²−30η+ 9 η¹⁰(1−η)^6+α

W6:=−2α²−648 + 6654η+ 64632η³−28533η²−78176η⁴−25300η⁷+ 17076η⁶+ 6325η⁸+ 37322η⁵ η¹⁰(1−η)¹⁰

W7:= 6α²

„600η⁵−8115η²−933−3300η⁴+ 6970η³+ 4562η η^(6+α)(1−η)¹⁰

−600η⁵+ 300η⁴−230η³+ 1005η²−958η+ 216 η¹⁰(1−η)^6+α

W8:= 24α−4795η⁵−7000η⁷+ 9765η⁶+ 1750η⁸−99−2845η⁴−3311η²+ 921η+ 5515η³ η¹⁰(1−η)¹⁰

W9:=−24α

„−900η⁴+ 994η+ 1135η³−243 + 300η⁵−1385η² η^(6+α)(1−η)¹⁰

−300η⁵−600η⁴+ 535η³+ 380η²−471η+ 99 η¹⁰(1−η)^6+α

W10:= 144(5η²−5η+ 1)(35η²−35η+ 9) η¹⁰(1−η)¹⁰

W11:=−144

„(5η²−5η+ 1)(30η³−105η²+ 119η−35)

η^(6+α)(1−η)¹⁰ −(5η²−5η+ 1)(30η³+ 15η²−k−9) η¹⁰(1−η)^6+α

As in the case of η, the asymptotic variance is symmetric at η= 0.5 for fixed alpha and has a pole at η = 0 of the form

g(α, η)∼ 1 η^6+α

7056(8−5α²+ 12α)(1−α)(2−α) 2(8−α)(6−α)(1−α)₄ .

As can be seen in figure 10, not all components of the asymptotic covariance matrix are symmetric. For a complete numerical implementation, see section 6.

true knot 0.3

0.4 0.5

0.6 0.7

−0.2

−0.1 0.0

0.1 1e+05 0.2

2e+05 3e+05 4e+05 5e+05

Asymptotic variance of ηη^

true knot 0.3

0.4 0.5

0.6 0.7

−0.2

−0.1 0.0

0.1 0.2 2e+07

4e+07 6e+07

Asymptotic variance of a^₅

true knot 0.3

0.4 0.5

0.6 0.7

−0.2

−0.1 0.0

0.1 0.2 2e+07

4e+07 6e+07

Asymptotic variance of a^₄

true knot 0.3

0.4 0.5

0.6 0.7

−0.2

−0.1 0.0

0.1 0.2

−6e+07

−4e+07

−2e+07 0e+00

Asymptotic covariance of a^

4 and a^

Im Dokument Piecewise polynomial regression with fractional residuals for the analysis of calcium imaging data (Seite 103-124)