• Keine Ergebnisse gefunden

Miscellaneous

Im Dokument Nonparametric Transformation Models (Seite 117-125)

=

(A+Bλ)˜ (A+Bλ1)exp

−BRy1

˜ y

1 λ(u)du

(A+Bh(y))−A B

for all y ∈[˜y,∞), where the last equation follows from (3.11). To fulfil the previous scale constraint ˜h(y1) =λ1 it is required that

λ˜=

(A+λ1B) exp

BRy1

˜ y

1 λ(u)du

−A B

Since this in turn results in ˜h(y) =h(y) for all y∈[˜y,∞), his identified for all y∈[˜y,∞).

Choosing ˜y arbitrarily close toy0 results in h(y) =

(A+Bλ1) exp

−BRy y1

1 λ(u)du

−A

B for all y > y0.

When proceeding analogously for y < y0 with the initial condition h(y2) =λ0

for someλ0<−BA, one has

h(y) =

(A+Bλ0) exp

−BRy y2

1 λ(u)du

−A

B for all y < y0.

Recall h0(y)>0 for all y∈Rand let t >0. Due to the continuous differentiability of h in y0 one has

t→0lim

h(y0+t)−h(y0) t

h(y0−t)−h(y0)

−t

→1.

On the other hand, it holds that h(y0+t)−h(y0)

t

h(y0−t)−h(y0)

−t

=−

(A+Bλ1) exp

−BRy0+t y1

1 λ(u)du (A+Bλ0) exp

−BRy0−t y2

1 λ(u)du

=−(A+Bλ1) (A+Bλ0) exp

B

Z y0−t y2

1

λ(u)du− Z y0+t

y1

1 λ(u)du

, so that

λ0 =−

limt→0(A+Bλ1) exp

B

Ry0−t y2

1

λ(u)du−Ry0+t y1

1 λ(u)du

+A

B =λ2.

This leads to the uniqueness of Solution (3.15). Inserting A = 0 yields the second part of

the assertion.

3.6 Miscellaneous

In this section all results, proofs or remarks are collected, that are not directly necessary to follow the golden thread of the chapter, but nevertheless important and interesting additions that complete a comprehensive discussion of the topic.

3.6.1 Bounded Support of fε

In the previous discussion, the density fε of the error term was always assumed to be greater than zero on the whole set of real numbers. In this Subsection it is tried to relax this assumption and to show how the derived approaches can be adapted to this case. It is assumed thath0 > 0 although it is conjectured that the ideas here can be extended to general monotonic transformationsh by similar ideas as in Section 3.6.3 below.

Identifiability of the model (3.1) on compact intervals is considered first. Hence, letY ⊆R be a compact interval. The main drawback when allowing one-sided or even bounded support offε consists in the fact that in general the set

AY =

x:∂FY|X(y|x)

∂y >0 for all y∈ Y

,

which is defined similar to equation (3.4), no longer consists of everyx∈RdX. Example 3.6.1 Consider the following model

Y =X+ 1 +ε

with one-sided error ε = η −1 and η ∼ Exp(1). For the choice of Y = [0,1] one has AY = (−∞,0).

Chiappori et al. (2015) introduced an assumption similar to AY 6= ∅, where Y is equal to the support of Y. Although they considered the partial derivative with respect to an appropriatexi instead ofy, the underlying problem remains the same. As in example 3.6.1, the weighting with respect to x has to be restricted to AY. Although not a big problem from an identification perspective this issue becomes crucial when estimating since neither g nor AY are known a priori. From an identification point of view, this “weighting” can even be implemented using a dirac measure.

The argumentation becomes even more complicated when considering an error term with bounded (from both sides) support or a heteroscedastic model, but at least the following corollary can be stated.

Corollary 3.6.2 Assume the support Y of Y is an interval and can be partitioned into countably many bounded subintervals (Yn)n∈IN such that

AYn 6=∅ and max{y:y∈ Yn}= min{y :y∈ Yn+1} f or all n∈N.

Then, the transformation functionhfrom model (3.1) is identified onY viah(y1) =λ1 and h(y2) =λ2 for arbitrary y1, y2 ∈ Y, λ1 < λ2.

Proof: Assume existence of a rooty0 of λand let y0 ∈ Ym for somem∈N. If there does not exist such a root, one can proceed as later in Subsection 3.6.2. The basic reasoning in this case would be the same, so that only the first case with a root is considered in the remaining proof.

Lety1,m and y2,m be the lower and upper bound of the intervalYm. For anyλ1,m< λ2,m, identification ofhon Ym via h(y1,m) =λ1,mand h(y2,m) =λ2,m can be shown as before in

3.6. Miscellaneous Section 3.2.

Now look at the two consecutive intervals Ym and Ym+1, that isy2,m =y1,m+1. Sinceh is continuously differentiable iny2,m, the limits

y→ylim2,m

h(y) and lim

y→y2,m

h0(y)

exist and especially have to be independent of any sequence yl l→∞−→ y2,m. Therefore, the location and scale constraints for Ym+1 are determined by the continuous differentiability of h. Hence, h is identified on Ym+1 as well. After proceeding analogously for all previous and consecutive intervals, one obtains a version of h which only depends on the chosen values of λ1,m and λ2,m from the first step. Due to the continuous differentiability of h, this is even the case if there are accumulation points in the sequence (y2,n)n∈IN. The two constants λ1,m and λ2,m are directly linked to the global location and scale constraints of h and are uniquely determined by h(y1) =λ1 and h(y2) =λ2.

If one has chosen a set Y and an appropriate weighting function v once, the conditions necessary to ensure identifiability of the model are thus the same as in the case with unbounded support.

3.6.2 The Case without a Root y0

Again, the notations from Section 3.2 are used, that is,y0 is defined as the root of the map y 7→ A+Bh(y) with A and B from (3.7). Recall the model (3.1). B is assumed to be different from zero, so that the only possibility that there does not exist any rooty0 of λis the case where the image ofh and consequently the support offεis bounded from at least one side. Therefore, this is a special case of Subsection 3.6.1, but has not been treated in detail there.

Identifiability of the model (3.1) on compact intervals is considered in the following. Hence, let Y be an interval and let v be weighting function, such that supp(v) ⊆ AY 6= ∅ and h(y)6= 0 for all y∈ Y. Introduce the location and scale constraints

h(y1) =λ1 and h(y2) =λ2

for arbitrary values y1 < y2 ∈ Y and λ1 < λ2 ∈ R. Then, the corresponding solution to (3.8) on Y is given by

h(y) = (λ2−λ1) exp

−BRy y1

1 λ(u)du

−1 exp

−BRy2

y1

1 λ(u)du

−1

1, y∈ Y.

Indeed, the transformation function expressed in this way fulfils the differential equation as well as the boundary constraints. Uniqueness ofh can be shown as in Section 3.2. Since h(y1) < h(y2) and the map y 7→ exp −BRy

y1

1 λ(u)du

is monotone, h is increasing as required. Estimating the transformation would become insofar easier later as there is no longer the need to estimatey0 andλ2 as in Chapter 4.

3.6.3 Vanishing Derivatives of h

Up to now, it was assumed that the derivative ofh is positive. Keeping the assumption of monotonicity, one potential generalisation could consist in allowing h0(y) = 0 at least for some real numbersy∈R. In the following, another reasoning for identifyingBis presented, which does not requireh0(y0)>0. For this purpose, define for anyα∈R\{0}the function

ϕα : R→ R

y7→ sign(y)|y|α.

Lemma 3.6.3 Let X, ε, g andσ be as in assumptions (A1)–(A3) and (A7) from Section 3.4. Further, let ε˜be a centred random variable independent of X with Var(˜ε) = 1. Let g˜ andσ˜ be functions, such that for some α >0 with E[ϕα(g(X) +σ(X)ε)2]<∞ one has

ϕα(g(X) +σ(X)ε) = ˜g(X) + ˜σ(X)˜ε almost surely.

Then, it holds that α= 1.

Proof: According to assumption (A7), assume w.l.o.g. thatx7→ σ(x)g(x) is not almost surely constant onM>c for an appropriatec >0 and that P(X∈M>c)>0 holds (the other case can be treated analogously). LetMX ⊆M>c be a bounded subset such that

(i) x7→ σ(x)g(x) is not almost surely constant onMX, (ii) P(X∈MX)>0,

(iii) sup

x∈MX

σ(x) g(x)

<∞.

Consequently, there exists someδ >0 such that

x∈MXinf, e∈(−δ,δ)g(x) +σ(x)e >0 and sup

x∈MX, e∈(−δ,δ)

σ(x)e g(x)

<1.

The generalized Binomial Theorem provides

IMX(X)I(−δ,δ)(ε)(g(X) +σ(X)ε)α=IMX(X)I(−δ,δ)(ε)g(X)α

X

k=0

α k

σ(X) g(X)

k

εk. Since the conditional expectation, conditional variance and the remaining error term can be written as

˜

g(X) =E

ϕα(g(X) +σ(X)ε)|X

, σ˜2(X) = Var ϕα(g(X) +σ(X)ε)|X and

˜

ε= ϕα(g(X) +σ(X)ε)−˜g(X)

˜

σ(X) ,

it holds that

IMX(X)I(−δ,δ)(ε)˜ε=IMX(X)I(−δ,δ)(ε)ϕα(g(X) +σ(X)ε)−g(X)˜

˜ σ(X)

3.6. Miscellaneous

=IMX(X)I(−δ,δ)(ε) P

k=0 α k

σ(X)kg(X)α−kεk−g(X)˜

˜ σ(X)

=IMX(X)I(−δ,δ)(ε)

g(X)α−˜g(X)

˜

σ(X) +

X

k=1

α k

σ(X)kg(X)α−k

˜

σ(X) εk

| {z }

independent ofX∈MX

.

This can be alternatively expressed as

IMX(X)I(−δ,δ)(ε)˜ε=IMX(X)I(−δ,δ)(ε)

X

k=0

βk(X)εk with

β0(X) = g(X)α−g(X)˜

˜

σ(X) and βk(X) = α

k

σ(X)kg(X)α−k

˜

σ(X) for all k≥1.

Due to the independence of ˜εand X, the coefficientsβk, k ≥0, are not allowed to depend on X. Since σ(Xg(X)) by assumption depends on X, g(X)σ(X)g(X)˜ ασ(X)kk is at most for one k ∈ IN0 independent ofX. Thus, αk

6= 0 for at most onek. Due toα >0 one has α= 1.

It is conjectured that similar techniques to the proof above can be used to identify model 3.1 without assuming h0 >0. Nevertheless, the following conjecture has not been completely proven so far, so that only a sketch of a possible proof is given.

Conjecture 3.6.4 Assume (A1)–(A3),(A5)–(A7) from Section 3.4 as well as conditions (3.9) for λ1 = 1 and (3.12). Let the conditional distribution function FY|X(y|x) be conti-nuously differentiable with respect to y and x and assume h0(y) >0 for all y 6=y0. Then, the unique solution to the model equation (3.1) is given by

h(y) =







 exp

−BRy y1

1 λ(u)du

y > y0

0 y=y0

λ2exp

−BRy

y2

1 λ(u)du

y < y0

,

where it is set 10 :=∞ as well as 1 := 0 and λ2 is uniquely determined. Further, one has g(x) =E[h(Y)|X =x] and σ(x) =p

Var(h(Y)|X=x).

Sketch of the Proof: The caseh0(y0) >0 has been considered in Theorem 3.2.4 so that h(y0) = 0 is assumed. Moreover, assume w.l.o.g. B > 0. For B = 0 the approach of Chiappori et al. (2015) can be adjusted in the same way as follows.

Lethand ˜hfulfil model (3.1) with (3.9) forλ1 = 1 and (3.12), that is,hand ˜hare solutions to the differential equalities

h0(y) =−A+Bh(y) λ(y)

0(y) =−A˜+ ˜B˜h(y) λ(y)

for someB,B >˜ 0, A,A˜∈Rand all y6=y0. By the same reasoning as before,h and ˜h can be written as

h(y) =







 exp

−BRy y1

1 λ(u)du

y > y0

0 y=y0

λ2exp

−BRy y2

1 λ(u)du

y < y0

and

˜h(y) =







 exp

−B˜Ry y1

1 λ(u)du

y > y0

0 y=y0

λ˜2exp

−B˜Ry

˜ y2

1 λ(u)du

y < y0 ,

whereA= ˜A= 0 is implied by the location constraint (3.12). Due to assumption (A3) one has

y→−∞lim h(y) =−∞ and lim

y→∞h(y) =∞.

Therefore, the transformation functionshand ˜h from above can be written as

h(y) =







 exp

−BRy y1

1 λ(u)du

y > y0

0 y=y0

exp

−BRy y3

1 λ(u)du

y < y0

(3.20)

and

˜h(y) =







 exp

−B˜Ry y1

1 λ(u)du

y > y0

0 y=y0

exp

−B˜Ry

˜ y3

1 λ(u)du

y < y0

for some appropriate y3,y˜3 < y0, which are uniquely determined by λ, B, λ2 and λ,B,˜ ˜λ2, respectively. These expressions in turn yield

ϕB˜ B

(h(y)) =







 exp

−B˜Ry y1

1 λ(u)du

y > y0

0 y=y0

exp

−B˜Ry y3

1 λ(u)du

y < y0

= ˜h(y) + ˜h(y)I{y<y0}

exp

−B˜ Z y˜3

y3

1 λ(u)du

−1

, that is,

˜h(y) =ϕB˜ B

(h(y))−˜h(y)I{y<y0}

exp

−B˜ Z y˜3

y3

1 λ(u)du

−1

. It is conjectured that similar arguments to the proof of Lemma 3.6.3 lead to

exp

−B˜ Z y˜3

y3

1 λ(u)du

= 1

and thus to ˜y3 =y3. Finally, Lemma 3.6.3 ensures ˜B =B and consequently ˜h=h.

It is conjectured that this result can be extended to a more general class of monotonic functions.

3.6.4 Uniqueness of Solutions to Ordinary Differential Equations

Finally, two basic results about ordinary differential equations and uniqueness of possible solutions are given. Theorem 3.6.6 is slightly modified compared to the version of Forster (1999, p. 102) so that the proof is presented as well.

3.6. Miscellaneous Lemma 3.6.5 (Gronwall’s Inequality, see Gr¨onwall (1919) or Bellman (1953) for details) Let I = [a, b]⊆Rbe a compact interval. Let u, v:I →R andq :I →[0,∞) be continuous functions. Further, let

u(y)≤v(y) + Z y

a

q(z)u(z)dz for all y∈I. Then, one has

u(y)≤v(y) + Z y

a

v(z)q(z) exp Z y

z

q(t)dt

dz for all y∈I.

Theorem 3.6.6 (see Forster (1999, p. 102) for a related version) Letb > a > y0 andG⊆ (y0,∞)×R+be a set such that [a, b]×R+⊆G. Moreover, letD:G→R,(y, h)7→D(y, h), be continuous with respect to both components and continuously differentiable with respect to the second component. Then, for all θ0 >0 any solution h ∈ C([a, b],R+) of the initial value problem

h0(y) =D(y, h(y)), h(a) =θ0

is unique.

Proof: Let h1, h2 : [a, b]→ R+ be two solutions of the mentioned initial value problem.

Since

K :={(y, θ)∈[a, b]×R+ :y∈[a, b], θ∈ {h1(y), h2(y)}}

is compact, there exists some L > 0 such that |D(y, θ) −D(y, ψ)| ≤ L|θ −ψ| for all (y, θ),(y, ψ)∈K. Consider the distanced(y) :=|h1(y)−h2(y)|. Then for ally ∈[a, b]

d(y) =|h1(y)−h1(a)−(h2(y)−h2(a))|

=

Z y a

(D(z, h1(z))−D(z, h2(z)))dz

≤ Z y

a

|D(z, h1(z))−D(z, h2(z))|dz

≤L Z y

a

|h1(z)−h2(z)|dz

=L Z y

a

d(z)dz

Gronwall’s Inequality leads tod≤0 (setu=d, v≡0, q≡L).

4

Nonparametric Estimation of the Transformation Function in a

Heteroscedastic Model

After identifiability of model (3.1) under conditions (3.9) and (3.12) was proven in the last chapter, the question arises how its components can be estimated appropriately. To the author’s knowledge, there is no estimating approach in such a general model as (3.1) so far.

To mention only some methods in the literature, Chiappori et al. (2015) provided an esti-mator for homoscedastic models, while Neumeyer et al. (2016) extended the ideas of Linton et al. (2008) to the case of heteroscedastic errors, but only for parametric transformation functions. In the context of a linear regression function, Horowitz (2009) discussed several approaches for a parametric/ nonparametric transformation function and a parametric/

nonparametric distribution function of the error term.

In the following, the analytical expressions of the model components in (3.1) are used to construct corresponding estimators in Section 4.1. Afterwards, the asymptotic behaviour of these estimators is examined in Section 4.2. When doing so, equation (3.17) and the ideas of Horowitz (1996) will play key roles in defining estimators and deriving the asymptotic behaviour. Some simulations are conducted in Section 4.3 and the chapter is concluded by a short discussion in 4.4. The proofs can be found in Section 4.6.

Throughout this chapter, assume (A1)–(A7) from Section 3.4 as well as B > 0 (see Re-mark 3.2.1). Moreover, assume the location and scale constraints (3.9) and (3.12) for some y1 > y0 withλ1= 1 and let (Yi, Xi), i= 1, ..., n,be independent and identically distributed observations from model (3.1).

Im Dokument Nonparametric Transformation Models (Seite 117-125)