• Keine Ergebnisse gefunden

2

Chapter 1 provides a description and summary of the asymptotic theory which forms the basis of this thesis. In the robustness context it is associated with the names of Bickel and Rieder; conferBickel(1981) andRieder(1994). Our presenta-tion is based on Chapters 4 and 5 ofRieder(1994). It is restricted to the estimation of a finite-dimensional parameter in the one sample i.i.d. case. More precisely, we assume a smoothly parameterized family

P ={Pθ|θ∈Θ} ⊂ M1(A)

of probability measures on some sample space (Ω,A) , whose parameter space Θ is an open subset of some finite dimensional Rk. At some fixed θ∈Θ this family P is assumed to be L2 differentiable,

kp

dPθ+t −p

dPθ(1 +12tτΛθ)k=o(|t|)

with L2 derivative Λθ∈Lk2(Pθ) and Fisher information of full rank k, Iθ= EθΛθΛτθ

For more details on L2, respectively Lr (r≥1 ) differentiability we refer to Sec-tion 2.3 ofRieder(1994), Section 1.8 ofWitting(1985) andRieder and Ruckdeschel (2001), respectively.

In Section 1.1 we introduce the notion of (partial) square integrable influence curves (involving a matrix D∈Rp×k of full rank p≤k) and show that a necessary and sufficient condition for their existence is

∃A∈Rp×k:D=AIθ

Next, we introduce asymptotically linear estimators and derive the Cram´er-Rao bound for this class of estimators.

The infinitesimal robust setup which is based on neighborhoods about the ideal model P that are shrinking at a rate of √

n is presented in Section1.2. Throughout this thesis we consider neighborhoods of contamination, total variation and occa-sionally of Kolmogorov type. That is, we omit Hellinger and Cram´er von Mises neighborhoods as treated inRieder(1994).

Subsequently, tangent classes for these neighborhoods are defined and then sim-ple perturbations along such tangents instead of full neighborhoods are considered.

As a consequence of Le Cam’s third lemma, one gets the asymptotic normality of asymptotically linear estimators under such simple perturbations. Using quadratic loss this leads to the asymptotic mean square error (MSE) problem stated in Subsec-tion1.3.1. This convex optimization problem involves certain bias terms (depending on the type of neighborhood) which can be calculated more or less explicitly; confer Subsection1.3.2.

The solution to this optimization problem has been derived in detail in Sec-tion 5.5 of Rieder (1994) using Lagrange multiplier theorems developed in Ap-pendix B (ibid.). To obtain the corresponding MSE solution, beforehand the prob-lem of minimizing the trace of the asymptotic covariance subject to a bound on

3

the various bias types is solved. Thus, we also give the solution (optimal influence curve) to this supplementary problem; confer Subsection 1.3.3. Additionally, the minimum asymptotic bias and the influence curve which achieves this minimum bias is specified. As for the original minimax MSE problem, the optimal influence curve is of the same form as in case of the minimum trace problem with a suit-able bias bound. This bound is determined by an additional implicit equation;

confer Subsection 1.3.4. The MSE solution is always of main case form; confer Theorem1.3.9(a).

Chapter 2 presents supplements to the asymptotic theory of robustness which have proved necessary for this thesis.

First, we show in Subsection2.1.1that the Lagrange multiplier A, which occurs in the optimal influence curves and is determined by an optimization problem using Lagrange multiplier arguments, has a statistical interpretation:

minimaxMSE = trA

This identity extends the classical Cram´er-Rao bound for quadratic loss and is remarkable since, in addition to variance, bias is involved.

Next, we treat discrete models which have rarely been considered in robustness literature; confer Subsection 2.1.2. The models show peculiar aspects: Under an additional “gap” condition, the MSE solution (always of main case form) in fact coincides with the minimum bias solution. This happens for radii r greater than some finite radius ¯r∈[0,∞) , the so called lower case radius. Another phenomenon which has not been studied in literature so far is non-uniqueness of the Lagrange multipliers as part of the (unique) optimal influence curves.

In the remaining part of Section 2.1, we derive technical properties of the Lagrange multipliers contained in the MSE solution: Boundedness (cf. Subsec-tion2.1.3), uniqueness (cf. Subsection2.1.4) and continuity (cf. Subsection 2.1.5).

These properties are important for the following purposes: Determination of un-known neighborhood radius according to a minimax criterion (cf. Section 2.2), estimator construction (cf. Section 2.3) and convergence of robust models (cf. Sec-tion2.4).

In Section 2.2 we consider the notions of least favorable radius and radius–

minimax estimator introduced in Rieder et al. (2001). This concept serves as a strategy how to proceed if the true neighborhood radius is unknown, respectively unknown except to belong to some radius interval. We supply the mathematical results on the least favorable radius which support and complement the purely numerical determination in Rieder et al.(2001).

Another important problem is the construction of optimally robust estimators.

So far, the results concern the MSE optimal influence curve whose derivation is based only on simple perturbations. Given a family of influence curves (ψθ)θ∈Θ, we have to construct an asymptotic estimator S, without knowing the parameter θ∈Θ , such that S is asymptotically linear with influence curve ψθ at Pθ. More-over, the risk of this estimator should not increase if one passes over from simple perturbations to full neighborhoods about Pθ. These goals can (under additional assumptions) be achieved by means of one-step constructions, and sufficient condi-tions are given in Subsection2.3.1.

4

The general sufficient conditions are taken from Section 6.4 of Rieder(1994).

We use them to derive sufficient conditions for the MSE optimal influence curve;

confer Subsection2.3.2. We verify these conditions for exponential families of full rank; confer Subsection 2.3.3. Thus, we can use the one-step method in several important models which are widely used in parametric statistics. In particular, these results apply to most of the models considered in this thesis.

The one-step construction requires a suitable initial estimator. By Theorem 6.3.7 of Rieder (1994) the Kolmogorov minimum distance estimator has the necessary properties if we consider 1/√

n neighborhoods of Kolmogorov type. Consequently, this is also true if we consider smaller 1/√

n neighborhoods like contamination or total variation neighborhoods. However, in robust literature most frequently the simpler median and median absolute deviation (MAD) are proposed as appropri-ate initial estimators. Since there seems to be no reference that these estimators also have the asserted properties, we prove their uniform √

n consistency on 1/√ n Kolmogorov neighborhoods even if there is no location or scale structure; confer Subsection2.3.4.

In the remaining part of the current chapter we derive some results which may be interpreted as convergence of robust models; confer Section2.4. We do not aim at the abstract framework ofLe Cam(1986) involving arbitrary decision rules. But, from the beginning, we base our concept solely on the optimally robust estimators.

We prove that under certain weak assumptions and with appropriate standardiza-tions the Lagrange multipliers of the MSE optimal influence curve of one model converge towards the Lagrange multipliers of the MSE optimal influence curve of another model. Hence, the minimax asymptotic MSE, the standardized asymptotic bias, and the asymptotic variance converge, too. Thus, if there is some infinitesimal robust model where the optimally robust influence curve is hard to determine, we can try to find another robust model which may serve as approximation and where the computation of the corresponding optimally robust influence curve is much easier. Using this influence curve we are able to construct approximations to the optimally robust influence curve for the model of interest which also is in the spirit of Le Cam (1986). Convincing examples are given in Chapters 3–5. The concept

— convergence of robust models — may certainly be expanded more abstractly.

Chapter 1