• Keine Ergebnisse gefunden

(3.23) which would indicate a faster rate of convergence. But the reason for the vanishing of the n−1/2-term could as well be found in the symmetry of F, i.e. f(x) = f(−x), which is used by M. Kohl throughout his investigations as there is Fθ =N(θ,1). In the case of convex contamination this symmetry condition indicates no vanishing of the n−1/2-term, however. It only gets some easier, algebraically speaking (conf. [Ruckdeschel (2005b)]

Remark 3.2 and Remark 3.4).

3.3 Finite Sample Breakdown Point

Concerning the finite sample breakdown point we work with the definition of

[Donoho and Huber (1983)], p. 161. As therein the definition for the finite sample break-down point ε0 is not restricted for contamination neighborhoods it is applicable for total variation neighborhood systems, too.

Anticipating the mechanism of modification in section 8.3 we use the concept of ” ε-replacement” on p. 160 (ibid.) for the description of the actual impact of a modification of a sample X = (x1, . . . , xn) by means of total variation:

ε-replacement: we replace an arbitrary subset of size m of the sample by arbitrary valuesy1, . . . , ym. The fraction of bad values in the corrupted sample X0 is ε=m/n.

Then we define the finite sample breakdown pointε0as done in [Donoho and Huber (1983)]:

Definition 3.9. Let T ={Tn}n=1,2,... be an estimator with values in the Euclidean space, and T(X) its value at the sample X. Then we define the breakdown point as

ε0(X, T) = inf{ε: b(ε;X, T) = ∞} (3.24) with

b(ε;X, T) = sup|T(X0)−T(X)|

the maximum bias that can be caused by ε-corruption.

Simply speaking, the (replacement) breakdown point ofT atX is the smallest fraction of the sample for which the estimator, when applied to the ε-corrupted sample X0 can take values arbitrarily far from T(X).

Based on this concept we join [Ruckdeschel (2005a)] or [Ruckdeschel (2005b)], respec-tively, and [Kohl (2005)] by employing an asymptotically negligible modification of the infinitesimal models (3.16) and (3.19): For sample length n and K the number of con-taminated or modified observations, we exclude all samples with K > n/2. Let

Kn :=n

K ≤ n 2

o

(3.25) then

(∗=c) we look at the conditional neighborhoods Qn( . |Kn) = n

L

[(1−Ui)Xi +UiYi]i=1,...,n|K =X

Ui ≤ n 2

o

(3.26) with random variables U1, ..., Un i.i.d.∼ Bin(1, r/√

n), X1, ..., Xn i.i.d.∼ F, Y1, ..., Yn i.i.d.∼ H ∈ M1(B) and all random variables stochastically independent.

(∗=v) we look at the conditional neighborhoods Qn(.|Kn) with random variableK i.i.d.∼ P such that

EPK =r√

n (3.27)

Definition 3.10. With the conditional neighborhoods

Qn:=Qn( . |Kn) (3.28)

for (∗ = c) and (∗ = v), respectively, we subsequently employ as standard neighborhood systems the (slightly) thinned out balls

(∗)n ( . ) ={Qn( . |Kn)} (3.29) Remark 3.11. a) The modification was motivated by a closer inspection of simulations done by M. Kohl, who found out that larger inaccuracies of (first order) asymptotics only occurred when there were extraneous sample situations where more than half the sample size stemmed from a contamination. This instance led him to the conjecture that excluding such samples, asymptotics might then prove useful even for very small samples.

b) In section 2.3 of [Ruckdeschel (2005a)] or [Ruckdeschel (2005b)], respectively, it is shown by the Hoeffding bound (A.2) that the above modification is asymptotically negligible, as P(K > n/2) decays exponentially fast. Furthermore, it enforces the unmodified MSE, i.e. without clipping, to converge along with weak convergence, confer Proposition 2.2 in section 2.4 of [Ruckdeschel (2005a)]. This is not self-evident, as weak convergence in general is too weak to entail convergence of the risks;

the standard way out in Asymptotic Statistics is to clip the unbounded loss function.

The clipping of unbounded loss functions is commonly used in asymptotic statistics in order to attain the lower bound in asymptotic minimax theorems, confer Proposition 2.23, for instance. In this context we refer to [Le Cam (1986)], [Rieder (1994)], [Bickel et al. (1998)] or [van der Vaart (1998)], respectively.

c) In Assumption 8.21 of section 8.9 we arrive at condition (PK) on the distribution of K:

(P K) P(K ∈[r√

n(1−η), r√

n(1 +η)] = 1−O(e−nδ) for some δ, η >0

Obviously, condition (PK) implies that no more than 50%of the sample is modified.

Therefore we could get rid of the modification (3.28). But as the main results in chapter 6 are available with the ”weaker” restriction, already, we stay with it in the meantime.

d) In the sequel we suppress the conditioning w.r.t. Kn and write Qn meaning Qn = Qn( . |Kn) and Kn as defined in (3.25), (3.26) and (3.27) or according to (PK), respectively.

Chapter 4

First Order Optimality for Robust Estimation of Location in one

dimension

As explicit and manageable bias terms for total variation are only available for one di-mension, we briefly summarize the results of chapter 2.4 as far as one-dimensional loca-tion, MSE and neighborhoods of type (∗ = c, v) are concerned. We give the first order optimality result to show that under symmetry of F there is no possibility to see any dif-ferences between the convex contamination and the total variation case. We add Huber’s monotony approach for M-estimators that turns out to be useful for the location but not for the scale model, for example. In this case, an alternative approach by k-step-estimators is presented, too.

4.1 Optimal Influence Curves for one dimension

For a sequence of estimators Sn we consider the asymptotic (modified) maximal mean squared error on ˜Qn

R(S˜ n, r) := lim

M→∞ lim

n→∞ sup

QnQ˜n(r)

Z

min{M, n|Sn−θ|2}dQn (4.1) As summed up in chapter 2 it is shown in [Rieder (1994)] that with scores Λ and Fisher-InformationI a (suitably constructed) asymptotically linear estimator Sn with IC ψ has risk

(∗=c) R(S˜ n, r) =r2sup|ψ|2+E|ψ|2 (4.2) (∗=v) R(S˜ n, r) = r2(supψ−infψ)2+E|ψ|2 (4.3) with the expectations evaluated under the law F.

In one dimension k = p = 1, for given r ≥ 0, among all such ALEs, any (suitably con-structed) ALE with IC η minimizes ˜R(·, r) where η is

34

(∗=c): of Hampel form

η =Y min{1, b/|Y|}, Y =AΛ−a (4.4)

for some A∈R and a∈R such thatη is an IC, i.e.

Eη = 0, EηΛ = 1, (4.5)

and b solving

r2b =E(|Y| −b)+. (4.6)

(∗=v): of form

η=c∨AΛ∧(c+b) (4.7)

for some A∈R\{0} and numbers c∈(−b,0), b∈(0,∞), such that η is an IC, i.e.

Eη = 0, EηΛ = 1, (4.8)

and

r2b =E(c−AΛ)+. (4.9)

Remark 4.1. The risks (4.2) and (4.3) only are first-order asymptotic solutions for optimal ALEs w.r.t. the MSE. Up to now it is not clear to which degree the asymptotic optimality carries over to finite sample size. Especially the influence of the radius r, the sample sizen and the clipping heightb is not visible. Therefore in chapter 6 we investigate the higher order asymptotics for the MSE in the one-dimensional location model. Section 6.1 summarizes briefly the result for the convex contaminated case as it was worked out in [Ruckdeschel (2005b)]. Section 6.2 then delivers the result for infinitesimal total variation neighborhood systems.