• Keine Ergebnisse gefunden

6.5 Negligibility of the non-i.i.d. case

7.2.2 Dependence on g and r

We take a look at theA2-term in the second-order MSE:

A2 = 1 n h

(1

3lid,3b4±lc,2b3)r4+ (±(4vc,1+ 3lc,2)vid,02 b (7.2) +((2lid,3+ 3vid,2−6)v2id,0+ 3)b2)r2+ (lid,3+ 3vid,2)vid,04

+2

1vid,03 i

with the coefficients from 6.24 andb =g ·A. It holds

Coefficient value for g = 1.0 sign for all g >0

A 1.46 +

lc,2 0.71 +

lid,3 0.71 +

vid,0 1.11 +

vc,1 -0.47

-vid,2 -0.52

re,1 1.24 +

Remark 7.2. We confine ourselves to the ”+”-case of A2, as according to the table above b·lc,2|(6.20)(b) = −b·lc,2|(6.20)(−b)>0, for instance.

With r small, r < 1 as in section 7.2, for example, the summands of order O(r4) and O(r2) in 7.3 are small. Calculating the remaining terms of order O(r0) for g = 1.0 we see that the main negative part stems from the summand (2lid,3+ 3vid,2)v2id,0, because the coefficient vid,2 is negative.

But this does not have to hold for radii approaching 1. Hence, we suppose that the situ-ation changes, when the sample modificsitu-ation by total varisitu-ation gets to a specific amount, large enough to enable the terms of higher order in r to establish their influence on the MSE.

In this sense we evaluateA2(g, r) forr ∈[0; 1] and c∈ {1.0; 1.6; 2.0}and take a look at the plots in figure 7.1.

Figure 7.1: Numerical behavior of A2(r) for g = 1.0 (top), g = 1.6 (middle) and g = 2.0 (bottom).

First of all we can clearly see, that for small radii, we always get a negative contribution of theA2-term to the MSE, indeed. Hence we get an overestimation of the MSE. But the situation changes by increasing (both) the radiusrand the clipping height.There is always

a specific radius (depending on the clipping height), where theA2-term gets positive. We then have to join the conclusion given in the convex-contaminated case by P. Ruckdeschel, i.e. the MSE suddenly is underestimated.

We might perhaps explain this circumstance with the fact that we can see a total variation neighborhood as the union of two convex contamination balls, as we have illustrated in (3.20). For r large enough ”the convex contamination character of the total variation neighborhood system breaks through”. But for smallr it is not visible.

In other words, there is always a radius rg0 beyond which we cannot apply, or achieve, respectively, the least favorable deviation as we do in chapter 6, in order to get the asymptotic expansion of the MSE. The result is an MSE that turns more and more to infinity with each additional ”bad member” of the sample.

Remark 7.3. For F =N(0,1), n large enough but finite, we get numerically that a) for radii rg small enough, like rg <0.5with g <1.6, for example, the maximal MSE

on Q˜n is overestimated by first-order and second-order asymptotics!

b) for radii rg large enough, likerg >0.5with g >1.6, for example, the maximal MSE on Q˜n is underestimated by first-order and second-order asymptotics!

Remark 7.4. In a finite setup there is at least in the symmetric case F = N(0,1) a difference in the behavior of the asymptotic maxMSE on convex contaminated versus total variation neighborhoods.

Chapter 8

Generation of least favorable

deviations in total variation for finite sample

In Theorem 6.13 we declared a least favorable deviationPndi= (Pndi)++(Pndi)with (Pndi)+ and (Pndi) defined as in (6.20) or (6.21), respectively. However, in the finite scenario with original sample (x1, . . . , xn) i.i.d.∼ Pnid to be manipulated by the signed measure ∆n(i) as defined in (3.18) and Lemma 6.21, respectively, the least favorable deviation may not be possible. This means that we have to find and declare a suitable mechanism explaining the effect of ∆ for every finite sample according to previous given conditions. For example, the amount of observations to be manipulated has to be determined and guaranteed (in probability) as well as the bound on the xi for having maximum influence on the mean squared error according to the value of ψ(xi) with ψ a influence curve satisfying certain assumptions.

Furthermore, we settle on the symmetric case for F = Pid symmetric on the Borel set B and show that for a certain kind of manipulation mechanism we gain the result of Corollary 6.19 even in the finite setup up to suitable order.

In this sense we first carry out a reordering of the sample by conditioning with respect to the arrangement. The influence curve ψ is seen as monotone and - in the symmetric case - as odd. Actually we confine ourselves to influence curves of Hampel-type form, at least attaining their maximum for|x|> cn with a general increasing sequence cn initially.

The amount k of manipulable observations is given by a random variable K with first moment EK =r√

n chosen to satisfy the requirement of staying in a total variation ball Bv(F, r/√

n). The second moment VarK = 12r√

n is a result of a deeper investigation of all terms in the expansion of the MSE given by a k-step approach. By ordering the sample the observations become (weakly) correlated, however, confer Proposition 8.16 and Theorem 8.17. But in in Theorem 8.20 we can show that under certain assumptions to choose the correlation vanishes.

Without application of further symmetry arguments we are confronted with the common law of the k- and n−k+ 1-quantile X[k:n] and X[n−k+1:n], which lead to order statistics.

But the integrals to be evaluated in this setup show up to be very hard to handle, so we just give a short impression of these circumstances in subsection 8.5.1 and make use of

92

a symmetry argument loosely inspired by the reflection principle known from elementary stochastic. By consideration of several samples {x1, . . . , xn}j i.i.d.∼ F, j ∈ N, at once, we are able to neglect the difference between the lower and upper k-quantile.

It shows up that we only get access to the result of Corollary 6.19 in the finite context if we require the finite sample to attain the minimum and maximum of the given influence curveψwith a certain probability already. Depending on this probability we derive a lower bound on the sample lengthn in Theorem 8.20, after having conjectured the existence of a such condition by preceding numerical investigations.

Finally, we give a restrictive condition on the distribution of K in assumption 8.21 (PK) that guarantees1 the desired realization of X[k:n] beyond a now concrete bound cn. The bound cn is explicitly calculated for F =N(0,1) in Proposition 8.24 and at last suitable four-point distributions forK are given in section 8.9.3 that satisfy all the previous claimed conditions.

8.1 Division of the support

In order to generate least favorable deviations we assume that there exist intervals, where the influence curve ψ (almost) attains its minimum and maximum. As a preparation we begin with a partition of the real line:

Notation 8.1. Let cn≥0 be an increasing sequence. We denote I := ]− ∞, −cn[,

II := [−cn, cn], III := ]cn, ∞[

Furthermore we make the assumption that fornlarge enough ψdoes not differ much from the asymptotic optimal influence curve described in (4.17).

Assumption 8.2. In addition to Assumption 6.9 (bmi) we assume:

(o) ψ is an odd function, i.e. ψ(−x) =−ψ(x).

(F) ψ is of form

ψ(xi) =





−b/2 +o(n−1) for xi ∈I AΛ(xi) = Axi for xi ∈II b/2−o(n−1) for xi ∈III (Z) F(ψ <−b2) = F(ψ > b2) = 0

1Theorem 8.22 shows the probability of exceeding the boundcnnegligible exponentially by assumption (PK).

Figure 8.1: The considered IC with the divided support.

Remark 8.3. a) The sequencecnin notation 8.1 will show up to be of orderO(r/√ n).

b) For x1, . . . , xni.i.d.∼ F and Qn as defined in (3.19) we have the decomposition dQn=dF −(dQn−dF)+ (dQn−dF)+

For the case Qn = Qn with signed measures as in (6.20), for example, we can identify

I = {dQn< dF} II = {dQn=dF} III = {dQn> dF}

c) Unless the Lagrangian multiplier A is calculated explicitly (conf. (8.23), for in-stance), we set A:= 1. This improves readability by merely neglecting a multiplica-tive constant.

8.2 Conditioning w.r.t. the arrangement of the