• Keine Ergebnisse gefunden

Binomial Model

3.2.1 Contamination Neighborhoods

η∗,r(y)≡ηh(y) = y

m −θ (3.1.6)

for all r ∈ [0,∞] ; confer also Remark 2.1.6 (a). Consequentially, we obtain relMSE(˜η∗,r0, r)≡1 for all r0, r∈[0,∞] , hence the MLE is also radius–minimax, as defined in Section 2.2, in these cases. But, nevertheless, as the examples on page 98 of Hampel (1968) (data generated by a binomial distribution with very large size containing outliers) and page 1119 ofRuckstuhl and Welsh(2001) (data generated by an equal mixture of two binomial distributions) as well as our small simulation study in Section3.6indicate, there is also the need for robust estimators for m >1 .

(c)Robust estimation in generalized linear models (logistic regression) is quite different from the simple binomial model and goes beyond the scope of this chapter.

Some references for robust estimation in logistic regression are for instance given

inRuckstuhl and Welsh(2001). ////

3.2 Optimally Robust Influence Curves

3.2.1 Contamination Neighborhoods

3.2.1.1 Mean Square Error Solution

The unique MSE optimal IC ˜ηc,r for infinitesimal contamination neighborhoods (1.2.4) and radius r∈(0,∞) may be read off from Theorem1.3.7 (a) and

Theo-3.2 Optimally Robust Influence Curves 71

rem1.3.11(b). For some given D∈R\{0} we can rewrite the solution as

˜

ηc,r(y) =Ar(Λ(y)−zr) minn

1, cr

|Λ(y)−zr| o

(3.2.1) where

0 = E(Λ−zr) minn 1, cr

|Λ−zr| o

(3.2.2) D=ArE|Λ−zr|min

|Λ−zr|, cr (3.2.3) and

r2cr=E |Λ−zr| −cr

+ (3.2.4)

For r=∞ we obtain by Theorem1.3.7(b)

˜

ηc,∞(y) =ωminc sign(D)h

I(y > M)−I(y < M) +βI(y=M)i

(3.2.5) where

β =h

P(y < M)−P(y > M)i.

P(y=M) (3.2.6)

with any M = med(y) and ˜ηc,∞ attains the minimum bias ωcmin= |D|θ(1−θ)

E|y−M| (3.2.7)

confer also Remark1.3.8.

For a plot of the optimally robust ICs in case m= 25 , θ= 0.25 and for different values of r see Figure3.1.

Remark 3.2.1 (a)Since ICs are defined with respect to the corresponding para-metric model (cf. Definition 1.1.1), the ICs of the binomial model are defined on {0,1, . . . , m}. However, if we consider neighborhoods of the ideal binomial model, we may allow distributions whose support is no longer restricted to {0,1, . . . , m}

but can be any subset T of N, Z or even R, respectively. In view of the construc-tion of the corresponding optimally robust estimator one has to choose an extension such that |η˜r| ≤br=Arcr onT. Otherwise the bias would increase if we pass over from the neighborhood submodel to full neighborhoods. In view of Theorem 2.3.3 and Lemma 2.3.6one possible choice is to extend ˜ηc,r to T\ {0,1, . . . , m} simply by regarding Λ as function on T.

(b)We only need to consider θ∈(0,0.5] . Since

Binom (m, θ)({y}) = Binom (m,1−θ)({m−y}) (3.2.8) it is equivalent to use ¯θ:= 1−θ and ¯y1, . . . ,y¯n :=m−y1, . . . , m−yn for θ >0.5 and estimate ¯θ. Hence, the results (e.g., for the Lagrange multipliers and the lower case radius ¯r) we obtain are symmetric to θ = 0.5 . This equivariance of the binomial model is in more detail considered in Example 3.1.1 ofLehmann and Casella(1998).

(c)As for each m, med(Λ) is non-unique for some θ∈(0,1) and the assump-tions of Proposition 2.1.3 hold, zr is non-unique if r ≥ ¯r where the lower case

72 Binomial Model

radius ¯r is defined in (2.1.12). For a plot of ¯r on (0,0.5] for different values of m confer Figure3.2. The upper peaks correspond to those values of θ for which the median of Λ is non-unique.

(d) Since L(Λ) is symmetric around zero for θ = 0.5 , we can choose zr = 0 for all r∈(0,∞) , respectively M = 0 for r=∞; i.e., in this case, we only have to determine the clipping bound cr and the standardizing constant Ar.

0 5 10 15 20 25

−0.2 0.0 0.2 0.4 0.6

y

optimally robust IC

r = 0.00 r = 0.01 r = 0.10 r = ∞

Figure 3.1: Optimally robust ICs for Binom (25,0.25) in case of contamination neighborhoods (∗=c) with (starting) radius r= 0,0.01,0.10,∞.

3.2 Optimally Robust Influence Curves 73

0.0 0.1 0.2 0.3 0.4 0.5

0.0 0.5 1.0 1.5 2.0 2.5

probability of success

r

m = 2 m = 5

m = 10 m = 50

Figure 3.2: Lower case radius ¯r for θ∈(0,0.5] (results symmetric to θ= 0.5 ) and sizes m= 2,5,10,50 in case of contamination neighborhoods (∗=c).

3.2.1.2 Continuity and Uniqueness of Lagrange Multipliers

The continuity of the Lagrange multipliers Ar, and br in r, stated in Proposi-tion2.1.9, is visualized in Figure3.3. Since med(Λ) is non-unique forθ= 1−25

0.5 , there is a whole interval of valid centering constants ar for r≥r¯≈0.606 . The boundaries of this interval can be determined via (2.1.17) and are given in Fig-ure 3.3. In contrast to the standardized bias br and the asymptotic variance Ar−r2b2r which seem to be non-differentiable at some values of r, Ar looks very smooth in r.

The Lagrange multipliers Ar and br = Arcr and therefore also the minimax asymptotic MSE and the asymptotic variance are continuous in θ as proven in Theorem 2.1.11. However, as part (c) of Remark 3.2.1 already indicates this is not necessarily true in case of the corresponding centering constant ar = Arzr. This fact is illustrated in Figure 3.4 where we choose r ≥ r¯ to demonstrate the extreme case. The discontinuity points coincide with those values of θ for which the median of Λ is non-unique. In addition, this plot indicates that the Lagrange multipliers and hence the standardized asymptotic bias, the asymptotic variance and the maximum asymptotic MSE are not necessarily smooth functions in θ. ////

74 Binomial Model

0.0 0.5 1.0 1.5

0.00 0.01 0.02 0.03 0.04

radius maximum asymptotic MSE = Ar

m = 25

0.0 0.5 1.0 1.5

−0.10

−0.08

−0.06

−0.04

−0.02 0.00 0.02

radius optimal centering constant ar=Arzr

m = 25

0.0 0.5 1.0 1.5

0.00 0.05 0.10 0.15 0.20

radius standardized bias br=Arcr

m = 25

0.0 0.5 1.0 1.5

0.000 0.002 0.004 0.006 0.008 0.010 0.012

radius asymptotic variance = Arr2br2

m = 25

1250.5 0.1 0.25

Figure 3.3: Continuity in the radius r of the Lagrange multipliers contained in the MSE optimal ICs for r∈(0,1.5] , θ= 1− 25

0.5,0.1,0.25 and m= 25 in case of contamination neighborhoods (∗=c).

3.2 Optimally Robust Influence Curves 75

0.0 0.1 0.2 0.3 0.4 0.5 0.0

0.1 0.2 0.3 0.4

probability of success maximum asymptotic MSE = A2

r = 2.0

0.0 0.1 0.2 0.3 0.4 0.5

−0.8

−0.6

−0.4

−0.2 0.0 0.2 0.4

probability of success optimal centering constant a2=A2z2

r = 2.0

0.0 0.1 0.2 0.3 0.4 0.5 0.05

0.10 0.15 0.20 0.25 0.30

probability of success standardized bias b2=A2c2

r = 2.0

0.0 0.1 0.2 0.3 0.4 0.5 0.00

0.01 0.02 0.03 0.04 0.05 0.06 0.07

probability of success asymptotic variance = A222b22

r = 2.0

m = 5 m = 10 m = 20

Figure 3.4: Continuity in θ of the Lagrange multipliers contained in the MSE optimal ICs for θ∈(0,0.5] (results symmetric to θ= 0.5 ), r= 2.0 and m= 5,10,20 in case of contamination neighborhoods (∗=c).

76 Binomial Model

3.2.1.3 Normal Approximation

The following lemma states the normal approximation for the optimally robust ICs, respectively for the Lagrange multipliers contained in these ICs which is a consequence of Theorem 2.4.1. The corresponding optimally robust IC in case of one-dimensional normal location is

˜

η1.locc,r (y) =A1.locr ymin

1, c1.locr |y|−1 (3.2.9)

with

1 =A1.locr EN(0,1)|y|min

|y|, c1.locr (3.2.10) and

r2c1.locr =EN(0,1) |y| −c1.lcor

+ (3.2.11)

For r=∞ we obtain

˜

ηc,∞1.loc(y) =pπ

2 sign(y) (3.2.12)

which attains the minimum bias ωcmin,1.loc=pπ

2 ; confer also Subsection 2.2.2 of Rieder et al.(2001).

Lemma 3.2.2 Let D= 1 and γm=q m

θ(1−θ) . Then,

m→∞lim γm2Ar=A1.locr lim

m→∞γm−1zr=zr1.loc= 0 lim

m→∞γ−1m cr=c1.locr (3.2.13) for all r∈(0,∞) and

m→∞lim γmωminccmin,1.loc (3.2.14) Proof We have

γm−1Λm(y) = √y−mθ

mθ(1−θ) (3.2.15)

and the central limit theorem of de Moivre-Laplace yields L(γm−1Λm)−→w N(0,1) as m→ ∞ where Eγm−1Λm= 0 and E γm−1Λm)2= 1 for all m∈N; i.e.,

m→∞lim E γm−1Λm)2= 1 = EN(0,1)y2 (3.2.16) Hence, we can apply Theorem2.4.1which yields (3.2.13) and (3.2.14). ////

Remark 3.2.3 (a)The convergence of the standardized optimal clipping bounds γm−1zr, respectively of the standardized bias terms γmωc(˜ηc,r) is illustrated in Fig-ure3.5and Figure3.6, respectively. In case r=∞, the discontinuity points of the centering constant zr coincide with those values of θ for which the median of Λ is non-unique. At these non-uniqueness points the standardized (infinitesimal) bias terms attain local minima in case r=∞.

(b) Some examples for the convergence of the standardized minimax asymp-totic MSE γm2Ar, respectively the MSE–inefficiencies are given in Figure3.7 and Subsection3.3.1, respectively.

3.2 Optimally Robust Influence Curves 77

(c)Figure3.8shows the MSE–inefficiency of the normal approximated IC. That is, we use the Lagrange multipliers A1.locrm2 , zr1.locγm and c1.locr γm instead of the optimal Ar, zr and cr; confer Lemma 3.2.2. To make sure that the resulting function is indeed an IC (with respect to the binomial model), we additionally centered and standardized this function. In case the radius is not too large (r ≤ 0.5 ), the MSE–inefficiency of this normal approximation is very small independent of the size m. To get a good approximation for larger radii, we need moderate to large sizes m depending on the parameter θ. Thus, these numerical results indicate that the “distance” between the optimal IC and its normal approximation also depends on the radius r as well as on the parameterθ. In particular, for radii larger than the lower case radius ¯r the approximation seems to work best for those values of θ where med(Λ) is non-unique, respectively θ≤1− m

0.5 or θ≥ m

0.5 . This result, the results in case of total variation neighborhoods (cf. Subsubsection3.2.2.3) and further numerical investigations indicate that this behavior is caused by the fact that the lower case solution in case of the binomial model and contamination neighborhoods in all cases, except med(Λ) is non-unique, respectivelyθ≤1−m

0.5 or θ≥ m

0.5 , attains three values (±ωminc , β) and the third valueβ in most cases is only badly approximated by the normal approximation. ////

78 Binomial Model

Figure 3.5: Normal approximation of the standardized centering constant γm−1zr for θ∈(0,0.5] (results symmetric to 0.5 ) in case of contamination neighborhoods (∗=c) with radius r= 0.1,0.25,0.5,∞.

3.2 Optimally Robust Influence Curves 79

0.0 0.1 0.2 0.3 0.4 0.5 2.0

2.2 2.4 2.6 2.8 3.0

probability of success γmωc(η~ c, r)

r = 0.1

0.0 0.1 0.2 0.3 0.4 0.5 1.6

1.7 1.8 1.9 2.0

probability of success γmωc(η~ c, r)

r = 0.25

0.0 0.1 0.2 0.3 0.4 0.5 1.30

1.35 1.40 1.45

probability of success γmωc(η~ c, r)

r = 0.5

0.0 0.1 0.2 0.3 0.4 0.5 1.20

1.25 1.30 1.35

probability of success γmωc(η~ c, r)

r = ∞

m = 10 m = 50 m = 1000 1.loc

Figure 3.6: Normal approximation of the standardized infinitesimal bias terms γmωc(˜ηc,r) forθ∈(0,0.5] (results symmetric to 0.5 ) in case of con-tamination neighborhoods (∗=c) with radius r= 0.1,0.25,0.5,∞.

80 Binomial Model

0.0 0.1 0.2 0.3 0.4 0.5 1.05

1.06 1.07 1.08 1.09 1.10

probability of success γm2Ar

r = 0.1

0.0 0.1 0.2 0.3 0.4 0.5 1.25

1.30 1.35

probability of success

γm2Ar r = 0.25

0.0 0.1 0.2 0.3 0.4 0.5 1.62

1.64 1.66 1.68 1.70 1.72 1.74

probability of success γm2Ar

r = 0.5

0.0 0.1 0.2 0.3 0.4 0.5 7.0

7.5 8.0 8.5 9.0

probability of success γm2Ar

r = 2.0

m = 10 m = 50 m = 1000 1.loc

Figure 3.7: Normal approximation of the standardized maximum asymptotic MSE γm2Ar for θ∈(0,0.5] (results symmetric to 0.5 ) for contam-ination neighborhoods (∗=c) with radius r= 0.1,0.25,0.5,2.0 .

3.2 Optimally Robust Influence Curves 81

0.0 0.1 0.2 0.3 0.4 0.5 1.000

1.005 1.010 1.015

probability of success

MSE−inefficiency r = 0.1

0.0 0.1 0.2 0.3 0.4 0.5 1.000

1.001 1.002 1.003 1.004

probability of success

MSE−inefficiency

r = 0.25

0.0 0.1 0.2 0.3 0.4 0.5 1.000

1.005 1.010 1.015 1.020 1.025 1.030

probability of success

MSE−inefficiency

r = 0.5

0.0 0.1 0.2 0.3 0.4 0.5 1.0

1.1 1.2 1.3 1.4 1.5

probability of success

MSE−inefficiency

r = 2.0

m = 10 m = 50 m = 1000

Figure 3.8: MSE–inefficiency of the normal approximated IC for θ ∈ (0,0.5]

(results symmetric to 0.5 ) in case of contamination neighborhoods with radius r= 0.1,0.25,0.5,2.0 .

82 Binomial Model