• Keine Ergebnisse gefunden

64

The subject of this second part are well-known parametric models which in ro-bust literature have rarely been considered. If standard roro-bust models like location and scale are included, our distributional assumptions on the ideal model are non-standard.

Binomial and Poisson Model

Robust estimation in the binomial and Poisson models has received only little at-tention in robust literature. It was first mentioned in Section F.3 ofHampel(1968), who calculates the score function Λθ and applies his Lemma 5 to the binomial and Poisson models. His optimal ψ function ˜ψθ in general minimizes the asymptotic variance Eθψθ2/(EθψθΛθ)2 under the bound b = c/Eθψ˜θΛθ on the gross error sensitivity sup|ψθ/EθψθΛθ| for any c∈(0,∞) . Hampel’s solution is of the same form as our optimally robust influence curves in case of infinitesimal contamination neighborhoods, as specified in Subsection1.3.3.

In his treatment of the binomial and Poisson models, as in general smooth parametric models,Hampel(1968) has no criterion for the choice of the sensitivity bound b. By considering the corresponding MSE problem, we obtain an additional equation to determine b in a unique and optimal way depending on the (starting) radius r∈(0,∞) of the infinitesimal neighborhoods; confer Subsection1.3.4.

There are other papers on robust estimation in discrete models which, however, consider only particular aspects. Ruckstuhl and Welsh(2001) for instance propose a robust estimator which has a high breakdown point and at the same time a bounded influence curve in case of the Binomial model. Simpson et al.(1987) show asymptotic non-normality over neighborhoods of Hampel’s optimal M estimators when the underlying distribution is discrete. The result, a limiting law which is pieced together by two normal distributions with different standard deviations, is analogous to the result given on p 78 ofHuber(1964), respectively on p 51 ofHuber (1981). Moreover, a similar result on the asymptotic non-normality of the trimmed mean has been proved by Stigler (1973). As a way out, Simpson et al. (1987) propose to replace ˜ψθ by a smooth approximation to retain asymptotic normality.

In contrast, our optimally robust influence curves are solutions to certain opti-mization problems based on the MSE criterion. Asymptotic normality of our more general AL estimators on full 1/√

n neighborhoods is obtained by the smoothness of the underlying parametric model and a suitable estimator construction. Most frequently in literature, robust estimators are constructed via the M principle. We prefer to construct the corresponding optimally robust estimators by means of the one-step method which, depending on a suitable initial estimator, is faster to com-pute and always yields a unique solution. For more details on one-step constructions we refer to Section 6.4 ofRieder(1994) and Section 2.3.

In Chapters 3 and 4 we in detail consider the binomial and Poisson model where we first briefly introduce the ideal models; confer Sections 3.1 and 4.1. In Subsubsections 3.2.1.1 and 3.2.2.1, respectively4.2.1.1 and 4.2.2.1 we specify the MSE optimal ICs in case of contamination (∗=c) as well as total variation neigh-borhoods (∗ = v) and give some numerical results for the lower case radius ¯r introduced in Subsection2.1.2.

65

Subsequently, we numerically investigate technical properties (continuity and uniqueness) of the Lagrange multipliers contained in the optimal solutions which are useful for: Determination of least favorable radii (cf. Section 2.2), one-step construction (cf. Section 2.3) and convergence of robust models (cf. Section 2.4);

confer Subsubsections3.2.1.2and3.2.2.2, respectively4.2.1.2 and4.2.2.2.

First, we study the dependence on the neighborhood radius r. The numerical results indicate that the standardizing constant Ar is smooth whereas the stan-dardized bias br, the lower clipping bound cr (∗=v) and the asymptotic variance Ar−r2b2r may be non-differentiable at some values of r. In addition, we consider parameter values θ where med(Λθ) is non-unique. As a consequence of Proposi-tion 2.1.3, the optimal centering constant ar is non-unique for r ≥ ¯r and those values of θ. More precisely, there is a whole interval of valid centering constants for r≥r¯.

Second, we treat continuity with respect to the parameter θ. The numerical results indicate that the standardizing constant Ar, the standardized bias br, the lower clipping bound cr (∗ = v) and the asymptotic variance Ar−r2b2r are continuous but, not necessarily smooth functions in θ. Moreover, the centering constant ar (∗=c) for radii r≥r¯ is even discontinuous at those values of θ for which med(Λθ) is non-unique.

These numerical results confirm the continuity and uniqueness results derived in Subsections 2.1.4 and 2.1.5 and indicate that we cannot expect the Lagrange multipliers to be smooth functions neither in the radius r nor in the parameter θ, in general.

We also use the binomial and Poisson models to demonstrate the convergence of robust models derived in Section 2.4; confer Subsubsections 3.2.1.3and 3.2.2.3, respectively Subsubsections4.2.1.3,4.2.2.3,4.2.1.4and4.2.2.4.

For this purpose, we give a proof that the suitable standardized Lagrange multi-pliers in case of the binomial and Poisson models converge towards the correspond-ing Lagrange multipliers of one-dimensional normal location. Moreover, we show that the Lagrange multipliers in case of the Poisson model can be approximated by the corresponding Lagrange multipliers arising in the binomial model.

With these results on hand, we numerically computed the “distance” in terms of the MSE–inefficiency between the optimal IC and the corresponding approximation.

In case of contamination neighborhoods these approximations work well for small radii (r ≤ 0.5 ). In case of total variation neighborhoods these approximations perform even better and we seem to get very good approximations independent of the considered neighborhood radius.

In Sections 3.3 and 4.3 we assume the (starting) radius of the infinitesimal neighborhoods is unknown. We give some numerical results for the least favorable radii and the corresponding MSE–inefficiencies in case of the binomial and Poisson models. In both models and all considered situations the efficiency loss stays below 30% and in most cases is even much smaller.

The construction problem in case of the binomial and Poisson models is solved in Sections 3.4 and 4.4. That is, we verify that we can construct the optimally robust estimator by means of one-step constructions by applying the results of Subsection 2.3.3. In particular, we investigate those parameter values for which

66

the centering constant ar (∗ = c) is non-unique for r ≥ r¯; i.e., we cannot ap-ply Lemma 2.3.6 (b). As initial estimator we propose and also implemented the Kolmogorov(–Smirnov) minimum distance estimator.

The implementation of the binomial model by means ofS4classes and methods (cf. Chambers(1998)) usingR (cf.R Development Core Team (2005)) is in detail described in Section 3.5. Since the implementation of the Poisson model is very similar, we give only a very short description in Section 4.5. Both models are included in ourRpackageROptEst(cf. AppendixD.3) which is part of ourRbundle RobASt.

To demonstrate the need of robust estimation in these two simple discrete mod-els, we included some small simulation studies; confer Sections 3.6 and 4.6. The results indicate that the classically optimal estimator (mean) is too sensitive and already very small deviations from the ideal model may lead to a very high effi-ciency loss compared to the optimally robust estimator. In addition, the results of these studies point out that the radius–minimax estimator may be a good choice if the true neighborhood radius is unknown.

Exponential Scale and Gumbel Location

Hampel (1968) (cf. Section F.1) discusses robust estimation in case of the expo-nential model where this model arises as an important special case of the Gamma model. He proposes to use a trimmed mean and suggests that the trimmed mean has the same breakdown point as the commonly used Winsorized mean (cf.Feller (1971), Problem 17, p 41) but, in addition, has a smaller sensitivity.

Gather and Schultze(1999) consider the standardized median as robust estima-tor for the exponential scale model. They show (cf. Theorem 2.1, ibid.) that this estimator is most B-robust in sense ofHampel et al.(1986); i.e., has minimal gross error sensitivity. In addition,Gather and Schultze(1999) introduce two other ro-bust estimators (RCS and Q estimators) which have been proposed byRousseeuw and Croux(1993). All three estimators have the highest possible breakdown point which in this setup is 0.5 . However, their bias curves and their asymptotic relative efficiencies are different.

As already mentioned above, our optimally robust influence curves are solu-tions to certain optimization problems and we obtain asymptotic normality of our more general AL estimators on full 1/√

n neighborhoods by the smoothness of the underlying parametric model and a suitable estimator construction. Moreover, in case of the one-step construction global properties like breakdown can be delegated to the initial estimate. Aside from these (local and global) properties, the focus of Chapter5 is rather the connection between location and scale models than the models themselves.

In Chapter 5 we show that certain scale and location models are connected via the transformations ±log| · | which is motivated by the treatment of the normal scale model in Section 5.6 ofHuber(1981). This is for instance true in case of the exponential scale and the Gumbel location model.

We begin with a brief introduction of the dimensional scale and the one-dimensional location model; confer Subsections 5.1.1and 5.1.2. Subsequently, we derive the mentioned connection (cf. Subsection 5.1.3) and show that this

con-67

nection entails a strong relationship between the Lagrange multipliers contained in the corresponding MSE optimal ICs. To demonstrate our results, we use the exponential scale model which is related to the Gumbel location model via the transformation −log| · |.

The optimally robust ICs for these two models in case of contamination (∗=c) as well as total variation (∗ = v) neighborhoods are specified in Section5.2. In both cases (∗ = c, v) the optimal ICs can be rewritten in such a way that the contained Lagrange multipliers are identical for both models.

As a consequence of this coincidence of the Lagrange multipliers, the least favor-able radii and the corresponding MSE–inefficiencies, which are given in Section5.3, are identical for both models. In case the radius is completely unknown, the max-imum efficiency loss is about 38% (∗ =c) and 22% (∗=v), respectively. That is, the loss is larger than in case of normal location, respectively lognormal scale where we obtain about 18% (∗=c, v); confer Remark5.1.9(b). But, it is smaller than in case of normal scale where the subefficiency is about 50% (∗ = c) and 25% (∗=v), respectively; confer Section 5.2 ofRieder et al.(2001).

The construction problem for one-dimensional location, respectively scale mod-els is treated in Section 5.4. If the considered location, respectively scale model forms a exponential family, we can construct the optimally robust estimators by means of the one-step method; confer Lemma2.3.6. As initial estimator we propose the Kolmogorov(–Smirnov) minimum distance estimator which has the required properties (strict and √

n consistent).

A short description of the implementation of some one-dimensional scale (expo-nential, normal, lognormal), respectively some one-dimensional location (Gumbel, normal) models is given in Section 5.5. All these models are included in our R packageROptEst(cf. AppendixD.3) which is part of ourRbundleRobASt.

Gamma Model

In Section F.1Hampel(1968) treats robust estimation in case of the Gamma model.

However, he only considers the estimation of the scale parameter σ for known shape parameter α, respectively the estimation of the shape parameter αfor known scale parameter σ and not the simultaneous estimation of scale and shape.

Hampel et al.(1986) (Section 4.4, p 256) consider the robust estimation of the shape parameter α where scale σ is regarded as nuisance. Instead of σ they use the re-parametrization ν = log(σ) which has been introduced in Example 1 of Subsection 4.3d (ibid.). This re-parametrization endows the Gamma model with a certain invariance structure; confer Section 6.1.

Marazzi and Ruffieux(1996) discuss the implementation of the M estimators for the Gamma model proposed byHampel et al.(1986). They also work with the re-parametrization ν. In addition, they consider the parametrization κ= log(α) +ν since their main interest is the estimation of the mean of the Gamma distribution which is ασ=eκ.

Such differentiable parameter transformations with Jacobian matrix of full rank are also allowed in case of the optimal solutions presented in Section1.3. We use the Gamma model to demonstrate how one can estimate such transformations in our setup. Moreover, the optimality result given in Theorem1.3.11 is clearly stronger

68

than the optimality provided by Theorem 4.3.1 ofHampel et al.(1986) (cf. also the discussion before Theorem 4.3.1, ibid.).

In Chapter6we first briefly introduce the Gamma as ideal model where we take into account the parameter transformation cited above; confer Section6.1.

The MSE optimal IC in case of contamination neighborhoods (∗=c) is specified in Section 6.2. We show how the re-parametrization ν = log(σ) by means of Theorem 2.4.1 leads to a simplification in our setup, too. However, in contrast to Section 4.4 of Marazzi and Ruffieux (1996), where the standardizing matrices for bijective and differentiable parameter transformations can always be obtained via the corresponding Jacobian matrices, this is not possible in general for the Lagrange multipliers included in our MSE solutions. We may derive valid ICs via the corresponding Jacobian matrices, but, these ICs lead to suboptimal robust estimators which may have a quite large efficiency loss (>100% ); confer Table6.1.

In Section 6.3we give some numerical results for the least favorable radii and the corresponding MSE–inefficiencies. In case the true neighborhood radius is com-pletely unknown the maximum subefficiencies are about 50% in all examples con-sidered.

Since the Gamma model forms a exponential family of full rank, we can apply the results of Subsection2.3.3; confer Section6.4. That is, we can construct the op-timally robust estimators by means of the one-step method using the Kolmogorov(–

Smirnov) minimum distance estimator as initial estimator.

A short description of the implementation of the Gamma model is given in Section6.5. Again, the corresponding optimally robust estimators can be computed via ourRpackageROptEst(cf. AppendixD.3) which is part of ourRbundleRobASt.

So far (version 0.3-9), packageROptEstcan be used to compute MSE optimal ICs and estimators for any L2 differentiable parametric family which is based on a univariate distribution.

Chapter 3