Genesis of this Thesis

The work on this thesis started with some numerical investigations about the “costs of not knowing the radius”; conferRieder et al.(2001). We computed the optimally robust influence curves for several well-known robust models like location, scale, and linear regression. For this purpose, we for instance had to invent new numerical algorithms in case of conditional (error-free-variables) regression neighborhoods.

The results contained in PartII started as an extension of these radius inves-tigations to some non-standard robust models. In particular, they led us to some new (theoretical) results like an extension of the classical Cram´er-Rao bound for quadratic loss (cf. Section2.1) or the convergence of robust models (cf. Section2.4).

Moreover, we were able to supply the mathematical results (cf. Section 2.2) which support and complement the purely numerical determination inRieder et al.(2001).

The models treated in Part II are covered by our RpackageROptEst; confer Ap-pendixD.3.

At that time Bednarski and M¨uller (2001) on robust linear regression with unknown scale was published which, in our view, did not look very convincing.

Hence, we decided to research into this model.

The fewest reader seem to realize, that already for scale alone, in the symmetrically contaminated model about a centered normal, the minimax variance approach of Huber(1981) remains incomplete. In addition, he shows no quantitative, let alone optimal, robustness of his joint location and scale estimates.

Moreover, the robust linear regression model with unknown error scale is covered by the local and asymptotic, infinitesimal robustness theories of Hampel et al.(1986) and Rieder(1994), however, is not treated very explicitly. This is true already for the simpler model of joint location and scale. In particular, Rieder(1994) has not specialized his optimality results contained in Section 5.5 to concrete models.

xlii

From our work Part III arose which is supplemented by Appendix B. Again, we also focussed on the numerical evaluation of the derived optimally robust influence curves. For this purpose, we implemented ourRpackagesROptRegTS, RobRexand RobLox; confer AppendixD.4, Section 7.6and Section8.9.

If the distribution of the unknown error is symmetric, the linear regression model is adaptive with respect to the unknown error scale. That is, the estimation of the regression parameter with unknown error scale is asymptotically no harder than the estimation of the regression parameter with known error scale. A very detailed treatment of adaptivity in semiparametric models is given in Bickel et al.(1998).

However, robustness properties of the introduced procedures are not considered;

confer p 4 (ibid.). The lectures on semiparametrics and (robust) time series models of my supervisor Prof. Dr. Rieder in 2001 and 2002 led to our definition and in-vestigation of adaptivity in the context of infinitesimal robust regression and time series models; confer PartIVand AppendixA.

We define robust adaptivity by means of the same value of two robust optimization problems; confer Section9.1. As a consequence, adaptivity is no longer only a di-chotomous criterion but, in contrast to previous literature, now has a quantitative meaning, too. That is, we were able to evaluate the amount of non-adaptivity nu-merically. We restricted our considerations to the optimally robust influence curves which can be computed via ourRpackageROptRegTS(cf. Appendix D.4) and did not treat the construction of the corresponding robust–adaptive estimators as in Stabla(2005).

The aim of Part V was, to check the asymptotics against exact finite-sample results. More precisely, we wanted to compare the finite-sample and asymptotic re-sults given byHuber(1968),Rieder(1989) andRieder(1980), respectively. While we were looking for a way to compute the finite-sample risk or at least an ap-proximation of it, we developed a convolution algorithm based on the fast Fourier transform (cf. AppendixCand Kohl et al. (2005)) which is implemented in ourR packagedistr (cf.Ruckdeschel et al.(2005)). This procedure enables us to com-pute a very accurate numerical approximation of the finite-sample distribution and the corresponding finite-sample risk of robust estimators which are constructed by means of the M principle; confer Part V and Ruckdeschel and Kohl(2005). The corresponding finite-sample and asymptotic minimax estimators are implemented in ourRpackagesROptEst(cf. AppendixD.3) andROptRegTS(cf. AppendixD.4).

Motivated by the work of P. Ruckdeschel (cf. Ruckdeschel (2004a), Ruckdeschel (2004b), Ruckdeschel (2004c), Ruckdeschel (2005e)) we also integrated some re-sults on higher order asymptotics.

Chapter 1 on the fundamentals of the asymptotic theory of robustness was included to make this thesis better readable. It is based on Chapters 4 and 5 of Rieder (1994). Actually, Chapter 2 of Part I was not intended. As already mentioned above, the contained results for the most part emerge from the work on the other parts and many of the included (theoretical) results were originated by our numerical evaluations. The results on one-step constructions (cf. Section2.3) arose during the work on ourRpackageROptEst(cf. AppendixD.3).

TheRbundleRobASt(cf. AppendixD), which consists of theRpackagesdistrEx, RandVar,ROptEst,RobLox,ROptRegTSandRobRexcan in principle be used to

re-xliii

compute all numerical results contained in this thesis. However, the main intention was to provide a way to determine optimally robust estimators for various smooth parametric models. Our implementation is based on S4classes and methods (cf.

Chambers (1998)) and uses our R package distr (cf. Ruckdeschel et al. (2005)) in a crucial way. With this approach we, contrary to Marazzi (1993), were able to uncouple our algorithms from specific distributional assumptions and to imple-ment them at one stroke for whole classes of models. For example, one can use our R package ROptEst to compute optimally robust estimators for any smooth parametric model which is based on a univariate distribution.

Acknowledgements

Many people helped and supported me during the work on this thesis.

First of all I thank my supervisor Prof. Dr. Rieder who taught me robust statis-tics, contributed many ideas to this work and was always willing to spend his valuable time for discussions.

Also many thanks to my permanent dialogue partner Dr. Peter Ruckdeschel.

Without him many of the ideas and results contained in this thesis would not have been found. In particular, he helped me to design the RbundleRobASt.

Last but not least I thank my wife Katrin, my daughters Anna and Emma, and my family and family-in-law. I owe much more than I can express here to them. Without their love, support and patience this thesis never would have been finished.

Einf¨ uhrung und Zusammenfassung i

Introduction xxxiv

Notation lxv

I Asymptotic Theory of Robustness 1

1 Asymptotic Theory of Robustness – an Abridge 5 1.1 Asymptotically Linear Estimators. . . 5 1.2 Infinitesimal Robust Setup. . . 8 1.3 Optimally Robust Influence Curves . . . 11 1.3.1 Introduction . . . 11 1.3.2 Bias Terms . . . 13 1.3.3 Minimum Trace Subject to Bias Bound . . . 14 1.3.4 Mean Square Error. . . 17 2 Supplements to the Asymptotic Theory of Robustness 18 2.1 Mean Square Error Solution . . . 18 2.1.1 Matrix A – an Analogue to the Inverse Fisher Information. . 18 2.1.2 Discrete Models and the Gap Condition . . . 20 2.1.3 Boundedness of Lagrange Multipliers. . . 25 2.1.4 Uniqueness of Lagrange Multipliers. . . 28 2.1.5 Continuity Properties of Lagrange Multipliers . . . 29 2.2 Least Favorable Radius . . . 35 2.3 One-Step Construction . . . 40 2.3.1 Motivation and Setup . . . 40 2.3.2 Sufficient Conditions for Mean Square Error Solution. . . 41 2.3.3 Sufficient Conditions in case of Exponential Families . . . 48 2.3.4 Median and Median Absolute Deviation . . . 51 2.4 Convergence of Robust Models . . . 58

CONTENTS xlv

II Non-Standard Robust Models 63

3 Binomial Model 69

3.1 Introduction. . . 69 3.2 Optimally Robust Influence Curves . . . 70 3.2.1 Contamination Neighborhoods . . . 70 3.2.1.1 Mean Square Error Solution . . . 70 3.2.1.2 Continuity and Uniqueness of Lagrange Multipliers 73 3.2.1.3 Normal Approximation . . . 76 3.2.2 Total Variation Neighborhoods . . . 82 3.2.2.1 Mean Square Error Solution . . . 82 3.2.2.2 Continuity and Uniqueness of Lagrange Multipliers 84 3.2.2.3 Normal Approximation . . . 87 3.3 Least Favorable Radius . . . 93 3.3.1 Contamination Neighborhoods . . . 93 3.3.2 Total Variation Neighborhoods . . . 97 3.4 One-Step Construction . . . 101 3.5 Implementation usingR . . . 101 3.6 A Small Simulation Study . . . 105

4 Poisson Model 109

4.1 Introduction. . . 109 4.2 Optimally Robust Influence Curves . . . 110 4.2.1 Contamination Neighborhoods . . . 110 4.2.1.1 Mean Square Error Solution . . . 110 4.2.1.2 Continuity and Uniqueness of Lagrange Multipliers 112 4.2.1.3 Normal Approximation . . . 115 4.2.1.4 Poisson Approximation . . . 120 4.2.2 Total Variation Neighborhoods . . . 126 4.2.2.1 Mean Square Error Solution . . . 126 4.2.2.2 Continuity and Uniqueness of Lagrange Multipliers 128 4.2.2.3 Normal Approximation . . . 131 4.2.2.4 Poisson Approximation . . . 137 4.3 Least Favorable Radius . . . 142 4.3.1 Contamination Neighborhoods . . . 142 4.3.2 Total Variation Neighborhoods . . . 146 4.4 One-Step Construction . . . 149 4.5 Implementation usingR . . . 149 4.6 A Small Simulation Study . . . 150 5 Exponential Scale and Gumbel Location Model 153 5.1 Introduction. . . 153 5.1.1 One-Dimensional Scale Model. . . 153 5.1.2 One-Dimensional Location Model. . . 155 5.1.3 Connection between One-Dimensional Scale and

One-Dimen-sional Location . . . 156

xlvi CONTENTS

5.2 Optimally Robust Influence Curves . . . 160 5.2.1 Contamination Neighborhoods . . . 160 5.2.2 Total Variation Neighborhoods . . . 162 5.3 Least Favorable Radius . . . 164 5.3.1 Contamination Neighborhoods . . . 164 5.3.2 Total Variation Neighborhoods . . . 164 5.4 One-Step Construction . . . 165 5.5 Implementation usingR . . . 165

6 Gamma Model 167

6.1 Introduction. . . 167 6.2 Optimally Robust Influence Curves . . . 169 6.3 Least Favorable Radius . . . 172 6.4 One-Step Construction . . . 172 6.5 Implementation usingR . . . 172

III Robust Regression and Scale 174

7 Regression and Scale 180

7.1 Introduction. . . 180 7.1.1 Ideal Model . . . 180 7.1.2 Infinitesimal Neighborhoods. . . 182 7.1.3 Estimators . . . 182 7.1.3.1 AL Estimators . . . 186 7.1.3.2 M Estimators. . . 186 7.1.4 Equivariance . . . 188 7.2 Simultaneous Estimation. . . 190 7.2.1 AL Estimators . . . 190 7.2.1.1 Unconditional Contamination Neighborhoods . . . . 190 7.2.1.2 Average Conditional Contamination Neighborhoods 197 7.2.2 M Estimators . . . 201 7.2.2.1 Unconditional Contamination Neighborhoods . . . . 201 7.2.2.2 Average Conditional Contamination Neighborhoods 214 7.3 Separate Estimation . . . 225 7.3.1 AL estimators . . . 225 7.3.2 M estimators . . . 226 7.3.3 BM Estimators . . . 231 7.4 One-Step Construction . . . 236 7.4.1 Normal Regression . . . 236 7.4.2 Normal Regression and Scale . . . 237 7.5 Numerical Results . . . 240 7.5.1 Simultaneous Estimation . . . 240 7.5.2 Separate Estimation . . . 246 7.6 Implementation usingR . . . 248 7.6.1 RPackage ROptRegTS . . . 248

CONTENTS xlvii

7.6.2 RPackageRobRex . . . 251 8 Normal Location and Scale – a Comparative Study 253 8.1 Setup . . . 253 8.2 AL Estimators . . . 254 8.3 M Estimators . . . 257 8.4 BM Estimators . . . 260 8.5 Other Proposals . . . 262 8.5.1 Huber Estimators . . . 263 8.5.2 Hampel Estimators. . . 267 8.5.3 Andrews Estimators . . . 269 8.5.4 Tukey Estimators. . . 273 8.6 MM Estimators . . . 276 8.7 Numerical Comparison . . . 278 8.8 One-Step Construction . . . 280 8.9 Implementation UsingR . . . 281

IV Robust Adaptivity 283

9 Robust Adaptivity 287

9.1 Introduction. . . 287 9.2 Regression Models . . . 289 9.2.1 Regression and Scale . . . 289 9.2.1.1 Unconditional Contamination Neighborhoods . . . . 290 9.2.1.2 Average Square Conditional Contamination

Neigh-borhoods . . . 292 9.2.1.3 Average Conditional Contamination Neighborhoods 293 9.2.1.4 Average Conditional Total Variation Neighborhoods 294 9.2.2 Regression with Intercept . . . 294 9.2.2.1 Unconditional Contamination Neighborhoods . . . . 296 9.2.2.2 Average Square Conditional Contamination

Neigh-borhoods . . . 297 9.2.2.3 Average Conditional Contamination Neighborhoods 298 9.2.2.4 Average Conditional Total Variation Neighborhoods 298 9.2.2.5 Numerical Results . . . 299 9.2.3 Implementation UsingR . . . 306 9.3 Time Series Models. . . 307 9.3.1 ARMA(p, q) with Shift . . . 307

9.3.1.1 Average Square Conditional Contamination Neigh-borhoods . . . 308 9.3.1.2 Average Conditional Contamination Neighborhoods 310 9.3.1.3 Average Conditional Total Variation Neighborhoods 312 9.3.2 ARCH(p) with Scale . . . 314

9.3.2.1 Average Square Conditional Contamination Neigh-borhoods . . . 315

xlviii CONTENTS

9.3.2.2 Average Conditional Contamination Neighborhoods 316 9.3.2.3 Average Conditional Total Variation Neighborhoods 317 9.3.3 Numerical Results . . . 318 9.3.3.1 AR(1) and MA(1) with Shift . . . 318 9.3.3.2 ARCH(1) with Scale. . . 323 9.3.4 Implementation UsingR . . . 327

Im Dokument Numerical Contributions to the Asymptotic Theory of Robustness (Seite 43-50)