• Keine Ergebnisse gefunden

Some Adaptive Procedures for Regression Models

N/A
N/A
Protected

Academic year: 2022

Aktie "Some Adaptive Procedures for Regression Models"

Copied!
15
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

NOT FOR QUOTATION WITHOUT PERMISSION OF THE AUTHOR

SOME ADAPTIVE PROCEDURES FOR REGRESSION MODELS

M. Huskova June 1935 CP-85-30

C o Z Z a b o r a t i v e P a p e r s report work which has not been performed solely at the International Institute for Applied Systems Analysis and which has received only limited review, Views or opinions expressed herein do not necessarily represent those of the Institute, its National.Member Organizations, or other organi- zations supporting the work,

INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS A-2361 Laxenburg, Austria

(2)
(3)

FOREWORD

W i t h i n t h e framework of t h e Economic S t r u c t u r a l Change Program, a c o o p e r a t i v e r e s e a r c h a c t i v i t y o f IIASA and t h e U n i v e r s i t y o f Bonn, FRG, a p r o j e c t i s c a r r i e d o u t on " S t a t i s - t i c a l and E c o n o m e t r i c I d e n t i f i c a t i o n o f S t r u c t u r a l Changef';

t h e p r o j e c t i n v o l v e s s t u d i e s on t h e f o r m a l a s p e c t s o f t h e

a n a l y s i s o f s t r u c t u r a l changes. On t h e one hand, t h e y i n c l u d e s t a t i s t i c a l methods t o d e t e c t n o n - c o n s t a n c i e s , s u c h a s s t a - b i l i t y . t e s t s , d e t e c t i o n c r i t e r i a , e t c . , and on t h e o t h e r hand, methods which a r e s u i t a b l e f o r models which i n c o r p o r a t e non- c o n s t a n c y o f t h e p a r a m e t e r s , s u c h a s e s t i m a t 2 o n t e c h n i q u e s f o r t i m e - v a r y i n g p a r a m e t e r s , a d a p t i v e methods, e t c .

The p r e s e n t p a p e r s u r v e y s a d a p t i v e methods f o r r e g r e s s i o n a n a l y s i s , i . e . , methods which are d e p e n d e n t on t h e d a t a t o b e a n a l y z e d . A f i n a l c h a p t e r s k e t c h e s some problems r e l a t e d t o t h e u s e of a d a p t i v e r e g r e s s i o n methods i n t h e c o n t e x t o f s t r u c - t u r a l c h a n g e s , s u c h a s t h e i n v e s t i g a t i o n o f p r o p e r t i e s o f a n a d a p t i v e v e r s i o n o f Q u a n d t F s s w i t c h i n g r e g r e s s i o n p r o c e d u r e .

A n a t o l i Smyshlyaev A c t i n g Leader

Economic S t r u c t u r a l Change Program

(4)
(5)

SOME ADAPTIVE PROCEDURES FOR REGRESSION MODELS Marie Huskova

C h a r l e s U n i v e r s i t y , S o k o Z o v s k a 8 3 , 186 0 0 Prague 8 , C z e c h o s Z o v a k i a

INTRODUCTION

Regression models belong to those statistical models, which are applied to extremely diverse types of data in many fields of quantitative relationships. Normally distributed errors are usually assumed and least squares est2mates are applied. It is known that for normally distributed errors the least squares estimates are optimal in several respects, while for nonnormally distributed errors these estimates are

ineffective and, moreover, they are sensitive to outlying observations.

Classes of estimators were developed whlch show a rezson- able behavior for comparatively large

families

of error dis- tributions and which are not too sensitive to the outliers.

Such estimators are usually called r o b u s t . Some of these estimators can be adapted with respect to the data

In

such a way that the resulting estimates are in some sense optimal;

these estimators are called a d a p t i v e .

The aim of this paper is to present some adaptive estimates for regression models.

(6)

Consider the linear model:

or equivalently,

where Y = (Ynl,

...,

Y ) ' is a vector of observations.

-n nn

X = ( X n ,

.

X n n l is a vector of independent identically -n

distributed random errors, a and 8 = (8,...,8)t are unknown

-

%

parameters, and

Cn

- (~n,ij)i = l,...,n j = l,...,p

is a design matrix nxp of full rank

(=PI.

Moreover, it is assumed that Xni has a distribution function F and a density f (with respect to the Lebesgue measure) belonging to some class F of densities.

The problem is that of estimating 8.

,"

If F is normal with mean zero, then the least squares estimate 8 = (8nl,...,9 )

'

is optimal. More precisely, it

-n nP

is unbiased

and has minimal variance

var

C f

u.0

1

~ v a r ~

r, ni

f

U.e*

i= 1 1 ni

I= 1

for all ul

, . . . ,

u

P

where 8* = (9nl,...,f3 )

'

is an arbitrary unbiased estimate

-n nP

of 8. Recall the definition of the least squares estimate:

CI

(7)

I1 2

0 = arg min

1

6i (0)

,

--n 8 i=l

-

where 6i (8)

-

-

-

Yni

-

c

,

c

- .

) 0

,

sometimes called resi-

j=l n11

duals, and

in

= n-'

1

c ; then the variance matrix can be

I j i= 1 n,ij rewritten as follows:

* *

-1

var

Bn

= var X nl -n--n (C C ) I

If the error distribution function F is nonnormal, the least squares estimate 0 is in most cases not even reasonable

-n

(see, e.g., Huber 1972). Since the true underlying distri- bution is seldom exactly known, it is sensible to use pro- cedures which work well for a variety of possible situations.

Such procedures are called robust. More information on this issue can be found, e.g., in Huber (1981), Jureckova (1985).

The typical robust estimates are M- and R-estimates.

The M-estimate (estimate of the maximum likelihood type) !M(Y) is defined as follows:

- -

0 (Y) = arg min

1

(tii (0) )

--M 0 i= 1

-

or, equivalently, it is the solution of the system of equations

with respect to 8 ,

."

where p is a convex function and p t = Y.

The choice of Y (x) = x and Y (x) = Yf (x) = -f (x)/f (x) leads to

(8)

the least squares estimate and to the maximum likelihood estimate, respectively. The R-estimate (estimate based on ranks) can be defined in either of the following ways:

11

8 ($1 = arg min

1

$ (Rni(0) (n+l I-') 6. (8)

,

-.R 0 i=l

...

1 "

*

P n

0 ($1 = arg min

1 I 1

$ J ( R ~ ~ ( ~ ) ( I I + I ) - ~ )

-R 8 j=1 i=1 "

where $ is a monotone function on (0,l) and Rni ( 8 ) is the rank

...

of fii (8) among (8)

, . . .

,fin (8)

.

Both estimates are asymp-

% -.

-

totically equivalent.

Both the M- and R-estimates allow a one-step version, i.e.

to start with some reasonably good preliminary estimate and then to apply one step of the Newton method to the corre- sponding system of equations.

Generally, the M- and R-estimates are under very mild conditions asymptotically unbiased and consistent. If the error distribution F is known and some regularity conditions are fulfilled, the estimates !M (Yf)

, BR (qf) ,

and 8

*

( ) with

-R f

Yf (x) = -f' (x)/f (x)

,

x E R1

,

and

qf

(u) = -f

'

(p-ltu) )/f (F-l (u) )

,

u E (0,l) (where F-l is the quantile function corresponding to F) are asymptotically optimal, i.e. they are asymptotically un- biased and have asymptotically the smallest variance matrix.

The latter property means that the asymptotic variance matrix

* *

should be closed to (C C ,n ) (f)

,

where I (f) = (fb (x) ) */f (x)dx is the Fisher information.

If F is unknown and we are still interested in having an asymptotically optimal estimate, at least in some class of error distributions, we can either construct quite new esti- mates (which is a difficult problem and solved only in very

special cases) or adapt the already known estimates

with

(9)

respect to the data. Attention was mainly paid to the latter case. To adapt M- and R-estimates means either to replace Yf and $f, respectively, by suitable estimates, or--assuming that the true density belongs to family F of error distribu- tions--to choose a density f E F according to a decision rule

0

that fits the data.

A simple form of such adaptive estimates was already in- tuitively used by many scholars in the field of applied

statistics; e.g.,with regard to the problem of estimating 0 in the model Yni = 9

+

Xni, i = 1,

...,

n, where Xni has a symmetric distribution, they used either the arithmetic mean or the median, depending on the data to be analyzed,

In the next section some typical adaptive M- and R-esti- mates are introduced.

For more detailed information on adaptive procedures for various models and other statist2cal problems see review

papers by Hogg (1 974)

,

Hogg and Lenth (1 9841

,

and Huskova (1985). General considerations on adaptive procedures can be found in the paper by Bickel (1 982 j

.

ADAPTIVE M- AND R-ESTIMATES

The basic steps in the procedure are the following:

a. Find a reasonable robust preliminary estimate

5

of 0 . -n

...

b. Choose a reasonable family P of error distributions and a decision rule for selecting a denslty fo E F as a possible true density or the type of estbate for Yf($f).

c. Using the residuals 6 1 ( 8 ?.n )

, . . .

,6,(@ ,n )

,

select fo E F

A A

according to the decision rule or find an estimate Yf($f) of

d. Compute the one-step version of the M- (R-)estimate using the preliminary estimator

8

and replacing Yf($f) by

* A h

either Y ( ) or by its estimate Yf($f) from step c.

£0 fo

As preliminary estimates either M-estimates with

Y

(XI

= x a . , x E R 1 , 1

-

< a < 2, or R-estimates with $ (u) = u, u E (0,l) are recommended.

(10)

Moberg et a l , (1980) proposed a decision rule based on the measure of skewness Q3 and the measure of tailweight Q4, where

with

C

(a)

, M

(a)

, U

(a) being the arithmetic means of the smallest, the medium, and the largest [na] of the order statistics Z <...<Z corresponding to the residuals

(1)-

-

(n)

Starting from the generalized A-family of distribution (che quantile function can be expressed as F-' (p) =

A 2 h

= h l + (p

-

I - 3)/~4, p E (o,I), A ~ E1' Ri = 1

,...

3,

X4 > O), and using the Monte Carlo method, they proposed par-

titioning of distributions into five classes (light-tailed and symmetric (I), medium-tailed and symmetric (11)

,

heavy-tailed and symmetric (111)

,

light-tailed and skewed to the right (IV)

,

moderate-tailed and skewed to the right (V)) according to Q3 and Q4. For each class they recommend a proper choice of the function Y.

Jones (1979) developed an adaptive procedure based on

ranks and order statistics, originally for testing of symmetry.

This can easily be modified to the estimation problem. The author assumes that the family P consists of densities f with

IJJf

expressed as follows:

which contains densities ranging from light-tailed

(A

> 0) to heavy-tailed

(A

< 0) densities. The estimate of A was defined through the ordered sample Z (1)'- ,Z

(n corresponding to 8 1

(8

-n )

,...,

6 n -n

(8

1 , namely,

(11)

where M is chosen in a proper way to reflect the behavior. of the tail.

Koul and Susarla (1983) constructed the estimate

where f(x;r n ) is the kernel estimate of the density f (with kernel N ( o , ~ ~ I ) based on 61(!n),...,6 n -n ( 9

-

1 , an 0, r n 0, and as a resulting estimate they propose a slightly modified

A

one-step version of 0 (Yf).

-M

Huskova (1984) made use of the fact that for

qf

E L2 (0,l) one can write

00

where {Pk (u)

lk=o

is the system of Legendre ' s polynomials on (0,l) and

and suggested the following estimator of

qf:

(12)

with dk being an estimate of dk obtained by means of the asymptotic linearity of rank statistics, Mn + as n + a.

The procedure proposed by Moberg et al. (1980) can be

easily applied in practice; the Monte Carlo study supports this procedure, but from the asymptotical point of view it is not optimal. Several modifications of this procedure were de- veloped.

The procedure of Jones (1979) is asymptotically optimal, if the true density- belongs to the A-family of distributions.

The last two remaining procedures lead to asymptotically optimal estimates, but due to computational problems their practical application is--in their present form--not very appealing.

ADAPTIVE PROCEDURES FOR DETECTING CHANGE Consider the regression model:

where Y(ti) is the observation taken at time ti,tl<t2'

- ...'

tn

(not all equal), a,f31,...,0 B1,=..,B are unknown parameters,

P' P

r E (tl

,

t ] is an u n k n m . ~ time point, XI,.

. .

;X are independent

n n

random variables with a distribution function F , and (cj (ti)) i=1,.

. .

,n AS a deslgn matrix.

j=1, ...,p

The problem is concerned with testing the constancy of the regression relationship over time, i.e., Ho : 0 = Bj,

j 1

-

< j

-

< p against H : 8 . #

B .

for at least one j.

1 3 3

(13)

Sen (1980, 1982) proposed some test procedures based on rank statistics, or, more exactly, on the statistics

Sen (1983) developed a procedure for a more general testing problem: Y (tl )

, . . .

,Y (t n ) are independent random

variables, Y (t. 1 ) has a distribution function Fit i = 1,.

. .

,n,

and

-

Ho : F1

- ... - -

Fn against

where q is unknown, 1

-

< q < n. The test procedure is based on U-statistics, i.e.

where h is a symmetric 'function on Rm, m is fixed, 1

-

< m

-

< n.

Both types of procedures mentioned belong to the robust pro- cedures. Adaptive procedures were not yet developed.

The problems to be solved (first for a simple linear model and then for the general regression model) are:

1. The development of adaptive procedures combining already existing robust procedures (i-e. based on ranks) with the methods of adaptation and the investigation of their

asymptotic properties.

2. The development of robust procedures based on M-estimates (modification of Quandt's log-likelihood ratio procedure) iLld the investigation of their asymptotic properties.

3. The development of adaptive procedures corresponding to the robust procedures of point 2, and again the investi- gation of their asymptotic properties.

(14)

4. The development of robust and adaptive procedures for a more general problem, namely, to admit in regression model

( l e ) Xi with different distributions for ti

-

< r and ti > r.

5. The development of suitable algorithms for the pro- cedures of points 1-4.

REFERENCES

Bickel, P. (1982). On adaptive estimation. Annals of Statistics 10:647-671.

Hogg, R.V. (1974). Adaptive robust procedures: partial review and some suggestions for future applications and theory.

J. Amer. Statist. Assoc. 69:909-923.

Hogg, R.V., and R.V. Lenth (1984). A review of some adaptive statistical techniques. Commun. in Statist. A 13:1551- 1579.

EIuber, P.J. (1972). Robust statistics: a review. Ann. Math.

Statist. 43:1041-1067.

Huber, P.J. (1981). Robust Statistics. New York: Wiley.

Huskova, M. (1984). Adaptive procedures for the two-sample location model.. Commun. in Statist. Sequential Analysis 2:387-401.

Huskova, M. (1985). Adaptive methods. Handbook of Statistics, P.R. Krishnaiah and P.K. Sen, eds., 4:347-358.

Jones, D.H. (1979). An efficient adaptive distribution-free test for location. J. Amer. Statist. Assoc. 74:822-828.

Jureckova, J. (1985). M-, L- and R-estimators. Handbook of Statistics, P.R. Krishnaiah and P.K. Sen, eds., 4:463-485.

Koul, H.L., and V. Susarla (1983). Adaptive estimation in linear regression. Statistics and Decision 1:379-400.

Moberg, T.F., J.S. Ramberg, and R.H. Randles (1980). An adaptive regression procedure based on M-estimators.

Technometrics 22:213-224.

Sen, P.K. (1980). Asymptotic theory of some tests for a possible change in the regression slope occuring at an unknaatime-point. Z. f. Wahrscheinlichkeitstheorie verw. Gebiete, 52:203-218.

Sen, P.K. (1982). Asymptotic theory of some tests for con- stancy of regression relationships over time. Math. .

Operationsforsch. Statist., Statbtics, 13221-31.

(15)

Sen, P.K. (1983). Tests for change-points based on recursive U-statistics. Commun. in Statist. Sequential Analysis,

1:263-284.

Referenzen

ÄHNLICHE DOKUMENTE

Strain-stress state analysis of the rock massif upon the action of horizontal and vertical stresses depends on the Poisson’s ratio of each rock seam, that were installed in the

Consequently, it is crucial that national governments should be encouraged to develop appropriate policies based on short, medium and long-term aggregate demand and supply

Wie in der Anhörung, so kann er auch im Interview die Frage nicht beantworten und sieht sich offenbar dem Interviewer gegenüber zu einer Begründung verpflichtet..

In sum, the evaluation of the procedures shows that sophisticated mechanisms do not perform very well in situations where participants have the possibility to disregard the

Consolidate the European Platform of Universities Engaged in Energy Research, Education and Training (EPUE) as the main stakeholder representing the university

In their approach of using MDA, the military aircraft industry implements dependabil- ity in two ways: (i) by verifying the models in different domains and mapping rules used

Then the Brown forecasting procedure with fitting functions as specified in model A will provide minimum mean square error forecasts if and only if the under- lying

One might wonder whether the perverse behavior ex- hibited by Simpson's paradox is restricted to procedures of the above kind or whether it identifies an inherent