Asymptotic Properties of Restricted L1-Estimates of Regression

(1)

NOT FOR QUOTATION WITHOUT THE PERMISSION OF THE AUTHOR

ASYMl'TOTIC PROPERTIES OF RESTRICTED L , - E ~ T E S OF REGRESSION

F e b r u a r y 1987 WP-87-18

Working Papers are interim r e p o r t s on work of t h e International I n s t i t u t e f o r Applied Systems Analysis and have r e c e i v e d only limited review. Views or opinions e x p r e s s e d h e r e i n d o not n e c e s s a r i l y r e p r e s e n t t h o s e of t h e Institute or of i t s National Member Organizations.

INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS A-2361 Laxenburg, Austria

(2)

FOREWORD

The r e c e n t r e s u l t s on asymptotic behavior of s t a t i s t i c a l estimates and of optimal solutions of s t o c h a s t i c optimization problems obtained by DupaEovh and Wets are used to p r o v e consistency of r e s t r i c t e d Ll-estimates u n d e r more g e n e r a l assumptions. For t h e s p e c i a l c a s e of l i n e a r l y r e s t r i c t e d l i n e a r L1-regression, Lagrangian a p p r o a c h is used t o a c h i e v e asymptotic normality.

Alexander B. Kurzhanski Chairman System and Decision S c i e n c e s P r o g r a m

(3)

ABSTRACT

Asymptotic p r o p e r t i e s of L l-estimates in l i n e a r r e g r e s s i o n have been studied by many authors, see e.g. Bassett and Koenker (1978), Bloomfield and S t e i g e r (1983). I t i s the lack of smoothness which does not allow to use t h e known r e s u l t s on asymptotic behavior of M-estimates (Huber (196'7)) directly. The additional lack of a convexity in the nonlinear regression c a s e i n c r e a s e s t h e complexity of t h e problem even under assumption t h a t the t r u e p a r a m e t e r values belong t o t h e i n t e r i o r of the given p a r a m e t e r s e t ; f o r a consistency r e s u l t in this case s e e e.g.

Oberhofer (1982).

W e shall use t h e technique developed in DupaEovA and Wets (1986), (1987) t o g e t asymptotic p r o p e r t i e s of t h e L l-estimates of regression coefficients which are assumed t o belong t o a n a p r i o r i given closed convex set given, e.g., by constraints of g e n e r a l equality and inequality form. The method uses, La., tools of nondifferen- tiable calculus and epi-convergence and i t c a n b e applied t o o t h e r classes of L1- estimates as well.

(4)

CONTENTS

1 Introduction 2 Consistency

3 Asymptotic Normality 4 Proof of Theorem 3 . 4 R e f e r e n c e s

-

vii

-

(5)

whose nondifferentiability precludes from d i r e c t application of t h e r e l a t e d asymptotic r e s u l t s . Nevertheless, asymptotic normality of t h e Ll-estimates of r e g r e s s i o n coefficient w a s proved by Bassett and Koenker (1978) f o r l i n e a r r e g r e s s i o n with nonrandom r e g r e s s o r s and by Bloomfield and S t e i g e r (1983) f o r l i n e a r r e g r e s s i o n with random r e g r e s s o r s under ergodicity and s t a t i o n a r i t y assumption. Consistency r e s u l t f o r L l-estimates of p a r a m e t e r s in nonlinear r e g r e s s i o n model can b e found in Oberhofer (1982).

In model (1.1) and, correspondingly, in t h e optimization problem (1.3), res- t r i c t i o n s on t h e estimated p a r a m e t e r values c a n b e t a k e n into account to r e s p e c t technical and modeling considerations and, eventually, to g u a r a n t e e t h e uni- quenesss of t h e estimate (Barrodale and R o b e r t s (1977)). Inequality c o n s t r a i n t s on t h e estimates, however, introduce a n essential lack of smoothness. That i s why one usually assumes (see e.g. H u b e r (1967), Oberhofer (1982)) t h a t t h e t r u e p a r a m e t e r v e c t o r i s an interior point of t h e given admissible set.

For t h e case of linearly r e s t r i c t e d l i n e a r r e g r e s s i o n t h e simplex method c a n b e used to g e t t h e r e s t r i c t e d L l-estimates, i.e., to g e t t h e optimal solution of t h e mathematical programming problem

where S i s a nonempty convex polyhedral set. (For a s u r v e y on solution techniques see e.g. Barrodale and R o b e r t s (1977) or Arthanari and Dodge (1982).)

The optimal solution of (1.4) lies v e r y often on t h e boundary of S what does not conform with t h e mentioned assumption t h a t t h e t r u e p a r a m e t e r value i s a n in- t e r i o r point of S.

W e shall use t h e technique developed in DupaEovA and Wets (1986), (1987) to g e t consistency and asymptotic normality of r e s t r i c t e d L l-estimates. The s p e c i a l form of t h e objective function t o g e t h e r with t h e use of empirical distribution help to simplify t h e assumptions used in t h e mentioned p a p e r s . F u r t h e r simplification i s possible in cases where g is l i n e a r both in b and x , i.e., f o r l i n e a r regression.

(6)

2. CONSISTENCY

Let

t

b e a n m

+

1 dimensional random v e c t o r on ( E , a , P) with components

to, tl, . . .

,

tm

^{and f'}^{o :}^{R n}^x^{R m}^'l^--,R1 t h e function

Assume t h a t S

c

R n i s a given nonempty closed set of admissible p a r a m e t e r values and define f' : R n X Rm ^--,

=

R1

U

lwj t h e function

f ' ( b , t ) = f o ( b ,

t )

f o r b E S a n d t € R m + l

= +

on f o r b 6Z S

.

Observe t h a t f' (b

, t ) =

^f'o ( b , t )

+

9, ( b ) where 9, (b ) is t h e indicator function of t h e set S

9 , ( b )

=

0 f o r b ^{E S}and 9 , ( b )

= +

^oo^if ^b^{g S}

Let us f i r s t discuss t h e properties of' the f i n c t i o n s f' a n d f'

o.

a ) If t h e function g : R n x Rm --, R1 i n (2.1) is a c o n t i n u o u s function, t h e n evi- dently

f' i s nonnegative and continuous in

t

f o i s nonnegative a n d continuous i n b and

t.

b ) If f o r a n a r b i t r a r y

t

^EE t h e function g i n (2.1) i s LocaLLy L i p s c h i t z in b t h e n f o r all

t

^E^E

f' o(',

t)

i s locally Lipschitz i n b and

f' ( a ,

t)

i s lower semicontinuous.

c ) Taking into account t h e special form of f o ( b ,

t)

w e c a n write (using again t h e notation f o r t h e m -dimensional s u b v e c t o r of components

tl, . . .

,

tm)

t ) =

max

t),

^Q2(b.

t)j

with

Ql(b*

t ) ⁼ to

- g ( b ,

'i)

Q2(b ^I t )

= - t o

+ g ( b

?I>.

If f o r a n a r b i t r a r y

t

^EE t h e function g i s continuousLy d w e r e n t i a b l e in b t h e n f o r a l l

t

^E^Z(see Rockafellar (1981), p r o p . 4 A and 3 H )

(7)

f' O(.,

t )

is locally Lipschitz and subdifferentially r e g u l a r and

f' ^{( a ,}

t)

i s lower semicontinuous on R n

.

The s u b d w e r e n t i a l (with r e s p e c t t o b)

8 J o ( b ,

t) ⁼

^conv106 iPi(b,

t )

^{f o r}ⁱsuch t h a t f o ( b ,

t) ⁼

^iPi^(b^,

t)j

The estimate of t h e p a r a m e t e r v e c t o r @ based on t h e sample of size ^IJfrom t h e distribution P I i.e., t h e optimal solution b of t h e mathematical program

- -

V

min

C

f o ( b ,

t i )

^{on t h e}^set^S

i = 1

-

v b ~ ( b s if

t o

> g ( b ~ v 6 9 ( b ,

T )

^if

t o <

^{g ( b ,}

T )

conv

1 - vb

^{g (b}^,

t),

v 6 g ( b ,

1

otherwise

.

i

o r , equivalently,

V

min f ' ( b , t i ) o n R n ,

i = 1

c o r r e s p o n d s t o t h e use of t h e (random) empirical probability measure P" which c o n v e r g e s t o P in distribution almost surely. In o u r analysis w e have to use assumptions c o n c e r n e d jointly with t h e considered probability measures P , P V and t h e functions f' or f'

o.

ASSUMPTION 2.1 To a n y bounded set V

c

R n t h e r e corresponds a summable

& n c t i o n n s u c h t h a t for a n y p a i r b O , b1 E V

COMMENT 2.2 Besides of local Lipschitz p r o p e r t y of

S o ( ' , 4)

which i s implied by t h e same p r o p e r t y of g (.,

?j,

w e assume t h e integrability of t h e Lipschitz constant

n.

ASSUMPTION 2.3 The p r o b a b i l i t y m e a s u r e s P, P V , ^IJ

=

1, 2 , .

..

^{a r e}^f'-tight, i.e., V b E S a n d E

>

0 there i s a compact set

K, c

E s u c h t h a t

(8)

Assumption 2.3 is fulfilled automatically for f ( b , .) bounded o r Z compact. For

t

one-dimensional, i t i s equivalent to uniform integrability of f o ( b , .) i n

P V

f o r b E S a n d i t i s equivalent t o t h e convergence of expectations

to a finite expectation

f o r all b E S (see Loeve (1955), Section 11.4).

Under Assumption 2.3, similar r e s u l t s hold t r u e in t h e more-dimensional c a s e as well (see DupaEovh and Wets (1986)), namely:

The expectations

a r e a.s. finite and lower semicontinuous on S and Ef

=

lim E

=

epi -1im E

.

v+= V + Q

In addition, t h e consistency p r o p e r t y follows (see DupaEov6 and Wets (1986), Theorem. 3.9):

THEOREM

2.4 Let in t h e d e f i n i t i o n s (2.T) a n d (2.2),

t

be a n ( m

+

^1)-

dimensionaL r a n d o m vector o n (Z, a . P), S

c

R n be a d o s e d n o n e m p t y set a n d t h e m n c t i o n g : R n X Rm --+ R' be c o n t i n u o u s . Let

PV,

v

=

^1,^2,

...

be (random) empirical m e a s u r e s based o n i n d e p e n d e n t sampLes of s i z e v from t h e d i s t r i b u - t i o n

P

s u c h t h a t A s s u m p t i o n s 2.P a n d 2.3 hold t r u e .

Then:

1 ) A n y c l u s t e r point of a n y sequence of [ b

F=l

s u c h t h a t b

"

^Ea r g min E " f , v

=

1, 2,

. . .

^,almost s u r e l y belongs t o a r g min E f

.

(9)

2 ) If t h e r e is a compact s e t D c R n s u c h that f o r v

=

1, 2,.

( a r g min E "f )

n

D is n o n e m p t y a.s.

and

181 =

(arg m i n E f ) n D

t h e n t h e r e e x i s t a m e a s u r a b l e selection Ib

'1 r=l

of [ a r g min E 'f

I

s u c h that

=

lim b "a.s.

v + -

and a l s o

inf Ef

=

lim (inf EVf ⁾a.s.

v + -

For t h e l i n e a r L1-regression, i.e., f o r t h e problem (1.4) with a n a l r e a d y given (observed) matrix X of r e g r e s s o r s , t h e existence of t h e optimal solutions b V fol- l o w s via p r o p e r t i e s of t h e corresponding l i n e a r program, see e.g. Bloomfield and S t e i g e r (1983). For nonlinear L1-regression this need not b e t h e case. To guaran- tee t h e existence of optimal solutions of t h e programs

f o r noncompact S o n e c a n use t h e infcompactness p r o p e r t y of t h e objective functions E

i f

(6, #)I and

EVlf

( b , C)

1.

To t h i s purpose, i t i s sufficient to assume t h a t f o r a set A E a with P(A)

>

0 ( r e s p . P V ( A )

>

0 ) t h e set

i s bounded f o r all a € R (see DupaEov6 a n d Wets (1986), Proposition 3.10). For t h e empirical measure P V , th i s p r o p e r t y i s evidently fulfilled i t t h e function

f

( a , C) i s infcompact f o r a realization of

C.

Evidently, o u r assumptions are weaker than those by Oberhofer (1982) and i t i s possible to p r o c e e d in a quite similar way to g e t t h e consistency of r e s t r i c t e d L l-estimates f o r o t h e r models without unnatural smoothness assumptions.

(10)

3. ASYMPTOTIC NORMALITY

Provided t h a t a l l t h e assumptions of Theorem 2.4 needed t o g e t t h e consisten- c y r e s u l t (2.4) are fulfilled we c a n study t h e rate of c o n v e r g e n c e f o r (2.4) i n a p r o b a b i l i s t i c s e n s e . To t h i s p u r p o s e , a p p r o p r i a t e d i f f e r e n t i a b i l i t y p r o p e r t i e s of o u r problems (2.5) are needed.

ASSUMPTION 3.1 F o r a n a r b i t r a r y

t

EZ, t h e f i n c t i o n g (., t ) is c o n t i n u o u s t y d m e r e n t i a b l e .

According t o r e s u l t s by Clark (1983) ( s e e a l s o t h e discussion in Dupa6ov6 a n d Wets (1987)) we h a v e with a f o ( b , C) given by (2.3)

LEMMA 3.2 U n d e r A s s u m p t i o n s 2.2, 2.3 and 3.2

and f o r an a r b i t r a r y b E S

w i t h e q u a l i t y

V

\k, is subd.igperentially r e g u l a r at b

.

COMMENT 3.3 a ) F o r convex sets o r f o r smooth manifolds, t h e i n d i c a t o r function 9 i s s u b d i f f e r e n t i a l l y r e g u l a r , see R o c k a f e l l a r (1981).

b ) Formula (3.1) t o g e t h e r with (2.3) imply t h a t f o r

P

absolutely continuous Ef i s d i f f e r e n t i a b l e .

The p r o p e r t i e s ( 3 . 1 , (3.3) imply t h a t f o r a n a r b i t r a r y b E S a n d v (b ) E

aE lf

^(b^,^{t )}^j t h e r e e x i s t v , (b ) E

a

^9,^(b⁾a n d measurable u O(b, ') s u c h t h a t almost s u r e l y

and

(11)

Similarly according to ( 3 . 2 ) , ( 3 . 4 ) , f o r a n a r b i t r a r y b E S and v " ( b ) E a E V [ f ' ( b ,

t ) ]

^{w e have}^almosts u r e l y

where

and

v a r v $ ( b )

=

- I v a r [ u 0 ( b S v

due t o subdifferential r e g u l a r i t y of

P o

and to the definition of P".

Application of t h e s e r e s u l t s t o n e c e s s a r y conditions

f o r t h e optimal solutions of t h e problems ( 2 . 5 ) , i.e. f o r

and

implies existence of vs (@) f 6 qs (@), v, ( b ") E 6 qs ( b ") and random functions u o ( @ , ^e^), u O ( b ", + ) s u c h t h a t

U O ( @ ,

t )

^E

af'o(@, t )

^ass.

u O ( b V ,

t )

^f6 f 0 ( b V , [) a.s. f o r v

=

1, 2 , . . . and

0 = E l u . o ( @ I

t>j

+ v s ( @ >

= ~ ( 8 )

0 = ~ " [ u ~ ( b " ,

t)j ⁺

v S ( b " ) = v V ( b " )

a.s. f o r

=

1, 2 , . .

For t h i s choice of s u b g r a d i e n t s v "(b '), t h e condition 1 v ' ( b ')

-.

⁰i n probability as v -P =

(12)

i s t r i v i a l l y fulfilled.

The b a s i c i d e a i s to a p p l y H u b e r ' s a p p r o a c h ( s e e H u b e r (1967), Section 4 ) to t h e s u b g r a d i e n t s v a n d v of t h e functions Ef a n d EVf t h a t fulfill (3.5) a n d (3.6) f o r to g e t t h e asymptotic normality of b V . The assumptions of ~ u p a 6 o v A a n d Wets (1987) r e d u c e to t h r e e basic conditions i n our case:

(a)

6

^{[ V}'(B)

+

v (b ')I ^-+0 i n p r o b a b i l i t y as v ^-+^a.

(b) Efo i s twice continuously d i f f e r e n t i a b l e at t h e point

B

with nonsingular Hes- s i a n H.

(c) 6 [ v , (b ')

-

^{v , (B)]}^-+0 in p r o b a b i l i t y as v ^-+^a.

The f i r s t two p r o p e r t i e s r e s e m b l e r e s u l t s of H u b e r (1967) a n d t h e i r validity c a n b e p r o v e d u n d e r v a r i o u s sets of s u f f i c i e n t conditions. The p r o p e r t y (c) i s of a d i f f e r e n t n a t u r e . I t i s t r i v i a l l y s a t i s f i e d if B a n d b V fo r v l a r g e enough are i n t e r i o r points of S. F o r to i n d i c a t e b r i e f l y t h a t a l l mentioned conditions c a n b e fulfilled we s h a l l c o n c e n t r a t e to t h e case of l i n e a r l y r e s t r i c t e d l i n e a r L1-regression; t h e non- l i n e a r case i s substantially m o r e complicated d u e to t h e f a c t t h a t t h e function

f o

i s n e i t h e r convex n o r d i f f e r e n t i a b l e .

W e assume t h a t t h e t r u e p a r a m e t e r v e c t o r

B

i s t h e optimal solution of t h e mathematical p r o g r a m

minimize E [f o(b, t ) j s u b j e c t to Ab 5 c (3.7) a n d i t is estimated by optimal solutions b of t h e p r o g r a m s

minimize E "[f o(b, t ) j s u b j e c t to Ab 5 c ; (3.8) A (m

,

n ) a n d c (m , 1 ) are given m a t r i c e s of c o n s t a n t elements a n d f o ( b , t )

= Ito -

^b

Trl.

The c o r r e s p o n d i n g Lagrangian functions h a v e t h e form

a n d

L ( b , Y )

=

j f o ( b , O P ( d 0

-

^{y T @ b}

-

c ) for

u

⁵0 X

- a o t h e r w i s e

Under Assumptions 2 . 1 a n d 2.3 (applied to t h e c o n s i d e r e d function f i n s t e a d of f ) L V ( b , If)

=

/lo@,

t ) P v ( d t ) - y T ( ~ b - c ) for y 5 0 X

- a otherwise (3.10)

(13)

a n a s s e r t i o n about consistency of saddle points (bV, y V ) p a r a l l e l t o t h a t of Theorem 2.4 c a n b e proved (see Dupabovii and Wets (1987), Theorem 5.2). The ex- i s t e n c e of saddle points in t h e case of linearly r e s t r i c t e d l i n e a r L1-regression i s g u a r a n t e e d thanks to t h e s p e c i a l type of constraints and of t h e function

f o .

Also in t h i s c a s e ,

and

are n e c e s s a r y and suzpicient conditions f o r ( 8 , 7 ) and (bv, y v ) to b e saddle points of t h e Lagrangian functions L and L with r e s p e c t t o t h e set S

= R n

X

RT .

The special form of t h e set S t o g e t h e r with consistency of (b ', y ') help t o el- iminate t h e c o n s t r a i n t s in (3.9) and (3.10) provided t h a t t h e s t r i c t complementari- t y conditions hold t r u e f o r (8, q), i.e., f o r

Vi

Denote by I

c

[ I ,

. . . ,

m j t h e set of indices f o r which q i

>

O,i.e., f o r which t h e i - t h c o n s t r a i n t i s a c t i v e f o r t h e t r u e p a r a m e t e r v e c t o r

8.

Evidently,

n

yiv

=

0 f o r i fZ 1 and

C

^aijbjv

⁼

ci f o r i ^EI a.s.

j = l

f o r v l a r g e enough. Denote AI

=

(ay), ~1

.

In t h i s situation, w e a r e in f a c t in-

= I . ..., n

t e r e s t e d t o study asymptotic behavior of t h e u n c o n s t r a i n e d saddle points (b ', y y ) of t h e r e d u c e d Lagrangian function

f o r v --, a. All w e need f o r asymptotic normality of t h e estimates b V are t h e corresponding versions of conditions (a), (b) with v V ( @ ) and v (b ') r e p l a c e d by v (8)

-

ATqI and v o(b ')

-

A f i y and with Ef r e p l a c e d by t h e reduced Lagrangian function LI.

(14)

THEOREM 3.4 Let t h e t r u e parameter vector

8

be t h e p o i n t of m i n i m a of t h e f u n c t i o n f o ( b ,

t )

= I t o - b T r I o n t h e s e t ~

=

( b : A b + c j .

Assume f u r t h e r :

( i ) For t h e t r u e parameter vector 8 , t h e r a n d o m vector

z

a n d r e s i d u a l E in

a r e i n d e p e n d e n t with d e n s i t i e s h a n d h 2 s u c h

that

h z ( 0 )

>

^0.

( i i ) The absolute v a l u e s

Iti

^1,i

=

0 , 1 ,

. . .

, n , of t h e components of t h e r a n d o m vector

t

a r e u n ~ o r m l y integrable with respect t o

P ' ,

v

=

1, 2 ,

....

(iii) The absolute moments E I I ~ I P , k

=

1 , 2 , 3 e z i s t a n d E S T i s f i n i t e a n d non- s i n g u l a r .

( i v ) For t h e t r u e parameter vector

8

a n d for t h e corresponding saddle p o i n t

(8,

^{7 )}^of ^@.Q),

the strict

complementarity c o n d i t i o n s @.Ill) hold t r u e . The m a t r i x AI i s o f f u l l row r a n k .

Then: f i ( b

' - 8 )

i s a s y m p t o t i c a l l y normal N ( 0 , C

cc

T , w i t h C

=

var z, C

=

H - ~ ( I

-

A ~ ( A ~ X - ' A ~ - ~ AIH-') a n d H

=

2 h Z ( 0 ) E f f T .

4. PROOF OF THEOREH 3.4

The assumed existence of EllfIl and t h e uniform integrability of

Itt 1,

i

=

0 , 1 ,

. . .

^,n , imply that Assumptions 2.1 and 2.3 (needed for consistency) a r e

T-I.

fulfilled f o r f o ( b ,

t ) = Ito -

^b

t

Denote by

with u o ( b ,

t )

^Ea f o ( b ,

t )

a subgradient of t h e reduced Lagrangian function L I ( b , y I ) . Following our discussion f r o m Section 3 , we can choose u O ( b ,

t )

^{i n such}

a way that

so t h a t t h e condition

-E

1 ' ( L ( b

', t ) ^j

^-+0 i n probability as v ^--+00

d;

(15)

is evidently fulfilled.

Let us study the properties of the subgradients u o ( b , f) . LEMMA 4 . 1 Denote

Then u n d e r a s s u m p t i o n (i) of Theorem 3.4 there i s a p o s i t i v e c o n s t a n t k s u c h t h a t

a n d

E l u ; ( b ,

.$)I

^S2kdF11TIP

.

PROOF According t o (2.3), we have

u o ( b , f )

= - T

^if

^{4 ,} ^>

^TTb

-

if

to <

TTb

conv

IT, - f j

if

to =

TTb , so that

u , ( b ,

- .9 = o

if o d ( b )

n

l a a : 7 T b #

= tj ⁼ ⁹

S 2 l l f l l otherwise

.

For a given d , b and f , the condition

o d ( b )

n

l b ' : z T b

= to] = 9

can be equivalently expressed a s

~ ( b , P

( 2 ) )

²d

where p ( b , p ( f ) ) denotes the distance of b from the hyperplane

p ( f )

=

l b ' : t T b '

= COj

, i . e . ,

(16)

Using (4.2), (4.3), we get

where Md ( b )

=

( C : p ( b , JJ

(0) <

^d

1 = ( t :

^{l p b}

- tol <

^d

11711 ^1.

Substituting

t T p ⁺

^e

f o r

to

^{we have}

In a similar way,

LEMMA 4.2 Existence o f Hessian Under Assumption ( i ) of Theorem 3.4, E f o i s twice c o n t i n u o u s l y d w e r e n t i a b l e a t the point

p

w i t h Hessian

provided that the ezpectation E g T e z i s t s . PROOF The function E f o can be written as

E f o ( b )

= / / t o - -

t T b l p ( d t )

= f f l z T ( p -

^{b )}

⁺

e \ h l ( ~ h Z ( e ) d z d e X

and its gradient (see (4.1), (3.1) and comment 3.3b)

(17)

Accordingly, t h e m a t r i x of t h e 2-nd o r d e r d e r i v a t i v e s

~ ( 6 )

=

2 J 7 h l ( 7 ) h 2 ( 7 T ( b

-

^@))zTd

T

^,

so t h a t

LEMMA 4.3 S24;nPicient c o n d i t i o n s t h a t @ be an isoLated gLobaL m i n i m u m of E f O ( b ) = ~ l t o - b T T l o n S = [ b : A b 2 c j

and t h e a s s o c i a t e d L a g r a n g i a n muLtipLier r] be u n i q u e are:

(i) A @ r c , T ~ ( A @

-

^{c )}

=

0, 7 r 0

(ii) F o r I

=

[ i :

zT=laij@j ⁼

^ci^j,t h e m a t r i z

is ofjhLL r o w r a n k and

(iii) A s s u m p t i o n ( i ) of Theorem 3.4 comes t r u e and E

gT

^isn o n s i n g u l a r .

PROOF Condition (i) i s the f i r s t - o r d e r n e c e s s a r y condition, condition (ii) c o n t a i n s t h e l i n e a r independence condition a n d s t r i c t complementary conditions a n d condition (iii) t o g e t h e r with Lemma 4.2 implies t h a t t h e s e c o n d o r d e r sufficient condition i s fulfilled. The r e s u l t follows e.g. from Theorem 3.2.2 of F i a c c o (1983).

If condition (ii) i s fulfilled, we c a n r e w r i t e t h e f i r s t o r d e r conditions (i) i n t h e form

(18)

Conditions (ii), (iii) of Lemma 4.3 t o g e t h e r with assumption (i) of Theorem 3.4 imply t h a t t h e m a t r i x L of t h e second o r d e r d e r i v a t i v e s of t h e r e d u c e d L a g r a n g e function L I ( b , pI) at t h e point

8,

^71,

i s nonsingular. Accordingly, we have

LEMMA 4.4 U n d e r a s s u m p t i o n s (i), ( i v ) of Theorem 3 . 4 complemented b y a s - s u m p t i o n ( i i i ) o f L e m m a 4.3, c o n d i t i o n (b) i s j b w l l e d for LI(b, yI).

Condition ( a ) c a n b e w r i t t e n as

in p r o b a b i l i t y a.s. as v --,

=.

To g e t t h e d e s i r e d c o n v e r g e n c e p r o p e r t y of

we s h a l l c h e c k u n d e r which c i r c u m s t a n c e s t h e conditions (N-1)-(N-4) of H u b e r (1967) are fulfilled: Measurability a n d s e p a r a b i l i t y of 1 ( b , pI, #), c f . (N-1), i s evidently fulfilled, e x i s t e n c e a n d uniqueness of t h e t r u e 8 , 71, cf. (N-2) a n d ( N 3 i ) fol- low from assumptions of Lemma 4.3 a n d p r o p e r t i e s of s u b g r a d i e n t s 1 ( b , yI, #), c f . (N-3ii), (N-3iii) a n d (N-4), c a n b e obtained using Lemma 4 . 1 a n d assumption (iii) of Theorem 3.4.

Denote

LEMMA 4.5 Let a s s u m p t i o n ( i ) of Theorem 3 . 4 b e j b l f i l l e d a n d let t h e a b s o l u t e m o m e n t s E

11 TIP,

^E

11

e x i s t . T h e n t h e r e a r e p o s i t i v e c o n s t a n t s K , K' s u c h t h a t

The e z p e c t e d v a l u e E

1

111 ( 8 , ' 71,

1) 11'1

i s f i n i t e .

(19)

PROOF W e h a v e

+

^{s u p} ^1IA?(yI

-

^{y i}

) I 1

^,

15j

^-y]lcd

s o t h a t

E I i d ( b , V I ,

61

5kdEII?1(2 + a . d 5 K d a c c o r d i n g to Lemma 2 .

Similarly,

+

_{s u p} l l ~ ? ( y ~

-

^yr

) I I 2 +

l b , ~ ; q c d J l ~ I ( b

-

b

* ) J I 2 Ibi.

- Y I J l ~ ^d

and

E y ~ ,

,$)I

⁵^2kd^E

Il?I$ +

~ M ' E IIdpa

+

2 a 2 d 2 5 d

.

K' The l a s t condition i s evidently fulfilled as

E I I I U ~ ( ~ , U I P I =EII?IP .

REFERENCES

A r t h a n a r i , T.S. a n d Y. Dodge (1982): Mathemtical Programming i n S t a t i s t i c s . Wi- l e y , New York.

Bassett, G.Jr. a n d R. K o e n k e r (1978): Asymptotic t h e o r y of l e s t a b s o l u t e e r r o r re- g r e s s i o n , J. ASA 7 3 , 618-622.

B a r r o d a l e , I. a n d F.D.K. R o b e r t s (1977): Algorithms f o r r e s t r i c t e d l e a s t a b s o l u t e value estimation. Commun. Statist.-Simula. Compta, B6(4), 353-363.

Bloomfield, P. a n d W.L. S t e i g e r (1983): Least Absolute Deviations, Theory, Appli- c a t i o n s a n d Algorithms. B i r k h a u s e r , Boston.

(20)

Clarke, R. (1983): O p t i m i z a t i o n a n d Nonsmooth AnaLysis. Wiley Interscience, N e w York.

DupaEovh, J . (1987): Asymptotic p r o p e r t i e s of r e s t r i c t e d L1-estimates of r e g r e s - sion. To a p p e a r in P r o c . of t h e 1st International Conference on Statistical Data Analysis based on t h e L norm and Related Methods, Neuchatel, 31 August

-

4 September 1987. North-Holland. Amsterdam.

DupaEov6, J. and R. W e t s (1986): Asymptotic behavior of s t a t i s t i c a l estimators and optimal solutions f o r s t o c h a s t i c programming problems. WP-86-41 IIASA, Lax- enburg.

DupaEov6, J. and R. W e t s (1987): Asymptotic behavior of s t a t i s t i c a l estimators and of optimal solutions of s t o c h a s t i c optimization problems, 11. WP-87-9 IIASA, Laxenburg.

Fiacco, A.V. (1983): I n t r o d u c t i o n to S e n s i t i v i t y a n d StabiLity AnaLysis i n Non- Linear Programming. Academic P r e s s , New York.

Huber, P. (1967): The behavior of maximum likelihood estimates under nonstandard conditions. R o c . 5 - t h BerkeLey a m p . Math. S t a t . R o b . 1 , 221-233.

Huber, P. (1973): Robust r e g r e s s i o n : Asymptotics, c o n j e c t u r e s and Monte Carlo.

Ann. o f S t a t . 1 , 799-821.

Loeve, M . (1955): ProbabiLity Theory. van Nostrand, New York.

Oberhofer, W. (1982): The consistency of nonlinear regr;ession minimizing t h e L1- norm. Ann. o f S t a t . 1 0 , 316-319.

Rockafellar, R.T. (1981): The Theory of S u b g r a d i e n t s a n d i t s AppLications t o RobLems of m t i m i z a t i o n . Convez a n d Nonconvez f i n c t i o n s . R e s e a r c h and education in mathematics 1 , Heldermann Verlag, Berlin.

Asymptotic Properties of Restricted L1-Estimates of Regression

ASYMl'TOTIC PROPERTIES OF RESTRICTED L , - E ~ T E S OF REGRESSION

ABSTRACT

-

-

t

+

to, tl, . . .

tm

c

=

U

t )

= +

.

, t ) =

+

=

= +

o.

t

t.

t

t

t)

t)

t)

tl, . . .

tm)

t ) =

t),

t)j

t ) = to

'i)

= - t o

?I>.

t

t

t )

t)

.

t) =

t )

t) =

t)j

- -

C

t i )

-

t o

T )

t o <

T )

1 - vb

t),

1

.

o.

c

S o ( ' , 4)

?j,

=

..

>

K, c

t

P V

=

=

.

THEOREM

t

+

c

PV,

=

...

P

F=l

"

t ) ⁼ to

t) ⁼

t) ⁼

t)j ⁺

⁼