NOT FOR QUOTATION WITHOUT THE PERMISSION OF THE AUTHOR
ASYMl'TOTIC PROPERTIES OF RESTRICTED L , - E ~ T E S OF REGRESSION
F e b r u a r y 1987 WP-87-18
Working Papers are interim r e p o r t s on work of t h e International I n s t i t u t e f o r Applied Systems Analysis and have r e c e i v e d only limited review. Views or opinions e x p r e s s e d h e r e i n d o not n e c e s s a r i l y r e p r e s e n t t h o s e of t h e Institute or of i t s National Member Organizations.
INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS A-2361 Laxenburg, Austria
FOREWORD
The r e c e n t r e s u l t s on asymptotic behavior of s t a t i s t i c a l estimates and of op- timal solutions of s t o c h a s t i c optimization problems obtained by DupaEovh and Wets are used to p r o v e consistency of r e s t r i c t e d Ll-estimates u n d e r more g e n e r a l as- sumptions. For t h e s p e c i a l c a s e of l i n e a r l y r e s t r i c t e d l i n e a r L1-regression, Lagrangian a p p r o a c h is used t o a c h i e v e asymptotic normality.
Alexander B. Kurzhanski Chairman System and Decision S c i e n c e s P r o g r a m
ABSTRACT
Asymptotic p r o p e r t i e s of L l-estimates in l i n e a r r e g r e s s i o n have been studied by many authors, see e.g. Bassett and Koenker (1978), Bloomfield and S t e i g e r (1983). I t i s the lack of smoothness which does not allow to use t h e known r e s u l t s on asymptotic behavior of M-estimates (Huber (196'7)) directly. The additional lack of a convexity in the nonlinear regression c a s e i n c r e a s e s t h e complexity of t h e problem even under assumption t h a t the t r u e p a r a m e t e r values belong t o t h e i n t e r i o r of the given p a r a m e t e r s e t ; f o r a consistency r e s u l t in this case s e e e.g.
Oberhofer (1982).
W e shall use t h e technique developed in DupaEovA and Wets (1986), (1987) t o g e t asymptotic p r o p e r t i e s of t h e L l-estimates of regression coefficients which are assumed t o belong t o a n a p r i o r i given closed convex set given, e.g., by constraints of g e n e r a l equality and inequality form. The method uses, La., tools of nondifferen- tiable calculus and epi-convergence and i t c a n b e applied t o o t h e r classes of L1- estimates as well.
CONTENTS
1 Introduction 2 Consistency
3 Asymptotic Normality 4 Proof of Theorem 3 . 4 R e f e r e n c e s
-
vii-
whose nondifferentiability precludes from d i r e c t application of t h e r e l a t e d asymp- totic r e s u l t s . Nevertheless, asymptotic normality of t h e Ll-estimates of r e g r e s s i o n coefficient w a s proved by Bassett and Koenker (1978) f o r l i n e a r r e g r e s s i o n with nonrandom r e g r e s s o r s and by Bloomfield and S t e i g e r (1983) f o r l i n e a r r e g r e s s i o n with random r e g r e s s o r s under ergodicity and s t a t i o n a r i t y assumption. Consistency r e s u l t f o r L l-estimates of p a r a m e t e r s in nonlinear r e g r e s s i o n model can b e found in Oberhofer (1982).
In model (1.1) and, correspondingly, in t h e optimization problem (1.3), res- t r i c t i o n s on t h e estimated p a r a m e t e r values c a n b e t a k e n into account to r e s p e c t technical and modeling considerations and, eventually, to g u a r a n t e e t h e uni- quenesss of t h e estimate (Barrodale and R o b e r t s (1977)). Inequality c o n s t r a i n t s on t h e estimates, however, introduce a n essential lack of smoothness. That i s why one usually assumes (see e.g. H u b e r (1967), Oberhofer (1982)) t h a t t h e t r u e p a r a m e t e r v e c t o r i s an interior point of t h e given admissible set.
For t h e case of linearly r e s t r i c t e d l i n e a r r e g r e s s i o n t h e simplex method c a n b e used to g e t t h e r e s t r i c t e d L l-estimates, i.e., to g e t t h e optimal solution of t h e mathematical programming problem
where S i s a nonempty convex polyhedral set. (For a s u r v e y on solution techniques see e.g. Barrodale and R o b e r t s (1977) or Arthanari and Dodge (1982).)
The optimal solution of (1.4) lies v e r y often on t h e boundary of S what does not conform with t h e mentioned assumption t h a t t h e t r u e p a r a m e t e r value i s a n in- t e r i o r point of S.
W e shall use t h e technique developed in DupaEovA and Wets (1986), (1987) to g e t consistency and asymptotic normality of r e s t r i c t e d L l-estimates. The s p e c i a l form of t h e objective function t o g e t h e r with t h e use of empirical distribution help to simplify t h e assumptions used in t h e mentioned p a p e r s . F u r t h e r simplification i s possible in cases where g is l i n e a r both in b and x , i.e., f o r l i n e a r regression.
2. CONSISTENCY
Let
t
b e a n m+
1 dimensional random v e c t o r on ( E , a , P) with componentsto, tl, . . .
,tm
and f' o : R n x R m 'l --, R1 t h e functionAssume t h a t S
c
R n i s a given nonempty closed set of admissible p a r a m e t e r values and define f' : R n X Rm --,=
R1U
lwj t h e functionf ' ( b , t ) = f o ( b ,
t )
f o r b E S a n d t € R m + l= +
on f o r b 6Z S.
Observe t h a t f' (b
, t ) =
f' o ( b , t )+
9, ( b ) where 9, (b ) is t h e indicator function of t h e set S9 , ( b )
=
0 f o r b E S and 9 , ( b )= +
oo if b g SLet us f i r s t discuss t h e properties of' the f i n c t i o n s f' a n d f'
o.
a ) If t h e function g : R n x Rm --, R1 i n (2.1) is a c o n t i n u o u s function, t h e n evi- dently
f' i s nonnegative and continuous in
t
f o i s nonnegative a n d continuous i n b and
t.
b ) If f o r a n a r b i t r a r y
t
E E t h e function g i n (2.1) i s LocaLLy L i p s c h i t z in b t h e n f o r allt
E Ef' o(',
t)
i s locally Lipschitz i n b andf' ( a ,
t)
i s lower semicontinuous.c ) Taking into account t h e special form of f o ( b ,
t)
w e c a n write (using again t h e notation f o r t h e m -dimensional s u b v e c t o r of componentstl, . . .
,tm)
t ) =
maxt),
Q2(b.t)j
with
Ql(b*
t ) = to
- g ( b ,'i)
Q2(b I t )
= - t o
+ g ( b?I>.
If f o r a n a r b i t r a r y
t
E E t h e function g i s continuousLy d w e r e n t i a b l e in b t h e n f o r a l lt
E Z (see Rockafellar (1981), p r o p . 4 A and 3 H )f' O(.,
t )
is locally Lipschitz and subdifferentially r e g u l a r andf' ( a ,
t)
i s lower semicontinuous on R n.
The s u b d w e r e n t i a l (with r e s p e c t t o b)
8 J o ( b ,
t) =
conv 106 iPi(b,t )
f o r i such t h a t f o ( b ,t) =
iPi (b ,t)j
The estimate of t h e p a r a m e t e r v e c t o r @ based on t h e sample of size IJ from t h e distribution P I i.e., t h e optimal solution b of t h e mathematical program
- -
V
min
C
f o ( b ,t i )
on t h e set Si = 1
-
v b ~ ( b s ift o
> g ( b ~ v 6 9 ( b ,T )
ift o <
g ( b ,T )
conv
1 - vb
g (b ,t),
v 6 g ( b ,1
otherwise.
i
o r , equivalently,
V
min f ' ( b , t i ) o n R n ,
i = 1
c o r r e s p o n d s t o t h e use of t h e (random) empirical probability measure P" which c o n v e r g e s t o P in distribution almost surely. In o u r analysis w e have to use as- sumptions c o n c e r n e d jointly with t h e considered probability measures P , P V and t h e functions f' or f'
o.
ASSUMPTION 2.1 To a n y bounded set V
c
R n t h e r e corresponds a summable& n c t i o n n s u c h t h a t for a n y p a i r b O , b1 E V
COMMENT 2.2 Besides of local Lipschitz p r o p e r t y of
S o ( ' , 4)
which i s implied by t h e same p r o p e r t y of g (.,?j,
w e assume t h e integrability of t h e Lipschitz constantn.
ASSUMPTION 2.3 The p r o b a b i l i t y m e a s u r e s P, P V , IJ
=
1, 2 , ...
a r e f' -tight, i.e., V b E S a n d E>
0 there i s a compact setK, c
E s u c h t h a tAssumption 2.3 is fulfilled automatically for f ( b , .) bounded o r Z compact. For
t
one-dimensional, i t i s equivalent to uniform integrability of f o ( b , .) i nP V
f o r b E S a n d i t i s equivalent t o t h e convergence of expectationsto a finite expectation
f o r all b E S (see Loeve (1955), Section 11.4).
Under Assumption 2.3, similar r e s u l t s hold t r u e in t h e more-dimensional c a s e as well (see DupaEovh and Wets (1986)), namely:
The expectations
a r e a.s. finite and lower semicontinuous on S and Ef
=
lim E=
epi -1im E.
v+= V + Q
In addition, t h e consistency p r o p e r t y follows (see DupaEov6 and Wets (1986), Theorem. 3.9):
THEOREM
2.4 Let in t h e d e f i n i t i o n s (2.T) a n d (2.2),t
be a n ( m+
1)-dimensionaL r a n d o m vector o n (Z, a . P), S
c
R n be a d o s e d n o n e m p t y set a n d t h e m n c t i o n g : R n X Rm --+ R' be c o n t i n u o u s . LetPV,
v=
1, 2,...
be (random) empirical m e a s u r e s based o n i n d e p e n d e n t sampLes of s i z e v from t h e d i s t r i b u - t i o nP
s u c h t h a t A s s u m p t i o n s 2.P a n d 2.3 hold t r u e .Then:
1 ) A n y c l u s t e r point of a n y sequence of [ b
F=l
s u c h t h a t b"
E a r g min E " f , v=
1, 2,. . .
, almost s u r e l y belongs t o a r g min E f.
2 ) If t h e r e is a compact s e t D c R n s u c h that f o r v
=
1, 2,.( a r g min E "f )
n
D is n o n e m p t y a.s.and
181 =
(arg m i n E f ) n Dt h e n t h e r e e x i s t a m e a s u r a b l e selection Ib
'1 r=l
of [ a r g min E 'fI
s u c h that=
lim b "a.s.v + -
and a l s o
inf Ef
=
lim (inf EVf ) a.s.v + -
For t h e l i n e a r L1-regression, i.e., f o r t h e problem (1.4) with a n a l r e a d y given (observed) matrix X of r e g r e s s o r s , t h e existence of t h e optimal solutions b V fol- l o w s via p r o p e r t i e s of t h e corresponding l i n e a r program, see e.g. Bloomfield and S t e i g e r (1983). For nonlinear L1-regression this need not b e t h e case. To guaran- tee t h e existence of optimal solutions of t h e programs
f o r noncompact S o n e c a n use t h e infcompactness p r o p e r t y of t h e objective func- tions E
i f
(6, #)I andEVlf
( b , C)1.
To t h i s purpose, i t i s sufficient to assume t h a t f o r a set A E a with P(A)>
0 ( r e s p . P V ( A )>
0 ) t h e seti s bounded f o r all a € R (see DupaEov6 a n d Wets (1986), Proposition 3.10). For t h e empirical measure P V , th i s p r o p e r t y i s evidently fulfilled i t t h e function
f
( a , C) i s infcompact f o r a realization ofC.
Evidently, o u r assumptions are weaker than those by Oberhofer (1982) and i t i s possible to p r o c e e d in a quite similar way to g e t t h e consistency of r e s t r i c t e d L l-estimates f o r o t h e r models without unnatural smoothness assumptions.
3. ASYMPTOTIC NORMALITY
Provided t h a t a l l t h e assumptions of Theorem 2.4 needed t o g e t t h e consisten- c y r e s u l t (2.4) are fulfilled we c a n study t h e rate of c o n v e r g e n c e f o r (2.4) i n a p r o b a b i l i s t i c s e n s e . To t h i s p u r p o s e , a p p r o p r i a t e d i f f e r e n t i a b i l i t y p r o p e r t i e s of o u r problems (2.5) are needed.
ASSUMPTION 3.1 F o r a n a r b i t r a r y
t
EZ, t h e f i n c t i o n g (., t ) is c o n t i n u o u s t y d m e r e n t i a b l e .According t o r e s u l t s by Clark (1983) ( s e e a l s o t h e discussion in Dupa6ov6 a n d Wets (1987)) we h a v e with a f o ( b , C) given by (2.3)
LEMMA 3.2 U n d e r A s s u m p t i o n s 2.2, 2.3 and 3.2
and f o r an a r b i t r a r y b E S
w i t h e q u a l i t y
V
\k, is subd.igperentially r e g u l a r at b.
COMMENT 3.3 a ) F o r convex sets o r f o r smooth manifolds, t h e i n d i c a t o r function 9 i s s u b d i f f e r e n t i a l l y r e g u l a r , see R o c k a f e l l a r (1981).
b ) Formula (3.1) t o g e t h e r with (2.3) imply t h a t f o r
P
absolutely continuous Ef i s d i f f e r e n t i a b l e .The p r o p e r t i e s ( 3 . 1 , (3.3) imply t h a t f o r a n a r b i t r a r y b E S a n d v (b ) E
aE lf
(b , t ) j t h e r e e x i s t v , (b ) Ea
9, (b ) a n d measurable u O(b, ') s u c h t h a t almost s u r e l yand
Similarly according to ( 3 . 2 ) , ( 3 . 4 ) , f o r a n a r b i t r a r y b E S and v " ( b ) E a E V [ f ' ( b ,
t ) ]
w e have almost s u r e l ywhere
and
v a r v $ ( b )
=
- I v a r [ u 0 ( b S vdue t o subdifferential r e g u l a r i t y of
P o
and to the definition of P".Application of t h e s e r e s u l t s t o n e c e s s a r y conditions
f o r t h e optimal solutions of t h e problems ( 2 . 5 ) , i.e. f o r
and
implies existence of vs (@) f 6 qs (@), v, ( b ") E 6 qs ( b ") and random functions u o ( @ , e), u O ( b ", + ) s u c h t h a t
U O ( @ ,
t )
Eaf'o(@, t )
ass.u O ( b V ,
t )
f 6 f 0 ( b V , [) a.s. f o r v=
1, 2 , . . . and0 = E l u . o ( @ I
t>j
+ v s ( @ >= ~ ( 8 )
0 = ~ " [ u ~ ( b " ,
t)j +
v S ( b " ) = v V ( b " )a.s. f o r
=
1, 2 , . .For t h i s choice of s u b g r a d i e n t s v "(b '), t h e condition 1 v ' ( b ')
-.
0 i n probability as v -P =i s t r i v i a l l y fulfilled.
The b a s i c i d e a i s to a p p l y H u b e r ' s a p p r o a c h ( s e e H u b e r (1967), Section 4 ) to t h e s u b g r a d i e n t s v a n d v of t h e functions Ef a n d EVf t h a t fulfill (3.5) a n d (3.6) f o r to g e t t h e asymptotic normality of b V . The assumptions of ~ u p a 6 o v A a n d Wets (1987) r e d u c e to t h r e e basic conditions i n our case:
(a)
6
[ V '(B)+
v (b ')I -+ 0 i n p r o b a b i l i t y as v -+ a.(b) Efo i s twice continuously d i f f e r e n t i a b l e at t h e point
B
with nonsingular Hes- s i a n H.(c) 6 [ v , (b ')
-
v , (B)] -+ 0 in p r o b a b i l i t y as v -+ a.The f i r s t two p r o p e r t i e s r e s e m b l e r e s u l t s of H u b e r (1967) a n d t h e i r validity c a n b e p r o v e d u n d e r v a r i o u s sets of s u f f i c i e n t conditions. The p r o p e r t y (c) i s of a d i f f e r e n t n a t u r e . I t i s t r i v i a l l y s a t i s f i e d if B a n d b V fo r v l a r g e enough are i n t e r i o r points of S. F o r to i n d i c a t e b r i e f l y t h a t a l l mentioned conditions c a n b e fulfilled we s h a l l c o n c e n t r a t e to t h e case of l i n e a r l y r e s t r i c t e d l i n e a r L1-regression; t h e non- l i n e a r case i s substantially m o r e complicated d u e to t h e f a c t t h a t t h e function
f o
i s n e i t h e r convex n o r d i f f e r e n t i a b l e .W e assume t h a t t h e t r u e p a r a m e t e r v e c t o r
B
i s t h e optimal solution of t h e mathematical p r o g r a mminimize E [f o(b, t ) j s u b j e c t to Ab 5 c (3.7) a n d i t is estimated by optimal solutions b of t h e p r o g r a m s
minimize E "[f o(b, t ) j s u b j e c t to Ab 5 c ; (3.8) A (m
,
n ) a n d c (m , 1 ) are given m a t r i c e s of c o n s t a n t elements a n d f o ( b , t )= Ito -
bTrl.
The c o r r e s p o n d i n g Lagrangian functions h a v e t h e forma n d
L ( b , Y )
=
j f o ( b , O P ( d 0
-
y T @ b-
c ) foru
5 0 X- a o t h e r w i s e
Under Assumptions 2 . 1 a n d 2.3 (applied to t h e c o n s i d e r e d function f i n s t e a d of f ) L V ( b , If)
=
/lo@,
t ) P v ( d t ) - y T ( ~ b - c ) for y 5 0 X- a otherwise (3.10)
a n a s s e r t i o n about consistency of saddle points (bV, y V ) p a r a l l e l t o t h a t of Theorem 2.4 c a n b e proved (see Dupabovii and Wets (1987), Theorem 5.2). The ex- i s t e n c e of saddle points in t h e case of linearly r e s t r i c t e d l i n e a r L1-regression i s g u a r a n t e e d thanks to t h e s p e c i a l type of constraints and of t h e function
f o .
Also in t h i s c a s e ,and
are n e c e s s a r y and suzpicient conditions f o r ( 8 , 7 ) and (bv, y v ) to b e saddle points of t h e Lagrangian functions L and L with r e s p e c t t o t h e set S
= R n
XRT .
The special form of t h e set S t o g e t h e r with consistency of (b ', y ') help t o el- iminate t h e c o n s t r a i n t s in (3.9) and (3.10) provided t h a t t h e s t r i c t complementari- t y conditions hold t r u e f o r (8, q), i.e., f o r
Vi
Denote by I
c
[ I ,. . . ,
m j t h e set of indices f o r which q i>
O,i.e., f o r which t h e i - t h c o n s t r a i n t i s a c t i v e f o r t h e t r u e p a r a m e t e r v e c t o r8.
Evidently,n
yiv
=
0 f o r i fZ 1 andC
aijbjv=
ci f o r i E I a.s.j = l
f o r v l a r g e enough. Denote AI
=
(ay), ~1.
In t h i s situation, w e a r e in f a c t in-= I . ..., n
t e r e s t e d t o study asymptotic behavior of t h e u n c o n s t r a i n e d saddle points (b ', y y ) of t h e r e d u c e d Lagrangian function
f o r v --, a. All w e need f o r asymptotic normality of t h e estimates b V are t h e corresponding versions of conditions (a), (b) with v V ( @ ) and v (b ') r e p l a c e d by v (8)
-
ATqI and v o(b ')-
A f i y and with Ef r e p l a c e d by t h e reduced Lagrangian function LI.THEOREM 3.4 Let t h e t r u e parameter vector
8
be t h e p o i n t of m i n i m a of t h e f u n c t i o n f o ( b ,t )
= I t o - b T r I o n t h e s e t ~=
( b : A b + c j .Assume f u r t h e r :
( i ) For t h e t r u e parameter vector 8 , t h e r a n d o m vector
z
a n d r e s i d u a l E ina r e i n d e p e n d e n t with d e n s i t i e s h a n d h 2 s u c h
that
h z ( 0 )>
0.( i i ) The absolute v a l u e s
Iti
1, i=
0 , 1 ,. . .
, n , of t h e components of t h e r a n d o m vectort
a r e u n ~ o r m l y integrable with respect t oP ' ,
v=
1, 2 ,....
(iii) The absolute moments E I I ~ I P , k
=
1 , 2 , 3 e z i s t a n d E S T i s f i n i t e a n d non- s i n g u l a r .( i v ) For t h e t r u e parameter vector
8
a n d for t h e corresponding saddle p o i n t(8,
7 ) of @.Q),the strict
complementarity c o n d i t i o n s @.Ill) hold t r u e . The m a t r i x AI i s o f f u l l row r a n k .Then: f i ( b
' - 8 )
i s a s y m p t o t i c a l l y normal N ( 0 , Ccc
T , w i t h C=
var z, C=
H - ~ ( I-
A ~ ( A ~ X - ' A ~ - ~ AIH-') a n d H=
2 h Z ( 0 ) E f f T .4. PROOF OF THEOREH 3.4
The assumed existence of EllfIl and t h e uniform integrability of
Itt 1,
i
=
0 , 1 ,. . .
, n , imply that Assumptions 2.1 and 2.3 (needed for consistency) a r eT-I.
fulfilled f o r f o ( b ,
t ) = Ito -
bt
Denote by
with u o ( b ,
t )
E a f o ( b ,t )
a subgradient of t h e reduced Lagrangian function L I ( b , y I ) . Following our discussion f r o m Section 3 , we can choose u O ( b ,t )
i n sucha way that
so t h a t t h e condition
-E
1 ' ( L ( b', t ) j
-+ 0 i n probability as v --+ 00d;
is evidently fulfilled.
Let us study the properties of the subgradients u o ( b , f) . LEMMA 4 . 1 Denote
Then u n d e r a s s u m p t i o n (i) of Theorem 3.4 there i s a p o s i t i v e c o n s t a n t k s u c h t h a t
a n d
E l u ; ( b ,
.$)I
S 2kdF11TIP.
PROOF According t o (2.3), we have
u o ( b , f )
= - T
if4 , >
TTb-
ifto <
TTbconv
IT, - f j
ifto =
TTb , so thatu , ( b ,
- .9 = o
if o d ( b )n
l a a : 7 T b #= tj = 9
S 2 l l f l l otherwise
.
For a given d , b and f , the condition
o d ( b )
n
l b ' : z T b= to] = 9
can be equivalently expressed a s
~ ( b , P
( 2 ) )
2 dwhere p ( b , p ( f ) ) denotes the distance of b from the hyperplane
p ( f )
=
l b ' : t T b '= COj
, i . e . ,Using (4.2), (4.3), we get
where Md ( b )
=
( C : p ( b , JJ(0) <
d1 = ( t :
l p b- tol <
d11711 1.
Substitutingt T p +
ef o r
to
we haveIn a similar way,
LEMMA 4.2 Existence o f Hessian Under Assumption ( i ) of Theorem 3.4, E f o i s twice c o n t i n u o u s l y d w e r e n t i a b l e a t the point
p
w i t h Hessianprovided that the ezpectation E g T e z i s t s . PROOF The function E f o can be written as
E f o ( b )
= / / t o - -
t T b l p ( d t )= f f l z T ( p -
b )+
e \ h l ( ~ h Z ( e ) d z d e Xand its gradient (see (4.1), (3.1) and comment 3.3b)
Accordingly, t h e m a t r i x of t h e 2-nd o r d e r d e r i v a t i v e s
~ ( 6 )
=
2 J 7 h l ( 7 ) h 2 ( 7 T ( b-
@))zTdT
,so t h a t
LEMMA 4.3 S24;nPicient c o n d i t i o n s t h a t @ be an isoLated gLobaL m i n i m u m of E f O ( b ) = ~ l t o - b T T l o n S = [ b : A b 2 c j
and t h e a s s o c i a t e d L a g r a n g i a n muLtipLier r] be u n i q u e are:
(i) A @ r c , T ~ ( A @
-
c )=
0, 7 r 0(ii) F o r I
=
[ i :zT=laij@j =
ci j, t h e m a t r i zis ofjhLL r o w r a n k and
(iii) A s s u m p t i o n ( i ) of Theorem 3.4 comes t r u e and E
gT
is n o n s i n g u l a r .PROOF Condition (i) i s the f i r s t - o r d e r n e c e s s a r y condition, condition (ii) c o n t a i n s t h e l i n e a r independence condition a n d s t r i c t complementary conditions a n d condi- tion (iii) t o g e t h e r with Lemma 4.2 implies t h a t t h e s e c o n d o r d e r sufficient condition i s fulfilled. The r e s u l t follows e.g. from Theorem 3.2.2 of F i a c c o (1983).
If condition (ii) i s fulfilled, we c a n r e w r i t e t h e f i r s t o r d e r conditions (i) i n t h e form
Conditions (ii), (iii) of Lemma 4.3 t o g e t h e r with assumption (i) of Theorem 3.4 imply t h a t t h e m a t r i x L of t h e second o r d e r d e r i v a t i v e s of t h e r e d u c e d L a g r a n g e function L I ( b , pI) at t h e point
8,
71,i s nonsingular. Accordingly, we have
LEMMA 4.4 U n d e r a s s u m p t i o n s (i), ( i v ) of Theorem 3 . 4 complemented b y a s - s u m p t i o n ( i i i ) o f L e m m a 4.3, c o n d i t i o n (b) i s j b w l l e d for LI(b, yI).
Condition ( a ) c a n b e w r i t t e n as
in p r o b a b i l i t y a.s. as v --,
=.
To g e t t h e d e s i r e d c o n v e r g e n c e p r o p e r t y ofwe s h a l l c h e c k u n d e r which c i r c u m s t a n c e s t h e conditions (N-1)-(N-4) of H u b e r (1967) are fulfilled: Measurability a n d s e p a r a b i l i t y of 1 ( b , pI, #), c f . (N-1), i s evi- dently fulfilled, e x i s t e n c e a n d uniqueness of t h e t r u e 8 , 71, cf. (N-2) a n d ( N 3 i ) fol- low from assumptions of Lemma 4.3 a n d p r o p e r t i e s of s u b g r a d i e n t s 1 ( b , yI, #), c f . (N-3ii), (N-3iii) a n d (N-4), c a n b e obtained using Lemma 4 . 1 a n d assumption (iii) of Theorem 3.4.
Denote
LEMMA 4.5 Let a s s u m p t i o n ( i ) of Theorem 3 . 4 b e j b l f i l l e d a n d let t h e a b s o l u t e m o m e n t s E
11 TIP,
E11
e x i s t . T h e n t h e r e a r e p o s i t i v e c o n s t a n t s K , K' s u c h t h a tThe e z p e c t e d v a l u e E
1
111 ( 8 , ' 71,1) 11'1
i s f i n i t e .PROOF W e h a v e
+
s u p 1IA?(yI-
y i) I 1
,15j
-y]lcds o t h a t
E I i d ( b , V I ,
61
5kdEII?1(2 + a . d 5 K d a c c o r d i n g to Lemma 2 .Similarly,
+
s u p l l ~ ? ( y ~-
yr) I I 2 +
l b , ~ ; q c d J l ~ I ( b-
b* ) J I 2 Ibi.
- Y I J l ~ dand
E y ~ ,
,$)I
5 2kd EIl?I$ +
~ M ' E IIdpa+
2 a 2 d 2 5 d.
K' The l a s t condition i s evidently fulfilled asE I I I U ~ ( ~ , U I P I =EII?IP .
REFERENCES
A r t h a n a r i , T.S. a n d Y. Dodge (1982): Mathemtical Programming i n S t a t i s t i c s . Wi- l e y , New York.
Bassett, G.Jr. a n d R. K o e n k e r (1978): Asymptotic t h e o r y of l e s t a b s o l u t e e r r o r re- g r e s s i o n , J. ASA 7 3 , 618-622.
B a r r o d a l e , I. a n d F.D.K. R o b e r t s (1977): Algorithms f o r r e s t r i c t e d l e a s t a b s o l u t e value estimation. Commun. Statist.-Simula. Compta, B6(4), 353-363.
Bloomfield, P. a n d W.L. S t e i g e r (1983): Least Absolute Deviations, Theory, Appli- c a t i o n s a n d Algorithms. B i r k h a u s e r , Boston.
Clarke, R. (1983): O p t i m i z a t i o n a n d Nonsmooth AnaLysis. Wiley Interscience, N e w York.
DupaEovh, J . (1987): Asymptotic p r o p e r t i e s of r e s t r i c t e d L1-estimates of r e g r e s - sion. To a p p e a r in P r o c . of t h e 1st International Conference on Statistical Data Analysis based on t h e L norm and Related Methods, Neuchatel, 31 August
-
4 September 1987. North-Holland. Amsterdam.DupaEov6, J. and R. W e t s (1986): Asymptotic behavior of s t a t i s t i c a l estimators and optimal solutions f o r s t o c h a s t i c programming problems. WP-86-41 IIASA, Lax- enburg.
DupaEov6, J. and R. W e t s (1987): Asymptotic behavior of s t a t i s t i c a l estimators and of optimal solutions of s t o c h a s t i c optimization problems, 11. WP-87-9 IIASA, Laxenburg.
Fiacco, A.V. (1983): I n t r o d u c t i o n to S e n s i t i v i t y a n d StabiLity AnaLysis i n Non- Linear Programming. Academic P r e s s , New York.
Huber, P. (1967): The behavior of maximum likelihood estimates under nonstandard conditions. R o c . 5 - t h BerkeLey a m p . Math. S t a t . R o b . 1 , 221-233.
Huber, P. (1973): Robust r e g r e s s i o n : Asymptotics, c o n j e c t u r e s and Monte Carlo.
Ann. o f S t a t . 1 , 799-821.
Loeve, M . (1955): ProbabiLity Theory. van Nostrand, New York.
Oberhofer, W. (1982): The consistency of nonlinear regr;ession minimizing t h e L1- norm. Ann. o f S t a t . 1 0 , 316-319.
Rockafellar, R.T. (1981): The Theory of S u b g r a d i e n t s a n d i t s AppLications t o RobLems of m t i m i z a t i o n . Convez a n d Nonconvez f i n c t i o n s . R e s e a r c h and education in mathematics 1 , Heldermann Verlag, Berlin.