62
5 O p tim a lit y R e su lt s
5.1
H u b e r’s M in im a x T h e o re m
aGrossErrorModelRemember:
•“Robustnessisthetheoryofapproximateparametricmo-dels.”
UεhFi={(1−ε)Fh.i+εHh.i|Harbitrary}
•M-estimatorsintheform Rψhx,ThFiidFhxi=0areaveryflexibleclassofestimators.Everyasympt.normalestimatorisequivalenttoanM-estimator.
635.1
bTheGame
•NaturechoosesadistributionFfromasetFofdistributions–e.g.F=UεhFi.
•Thestatisticianchoosesafunctionψ∈Ψ
•Thepayofftothestatisticianis
Khψ,Fi= R(ψ ′hxidFhxi) 2R(ψhxi 2dFhxi)“usually”=1/VhT,Fi.
64
Statisticianchoosesminimaxstrategy:Maximizetheminimalpayoff=minimizethemaximalvariance
AssumethatNaturegivesyoutheworstdistribution.Choosethebestestimatorforthisdistribution.Makesurethatitisatleastasgoodforotherdistributions.
NeedsaddlepointKhψ0,F0isuchthat
Khψ0,F0i≤Khψ0,Fi∀F∈Fand
Khψ0,F0i≥Khψ,F0i∀ψ∈Ψ
655.1 cHuber’sTheorem(1964,Ann.Math.Statistics)Assumptions:Fisconvex,hasdensities,finiteFisherInformationIhFi= R(f ′hxi/fhxi) 2fhxidx
Then,(i)IfthereisF0∈FsuchthatIhF0i≤IhFi∀F∈Fandψ0h.i:=f ′0 h.i/fh.i∈Ψ,then
[ψ0,F0]isthesaddlepointofthegame,
Khψ,F0i≤Khψ0,F0i=IhF0i≤Khψ0,Fi
∀F∈Fand∀ψ∈Ψ.
66(ii)Conversely,if[ψ0,F0]isasaddlepoint,and
∃c:c·f ′0 /f0∈Ψ,then
IhF0i≤IhFi∀F∈F,F0isuniqueandψ0=c·f ′0 /f0.(iii)...
675.1
dLocationModelAssume:Ghasconvexsupport,−loghgiisconvex.
ThFiM-estimatorgivenbyψ: Rψhx−ThFiidFhxi=0 asvarF hTi= Rψhx−ThFii 2dFhxiRψ ′hx−ThFiidFhxi 2“Neighborhood”UεhFi={(1−ε)F+εH}Loss=asvar.“Saddlepoint”[ψ0,F0]:
asvarFhTiminimaxonlyamongdist.sFforwhichEFhψ0i=0Solutiondefinestheproblem!
−→RestrictiontosymmetricFandHisnatural...but“non-robust”!
68
ψ0istheboundedversionofthemax.li.scores
ψ0hxi=minhshxi,maxhshxi,−bi,bi shxi=−f ′hxi/fhxi.“Tuningconstant”bisafunctionofallowedcontaminationε
(1−ε) −1= R
−b≤ghxi≤b ghxidx+(ghx0i+ghx1i)/b
695.1 eResultforthenormallocationmodelWhenF=UεhNhµ,1ii,thenthesaddlepointisgivenbytheHuberfunctionwithatuningconstantbthatisafunctionofε,and
F0isthecorrespondingdistribution,see(3.2.e) TuningconstantsandasvarΦhTi:
εbasvarΦ0.44170.51.2630.14281.01.1070.04981.41.0470.00642.01.010
70
b
as. Varianz
0.00.20.40.60.81.01.21.41.61.82.0
0.0 1.0 1.2 1.4 1.6
0.800.500.200.100.050.01 epsilon
715.1
fConsequences
•Exacttreatmentoftherobustnessquestion:“Behaviorundertheassumptionthat
XifollowapproximatelythedistributionF”
•1/Kmeasurestheasymptoticvariance.Whataboutbias?Biasdominatesforlargen.Asymptoticbias:ThGi−ThFi,G=εF+(1−ε)H.
•MinimaxResultforBiasisalsotreatedinHuber(1964)!Result:...
72
•Trickforminimaxvarianceresult:restrictiontosymmetricdistributions!
−→misunderstanding:Robustmethodsareforsymmetricdistributionsonly.
•Asymmetriccontaminationallowed−→“unavoidablebias”
−→RobustestimatordefinesquantitytobeestimatedforcontaminateddistributionsStillwanttobe(Fisher)consistent=asympt.unbiasedatthe“nominalmodel”Fθ.
73
5.2
H a m p e l’s O p tim a l E st im a to rs
aAsymptoticbias...ingrosserror“neighborhood”:
G=(1−ε)F+εH=F+ε(H−F)
ThGi=ThFi+ε RIFhx;T,Fid(H−F)hxi
=ThFi+ε RIFhx;T,FidHhxi−0
|ThGi−ThFi|≤εGEShT,Fi
GEShT,Fi=γ ∗hT,Fi=supx h|IFhx;T,Fi|i
Robustestimator−→boundedGES.
745.2
bOptimalityproblemAmongall(M-)estimatorswithGES≤aconstantγ,findtheonethatminimizes
asvarhT,Fi= RIFhx;T,Fi 2dFhxi
SearchintheclassofM-estimators−→classofψ-functions.=Hampel’soptimalityproblem
−→Variationalproblem
755.2
cHampel’soptimalityproblemasapproximateminimaxproblem:
asvarGhTi≈asvar DhThFi+ 1n X
i IFhXi;T,Fi iE
= RIFhx;T,Gi 2dGhxi≈ RIFhx;T,Fi 2dGhxi
=(1−ε)asvarFhTi+ε RIFhx;T,Fi 2dHhxi
≤(1−ε)asvarFhTi+εγ ∗hT,Fi
765.2
dHampel’sLemma5Modelandassumptions:
•Parametricmodel{Fθ|θ∈Θ},Θopen,convex⊂|R
•Densityfθ >0onsetofpossibleobservationsx.
•Fixθ0.DenoteF0:=fθ0 .
shx,θ0i=∂hloghfθhxiii/∂θ|θ0 exists,and
• Rshx,θ0idF0hxi=0
•FisherInformationJhF0i= Rshx,θ0i 2dF0hxiexists
77
Lemma5
Chooseb>0.(a)Thereexistsaconstantasuchthat
eψhxi:=minhmaxhshx,θ0i−a,−bi,bisatisfies ReψhxidF0hxi=0and
d:= Reψhxishx,θ0idF0hxi>0.
78 (Lemma5,continued)This eψminimizesRψhxi 2dF0hxi .Rψhxishx,θ0idF0hxi 2
amongallmappingsψthatsatisfyRψhxidF0hxi=0Rψhxishx,θ0idF0hxi>0andsupx ψhxi Rψhxishx,θ0idF0hxi ≤γ:=b/d
Anyothersolutionofthisextremalproblemcoincideswithanonzeromultipleof eψalmosteverywhere(withrespecttoF0).
79 5.2
eInterpretation
•LikeHuber’sTheorem,the“Lemma”isaboutfunctions.Noprobabilities,estimators...Interpretationforstatisticalpurposescomesnow.
•Optimalityisdeterminedforafixedθ0.Onlymakessenseiftruefor“all”θ(atleastainneighborhoodofθ0).
• RψhxidF0hxi=0meansFisherconsistency.(Weneedsuchacondition,sinceotherwise,thenon-random“estimator” bθhFi=θisoptimalunderallaspects...)
80
•ThequantityminimizedisasvarhTi,whereTistheM-estimatordefinedbyψ.
•Theexpressionsupx ...istheGESoftheestimatorandb/distheGESoftheconstructedestimator.
•Wewouldprefertostartwithagivenboundγ.Butthen,theconstructionismoredifficult,because
γneedstobelargeenoughforasolutiontoexist.
815.2
fProof
•aexistssinceRminhshx,θ0i−a,maxhshx,θ0i−a,−bi,bidF0hxiiscontinuousinaand→b>0fora→−∞and→−b<0fora→∞.
•d>0technical,seeHampeletal.(1986,p.118)
82
•Optimality:Letψbeafunctionsatisfyingtheconditons.Then,wecanreplaceψbyd·ψ Rψhxishx,θ0idF0hxi,thatis,wecanassume Rψhxishx,θ0idF0hxi=dandneedtominimize Rψhxi 2dF0hxi.Denoteshx,θ0i−a=:eshxi.NotethatR(eshxi−ψhxi) 2dF0hxi
= Reshxi 2dF0hxi−2d+ Rψhxi 2dF0hxiSincethefirst2termsonther.h.s.donotdependonψ,weminimizethel.h.s.Theintegrandisminimizedpointwiseby eψ.
•Thisalsoprovesuniqueness(uptovaluesonsetswith0probability)
835.2
gApproximatelyminimalMSEMeanSquaredError
MSE=bias 2+var≈(ThGi) 2+n −1·as.varhT,Gi
≤GEShT,Fi 2+n −1·(as.varhT,Fi+...)
Forgivenεandn,necanchooseboftheHuberest.suchthatthisapproximateMSEisminimized.
−→“Optimal”choiceofthetuningconstant,dependingonn.
Moreprecisetreatmentneeds“ChangeofVarianceFunction”andrespective“sensitivity”,seeHampeletal(1986,Ch.2.7).
84 5.2
hExample:Log-WeibulldistributionWeibull:OftenusedinreliabilitystudiesFailuretimeorsurvivaltimeY∼W
0.51.01.52.02.53.0
0.0 0.4 0.8 1.2
13.15
2.68
1.14
0.63 Gamma
0.51.01.52.02.53.0
0.0 0.2 0.4 0.6 0.8 1.0 1.2
3.80
1.71
1.08
0.76 Weibull
−2−101234
0.0 0.1 0.2 0.3 0.4
Gumbel
x x fhxifhxiη=η=
µ=0,τ=1
85 FailuretimeorsurvivaltimeY∼WThen,X=−loghYi∼Gumbelor“extremevalue”distributionCum.d.f.Fµ,τhxi=1−exph−e zi,z= x−µτDensityfµ,τhxi=τ −1e zexph−e zi
Location-Scalefamily.Considerfixedscaleτ=1.
Optimalestimatorsforµ?−→exercise!
loghfµ,1hxii=z−e z.Scores:shx,µi=e z−1.Max.li.: Pi e xi−bµ−n=0−→exphbµi= 1n Pi exphxii.
86 Optimalestimatorsforµ?Givenb,determineasuchthat(generalformula!)Rminhmaxhshx,θ0i−a,−bi,bidF0hxi
=b
0100(1)−Fhcha,bii−Fhcha,bii
−(f0hc1ha,bii−f0hc0ha,bii)
−a(F0hc1ha,bii−F0hc0ha,bii)
=0
c0ha,bi=s ∗ha−bi,c1ha,bi=s ∗ha+biwheres ∗istheinversefunctionofsh.,θ0i.Thisdefinesthefunctionahbi.
ForGumbel,s ∗hci=logh1+ci.
87
Functionahbi:
f.ho.b2a<-function(b,distr="gumbel"){##Purpose:Hampeloptimality:calculateafromb##Arguments:b:chosenboundif(distr=="gumbel"){fd<-function(x)exp(x-exp(x))fp<-function(x)1-exp(-exp(x))fsi<-function(x)log(max(1e-20,1+x))}elsestop("notprogrammedforthisdistr.")##functionforunirootff<-function(a,b){lc0<-fsi(a-b);lc1<-fsi(a+b)b*(1-fp(lc1)-fp(lc0))-(fd(lc1)-fd(lc0))-a*(fp(lc1)-fp(lc0))}rr<-uniroot(ff,c(-1,1),b=b)rr$root}
88
12345678
−0.15 −0.10 −0.05 0.00
b
a
89
Estimatingfunction
##functionthatestimatesmufromdataxf.estgumbel0<-function(x,b){la<-f.ho.b2a(b)lf.psi<-function(x,a,b)pmin(pmax(exp(x)-1-a,-b),b)ff<-function(mu,a,b,x)sum(lf.psi(x-mu,a,b))rr<-uniroot(ff,c(-10,10),a=la,b=b,x=x)rr$root}
90
Simulation:
##Gumbelquantilesandrandomnumbersf.qgumbel<-function(p)log(-log(1-p))f.rgumbel<-function(n)f.qgumbel(runif(n))
##simulatedistributionoftheestimatorf.simgumbelest<-function(n=20,nrep=100,b=2){lx<-matrix(f.rgumbel(n*nrep),n)apply(lx,2,f.estgumbel0,b=b)}
r.simest<-f.simgumbelest(n=t.n,1000,b=2)c(n=t.n,expec=mean(r.simest),sd=lsd<-sd(r.simest),se=lsd/sqrt(length(r.simest)-1))
nexpecsdse10.0000-0.03450.35080.0111
91
Simulated distr. of opt. M−est., b = 0.5 , n = 10
estimated mu
Frequency
−1.0−0.50.00.51.0
0 50 100 150 200