• Keine Ergebnisse gefunden

Penalizing function based bandwidth choice in nonparametric quantile regression

N/A
N/A
Protected

Academic year: 2022

Aktie "Penalizing function based bandwidth choice in nonparametric quantile regression"

Copied!
14
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

bandwidth choice in

nonparametric quantile regression

KlausAbberger,UniversityofKonstanz,Germany

Abstract:

In nonparametric mean regression various methods for bandwidth choice ex-

ist. These methods can roughlybedividedinto plug-inmethodsand methods

basedonpenalizingfunctions. Thispaperusestheapproachbasedonpenalizing

functions andadapt it tononparametric quantileregressionestimation, where

bandwidthchoiceisstill anunsolvedproblem. Variouscriteriaforbandwitdth

choicearedenedandcompared insomesimulationexamples.

Key Words: nonparametric quantile regression, bandwidth choice, cross-

validation,penalizingfunctions

1 Introduction

Althoughmostregressioninvestigationsareconcernedwiththeregressionmean

functionotheraspectsof theconditional distributionofY givenX arealsoof-

tenof interest. For xed 2(0;1), the quantileregressionfunction gives the

thquantileq

(x)intheconditionaldistributionofaresponsevariableY given

thevalue X =x. It canbeused to measure the eect of covariatesnotonly

inthecenterofapopulation,but alsoin theloweranduppertails. Especially

of interest is the case where the data pattern shows heteroscedasticities and

asymmetries.

(2)

discussed. Thesemethodsincludesplinesmoothing,kernelestimation,nearest-

neighbourestimationandlocallyweightedpolynomialregression.YuandJones

(1998)proposetwokindsoflocallinearquantileregression. Theyalsodevelopa

rule-of-thumbbandwidthchoiceprocedure basedontheplug-inidea. Starting

pointistheasymptoticallyoptimalbandwidthminimizingtheMSE.Sincethis

bandwidthdependsonunknownquantitiestheauthorsintroducesomesimplify-

ingassumptions. Theseassumptionsresultin thebandwidthselectionstrategy

h

=h

mean

f(1 )=(

1

()) 2

g 1=5

: (1)

andare thestandardnormaldensityand distributionfunction and h

mean

isabandwidthchoiceforregressionmeanestimationwithoneoftheseveralex-

istingmethods. Asitcan beseenthisprocedureleadstoidentical bandwidths

fortheand(1 )quantiles. Althoughthisstrategymightworkverywellin

somesituationsourspecialinterestliesin asymmetricdatapatternswhere the

aboveruleistorestrictive.

Abberger(1998) adapts thecross-validation ideato kernelquantileregres-

sion and presents somesimulation examples. Also asymmetric data patterns

basedonthelognormaldistributionareincluded.

Thispapertriestousepenalizingfunctionbasedcriteriatochoosetheband-

width in nonparametric quantileregression. Inthe next section these criteria

arepresentedandsimulationexamplesarediscussedinSection3.

2 Quantile estimation and bandwidth choice

A locally weighted linear quantile regression estimator is dened by setting

^ q

(x)=^a,where^aand

^

b minimize

n

X

i=1

(Y

i

a b(X

i

x))K

x X

i

h

(2)

(3)

(u)=1

fu0g

(u)u+( 1)1

fu<0g

(u)u (3)

introduced by Koenker and Basset (1978) in connection with the parametric

quantile regression. Fora discussionof this estimatorsee Heiler (2000)or Yu

andJones(1998),whichalsoderivetheMSEofthisestimator. Tocalculateq^

weuse an iterativelyreweightedleast squaresalgorithm. Initial estimatesare

conditionalquantilescalculatedwithakernelestimatoroftheNadaraya-Watson

type(seeHeiler(2000)).

log(h(ASE)/h(AAWE))

est. density

-2 -1 0 1

0.0 0.5 1.0 1.5 2.0

Figure1: EstimateddensityoflogdierencesbetweenASEandAAWEoptimal

bandwidthsforsimulateddata

Estimation by minimizing equation (2) canbe interpreted asM-estimator

orin thenotationofBickelandDoksum(2001)asminimumcontrastestimate

withcontrastfunction

. Ingeneraltheydeneadiscrepancy function

D(

0

;)E

0

(Y;) (4)

(4)

thetruevalue

0 .

Nonparametric estimation of quantile regression requires the choice of a

bandwidth. Innonparametricmeanregressionproceduresforbandwidthchoice

usually groundonthe MSE. Variousdenitions of theoptimal bandwidthare

available. One candidate is the bandwidth that minimizes MISE (mean inte-

grated squared error) for the given sample size and design. This bandwidth

is optimalwith respect to the averageperformance over allpossibledata sets

foragivenpopulation,rather thanfor the performance for theobserveddata

set. Anotherchoiceisthebandwidththatminimizestheaveragesquarederror

(ASE)fortheobserveddataset. Betweenthesetwoconceptswechosethelater

one. Forfurther discussionofthisissuesee e.g. Mammen(1990),Grund etal.

(1994),Hardle(1988).

Another natural choice in quantile regression is based on the discrepancy

function(4). Itis

E[

(Y m(x))]=(

Y

(x) m(x))+ Z

m(x)

1

F(yjx)dy (5)

andthustheoptimalbandwidthisthatoneforwhichthecorrespondingquantile

estimatorminimizes

1

n n

X

i=1 (

Z

^ q

(X

i )

1

F(yjX

i

)dy ^q

(X

i )

)

: (6)

Inthesequelthiscriterionwillbecalledaveragealphaweightederror(AAWE).

ThedierencebetweenASE(whichinquantileregressionis1=n P

(q

(X

i )

^ q

(X

i ))

2

)andAAWEisdemonstratedforadatapattern whichisfurther con-

sideredin thesimulationexamplessection. Thetrueunderlying distributionis

exponentialwithdensity

f(y)=ae ay 1

1

fy> 1=ag

(y);a>0: (7)

(5)

x

(est.) quantiles

0 100 200 300 400 500 600

0.0 0.2 0.4 0.6 0.8

true AAEW ASE

Figure2: Estimated0.75quantileswithASEand AAWEoptimal bandwidths

forasimulateddataset

This density is asymmetric and has expectation Zero for all a > 0. With

x=1;:::;600wechosea=1:5+sin(

x

100

2). For1000repetitionsthebandwidths

minimizingthe average errorsestimatingthe0:75-quantileswithakernelesti-

matorarecalculated. Figure1showsthedensityoflog (h

ASE

=h

AAWE

). There

isahighpeakat0indicatingthatthechosenbandwidthscoincide quiteoften.

Butthere is also aslight left skewness observable. This indicates a tendency

of theAAWEmethod to smooth strongerthan theASE procedure. Figure 2

showsa\typical\examplewheretheASEbandwidthissmallerthantheAAWE

bandwidth. Sincetheconditionaldistributionin thepeaksismuchatterthan

inthevalleyswheretheconditionaldistributionisverysteep,derivationsinthe

valleysare intheAAWEsensemoreimportantthanerrorsin thepeaks. This

perspectiveisquitenaturalforquantileestimation,especiallywhenwethinkof

doingquantileforecasts.

(6)

theyareoftheformy^=m(x)^ =Hy , wherethematrixH iscommonlycalled

thesmoother matrix anddepends onx but noton y . The traceofH canbe

interpretedas theeective numberof parametersused in thesmoothing(e.g.

HastieandTibshirani(1990),sec. 3.5). Onepossiblestrategytondasuitable

smoothingparameteristochose thebandwidthwhichistheminimizerof

log (^

2

)+ (H); where (8)

^ 2

= 1

n n

X

i=1 fy

i

^ m

h (X

i )g

2

(9)

and () apenalty function designed to decrease with increasing smoothness

of m^

h

. Common choices of lead to GCV ( (H) = 2logf1 tr(H)=ng),

Rice`s T ( (H) = logf1 2tr(H)=ng) and AIC

c

(Hurvich et al. (1998) )

( (H)=f1+tr(H)=ng=f1 [tr(H)+2]=2g).

These smoothing parameter selectors can be adapted to quantile regres-

sion estimation. The rst modication concerns log(^

2

). Since the quantile

estimator (2) falls into the class of M-estimators wecan proceed as usual in

M-estimation(seee.g. Hampel et al. (1986))and interpretthe

function as

\-loglikelhood=

\. Sothe AIC criterionand all the otherabove mentioned

criteriacan beadaptedbyusing 1

n P

n

i=1

(y

i

^ q

(x

i

))insteadof.^

The second modication concerns the smoother matrixH. Estimator (2)

doesnot lead to alinear estimatory^ =Hy . Because the actual estimatoris

carriedoutbyiterativelyreweightedleastsquaresthesmoother matrixH can

beapproximatedbytheimpliedsmoothermatrixfromthelastiterationof the

iterativelyreweightedleastsquarestof themodel.

Withthesemodicationswearriveatthefollowingstrategytondasuitable

smoothingparameterforlocallinearquantileregression: choosethebandwidth

(7)

2log 1

n n

X

i=1

(y

i

^ q

(x

i ))

!

+ (H); (10)

where () is oneof theabovementioned penalizing functions and H the ap-

proximativesmoothermatrix.

3 Simulation examples

Inthissection, somesimulation resultsarepresented. Theunderlying density

functionswereoftheexponentialtypeshownin equation(7). Thetwomodels

ModelI: a =1:5+sin(

x

100

2) (11)

ModelII: a =10exp( 1=200x) (12)

with x = 1;:::;400are considered. For each setting 100 repetitions were cal-

culated. The0:25-and0:75-quantileswereestimated forbothmodels. Band-

widthsarechosenwiththehelpoftheabovediscussedmethodsbasedonpenal-

izingfunctions andin additionwiththecross-validationmethodwhichchooses

h

CV

=min

h (

n

X

i=1

(Y

i

^ q

( i)

(X

i ))

)

; (13)

withq^ ( i)

(X

i

)thesocalledleave-one-outestimator. Thisistheestimatorforthe

conditionalquantileatX

i

whichiscalculatedwithouttheobservation(Y

i

;X

i ).

Toavoidboundaryeects onlythe200observations x=101;:::;300in the

middleareused forbandwidthchoice.

Theestimateddensitiesoflog(h

CV

=h

AAWE ),log(h

GCV

=h

AAWE

)andlog(h

AICc

=h

AAWE )

forthe twoquantilesand bothmodels are shown in the Figures 3-6. Wealso

calculatedRice`s T but the resultsare quite similar to the AIC

c

criterion so

theseresultsarenotshownin thegraphs.

(8)

AAWE

andforthe0:75quantilesthemeanis58:1. Thisdierenceinthemeansconrms

theneedofmethods whichcanhandleasymmetricdatapatterns.

est. density

-1.5 -1.0 -0.5 0.0 0.5 1.0

0.0 0.2 0.4 0.6 0.8 1.0

est. density

-1.5 -1.0 -0.5 0.0 0.5 1.0

0.0 0.2 0.4 0.6 0.8 1.0

est. density

-1.5 -1.0 -0.5 0.0 0.5 1.0

0.0 0.2 0.4 0.6 0.8 1.0 CV

GCV AICC

Figure3: Estimateddensitiesoflog(h

=h

AAWE

)for0.25quantilesof ModelI

Figure 3showstheresultsfor the 0:25quantilesof Model I.The three es-

timateddensitieshaveallmodiaroundZero. Butthepeaksforthepenalizing

methodsarehigherandsharperthanforthecross-validationmethodwherethe

densityisatter.

Alsoin Figure 4whichpresentstheresultsfor the0:75quantilesof Model

I the cross-validation density is relativelyat. But in this caseit is the only

densitywithmodusaroundZero. Thepenalizingmethodstendtoundersmooth.

AsimilarbehaviourcanbeobtainedforModelIIvisualizedinFigure6and

7. The mean of h

AAWE

for = 0:25is 109.6 and for = 0:75 the mean is

(9)

est. density

-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5

0.0 0.2 0.4 0.6 0.8

est. density

-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5

0.0 0.2 0.4 0.6 0.8

est. density

-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5

0.0 0.2 0.4 0.6 0.8

CV GCV AICC

Figure4: Estimateddensitiesoflog(h

=h

AAWE

)for0.75quantilesof ModelI

245.6. Thisdierenceis againaresultoftheasymmetric density. Andjust as

forModelI thepenalizingmethods tendtoundersmooththeupperquantile.

Theseresultsremainunchangedwhenh

ASE

isusedasreferencebandwidth

insteadof h

AAWE

becauseinthese examplesthe dierencebetweenh

ASE and

h

AAWE

isnotsuchlarge.

Figure7showstheestimateddensitiesforthe0.75quantilesofModelIbut

now100observationsareusedforbandwidthchoiceinsteadof200. Thepenal-

izingmethods still undersmoothbut the smallersample size leadsto stronger

dierences between the methods. The AIC

C

method undersmooth less than

theGCVmethod.

Tosumupthesimulationresultsitcanbestatedthatthepenalizingfunc-

(10)

est. density

-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5

0.0 0.2 0.4 0.6 0.8

est. density

-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5

0.0 0.2 0.4 0.6 0.8

est. density

-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5

0.0 0.2 0.4 0.6 0.8

CV GCV AICC

Figure5: Estimateddensitiesoflog(h

=h

AAWE

)for0.25quantilesofModelII

tionbasedmethods forbandwidthchoicecanleadto areductioninvariability

comparedwith thecross-validation method. But forthis wehaveto takeinto

accountthetendencyofpenalizing methodsto undersmoothwhenlargeband-

widthsareappropriate. Maybethisdisadvantagecanbegetundercontrolwith

thedevelopmentofadaptedpenalizingfunctions. Simulationsbasedonsmaller

sample sizes show that the AIC

C

penalizing function undersmooth less than

someotherpenalizingfunctions.

(11)

x

est. density

-2 -1 0 1

0.0 0.2 0.4 0.6 0.8

x

est. density

-2 -1 0 1

0.0 0.2 0.4 0.6 0.8

x

est. density

-2 -1 0 1

0.0 0.2 0.4 0.6 0.8

CV GCV AICC

Figure6: Estimateddensitiesoflog(h

=h

AAWE

)for0.75quantilesofModelII

(12)

est. density

-4 -3 -2 -1 0 1 2

0.0 0.1 0.2 0.3 0.4 0.5 0.6

est. density

-4 -3 -2 -1 0 1 2

0.0 0.1 0.2 0.3 0.4 0.5 0.6

est. density

-4 -3 -2 -1 0 1 2

0.0 0.1 0.2 0.3 0.4 0.5 0.6

CV GCV AICC

Figure 7: Estimated densitiesof log(h

=h

AAWE

) for0.75 quantiles ofModel I

(bandwidthchoice with100observations)

(13)

Abberger K. (1998): Cross-validation innonparametricquantileregression.

AllgemeinesStatistischesArchiv,82, 149-161.

BickelP.J.,DoksumK.A.(2001): MathematicalStatistics. Longman

HigherEducation,NewJersey.

Grund B., Hall P., Marron J.S (1994): Loss and risk in smoothing

parameterselection. JournalofNonparametricStatistics, 4,107-132.

HampelF.R.,RonchettiE.M.,RousseeuwP.J.,StahelW.A.(1986):

RobustStatistics. Wiley,NewYork.

Hardle W., Hall P., Marron J:S. (1988): How far are automatically

chosen regressionsmoothing parameters from their optimum? Journalof the

AmericanStatisticalAssociation,83,86-101.

Hastie T.J., Tibshirani R.J. (1990): Generalized Additive Models.

ChapmanandHall,NewYork.

Heiler S. (2000): NonparametricTime SeriesAnalysis. In: A Course in

TimeSeriesAnalysis,editedbyD.Penaand G.C.Tiao. JohnWiley,London.

Hurvich C.M., SimonoJ.S., Tsay C.L. (1998): Smoothingparame-

terselectioninnonparametricregressionusinganimprovedAkaikeinformation

criterion. journaloftheRoyalStatisticalSociety,Ser. B,60,271-293.

HurvichC.M., Tsay C.L. (1989): Regressionandtimeseriesmodelse-

lectioninsmallsamples. Biometrika,76,297-307.

(14)

46,33-50.

Koenker R., Portnoy S., Ng P. (1992): Nonparametricestimation of

conditionalquantilefunctions. In: L

1

-StatisticalAnalysisandRelatedmethods

(ed. Y. Dodge),North-Holland,NewYork.

Mammen E. (1990): A short note on optimal bandwidth selection for

kernelestimators. StatisticsandProbabilityLetters,9,23-25.

YuK., Jones M.C.(1998): Locallinearquantileregression. Journalof

theAmericanStatisticalAssociation,93,228-237.

Referenzen

ÄHNLICHE DOKUMENTE

In this paper we use recent advances in unconditional quantile regressions (UQR) (Firpo, Fortin, and Lemieux (2009)) to measure the effect of education (or any other

Using this as a pilot estimator, an estimator of the integrated squared Laplacian of a multivariate regression function is obtained which leads to a plug-in formula of the

In Figure 1 we show a typical data set in the Laplace case (a) together with box plots for the absolute error of the different methods in 1000 Monte Carlo repetitions: local means

And the methodology is implemented in terms of financial time series to estimate CoVaR of one specified firm, then two different methods are compared: quantile lasso regression

[r]

In Chapter 3, motivated by applications in economics like quantile treatment ef- fects, or conditional stochastic dominance, we focus on the construction of confidence corridors

The quantile regression method, developed by Koenker and Bassett (1978), makes it possible to estimate coefficients of the demand functions at different points of the reserve holding

Keywords : First-price auctions, independent private values, nonparametric estimation, kernel estimation, quantiles, optimal reserve price..