An iterative plug-in algorithm for nonparametric modelling of seasonal time series

(1)

Nonparametric Modelling of Seasonal Time Series

Yuanhua Feng

University of Konstanz

Abstract

This paper focuses on developing a new data-driven procedure for decomposing

seasonal time series based on local regression. Formula of the asymptotic optimal

bandwidthh

A

inthecurrentcontextisgiven. Methodsforestimatingtheunknowns

in h

A

are investigated. A data-driven algorithm for decomposing seasonal time

series is proposed based on the iterative plug-in idea introduced by Gasser et al.

(1991). Asymptoticbehaviourofthisalgorithmisinvestigated. Somecomputational

aspects arediscussed indetail. Practicalperformanceof theproposed algorithmis

illustrated by simulated and data examples. The results here also provide some

insightsinto theiterative plug-inidea.

Keywords: Time series decomposition; Local regression; Iterative plug-in; Band-

widthselection

1 Introduction

Decomposing seasonal time series into unobserved components is an important issue of

statistics. Thisquestionarises,ife.g. wewanttoanalyzemonthlydataortobuildmodels

using seasonallyadjusted data. Here, the equidistant additive time series model

Y

t

=g(x

t

)+S(x

t )+

t

; t=1;2;:::;n; (1)

will be used to perform this, where x

t

= (t 0:5)=n,

t

are iid random variables with

E(

t

)=0 and var(

t )=

2

, g isa smooth trend-cyclical function, S is a slowlychanging

seasonal component with seasonal period s (in terms of t). Denote by m = g +S the

mean function. In the following model (1) will be treated as a standard nonparametric

regressionwith anadditional(deterministic)seasonalcomponent. Atraditionalnonpara-

metric approach for estimating g, S and m based on (unweighted) localregression with

polynomialsandtrigonometricfunctionsaslocalregressorswasproposedbyHeiler(1966,

(2)

beingused bythe GermanFederalStatisticalOÆcesince1983. Theapproachusedinthe

following is a generalized version of the Berlin Method proposed by Heiler and Michels

(1994) based on locallyweighted regression (Cleveland, 1979) by introducing a common

kernel weight function into the originalmethodology.

A crucial problemfortime series decomposition based onlocalregression is the selec-

tion of the bandwidth. Some double-smoothing procedures to perform this are proposed

by Heiler and Feng (1996, 2000), Feng (1999) and Feng and Heiler (2000). The aim of

the currentpaperistoproposeanew algorithmforselectingthebandwidth undermodel

(1). The iterative plug-in idea introduced by Gasser et al. (1991) with some minor im-

provementsproposedbyBeranandFeng(2002a,b)isadapted tothecurrentcontext. To

our knowledge this isthe rst plug-in bandwidth selectorfor decomposing seasonaltime

series. Moreover,wealsoprovidesomeinsightsintotheiterativeplug-inidea. Asymptotic

behaviourof the proposed algorithmis investigated. Some computationalaspects of this

algorithmare discussed indetail. Simulatedand data examplesshowthat this algorithm

works wellin practice.

The paperisorganized asfollows. The estimatorsand some oftheir asymptoticprop-

erties that are needed in the subsequent sections, are given in Section 2. Methods for

estimating the unknown terms inthe asymptoticallyoptimal bandwidthare discussed in

Section3. Thealgorithmis proposed inSection4 together with discussion onitsasymp-

totic behaviour and on some computational aspects. Simulated and data examples in

Section5illustratethe practicalusefulnessof thisproposal. Section6contains somenal

remarks. Proofs of the results are put in the appendix.

2 The local regression approach

2.1 The estimators

Assume that g is at least (p+1) times continuously dierentiable, so that it can be ex-

panded in a Taylor series around a point x

t

. Similarly, S can be locally modelled by a

Fourier series. Let

1

= 2=s be the seasonal frequency and

j

= j

1

, for j = 2;:::;q,

where q = [s=2] with [] denoting the integer part. Let K(u) be a second order kernel

(3)

regressionestimatorsofg,S andmatx

t

areobtained bysolvingthe leastsquareproblem

Q =

n

X

i=1 fY

t p

X

j=0

1j (x

i x

t )

j

q

X

j=1 (

2j cos

j

(i t)+[

3j sin

j

(i t)])gK

x

i x

t

h

)min: (2)

Thesolutionsof(2)are ^g(x

t )=

^

10 ,

^

S(x

t )=

q

P

j=1

^

2j

and m(x^

t

)=^g(x

t )+

^

S(x

t

),wherethe

coeÆcients and their estimations are dened locallyand hence depend onx

t .

Let

X

1

= 0

B

@ 1 x

1 x

t

(x

1 x

t )

p

.

. .

.

. .

.

1 x

n x

t

(x

n x

t )

p 1

C

A

and

X

2

= 0

B

@ cos

1

(1 t) sin

1

(1 t) cos

q

(1 t) [sin

q

(1 t)]

.

. [

.

.]

cos

1

(n t) sin

1

(n t) cos

q

(n t) [sin

q

(n t)]

1

C

A :

Then X=(X

1 .

.

.X

2

) is the [n(p+s)]-designmatrix. The entries in (2)and X

2

marked

by [ ] only apply to odd s, for even s they have to be omitted due to

q

= . Let

y=(y

1

; :::;y

n )

0

be the vector of observations and K denote adiagonalmatrix with

k

i

=K(

x

i x

t

h ):

Furthermore, denote the j-th (p+1)1 unit vector by e

j

and let

s

be an (s 1)1

vector having1 inits odd entries and 0elsewhere. Then we have

^ m(x

t )=(e

0

1

; 0

s )(X

0

KX) 1

X 0

Ky =:w 0

y; (3)

^ g(x

t )=(e

0

1

;0 0

)(X 0

KX) 1

X 0

Ky =:w 0

1

y; (4)

and

^

S(x

t )=(0

0

; 0

s )(X

0

KX) 1

X 0

Ky =:w 0

2

y; (5)

where 0 isa vector of zeros of appropriatedimension.

The vectors w = (w

1

;:::;w

n )

0

, w

1

= (w

11

;:::;w

1n )

0

and w

2

= (w

21

;:::;w

2n )

0

will be

calledweightingsystemsofm ,^ g^and

^

S,forwhichwehavew=w

1 +w

2 ,

P

w

i

= P

w

1i

=1

and P

w

2i

=0. The localregression approachmakes m ,^ ^g and

^

S exactlyunbiased, if g is

apolynomialof order no largerthan p and S isexactly periodicwith period s.

(4)

To develop aplug-in bandwidth selector we have todiscuss the asymptoticbehaviour of

^ g,

^

S and m.^ From here onitis assumed that pis odd so that g^has automatic boundary

correction. Put k =p+1and assumethat

A1. h!0and nh!1 asn !1.

A2. g isat least k times continuously dierentiable.

A3. S is exactly periodicwith periods.

A1 and A2 are the same as in nonparametric regression without seasonality. A3 is a

parametric assumptionon the seasonalcomponent, which ismade for simplicity and can

easilyberelaxedtoageneralslowlychangingseasonalcomponent. Undertheassumption

A1itcanbeshowedthat^gisasymptoticallyequivalenttosomekernelestimator(seeFeng,

1999). This means that the same asymptotic results in localpolynomial tting hold for

^

g undermodel (1). In the followingdenoteby K

p

(u)the equivalentkernel forestimating

g, which is of orderk.

To deal with

^

S, wewill introduce akernel estimatorof S. Let

Q

s (i)=

8

<

:

(s 1); if (i t)=s is aninteger;

1; otherwise;

(6)

and

w

2i

=(nh) 1

Q

s (i)K

x

i x

t

h

: (7)

A kernel estimatorof S isdened by

S(x

t )=

n

X

i=1 w

2i y

i

=:w 0

2

y: (8)

Notethat fw

2i

gare asymptoticallyperiodicwith the same periods. Suppose thatcorre-

spondingboundarycorrection is donefor

S, thenit can beshown that, under A1,

^

S and

S are asymptoticallyequivalent,too(see Feng, 1999).

As anerror criterion forbandwidth selectionwe use the mean averaged squared error

(MASE). Dene R (K)= R

1

1 K

2

(u)du. Let B denote the bias of anestimator. We have

Lemma 1 Assume that A1 to A3 hold, then

1. the asymptotic bias of m^ is

B[m(x^

t )]

:

=B[^g(x

t )]

:

= 1

(k!) Z

u k

K

p (u)du

g (k)

(x

t )

h k

; (9)

(5)

var(m(x^

t

))=(nh) 1

2

fR (K

p

)+(s 1)R (K)gf1+O[(nh) 1

]g (10)

3. and the MASE of m^ is

MASE (m)^ :=

1

n n

X

t=1

[E(m(x^

t

)) m(x

t )]

2

:

=

2

nh fR (K

p

)+(s 1)R (K)g (11)

+ 1

(k!) 2

(

Z

fg (k)

(x)g 2

dx Z

u k

K(u)du

2 )

h 2k

:

AsketchedproofofLemma1isgivenintheappendix,whereitisshowninparticularthat:

1. g^ and

^

S are asymptotically uncorrelated and 2. the bias in

^

S is negligible compared

tothat in^g. The asymptoticallyoptimalbandwidth,whichminimizesthedominate part

of the MASE is given by

h

A

= (k!)

2

2k

2

fR (K

p

)+(s 1)R (K)g

R

fg (k)

(x)g 2

dxf R

u k

K

p

(u)dug 2

!

1=(2k+1)

n

1=(2k+1)

; (12)

where itis assumed that I = R

fg (k)

(x)g 2

dx>0. The changein h

A

due to the additional

termS isjusta constant. Fors=1the aboveformulaereduce totheresults innonpara-

metric regression asgiven e.g. in Muller (1988), Ruppert and Wand (1994)and Fanand

Gijbels (1996).

3 Estimating the unknown parameters

3.1 Estimation of the variance

In ordertodevelop aplug-inbandwidth selectorbased on(12),the unknown parameters

as 2

and I have to be estimated. It is well known that the variance in nonparametric

regression can be estimated by dierence-based methods (see e.g. Rice, 1984, Gasser

et al., 1986 and Hall et al., 1990). Heiler and Feng (1996) adapted this idea to model

(1) and proposed some seasonal-dierence-based variance estimators. Here a sequence

D

ms

=fd

j

j j =0; 1;:::;mg is called aseasonaldierence sequence,if

m

X

j=0 d

j

=0;

m

X

j=0 d

2

j

=1; m=1;2; ::: (13)

(6)

S

i

= m

X

j=0 d

j Æ

ij

=0; i=0; 1;:::; s 1; (14)

where

Æ

ij

= 8

<

:

1; if (j i)=s is aninteger;

0; otherwise :

A seasonal-dierence-based variance estimatoristhen dened by

^ 2

D

=(n m) 1

n m

X

i=1 (

m

X

j=0 d

j Y

i+j )

2

: (15)

FollowingHalletal. (1990)itcanbeshownthatunderA2andA3^ 2

D

isrootn consistent.

In this paperthe following seasonaldierence sequence

D

m;s

= 1

12

f 1;2; 1;0;:::;0;

| {z }

s 3

1; 2;1g

dened for s 3 willbeused to estimate 2

,where m =s+2.

3.2 Estimation of I

Similar tolocalpolynomial tting the k-th derivativeof g can be estimated with a local

polynomial of order p

I

and a bandwidth h

I

with p

I

> k and p

I

k odd. And we set

l = p

I

+1. A simple choice is p

I

= k +1 with l = k+2. Let now (2) be dened with

p being replaced by p

I

. LetK, y and e

j

are the same as dened in Section 2. Let X be

dened similarlyas before. Then g^ (k)

=k!

^

k

estimates g (k)

, whichis given by

^ g

(k)

(t)=k!(e 0

k+1

;0 0

)(X 0

KX) 1

X 0

Ky=:(w k

) 0

y: (16)

where0is thesame asin(4)and w k

=(w k

1

;:::;w k

n )

0

isthe weightingsystem ofg^ (k)

. Then

I may be estimated by

^

I[g (k)

(x;h

I )]=n

1 n

X

i=1 f^g

(k)

(x

i

;h

I )g

2

: (17)

In the following some results on

^

I, which are important for the development of a plug-

in bandwidth selector, will be given without proof, since here we are only interested in

the magnitude orders. These orders are the same in the current context and in models

withoutseasonality. Suchresultsinnonparametricregression maybefoundinRuppert et

(7)

withshortmemory,longmemoryandantipersistencearegiveninBeranandFeng(2002a).

Assume now that

A 0

1. h!0and nh 2k+1

!1 as n!1.

A 0

2. g isat least l times continuously dierentiable.

Under the assumptions A 0

1,A 0

2and A3 we have

B(

^

I) :

=O(h (l k)

I

)+O[(nh

I )

1

h (2k)

I

] (18)

and

var(

^

I) :

=O(n 1

)+O(n 2

h 4k 1

I

): (19)

A 0

1 implies that h

I

is of a larger order than h

A

, i.e. (h

I )

1

= o[(h

A )

1

], which ensures

that g^ (k)

and hence

^

I is atleast consistent.

The followingremarks show that howh

I

shouldbechosen.

Remark 1. The largest order h

I

should take is O(n

1=(4k+2)

) = O[(h

A )

1=2

]. Under this

choice the second term on the right hand side of (18) and the standard deviation of

^

I

achieve the fastest rootn convergence rate atthe same time. Anh

I

ofa largerorder will

increase the bias withoutimprovingthe variance(interms of the magnitudeorder).

Remark 2. The optimal bandwidth for estimating g (k)

itself is of order O(n

1=(2l +1)

).

This order liesbetween the twogiven in Remarks1 and 3. The choice h

I

=O(n

1=(2l +1)

)

is hencealso reasonable.

Remark3. ObservethattheMSE(meansquarederror)of

^

I isdominatedbythesquared

bias part. By balancing the orders of the two terms on the right hand side of (18) we

obtain h

I

=O(n

1=(k+l +1)

), whichmay be considered to be the (asymptotically) optimal

choice of h

I

. Such achoice is of a smallerorder than the two given in Remarks 1and 2.

4 The main proposal

4.1 The basic algorithm

Fromhereononlyp=1and3withk=2and4willbeconsidered. Followingtheiterative

plug-inideaofGasseretal. (1991),

^

I

j

,theestimateofI inthej-thiteration,iscalculated

(8)

I;j j 1

(j-1)-th iteration,by meansof aination method. Here aination method is afunction

h

I;j

= f(h

j 1

) such that (h

I;j )

1

= o[(h

j 1 )

1

]. This means that h

I;j

will be of a larger

order than h

A , if h

j 1

is at least of order O(h

A

). Now A 0

1 is satised so that

^

I and

^

h

will be consistent in the j-th iteration. In the following two ination methods will be

described.

The original ination function, called a multiplied ination method (MIM) is h

I;j

=

f(h

j 1 )=ch

j 1 n

introduced by Gasseret al. (1991)with some >0,called aination

factor. This idea is discussed in detail by Herrmann and Gasser (1994). There are some

unknownsinthe functionf asc, andastartingbandwidthh

0

, whichhavetobechosen

beforehand. Theoretically, the rate of convergence of

^

h does not depend on c and h

0 .

Here we will simply set c = 1. The choice of h

0

will be discussed in Section 4.3. Let

l =k+2. FollowingRemarks1to3,weobtainthree reasonablechoicesof forthe MIM

respectively.

1.

1

=1=(4k+2) sothat the varianceterm of

^

I is minimized,

2.

2

=4=[(2k+1)(2k+5)]so that ^g (k)

is optimized and

3.

3

=2=[(2k+1)(2k+3)]so that the MSE of

^

I is minimized,

when convergence isreached, where

1

>

2

>

3

and

3

is the optimalchoice of .

It is well known that the required number of iterations by the MIM is very large,

especiallyfork>2. Forexample,ifk =4,itis5k+1=21for

1

and(k+1)(2k+1)=45

for

3

(see Herrmann and Gasser, 1994). The required numberof iterationswillbe even

larger,iftheerrorshavelongmemory(seeRayandTsay, 1997). Beran(1999)introduced

another ination method h

I;j

= f(h

j 1

) =ch

j 1

, called an exponential ination method

(EIM), toreduce the number of iterations. It is easyto showthat, in order toinate h

A

to a given order, the required number of iterations by the EIM is much smaller than by

the MIM (see Beranand Feng, 2002a forsome examples). In the followingthe EIM with

c=1 willbe used. The choices of correspondingto the

1 ,

2

and

3 are:

1.

1

=1=2,

2.

2

=(2k+1)=(2k+5)and

(9)

3

where

1

<

2

<

3

and

3

isthe optimalchoice of (see Beran and Feng, 2002a,b).

In the followingwewillpropose a basiciterativeplug-in algorithmforselectingband-

width intime series decomposition, which is dened for k =2 and k =4 separately.

i) Start with apossiblebandwidth h

0

;

ii) For j = 1;2;::: set h

I;j

=h

j 1

with =

3

=5=7 for k =2 and =

2

=9=13 for

k =4. Calculate

h

j

= (k!)

2

2k

^ 2

fR (K

p

)+(s 1)R (K)g

R

f^g (k)

(x;h

I;j )g

2

dxf R

u k

K

p

(u)dug 2

!

1=(2k+1)

n

1=(2k+1)

; (20)

iii) Increase j by 1 and repeat Step ii) untilconvergence is reached atsome j 0

and set

^

h=h

j 0

.

Aclosely relatedproposalfor bandwidthselectioninnonparametricregression withshort

memory,long memoryand antipersistenceisproposedbyBeranandFeng(2002a). Other

iterativeplug-inproceduresmay befoundinGasseretal. (1991),Herrmannetal. (1992)

and Ray and Tsay (1997).

Theoretically,

3

is the asymptoticallyoptimalchoice of. Our experienceshowthat,

for k = 2, this choice works well in practice. Hence we choose

3

= 5=7 for k = 2.

However,

3

= 9=11 for k = 4 is too close to one and for small samples the bandwidth

couldnot beinatedcorrectly. Fork =4 itis henceproposedtouse the slightlystronger

ination factor

2

. Now, the variance of

^

h with k = 2 and k = 4 is almost of the same

order and

^

h is hence in both cases stable (see Theorem 1 in the next subsection). The

most stableinationfactor

1

=1=2 by the EIMistoostrongand doesnot workwellfor

small samples.

4.2 Asymptotic behaviour

The iterativeplug-inalgorithmismotivated by xed point search. Here the procedureis

startedwithabandwidthh

0

andstopped,ifaconvergentoutput(axedpoint)isarrived.

The ination process behind an iterative plug-in algorithm is described by the following

lemma accordingto the relationship between h

0

and h

A .

(10)

Lemma 2 Under assumptions A2 and A3, an iterative plug-in algorithm processes as

follows:

Case 1. Start with an h

0

=o

p (h

A

), then

Step 1. h

j

=O

p (h

I;j ), if h

I;j

=o

p (h

A );

Step 2. h

j

=O

p (h

A ), if h

I;j

=O

p (h

A );

Step 3. h

j

=h

A [1+o

p

(1)], if h

A

=o

p (h

I;j ).

Case 2. Start with an h

0

such that (h

0 )

1

=o

p [(h

A )

1

], then

Step 1 0

. h

j

=O

p (h

A ), if h

I;j

=O

p (1).

Step 2 0

. The same as Step 3 in case 1.

The proof of Lemma 2 is given in the appendix. Related results may be found in Beran

and Feng (2002a)(see alsothe descriptionin Herrmannand Gasser,1994, p. 8). Note in

particular that A 0

1 does not apply toLemma 2.

Remark 4. Case 1 in Lemma 2 shows that, by starting with a small bandwidth, h

j 1

willbeinatedinthe j-thiteration,if h

j 1

=o

p (h

A

). This willberepeatedly carriedout

until h

j 0 =O

p (h

A

) is achieved in the j 0

-thiteration. And h

j 0

+1

in the next iteration will

be a consistent bandwidth selector. Some further iterations are required toimprove the

nite sampleproperty of

^

h.

Remark 5. Case 2 in Lemma 2 shows how such an algorithm works, if a starting

bandwidthh

0

,whichisatleastoforderO

p (h

A

),isuaed. On onehand,ifh

0

=o

p

(1),then

h

1

isalready consistent,since A 0

1issatised. Inthis caseStep 1 0

willnot appear. On the

other hand, if h

0

= O

p

(1), then h

1

= O

p (h

A

), which is already of the correct order but

not yet consistent. Now h

2

will be consistent. Again, some further iterationsare needed

toreduce the inuence of h

0 .

The followingtheorem hold forthe algorithmproposed in Section4.1.

Theorem 1 Under the assumptions of Lemma 2 wehave

i) For k=2 with

3

=5=7

^

h=h

A n

1+O(n 2=7

)+O

p (n

5=14

) o

; (21)

(11)

2

^

h=h

A n

1+O(n 2=13

)+O

p (n

9=26

) o

: (22)

A sketched proof of Theorem 1is given inthe appendix.

Remark6. Leth

M

denotetheoptimalbandwidth,whichminimizestheMASE.Theorem

1alsoholds,ifh

A

ontherighthandsidesof(21)and(22)isreplacedbyh

M

. Thisisdueto

the factthatjh

M h

A j=h

M

=O(h 2

M

)(see Beranetal.,2000 and Beranand Feng, 2002b),

which is of orders O(n 2=5

) for k =2 and O(n 2=9

) for k=4and ishence negligible.

4.3 Computational aspects

Thissubsection dealswithsomecomputationalaspects asthedecisionofj 0

,the choiceof

h

0

and soon. A morepractical procedure willbe proposed at the end of this subsection.

The estimators in Section 2.1 are dened with a xed bandwidth h. In this case

the number of observations used at x

t

decreases when x

t

moves from the interior to the

boundary. To solve this problem the k-NN idea as proposed by Gasser etal. (1985) will

beused. Foragivenhwedenealeftbandwidthh

l

andarightoneh

r

sothath

l

=h

r

=h

intheinterior,h

l

=x

t

ataleftboundarypointandh

r

=1 x

t

atarightboundarypoint.

h

r

(rep. h

l

)ataboundarypointisdecided byh

l +h

r

=2h. The estimatesataboundary

point are calculated similarlybut with h in(2) being replaced by max(h

l

;h

r ).

Inourprogramonlybandwidthsh2[h

min

;h

max

]withh

min

=s=nandh

max

=0:5 1=n

will be considered, which includes practically all reasonable possibilities for h. Further-

more, two bandwidths h and h 0

will be considered to be the same, if jh h 0

j < 1=n,

because a dierence of such an order is for any bandwidth selector negligible. In the

programthe actuallyused bandwidthis aninteger b=[nh+0:5],whichisthebandwidth

w.r.t. the observation time t. Let b

I;j

=[nh

I;j

+0:5]. Then we obtain anatural criterion

forstoppingthe computing procedure, i.e. the procedurewillbestopped, if b

I;j 0 =b

I;j 0

1

inthej 0

-th iteration. This implies

^

I 0

j

=

^

I

j 0

1 and

^

h=h

j 0 =h

j 0

1

. Furtheriterationsare

not necessary. Note that, even the j 0

-thiteration isjust a repetitionof the (j 0

-1)-th.

In the following the choice of h

0

will be considered. In most cases h

0

does not play

any role. However, in some cases, when the nite sample MASE has more than one

localminimumsorwhentheMASEchangesveryslowlyarounditsminimum,then

^

hmay

(12)

0

h

f

is called axed point (of the procedure proposed in Section 4.1), if

^

h=h

f

, when the

procedure is started with h

0

= h

f

itself. A xed point h

f

is called left stable, if for all

h

0 h

f

inaneighbourhoodofh

f

wehave

^

h =h

f

. A xed pointh

f

iscalledrightstable,

if for all h

0 h

f

in a neighbourhood of h

f

we have

^

h = h

f

. A xed point h

f

is called

stable, if it is both left and right stable. A xed point is called unstable, if it is only

achievable by starting with itself. An interval of bandwidths [h l

f

;h r

f

] iscalled an interval

of xed points, if h l

f

is a left stable xed point, h r

f

is a right stable xed point and all

pointsbetween themare unstablexed points. Denoteby

^

h l

thebandwidthselected with

h 1

0

= h

min and

^

h r

the bandwidth selected with h 2

0

= h

max

. Then

^

h l

is a left stable xed

point, if

^

h l

>h

min and

^

h r

is aright stable xed point, if

^

h r

<h

max .

Whenthe nitesampleMASEhasonlyoneminimum,thenthereexitsauniquestable

xed point or a unique interval of xed points. In the rst case we will obtain the same

selected bandwidth

^

h by startingwith any h

0

. In the second case we have

^

h =h l

for all

h

0 h

l

,

^

h = h r

for all h

0 h

r

and

^

h = h

0 for h

l

< h

0

< h r

. Now all bandwidths in

[h l

;h r

] are reasonable to be used as the optimal bandwidth, since now the change of the

MASE over [h l

f

;h r

f

] is negligible. In this case we also say that the result is unique and

set

^

h:=(

^

h l

+

^

h r

)=2. In thefollowingthe words a stablexedpoint alsomeanssometimes

an interval of stable points. In the case when the nite MASE has more than one local

minimums,then wemay obtaindierent

^

hby startingwithdierenth

0

. Now, theremay

alsobesome unstable xed points corresponding to alocalmaximum between two local

minimums. Ifthis isthe case, weshould ndout allpossible stablexed pointsand then

select one of them asthe optimalbandwidth by analyzing the smoothingresults.

An S-Plusfunction calledDeSeaTS (DecomposingSea sonalT ime Series)is developed

based onthe following quasi-data-driven procedure.

1. Carry out the algorithmin Section5.1twice with h 1

0

=h

min and h

2

0

=h

max

, respec-

tively.

2. Calculate the decomposition results automatically,if

^

h is unique.

3. Showdetailedinformation about allstable xed points,when

^

h is not unique.

If 3occurs, further subjective analysis is required.

(13)

respectively. If the smoothing results with p=1 and p=3 are both satisfactory, we can

choose either p=1 or p =3. However, it is more preferable to use p =3, since now the

selected bandwidth isin generalslightlylarger, which doesnot increase the bias of g^but

will improve

^

S. Sometime one p is more reasonable than the other, now the reasonable

one shouldbe chosen(see the examplesgiven inthe next section). An objectivecriterion

for choosingp isnot given here, becausewe donot haveanestimateof the MASE atthe

end of the procedure.

5 Practical performance

Inthe following,the practicalperformance ofthe proposed procedurewillbeinvestigated

by simulated and data examples. Here the bisquare kernel isused asweight function.

5.1 Simulated examples

At rst some simulatedexamples willbe analyzed. Here the trend function

g =2sin(2(x 0:5))+2x+4exp( 100(x 0:5) 2

)+6

isused,wherex2[0;1]. Figure1atobshowtwosimulatedtime seriesoflengthn=200,

called Sim1 and Sim2, generated with iidN(0;1) errors, where the exactly periodicsea-

sonal componentS

1

=f1:5; 1:2; 0:8;0:5gwith s=4is used. The selected bandwidths

^

h l

and

^

h r

with p= 1and p= 3 respectively, are given in Table 1 together with the cor-

respondingj 0

's,the value d:=n(

^

h r

^

h l

) and the answer tothe question, if

^

h is unique.

We see that

^

h l

and

^

h r

are exactlythe same inall cases. This is always the case formany

Table 1:

^

h l

,

^

h r

and otherparameters for the simulated examples

Time p=1 p=3

Series

^

h l

j 0

^

h l

j 0

d uniq.

^

h l

j 0

^

h r

j 0

d uniq.

Sim1&3 0.139 7 0.139 5 0 Yes 0.142 7 0.142 6 0 Yes

Sim2&4 0.141 6 0.141 6 0 Yes 0.163 11 0.163 6 0 Yes

other simulatedexamples we havedone.

(14)

corresponding errors as by Sim1 and Sim2respectively, but another seasonalcomponent

S

2

=f0:3; 0:5;0:9; 0:7gare shown inFigures1ctod. Withthese examplesitisshown

that the changeof the seasonalcomponent doesnot changethe selected bandwidth. The

selectedbandwidths,otherparametersandeventhedetailedinformationineachiteration

for Sim3 and Sim4 are exactlythe same asfor Sim1 and Sim2 respectively.

Data-driven decompositionresults with p=3 forthe simulatedtime series are shown

in Figures1a to d, where the data are given in the upper part of each gure (solid line)

togetherwiththeunderlyingtrendfunction(dottedline)andtheestimatedtrend(dashed

line). Theestimatedseasonalcomponentisplottedinthelowerpart ofeachgure. From

Figure 1 we see that the proposed procedure works well. ^g in all cases is satisfactory.

Althoughatrstglance weare notsure ifthese timeseries areseasonalornot, especially

by Sim3 and Sim4, the seasonal component is well discovered from the data. It is clear

thatS

1

ismoreeasilytoestimatethanS

2

,sincevar(

^

S

1

)=var(

^

S

2

)buttheabsolutevalues

of S

1

are related larger than those of S

2

. Notethat the selected bandwidth under model

(1) for a simulated time series with given errors will be the same, even if the seasonal

component isset tozero. Now

^

S isfully due tothe noises in the data.

5.2 Data examples

The following data examples are chosen to illustrate the practical performance of the

proposalincommon cases and toshow some details.

1. Time Series \CAPE" {Time series of the quarterly nal consumption expenditure

inAustralia(totalprivate,millionsofdollars,1989/90prices)fromSeptember1959

toJune 1995with n =144. Source: AustralianBureau of Statistics.

2. TimeSeries \Strom"{Themonthlytime series ofproducedelectricity inGermany

from1955 to 1979 with n=300. Source: Schlittgen and Streitberg (1994, p. 82).

3. Time Series \IFOR" {The monthly time series of the indices of the foreign orders

received in Germany from 1978 to 1994 (1985 =100) with n = 204. Source: IFO-

Institute forEconomic Research inMunich.

4. TimeSeries\Hsales"{Monthlysalesofnewone-familyhousessoldintheUSAfrom

(15)

0 50 100 150 200

02 46 8 1 0 1 2

The first simulate time series

(a)

0 50 100 150 200

05 1 0

The second simulate time series

(b)

0 50 100 150 200

05 1 0

The third simulate time series

(c)

0 50 100 150 200

02468 1 0 1 2

The fourth simulate time series

(d)

Figure 1: Optimal decomposition results for the simulated time series. Upper: the data

togetherwiththe underlyingtrend(points)andthe estimatedtrend(dashes). Below: the

estimated seasonal component.

(16)

and Hyndman (1998).

All of thesetime series are analyzedwith p=1and p=3respectively. Table 2 shows

the same parametersfor thesedata examplesas thosegiven inTable 1. From Table 2we

see that the results inmost of the cases are unique. For the series Hsales with p=3 we

obtained aninterval of xed points, [0:094;0:105]. As mentionedbefore, we willconsider

such a result to be unique and now

^

h = (0:105+0:094)=2 = 0:10 will be used. Two

unusual cases are: Firstly, the selected bandwidths for the series IFOR with p = 1 are

not unique; Secondly, althoughthe selected bandwidth forthe series Strom with p=3is

unique, which ishowevermuch smallerthan that selected for the same serieswith p=1.

These two cases willbe discussed in the next subsection indetail.

Table 2:

^

h l

,

^

h r

and other parameters for the data examples

Time p=1 p=3

Series

^

h l

j 0

^

h r

j 0

d uniq.

^

h l

j 0

^

h r

j 0

d uniq.

CAPE 0.084 7 0.086 6 0.288 Yes 0.089 6 0.089 8 0 Yes

Strom 0.160 7 0.160 7 0 Yes 0.101 7 0.102 13 0.300 Yes

IFOR 0.113 6 0.262 3 30.40 No 0.140 7 0.141 6 0.204 Yes

Hsales 0.066 4 0.067 8 0.257 Yes 0.094 7 0.105 4 3.025 Int

As explainedinthe lastsection,the use of p=3ismore preferable. But forthe series

Strom p = 1 should be used (see the next subsection for reason). Hence p = 1 for the

series Strom and p = 3 for the other is chosen. Data-driven decomposition results for

these examples are shown inFigures 2a trough d, where corresponding location changes

are introduced for the seasonal component so that the gures look more clear. We see

that the results given in Figure 2 look quite well. This shows the practical usefulness

of the proposed procedure. Note that the selected bandwidths for the examples given

in Figure 2a to d are quite dierent, which adapt automatically to the structure of the

data. The largest is

^

h = 0:16 by the series Strom. This is not surprising, because the

trend in this time series can almost be modelled by a parametric model (see Schlittgen

and Streitberg, 1994). Although the trend in the time series CAPE is also regular, the

selected bandwidth

^

h=0:089 ishoweverthe smallestone, since s=4forthis time series

but for the other s=12. Tables 1 and 2 alsoshowthat j 0

changes fromcase to case.

(17)

0 20 40 60 80 100 120 140

20000 40000 60000

The time series CAPE

(a)

0 50 100 150 200 250 300

0 10000 20000 30000

The time series Strom

(b)

0 50 100 150 200

20 40 60 80 100 120 140

The time series IFOR

(c)

0 50 100 150 200 250

-2 0 0 2 04 06 08 0

The time series Hsales

(d)

Figure2: Optimaldecompositionresultsforthe dataexamples. Upper: thedatatogether

with the estimated trend (dashes). Below: the estimated seasonalcomponent.

(18)

Following Lemma 2 we have

^

h l

h

A

^

h r

in probability. From Tables 1 and 2 we see

that this are all true for the examples. Lemma 2 also ensures that, in probability, h

j is

nondecreasing in j by starting with h 1

0

and h

j

is nonincreasing in j by starting with h 2

0 .

The detailedsearch processes with startingbandwidthsh 1

0

and h 2

0

respectively are shown

in Figure 3, where the results are for the time series Strom with p = 1 (solid line) and

CAPE with p = 3 (dashed line). These two examples are chosen so that the iterative

plug-in algorithm can be well understood. From Figure 3 we can nd an interesting

phenomenon, i.e. although the selected bandwidth for Strom with p =1 is much larger

than that for CAPE with p = 3, h

1

with h 2

0

in the second case was even slightly larger

than that in the rst case. However, after some iterations both of them arrived at the

corresponding xed points.

l

l l l l l

u

u u u u u

l

l l l l l

u

u u

u u u

1 2 3 4 5 6 7 8

0.05 0.10 0.15 0.20

Search processes for the series Strom with p=1 and CAPE with p=3

Number of iteration

Figure3: ThesearchprocessesforthetimeseriesStromwithp=1(solidline)andCAPE

with p =3 (dashed line), where results with h 1

0

=h

min

are marked by the letter \l" and

those with h 2

0

=h

max

are marked by the letter \u".

Hardle et al. (1988) proposed to estimate the MASE on x 2 [;1 ] to avoid the

boundary eect of a kernel estimator, where > 0 is a small positive number. In our

proposal above no (or = 0) is used, since local polynomial tting has automatic

boundarycorrection. However, ife.g. thereare someoutliersorthereastructuralchange

in the boundary area, then the estimateg^ (k)

depends strongly on the value of . Hence

we will use this idea as a diagnostic tool in order to see, if the selected bandwidth is

susceptibleto observationsinthe boundary area. The susceptibility of

^

h toisasignal,

(19)

not work well for the given data set. As examples, bandwidths

^

h l

selected for the two

time seriesStrom and IFORwith h 1

0

, p=1and 3aswell as=0:00; 0:02; ; 0:10 are

given in Table 3. Note that if a 6=0 isused, corresponding formulae given in Sections

2to 4 shouldbeadapted.

From Table 3 we see that the selected bandwidths in the two unusual cases, i.e. the

series Stromwithp=3and theseries IFORwith p=1 arevery susceptibletoboundary

observations. The time series Strom seems to have some outliers at the right boundary.

For dierent , the inuence of these outliers on g^ (4)

are quite dierent. This inuence

is however not so clear if p = 1 is used. Hence p = 1 should be chosen for smoothing

this time series. For the time series IFOR, we see that there is a generally increasing

trend until about the 145-th observation. The trend after there is more complex than

before. This seems tobea structuralchange,whichwillbesmoothed away, ifp=1with

a positive is used. The selected bandwidth

^

h l

with = 0:10 is about the same as

^

h r

given in Table 2. Note that if changes from 0:06 to 0:08, the selected bandwidth is

Table 3:

^

h l

selected for Strom and IFOR with dierent

Time

Series p 0.00 0.02 0.04 0.06 0.08 0.10

Strom 1 0.160 0.161 0.163 0.163 0.163 0.163

3 0.101 0.106 0.116 0.152 0.157 0.160

IFOR 1 0.113 0.118 0.128 0.141 0.259 0.256

3 0.140 0.140 0.140 0.139 0.139 0.137

almost doubled. The structure of this time series can however be well tted, if p= 3 is

used. For the other two time series, especially for the series CAPE, both of p = 1 and

p=3 perform well.

Inthefollowingwewillgiveanexampleforsubjectivechoicefrommorethanone xed

points obtained at the end of the procedure. Suppose that we would like to deal with

the time series IFOR using p = 1. Then two stable xed points, i.e. h 1

f

=

^

h l

= 0:113

and h 2

f

=

^

h r

=0:264as given inTable 2 willbe found. Smoothingresults forIFORwith

p = 1 and these two bandwidths respectively are shown in Figure 4. We see that the

results with

^

h r

are clearly oversmoothed due to the reason mentioned above. Hence, for

(20)

p=1,

^

h shouldbechosenasthe optimalbandwidth. Thesmoothingresultswith

^

h also

provideus some useful informationabout this data set.

0 50 100 150 200

20 40 60 80 100 120 140

Decomposition results for IFOR with p=1 and h0=hmin

(a)

0 50 100 150 200

40 60 80 100 120 140

Decomposition results for IFOR with p=1 and h0=hmax

(b)

Figure4: Decomposition results for IFORwith p=1 and dierent starting bandwidths.

(a): results with h 1

0

. (b): results with h 2

0

. The curvesare drawn similarlyasin Figure2.

6 Final remarks

Inthis paperaniterativeplug-inalgorithmfordecomposingseasonaltime seriesisdevel-

oped. Simulatedanddata examplesshowthatthe proposed bandwidthselectorperforms

well in practice. This proposal can also be applied to equidistant nonparametric regres-

sionwithoutseasonality. Inordertoinvestigatethepracticalperformanceoftheproposal

in detail, a simulation study is required. This is however beyond the aim of this paper

and willbe carried out elsewhere.

(21)

conference: \Shouldwechooseabandwidthsubjectivelyorobjectively?" Herewepropose

tocarry out the procedureatleast twice with p=1 andp=3. In cases whenthe results

with p=1 and p=3are both satisfactory, we can use the results with any p, e.g. those

withp=3. In this casethe result maybeconsideredto beobjective. Sometimeswe have

however tochoose psubjectively by means of our experience and more detailed analysis

asshowninthelastsection. Furthermore,if thereexistmorethanone stablexedpoints

for the p wewould like touse, further subjective choice is alsorequired. Note that both,

the diagnosis at the boundary and the analysis of the smoothing results with a stable

xed point corresponding to a local minimum, will provide us more useful information

about the structure of our data set. Hence, we recommend anexperienced data analyst

touse the proposalhere indierent ways.

The proposed bandwidth selector is motivated by optimizing m^ = g^+

^

S. Similarly,

we can develop an iterative plug-in algorithm for optimizing g.^ Sometimes selection of

optimalbandwidth forestimating S isalsointeresting. However, itissenseless todevelop

suchaprocedureundertheassumptionA3. IfA3holds,thenamorepreferableprocedure

is: 1. To estimateg with a corresponding optimalbandwidth and 2. To estimateS from

the residualsy

i

^ g(x

i

) parametrically, i.e. with the seasonal means.

In model(1)itisassumed thatthe errors are iidfor simplicity. This isanimpractical

assumption, in particular for analyzing a time series. Recent works on iterative plug-in

bandwidthselectioninnonparametricregression withdependent errors(see Herrmannet

al.,1992,Ray andTsay, 1997andBeranandFeng, 2002a,b)showthat itisnot diÆculty

to develop a data-driven procedure for model (1) with a general stationary time series

error process.

Acknowledgements: This work was nished under the advice of Prof. Jan

Beran, University of Konstanz, Germany, and was nancially supported by the Center

of Finance and Econometrics (CoFE)at the University of Konstanz. Some basic results

used here are obtained in the author's PhD thesis, which was nished under the advice

ofProf. SiegfriedHeiler, University ofKonstanz. The dataforthe timeseries CAPEand

Hsales are downloadedfrom theTime Series DataLibrary. We would like tothankProf.

RobJ. Hyndman, Monash University, for makingthese data publicly available.

(22)

A sketched proof of Lemma 1: The proof of this lemma based on some desirable

standardizing and orthogonal nite sample properties of ^g and

^

S. These properties are

quantied by the following properties of w

1

and w

2 .

a:

n

P

i=1 w

1i (x

i x

t )

j

= 8

<

: 1;

0;

j =0;

1j p;

a 0

: 8

>

<

>

: n

P

i=1 w

1i cos (

j

(i t))=0;

n

P

i=1 w

1i sin(

j

(i t))=0;

j =1; ::: ;q:

b:

n

P

i=1 w

2i (x

i x

t )

j

=0; 0j p;

b 0

: 8

>

<

>

: n

P

i=1 w

2i cos (

j

(i t))=1;

n

P

i=1 w

i sin(

j

(i t))=0;

j =1; ::: ;q:

(A.1)

Note that w 0

1

=(e 0

1

;0 0

)(X 0

KX) 1

X 0

K and w 0

2

=(0 0

; 0

s )(X

0

KX) 1

X 0

K. Hence we have

w 0

1

X=(e 0

1

;0 0

) and w 0

2

X=(0 0

; 0

s

). Observingthe denition of e

1

and

s

we obtain the

results in (A.1). Note that (A.1) ensures that g,^

^

S and hence m^ are exactly unbiased,

if m is the sum of a polynomial of order no larger than p and S is an exactly periodic

component with periods.

1. Under A2,A3 we have,in the neighbourhoodof x

t ,

g(x)= p

X

j=0 g

(j)

(x

t )

j!

(x x

t )

j

+ g

(k)

(x

t

+(x x

t ))

k!

(x x

t )

k

; (A.2)

where 0< <1 and

S(x

i )=

q

X

j=1 (

2j cos

j

(i t)+[

3j sin

j

(i t)]): (A.3)

This leads to S(x

t )=

q

P

j=1

2j

. Followinga 0

,we have

B[^g(x

t )]=

n

X

i=1 w

1i [g(x

i

)+S(x

i

)] g(x

t )=

n

X

i=1 w

1i g(x

i

) g(x

t

); (A.4)

since n

P

i=1 w

1i S(x

i

)=0. For B(

^

S)we have

B[

^

S(x

t )]=

n

X

i=1 w

2i [g(x

i

)+S(x

i

)] S(x

t )=

n

X

i=1 w

2i g(x

i

); (A.5)

(23)

since n

P

i=1 w

2i S(x

i )=

P

j=1

2j

=S(x

t

) following b 0

and (A.3). Property a 0

results in

n

X

i=1 w

2i 8

<

: p

X

j=0 g

(j)

(x

t )

j!

(x x

t )

j 9

=

;

=0: (A.6)

Hence

B[

^

S(x

t )] =

n

X

i=1 w

2i g

(k)

(x

t +(x

i x

t ))

k!

(x

i x

t )

k

:

= g

(k)

(x

t )

k!

h k

n

X

i=1 w

2i

x

i x

t

h

k

= o(h k

); (A.7)

where the lastequation isdue tothe fact

n

X

i=1 w

2i (

x

i x

t

h )

k 0

=o(1); for any k 0

0: (A.8)

Equation (refss2uk) holds, since the weights w

2i

are asymptotically periodic (see (7)).

This shows that B(

^

S) is only due to the k-th order term in the Taylor expansion of g.

Andthe contribution of this term toB(

^

S) is negligiblecompared with B(^g). We obtain

B[

^

S(x

t

)]=o(B[^g(x

t )])

and

B[m(x^

t )]

:

=B[^g(x

t )]:

Observe that B(^g) is the same asfor a localpolynomial ttingof order p,weobtain (9).

2. Detailed proof of(10) may befound inFeng (1999),where isitshown inparticular

thatthetwoweightingsystemsw

1

andw

2

areasymptoticallyorthogonalinthesensethat

n

P

i=1 w

1i w

2i

=o(

n

P

i=1 w

2

1i ) =o(

n

P

i=1 w

2

2i

). This follows from (A.8), since K

p

(u) is a polynomial

kernel.

3. Formula (11) follows from (9) and (10). Lemma 1is proved. 3

In the following, it will be explained, why Lemma 2 and Theorem 1 should hold.

Detailed proofs are omitted, since these results are similar to those in nonparametric

regression withoutseasonality.

Asketched proofof Lemma2: Case1. Notethatthetwotermsontherighthandside

of(18)areduetothecontributionofB(^g (k)

)andvar(^g (k)

)(seee.g. theproofofProposition

(24)

I;j A

this case B(^g (k

) is negligible and

^

I is dominated by var(^g (k)

), which tends to innite as

n!1. Observe that w k

i

=O[(nh k+1

I;j )

1

],we havevar(^g (k)

)=O(n 1

h (2k+1)

I;j

) and hence

^

I =O

p (n

1

h (2k+1)

Ij

). Inserting this in the formula for h

j

we obtain h

j

= O

p (h

Ij

), i.e. in

this case h

j 1

is inated to a bandwidth of order O

p (h

Ij

). Step 1 is proved. Results in

Steps 2and 3are clear.

Case 2. Note that Step 1 0

will not appear, if h

0

is of a larger order than h

A such

that h

0

! 0, since now A 0

1 is satised in the rst iteration. In this case h

1

is already

consistent and only Step 2 0

will appear. Step 1 0

occurs, only if 0 <h

0

< 0:5 is taken to

be a constant. Now B(

^

I

1

) is a constant and hence

^

I

1

= O

p

(1) = O

p

(I). Now we obtain

h

1

=O

p (h

A

), which is of the correct order but not yetconsistent. The process willthen

be changed intoStep 2 0

in the seconditeration. Lemma 2 isproved. 3

Remark A1. Theoretically, if the procedure is startedwith an h

0

such that h

A

=o(h

0 )

and h

0

!0 as n ! 1, then h

1

will already be consistent. Hence such a starting band-

width is asymptotically more preferable. Now the asymptotic behaviour of an iterative

plug-inbandwidthselectoriseasytounderstand. Ifthe samplesizeis smallandthe data

haveaspecialstructure,atoolargestartingbandwidth,e.g. h 2

0

=h

max

mayperhapslead

to

^

I

1 :

= 0. Now h

j

could not be deated to the optimal bandwidth. In the application

we did not yetnd such aphenomenon. If this occurs, it isnoproblemfor our proposal,

because it willbe discovered by starting with the other bandwidth h 1

0 .

A sketched proof of Theorem 1: The proof of Theorem 1 can be carried out based

ona formulagiven inthe appendix inBeranand Feng (2002a). See alsoBeranand Feng

(2002b). They showed that, when convergence is reached, the rate of convergence of an

iterativeplug-in bandwidthselector is quantied by:

(

^

h h

A )=h

A :

=

1

2k+1 2Æ I

1

(

^

I I): (A.9)

Equation (A.9)shows that B(

^

h) and var(

^

h) atthe end of the proposed procedure are of

the corresponding orders as those of

^

I. var(

^

h) is dominated by the second term in (19)

of orderO(n 1

h 4k 1

I

),where h

I

denotes the bandwidthfor estimating I used at the end

of theprocedure, whichis oforder O

p (n

1=7

) fork =2and O

p (n

1=13

) fork =4. In both

cases,i.e. k =2with

3

andk =4with

2

,theorderofthesecondtermontherighthand

side of (18) is nolarger than that of the rst. Hence wehave B(

^

h)=O[B(

^

I)]=O(h 2

I ).

Straightforward calculation leads to the results of Theorem 1. 3

(25)

Beran, J. 1999. SEMIFAR models { A semiparametricframework for modelling trends,

longrangedependenceandnonstationarity. Discussionpaper,CenterofFinanceand

Econometrics(CoFE), No. 99/16, Center of Finance and Econometrics, University

of Konstanz.

Beran, J. and Feng, Y. (2002a). Local polynomial tting with long-memory, short-

memoryand antipersistent errors. The Annalsof the Institute of StatisticalMathe-

matics(in press).

Beran, J. and Feng, Y. (2002b). Iterative plug-in algorithms for SEMIFAR models -

denition, convergence and asymptotic properties. To appear in Journal of Com-

putational and Graphical Statistics.

Beran, J., Feng, Y. and Heiler, S. (2000). Modifying the double smoothing bandwidth

selectioninnonparametricregression. DiscussionPaper, CoFE,No. 00/37,Univer-

sity of Konstanz. Submitted.

Cleveland,W.S.1979. RobustLocallyWeightedRegressionandSmoothingScatterplots.

J. Amer. Statist. Assoc. 74,No. 36,829{836.

Fan,J.andGijbels,I.1996. LocalPolynomialModelinganditsApplications. Chapman

&Hall, London.

Feng, Y. 1999. Kernel- and Locally Weighted Regression { with Application to Time

Series Decomposition. VerlagfurWissenschaftund Forschung, Berlin.

Feng, Y. and Heiler, S. (2000). Eine robuste datengesteuerte Version des Berliner-

Verfahrens(inGerman). Wirtschaft und Statistik, 10/2000,786 { 795.

Gasser, T., Kneip, A.and Kohler, W. 1991. A exible and fastmethodfor automatic

smoothing. J. Amer. Statist. Assoc., 86,643{652.

Gasser, T., Muller, H.G. and Mammitzsch, V. 1985. Kernels for nonparametric curve

estimation. J.Roy. Statist. Soc. Ser. B 47238{252.

Gasser, T., Sroka, L. and Jennen-Steinmetz, C. 1986. Residual Variance and Residual

Pattern inNonlinear Regression. Biometrika73,625{33.

(26)

smoothing parameters from their optimum? (with discussion) J. Amer. Statist.

Assoc., 83, 86{99.

Hall, P., Kay, J.W. and Titterington, D.M. 1990. Asymptotically Optimal Dierence-

based Estimationof Variance inNonparametric Regression. Biometrika77,521{8.

Heiler, S. 1966. Analyse der Struktur Wirtschaftlicher Prozesse durch Zerlegung von

Zeitreihen. Dissertation,Tubingen.

Heiler, S. 1970. Theoretische Grundlagen des \Berliner Verfahrens". In Wetzel, W.

(Ed.): Neuere Entwicklungen auf dem Gebiet der Zeitreihenanalyse, Sonderheft 1

zum Allg. Statistischen Archiv,67{93.

Heiler, S. and Feng, Y. 1996. Datengesteuerte Zerlegung saisonaler Zeitreihen. IFO-

Studien3/1996, 337{369.

Heiler,S.andFeng, Y.(2000). Data-drivendecompositionof seasonaltimeseries. Jour-

nal of Statistical Planning and Inference,91, 351 {363.

Heiler,S.andMichels,P.1994. DeskriptiveundExplorativeDatenanalyse,Oldenbourg-

Verlag,Munchen.

Herrmann, E.and Gasser,T. 1994. Iterative plug-inalgorithmforbandwidth selection

in kernel regression estimation. Preprint, Darmstadt Institute of Technology and

University of Zurich.

Herrmann,E., Gasser,T.and Kneip,A.1992. Choiceofbandwidthforkernelregression

when residualsare correlated. Biometrika, 79, 783{795.

Makridakis, Wheelwright and Hyndman 1998. Forecasting: Methods and Applications

(3rd edition). John Wiley,New York.

Muller, H.G. 1988. Nonparametric Analysis of Longitudinal Data, Springer-Verlag,

Berlin.

Ray, B.K. and Tsay, R.S. 1997. Bandwidth selection for kernel regression with long-

range dependence. Biometrika, 84,791{802.

(27)

1215{30.

Ruppert, D., Sheather, S.J. and Wand, M.P.1995. Aneective bandwidthselectorfor

localleast squares regression. J.Amer. Statist. Assoc. 90,1257{1270.

Ruppert, D. and Wand, M.P. (1994). Multivariatelocallyweighted least squaresregres-

sion. Ann. Statist. 22 1346{1370.

Schlittgen,R. and Streitberg, B. 1991. Zeitreihenanalyse. R.Oldenbourg, Munchen.

Yuanhua Feng

Department of Mathematics and Statistics

University of Konstanz

D-78457Konstanz, Germany

Email: yuanhua.feng@uni-konstanz.de

Tel. +49-7531-88-7363

Fax. +49-7531-88-2407