Nonparametric Modelling of Seasonal Time Series
Yuanhua Feng
University of Konstanz
Abstract
This paper focuses on developing a new data-driven procedure for decomposing
seasonal time series based on local regression. Formula of the asymptotic optimal
bandwidthh
A
inthecurrentcontextisgiven. Methodsforestimatingtheunknowns
in h
A
are investigated. A data-driven algorithm for decomposing seasonal time
series is proposed based on the iterative plug-in idea introduced by Gasser et al.
(1991). Asymptoticbehaviourofthisalgorithmisinvestigated. Somecomputational
aspects arediscussed indetail. Practicalperformanceof theproposed algorithmis
illustrated by simulated and data examples. The results here also provide some
insightsinto theiterative plug-inidea.
Keywords: Time series decomposition; Local regression; Iterative plug-in; Band-
widthselection
1 Introduction
Decomposing seasonal time series into unobserved components is an important issue of
statistics. Thisquestionarises,ife.g. wewanttoanalyzemonthlydataortobuildmodels
using seasonallyadjusted data. Here, the equidistant additive time series model
Y
t
=g(x
t
)+S(x
t )+
t
; t=1;2;:::;n; (1)
will be used to perform this, where x
t
= (t 0:5)=n,
t
are iid random variables with
E(
t
)=0 and var(
t )=
2
, g isa smooth trend-cyclical function, S is a slowlychanging
seasonal component with seasonal period s (in terms of t). Denote by m = g +S the
mean function. In the following model (1) will be treated as a standard nonparametric
regressionwith anadditional(deterministic)seasonalcomponent. Atraditionalnonpara-
metric approach for estimating g, S and m based on (unweighted) localregression with
polynomialsandtrigonometricfunctionsaslocalregressorswasproposedbyHeiler(1966,
beingused bythe GermanFederalStatisticalOÆcesince1983. Theapproachusedinthe
following is a generalized version of the Berlin Method proposed by Heiler and Michels
(1994) based on locallyweighted regression (Cleveland, 1979) by introducing a common
kernel weight function into the originalmethodology.
A crucial problemfortime series decomposition based onlocalregression is the selec-
tion of the bandwidth. Some double-smoothing procedures to perform this are proposed
by Heiler and Feng (1996, 2000), Feng (1999) and Feng and Heiler (2000). The aim of
the currentpaperistoproposeanew algorithmforselectingthebandwidth undermodel
(1). The iterative plug-in idea introduced by Gasser et al. (1991) with some minor im-
provementsproposedbyBeranandFeng(2002a,b)isadapted tothecurrentcontext. To
our knowledge this isthe rst plug-in bandwidth selectorfor decomposing seasonaltime
series. Moreover,wealsoprovidesomeinsightsintotheiterativeplug-inidea. Asymptotic
behaviourof the proposed algorithmis investigated. Some computationalaspects of this
algorithmare discussed indetail. Simulatedand data examplesshowthat this algorithm
works wellin practice.
The paperisorganized asfollows. The estimatorsand some oftheir asymptoticprop-
erties that are needed in the subsequent sections, are given in Section 2. Methods for
estimating the unknown terms inthe asymptoticallyoptimal bandwidthare discussed in
Section3. Thealgorithmis proposed inSection4 together with discussion onitsasymp-
totic behaviour and on some computational aspects. Simulated and data examples in
Section5illustratethe practicalusefulnessof thisproposal. Section6contains somenal
remarks. Proofs of the results are put in the appendix.
2 The local regression approach
2.1 The estimators
Assume that g is at least (p+1) times continuously dierentiable, so that it can be ex-
panded in a Taylor series around a point x
t
. Similarly, S can be locally modelled by a
Fourier series. Let
1
= 2=s be the seasonal frequency and
j
= j
1
, for j = 2;:::;q,
where q = [s=2] with [] denoting the integer part. Let K(u) be a second order kernel
regressionestimatorsofg,S andmatx
t
areobtained bysolvingthe leastsquareproblem
Q =
n
X
i=1 fY
t p
X
j=0
1j (x
i x
t )
j
q
X
j=1 (
2j cos
j
(i t)+[
3j sin
j
(i t)])gK
x
i x
t
h
)min: (2)
Thesolutionsof(2)are ^g(x
t )=
^
10 ,
^
S(x
t )=
q
P
j=1
^
2j
and m(x^
t
)=^g(x
t )+
^
S(x
t
),wherethe
coeÆcients and their estimations are dened locallyand hence depend onx
t .
Let
X
1
= 0
B
B
B
B
@ 1 x
1 x
t
(x
1 x
t )
p
.
.
. .
.
. .
.
.
.
.
.
1 x
n x
t
(x
n x
t )
p 1
C
C
C
C
A
and
X
2
= 0
B
B
B
B
@ cos
1
(1 t) sin
1
(1 t) cos
q
(1 t) [sin
q
(1 t)]
.
.
.
.
.
.
.
.
.
.
.
. [
.
.
.]
cos
1
(n t) sin
1
(n t) cos
q
(n t) [sin
q
(n t)]
1
C
C
C
C
A :
Then X=(X
1 .
.
.X
2
) is the [n(p+s)]-designmatrix. The entries in (2)and X
2
marked
by [ ] only apply to odd s, for even s they have to be omitted due to
q
= . Let
y=(y
1
; :::;y
n )
0
be the vector of observations and K denote adiagonalmatrix with
k
i
=K(
x
i x
t
h ):
Furthermore, denote the j-th (p+1)1 unit vector by e
j
and let
s
be an (s 1)1
vector having1 inits odd entries and 0elsewhere. Then we have
^ m(x
t )=(e
0
1
; 0
s )(X
0
KX) 1
X 0
Ky =:w 0
y; (3)
^ g(x
t )=(e
0
1
;0 0
)(X 0
KX) 1
X 0
Ky =:w 0
1
y; (4)
and
^
S(x
t )=(0
0
; 0
s )(X
0
KX) 1
X 0
Ky =:w 0
2
y; (5)
where 0 isa vector of zeros of appropriatedimension.
The vectors w = (w
1
;:::;w
n )
0
, w
1
= (w
11
;:::;w
1n )
0
and w
2
= (w
21
;:::;w
2n )
0
will be
calledweightingsystemsofm ,^ g^and
^
S,forwhichwehavew=w
1 +w
2 ,
P
w
i
= P
w
1i
=1
and P
w
2i
=0. The localregression approachmakes m ,^ ^g and
^
S exactlyunbiased, if g is
apolynomialof order no largerthan p and S isexactly periodicwith period s.
To develop aplug-in bandwidth selector we have todiscuss the asymptoticbehaviour of
^ g,
^
S and m.^ From here onitis assumed that pis odd so that g^has automatic boundary
correction. Put k =p+1and assumethat
A1. h!0and nh!1 asn !1.
A2. g isat least k times continuously dierentiable.
A3. S is exactly periodicwith periods.
A1 and A2 are the same as in nonparametric regression without seasonality. A3 is a
parametric assumptionon the seasonalcomponent, which ismade for simplicity and can
easilyberelaxedtoageneralslowlychangingseasonalcomponent. Undertheassumption
A1itcanbeshowedthat^gisasymptoticallyequivalenttosomekernelestimator(seeFeng,
1999). This means that the same asymptotic results in localpolynomial tting hold for
^
g undermodel (1). In the followingdenoteby K
p
(u)the equivalentkernel forestimating
g, which is of orderk.
To deal with
^
S, wewill introduce akernel estimatorof S. Let
Q
s (i)=
8
<
:
(s 1); if (i t)=s is aninteger;
1; otherwise;
(6)
and
w
2i
=(nh) 1
Q
s (i)K
x
i x
t
h
: (7)
A kernel estimatorof S isdened by
S(x
t )=
n
X
i=1 w
2i y
i
=:w 0
2
y: (8)
Notethat fw
2i
gare asymptoticallyperiodicwith the same periods. Suppose thatcorre-
spondingboundarycorrection is donefor
S, thenit can beshown that, under A1,
^
S and
S are asymptoticallyequivalent,too(see Feng, 1999).
As anerror criterion forbandwidth selectionwe use the mean averaged squared error
(MASE). Dene R (K)= R
1
1 K
2
(u)du. Let B denote the bias of anestimator. We have
Lemma 1 Assume that A1 to A3 hold, then
1. the asymptotic bias of m^ is
B[m(x^
t )]
:
=B[^g(x
t )]
:
= 1
(k!) Z
u k
K
p (u)du
g (k)
(x
t )
h k
; (9)
var(m(x^
t
))=(nh) 1
2
fR (K
p
)+(s 1)R (K)gf1+O[(nh) 1
]g (10)
3. and the MASE of m^ is
MASE (m)^ :=
1
n n
X
t=1
[E(m(x^
t
)) m(x
t )]
2
:
=
2
nh fR (K
p
)+(s 1)R (K)g (11)
+ 1
(k!) 2
(
Z
fg (k)
(x)g 2
dx Z
u k
K(u)du
2 )
h 2k
:
AsketchedproofofLemma1isgivenintheappendix,whereitisshowninparticularthat:
1. g^ and
^
S are asymptotically uncorrelated and 2. the bias in
^
S is negligible compared
tothat in^g. The asymptoticallyoptimalbandwidth,whichminimizesthedominate part
of the MASE is given by
h
A
= (k!)
2
2k
2
fR (K
p
)+(s 1)R (K)g
R
fg (k)
(x)g 2
dxf R
u k
K
p
(u)dug 2
!
1=(2k+1)
n
1=(2k+1)
; (12)
where itis assumed that I = R
fg (k)
(x)g 2
dx>0. The changein h
A
due to the additional
termS isjusta constant. Fors=1the aboveformulaereduce totheresults innonpara-
metric regression asgiven e.g. in Muller (1988), Ruppert and Wand (1994)and Fanand
Gijbels (1996).
3 Estimating the unknown parameters
3.1 Estimation of the variance
In ordertodevelop aplug-inbandwidth selectorbased on(12),the unknown parameters
as 2
and I have to be estimated. It is well known that the variance in nonparametric
regression can be estimated by dierence-based methods (see e.g. Rice, 1984, Gasser
et al., 1986 and Hall et al., 1990). Heiler and Feng (1996) adapted this idea to model
(1) and proposed some seasonal-dierence-based variance estimators. Here a sequence
D
ms
=fd
j
j j =0; 1;:::;mg is called aseasonaldierence sequence,if
m
X
j=0 d
j
=0;
m
X
j=0 d
2
j
=1; m=1;2; ::: (13)
S
i
= m
X
j=0 d
j Æ
ij
=0; i=0; 1;:::; s 1; (14)
where
Æ
ij
= 8
<
:
1; if (j i)=s is aninteger;
0; otherwise :
A seasonal-dierence-based variance estimatoristhen dened by
^ 2
D
=(n m) 1
n m
X
i=1 (
m
X
j=0 d
j Y
i+j )
2
: (15)
FollowingHalletal. (1990)itcanbeshownthatunderA2andA3^ 2
D
isrootn consistent.
In this paperthe following seasonaldierence sequence
D
m;s
= 1
12
f 1;2; 1;0;:::;0;
| {z }
s 3
1; 2;1g
dened for s 3 willbeused to estimate 2
,where m =s+2.
3.2 Estimation of I
Similar tolocalpolynomial tting the k-th derivativeof g can be estimated with a local
polynomial of order p
I
and a bandwidth h
I
with p
I
> k and p
I
k odd. And we set
l = p
I
+1. A simple choice is p
I
= k +1 with l = k+2. Let now (2) be dened with
p being replaced by p
I
. LetK, y and e
j
are the same as dened in Section 2. Let X be
dened similarlyas before. Then g^ (k)
=k!
^
k
estimates g (k)
, whichis given by
^ g
(k)
(t)=k!(e 0
k+1
;0 0
)(X 0
KX) 1
X 0
Ky=:(w k
) 0
y: (16)
where0is thesame asin(4)and w k
=(w k
1
;:::;w k
n )
0
isthe weightingsystem ofg^ (k)
. Then
I may be estimated by
^
I[g (k)
(x;h
I )]=n
1 n
X
i=1 f^g
(k)
(x
i
;h
I )g
2
: (17)
In the following some results on
^
I, which are important for the development of a plug-
in bandwidth selector, will be given without proof, since here we are only interested in
the magnitude orders. These orders are the same in the current context and in models
withoutseasonality. Suchresultsinnonparametricregression maybefoundinRuppert et
withshortmemory,longmemoryandantipersistencearegiveninBeranandFeng(2002a).
Assume now that
A 0
1. h!0and nh 2k+1
!1 as n!1.
A 0
2. g isat least l times continuously dierentiable.
Under the assumptions A 0
1,A 0
2and A3 we have
B(
^
I) :
=O(h (l k)
I
)+O[(nh
I )
1
h (2k)
I
] (18)
and
var(
^
I) :
=O(n 1
)+O(n 2
h 4k 1
I
): (19)
A 0
1 implies that h
I
is of a larger order than h
A
, i.e. (h
I )
1
= o[(h
A )
1
], which ensures
that g^ (k)
and hence
^
I is atleast consistent.
The followingremarks show that howh
I
shouldbechosen.
Remark 1. The largest order h
I
should take is O(n
1=(4k+2)
) = O[(h
A )
1=2
]. Under this
choice the second term on the right hand side of (18) and the standard deviation of
^
I
achieve the fastest rootn convergence rate atthe same time. Anh
I
ofa largerorder will
increase the bias withoutimprovingthe variance(interms of the magnitudeorder).
Remark 2. The optimal bandwidth for estimating g (k)
itself is of order O(n
1=(2l +1)
).
This order liesbetween the twogiven in Remarks1 and 3. The choice h
I
=O(n
1=(2l +1)
)
is hencealso reasonable.
Remark3. ObservethattheMSE(meansquarederror)of
^
I isdominatedbythesquared
bias part. By balancing the orders of the two terms on the right hand side of (18) we
obtain h
I
=O(n
1=(k+l +1)
), whichmay be considered to be the (asymptotically) optimal
choice of h
I
. Such achoice is of a smallerorder than the two given in Remarks 1and 2.
4 The main proposal
4.1 The basic algorithm
Fromhereononlyp=1and3withk=2and4willbeconsidered. Followingtheiterative
plug-inideaofGasseretal. (1991),
^
I
j
,theestimateofI inthej-thiteration,iscalculated
I;j j 1
(j-1)-th iteration,by meansof aination method. Here aination method is afunction
h
I;j
= f(h
j 1
) such that (h
I;j )
1
= o[(h
j 1 )
1
]. This means that h
I;j
will be of a larger
order than h
A , if h
j 1
is at least of order O(h
A
). Now A 0
1 is satised so that
^
I and
^
h
will be consistent in the j-th iteration. In the following two ination methods will be
described.
The original ination function, called a multiplied ination method (MIM) is h
I;j
=
f(h
j 1 )=ch
j 1 n
introduced by Gasseret al. (1991)with some >0,called aination
factor. This idea is discussed in detail by Herrmann and Gasser (1994). There are some
unknownsinthe functionf asc, andastartingbandwidthh
0
, whichhavetobechosen
beforehand. Theoretically, the rate of convergence of
^
h does not depend on c and h
0 .
Here we will simply set c = 1. The choice of h
0
will be discussed in Section 4.3. Let
l =k+2. FollowingRemarks1to3,weobtainthree reasonablechoicesof forthe MIM
respectively.
1.
1
=1=(4k+2) sothat the varianceterm of
^
I is minimized,
2.
2
=4=[(2k+1)(2k+5)]so that ^g (k)
is optimized and
3.
3
=2=[(2k+1)(2k+3)]so that the MSE of
^
I is minimized,
when convergence isreached, where
1
>
2
>
3
and
3
is the optimalchoice of .
It is well known that the required number of iterations by the MIM is very large,
especiallyfork>2. Forexample,ifk =4,itis5k+1=21for
1
and(k+1)(2k+1)=45
for
3
(see Herrmann and Gasser, 1994). The required numberof iterationswillbe even
larger,iftheerrorshavelongmemory(seeRayandTsay, 1997). Beran(1999)introduced
another ination method h
I;j
= f(h
j 1
) =ch
j 1
, called an exponential ination method
(EIM), toreduce the number of iterations. It is easyto showthat, in order toinate h
A
to a given order, the required number of iterations by the EIM is much smaller than by
the MIM (see Beranand Feng, 2002a forsome examples). In the followingthe EIM with
c=1 willbe used. The choices of correspondingto the
1 ,
2
and
3 are:
1.
1
=1=2,
2.
2
=(2k+1)=(2k+5)and
3
where
1
<
2
<
3
and
3
isthe optimalchoice of (see Beran and Feng, 2002a,b).
In the followingwewillpropose a basiciterativeplug-in algorithmforselectingband-
width intime series decomposition, which is dened for k =2 and k =4 separately.
i) Start with apossiblebandwidth h
0
;
ii) For j = 1;2;::: set h
I;j
=h
j 1
with =
3
=5=7 for k =2 and =
2
=9=13 for
k =4. Calculate
h
j
= (k!)
2
2k
^ 2
fR (K
p
)+(s 1)R (K)g
R
f^g (k)
(x;h
I;j )g
2
dxf R
u k
K
p
(u)dug 2
!
1=(2k+1)
n
1=(2k+1)
; (20)
iii) Increase j by 1 and repeat Step ii) untilconvergence is reached atsome j 0
and set
^
h=h
j 0
.
Aclosely relatedproposalfor bandwidthselectioninnonparametricregression withshort
memory,long memoryand antipersistenceisproposedbyBeranandFeng(2002a). Other
iterativeplug-inproceduresmay befoundinGasseretal. (1991),Herrmannetal. (1992)
and Ray and Tsay (1997).
Theoretically,
3
is the asymptoticallyoptimalchoice of. Our experienceshowthat,
for k = 2, this choice works well in practice. Hence we choose
3
= 5=7 for k = 2.
However,
3
= 9=11 for k = 4 is too close to one and for small samples the bandwidth
couldnot beinatedcorrectly. Fork =4 itis henceproposedtouse the slightlystronger
ination factor
2
. Now, the variance of
^
h with k = 2 and k = 4 is almost of the same
order and
^
h is hence in both cases stable (see Theorem 1 in the next subsection). The
most stableinationfactor
1
=1=2 by the EIMistoostrongand doesnot workwellfor
small samples.
4.2 Asymptotic behaviour
The iterativeplug-inalgorithmismotivated by xed point search. Here the procedureis
startedwithabandwidthh
0
andstopped,ifaconvergentoutput(axedpoint)isarrived.
The ination process behind an iterative plug-in algorithm is described by the following
lemma accordingto the relationship between h
0
and h
A .
Lemma 2 Under assumptions A2 and A3, an iterative plug-in algorithm processes as
follows:
Case 1. Start with an h
0
=o
p (h
A
), then
Step 1. h
j
=O
p (h
I;j ), if h
I;j
=o
p (h
A );
Step 2. h
j
=O
p (h
A ), if h
I;j
=O
p (h
A );
Step 3. h
j
=h
A [1+o
p
(1)], if h
A
=o
p (h
I;j ).
Case 2. Start with an h
0
such that (h
0 )
1
=o
p [(h
A )
1
], then
Step 1 0
. h
j
=O
p (h
A ), if h
I;j
=O
p (1).
Step 2 0
. The same as Step 3 in case 1.
The proof of Lemma 2 is given in the appendix. Related results may be found in Beran
and Feng (2002a)(see alsothe descriptionin Herrmannand Gasser,1994, p. 8). Note in
particular that A 0
1 does not apply toLemma 2.
Remark 4. Case 1 in Lemma 2 shows that, by starting with a small bandwidth, h
j 1
willbeinatedinthe j-thiteration,if h
j 1
=o
p (h
A
). This willberepeatedly carriedout
until h
j 0 =O
p (h
A
) is achieved in the j 0
-thiteration. And h
j 0
+1
in the next iteration will
be a consistent bandwidth selector. Some further iterations are required toimprove the
nite sampleproperty of
^
h.
Remark 5. Case 2 in Lemma 2 shows how such an algorithm works, if a starting
bandwidthh
0
,whichisatleastoforderO
p (h
A
),isuaed. On onehand,ifh
0
=o
p
(1),then
h
1
isalready consistent,since A 0
1issatised. Inthis caseStep 1 0
willnot appear. On the
other hand, if h
0
= O
p
(1), then h
1
= O
p (h
A
), which is already of the correct order but
not yet consistent. Now h
2
will be consistent. Again, some further iterationsare needed
toreduce the inuence of h
0 .
The followingtheorem hold forthe algorithmproposed in Section4.1.
Theorem 1 Under the assumptions of Lemma 2 wehave
i) For k=2 with
3
=5=7
^
h=h
A n
1+O(n 2=7
)+O
p (n
5=14
) o
; (21)
2
^
h=h
A n
1+O(n 2=13
)+O
p (n
9=26
) o
: (22)
A sketched proof of Theorem 1is given inthe appendix.
Remark6. Leth
M
denotetheoptimalbandwidth,whichminimizestheMASE.Theorem
1alsoholds,ifh
A
ontherighthandsidesof(21)and(22)isreplacedbyh
M
. Thisisdueto
the factthatjh
M h
A j=h
M
=O(h 2
M
)(see Beranetal.,2000 and Beranand Feng, 2002b),
which is of orders O(n 2=5
) for k =2 and O(n 2=9
) for k=4and ishence negligible.
4.3 Computational aspects
Thissubsection dealswithsomecomputationalaspects asthedecisionofj 0
,the choiceof
h
0
and soon. A morepractical procedure willbe proposed at the end of this subsection.
The estimators in Section 2.1 are dened with a xed bandwidth h. In this case
the number of observations used at x
t
decreases when x
t
moves from the interior to the
boundary. To solve this problem the k-NN idea as proposed by Gasser etal. (1985) will
beused. Foragivenhwedenealeftbandwidthh
l
andarightoneh
r
sothath
l
=h
r
=h
intheinterior,h
l
=x
t
ataleftboundarypointandh
r
=1 x
t
atarightboundarypoint.
h
r
(rep. h
l
)ataboundarypointisdecided byh
l +h
r
=2h. The estimatesataboundary
point are calculated similarlybut with h in(2) being replaced by max(h
l
;h
r ).
Inourprogramonlybandwidthsh2[h
min
;h
max
]withh
min
=s=nandh
max
=0:5 1=n
will be considered, which includes practically all reasonable possibilities for h. Further-
more, two bandwidths h and h 0
will be considered to be the same, if jh h 0
j < 1=n,
because a dierence of such an order is for any bandwidth selector negligible. In the
programthe actuallyused bandwidthis aninteger b=[nh+0:5],whichisthebandwidth
w.r.t. the observation time t. Let b
I;j
=[nh
I;j
+0:5]. Then we obtain anatural criterion
forstoppingthe computing procedure, i.e. the procedurewillbestopped, if b
I;j 0 =b
I;j 0
1
inthej 0
-th iteration. This implies
^
I 0
j
=
^
I
j 0
1 and
^
h=h
j 0 =h
j 0
1
. Furtheriterationsare
not necessary. Note that, even the j 0
-thiteration isjust a repetitionof the (j 0
-1)-th.
In the following the choice of h
0
will be considered. In most cases h
0
does not play
any role. However, in some cases, when the nite sample MASE has more than one
localminimumsorwhentheMASEchangesveryslowlyarounditsminimum,then
^
hmay
0
h
f
is called axed point (of the procedure proposed in Section 4.1), if
^
h=h
f
, when the
procedure is started with h
0
= h
f
itself. A xed point h
f
is called left stable, if for all
h
0 h
f
inaneighbourhoodofh
f
wehave
^
h =h
f
. A xed pointh
f
iscalledrightstable,
if for all h
0 h
f
in a neighbourhood of h
f
we have
^
h = h
f
. A xed point h
f
is called
stable, if it is both left and right stable. A xed point is called unstable, if it is only
achievable by starting with itself. An interval of bandwidths [h l
f
;h r
f
] iscalled an interval
of xed points, if h l
f
is a left stable xed point, h r
f
is a right stable xed point and all
pointsbetween themare unstablexed points. Denoteby
^
h l
thebandwidthselected with
h 1
0
= h
min and
^
h r
the bandwidth selected with h 2
0
= h
max
. Then
^
h l
is a left stable xed
point, if
^
h l
>h
min and
^
h r
is aright stable xed point, if
^
h r
<h
max .
Whenthe nitesampleMASEhasonlyoneminimum,thenthereexitsauniquestable
xed point or a unique interval of xed points. In the rst case we will obtain the same
selected bandwidth
^
h by startingwith any h
0
. In the second case we have
^
h =h l
for all
h
0 h
l
,
^
h = h r
for all h
0 h
r
and
^
h = h
0 for h
l
< h
0
< h r
. Now all bandwidths in
[h l
;h r
] are reasonable to be used as the optimal bandwidth, since now the change of the
MASE over [h l
f
;h r
f
] is negligible. In this case we also say that the result is unique and
set
^
h:=(
^
h l
+
^
h r
)=2. In thefollowingthe words a stablexedpoint alsomeanssometimes
an interval of stable points. In the case when the nite MASE has more than one local
minimums,then wemay obtaindierent
^
hby startingwithdierenth
0
. Now, theremay
alsobesome unstable xed points corresponding to alocalmaximum between two local
minimums. Ifthis isthe case, weshould ndout allpossible stablexed pointsand then
select one of them asthe optimalbandwidth by analyzing the smoothingresults.
An S-Plusfunction calledDeSeaTS (DecomposingSea sonalT ime Series)is developed
based onthe following quasi-data-driven procedure.
1. Carry out the algorithmin Section5.1twice with h 1
0
=h
min and h
2
0
=h
max
, respec-
tively.
2. Calculate the decomposition results automatically,if
^
h is unique.
3. Showdetailedinformation about allstable xed points,when
^
h is not unique.
If 3occurs, further subjective analysis is required.
respectively. If the smoothing results with p=1 and p=3 are both satisfactory, we can
choose either p=1 or p =3. However, it is more preferable to use p =3, since now the
selected bandwidth isin generalslightlylarger, which doesnot increase the bias of g^but
will improve
^
S. Sometime one p is more reasonable than the other, now the reasonable
one shouldbe chosen(see the examplesgiven inthe next section). An objectivecriterion
for choosingp isnot given here, becausewe donot haveanestimateof the MASE atthe
end of the procedure.
5 Practical performance
Inthe following,the practicalperformance ofthe proposed procedurewillbeinvestigated
by simulated and data examples. Here the bisquare kernel isused asweight function.
5.1 Simulated examples
At rst some simulatedexamples willbe analyzed. Here the trend function
g =2sin(2(x 0:5))+2x+4exp( 100(x 0:5) 2
)+6
isused,wherex2[0;1]. Figure1atobshowtwosimulatedtime seriesoflengthn=200,
called Sim1 and Sim2, generated with iidN(0;1) errors, where the exactly periodicsea-
sonal componentS
1
=f1:5; 1:2; 0:8;0:5gwith s=4is used. The selected bandwidths
^
h l
and
^
h r
with p= 1and p= 3 respectively, are given in Table 1 together with the cor-
respondingj 0
's,the value d:=n(
^
h r
^
h l
) and the answer tothe question, if
^
h is unique.
We see that
^
h l
and
^
h r
are exactlythe same inall cases. This is always the case formany
Table 1:
^
h l
,
^
h r
and otherparameters for the simulated examples
Time p=1 p=3
Series
^
h l
j 0
^
h l
j 0
d uniq.
^
h l
j 0
^
h r
j 0
d uniq.
Sim1&3 0.139 7 0.139 5 0 Yes 0.142 7 0.142 6 0 Yes
Sim2&4 0.141 6 0.141 6 0 Yes 0.163 11 0.163 6 0 Yes
other simulatedexamples we havedone.
corresponding errors as by Sim1 and Sim2respectively, but another seasonalcomponent
S
2
=f0:3; 0:5;0:9; 0:7gare shown inFigures1ctod. Withthese examplesitisshown
that the changeof the seasonalcomponent doesnot changethe selected bandwidth. The
selectedbandwidths,otherparametersandeventhedetailedinformationineachiteration
for Sim3 and Sim4 are exactlythe same asfor Sim1 and Sim2 respectively.
Data-driven decompositionresults with p=3 forthe simulatedtime series are shown
in Figures1a to d, where the data are given in the upper part of each gure (solid line)
togetherwiththeunderlyingtrendfunction(dottedline)andtheestimatedtrend(dashed
line). Theestimatedseasonalcomponentisplottedinthelowerpart ofeachgure. From
Figure 1 we see that the proposed procedure works well. ^g in all cases is satisfactory.
Althoughatrstglance weare notsure ifthese timeseries areseasonalornot, especially
by Sim3 and Sim4, the seasonal component is well discovered from the data. It is clear
thatS
1
ismoreeasilytoestimatethanS
2
,sincevar(
^
S
1
)=var(
^
S
2
)buttheabsolutevalues
of S
1
are related larger than those of S
2
. Notethat the selected bandwidth under model
(1) for a simulated time series with given errors will be the same, even if the seasonal
component isset tozero. Now
^
S isfully due tothe noises in the data.
5.2 Data examples
The following data examples are chosen to illustrate the practical performance of the
proposalincommon cases and toshow some details.
1. Time Series \CAPE" {Time series of the quarterly nal consumption expenditure
inAustralia(totalprivate,millionsofdollars,1989/90prices)fromSeptember1959
toJune 1995with n =144. Source: AustralianBureau of Statistics.
2. TimeSeries \Strom"{Themonthlytime series ofproducedelectricity inGermany
from1955 to 1979 with n=300. Source: Schlittgen and Streitberg (1994, p. 82).
3. Time Series \IFOR" {The monthly time series of the indices of the foreign orders
received in Germany from 1978 to 1994 (1985 =100) with n = 204. Source: IFO-
Institute forEconomic Research inMunich.
4. TimeSeries\Hsales"{Monthlysalesofnewone-familyhousessoldintheUSAfrom
0 50 100 150 200
02 46 8 1 0 1 2
The first simulate time series
(a)
0 50 100 150 200
05 1 0
The second simulate time series
(b)
0 50 100 150 200
05 1 0
The third simulate time series
(c)
0 50 100 150 200
02468 1 0 1 2
The fourth simulate time series
(d)
Figure 1: Optimal decomposition results for the simulated time series. Upper: the data
togetherwiththe underlyingtrend(points)andthe estimatedtrend(dashes). Below: the
estimated seasonal component.
and Hyndman (1998).
All of thesetime series are analyzedwith p=1and p=3respectively. Table 2 shows
the same parametersfor thesedata examplesas thosegiven inTable 1. From Table 2we
see that the results inmost of the cases are unique. For the series Hsales with p=3 we
obtained aninterval of xed points, [0:094;0:105]. As mentionedbefore, we willconsider
such a result to be unique and now
^
h = (0:105+0:094)=2 = 0:10 will be used. Two
unusual cases are: Firstly, the selected bandwidths for the series IFOR with p = 1 are
not unique; Secondly, althoughthe selected bandwidth forthe series Strom with p=3is
unique, which ishowevermuch smallerthan that selected for the same serieswith p=1.
These two cases willbe discussed in the next subsection indetail.
Table 2:
^
h l
,
^
h r
and other parameters for the data examples
Time p=1 p=3
Series
^
h l
j 0
^
h r
j 0
d uniq.
^
h l
j 0
^
h r
j 0
d uniq.
CAPE 0.084 7 0.086 6 0.288 Yes 0.089 6 0.089 8 0 Yes
Strom 0.160 7 0.160 7 0 Yes 0.101 7 0.102 13 0.300 Yes
IFOR 0.113 6 0.262 3 30.40 No 0.140 7 0.141 6 0.204 Yes
Hsales 0.066 4 0.067 8 0.257 Yes 0.094 7 0.105 4 3.025 Int
As explainedinthe lastsection,the use of p=3ismore preferable. But forthe series
Strom p = 1 should be used (see the next subsection for reason). Hence p = 1 for the
series Strom and p = 3 for the other is chosen. Data-driven decomposition results for
these examples are shown inFigures 2a trough d, where corresponding location changes
are introduced for the seasonal component so that the gures look more clear. We see
that the results given in Figure 2 look quite well. This shows the practical usefulness
of the proposed procedure. Note that the selected bandwidths for the examples given
in Figure 2a to d are quite dierent, which adapt automatically to the structure of the
data. The largest is
^
h = 0:16 by the series Strom. This is not surprising, because the
trend in this time series can almost be modelled by a parametric model (see Schlittgen
and Streitberg, 1994). Although the trend in the time series CAPE is also regular, the
selected bandwidth
^
h=0:089 ishoweverthe smallestone, since s=4forthis time series
but for the other s=12. Tables 1 and 2 alsoshowthat j 0
changes fromcase to case.
0 20 40 60 80 100 120 140
20000 40000 60000
The time series CAPE
(a)
0 50 100 150 200 250 300
0 10000 20000 30000
The time series Strom
(b)
0 50 100 150 200
20 40 60 80 100 120 140
The time series IFOR
(c)
0 50 100 150 200 250
-2 0 0 2 04 06 08 0
The time series Hsales
(d)
Figure2: Optimaldecompositionresultsforthe dataexamples. Upper: thedatatogether
with the estimated trend (dashes). Below: the estimated seasonalcomponent.
Following Lemma 2 we have
^
h l
h
A
^
h r
in probability. From Tables 1 and 2 we see
that this are all true for the examples. Lemma 2 also ensures that, in probability, h
j is
nondecreasing in j by starting with h 1
0
and h
j
is nonincreasing in j by starting with h 2
0 .
The detailedsearch processes with startingbandwidthsh 1
0
and h 2
0
respectively are shown
in Figure 3, where the results are for the time series Strom with p = 1 (solid line) and
CAPE with p = 3 (dashed line). These two examples are chosen so that the iterative
plug-in algorithm can be well understood. From Figure 3 we can nd an interesting
phenomenon, i.e. although the selected bandwidth for Strom with p =1 is much larger
than that for CAPE with p = 3, h
1
with h 2
0
in the second case was even slightly larger
than that in the rst case. However, after some iterations both of them arrived at the
corresponding xed points.
l
l
l l l l l
u
u
u u u u u
l
l
l
l l l l l
u
u
u
u u
u u u
1 2 3 4 5 6 7 8
0.05 0.10 0.15 0.20
Search processes for the series Strom with p=1 and CAPE with p=3
Number of iteration
Figure3: ThesearchprocessesforthetimeseriesStromwithp=1(solidline)andCAPE
with p =3 (dashed line), where results with h 1
0
=h
min
are marked by the letter \l" and
those with h 2
0
=h
max
are marked by the letter \u".
Hardle et al. (1988) proposed to estimate the MASE on x 2 [;1 ] to avoid the
boundary eect of a kernel estimator, where > 0 is a small positive number. In our
proposal above no (or = 0) is used, since local polynomial tting has automatic
boundarycorrection. However, ife.g. thereare someoutliersorthereastructuralchange
in the boundary area, then the estimateg^ (k)
depends strongly on the value of . Hence
we will use this idea as a diagnostic tool in order to see, if the selected bandwidth is
susceptibleto observationsinthe boundary area. The susceptibility of
^
h toisasignal,
not work well for the given data set. As examples, bandwidths
^
h l
selected for the two
time seriesStrom and IFORwith h 1
0
, p=1and 3aswell as=0:00; 0:02; ; 0:10 are
given in Table 3. Note that if a 6=0 isused, corresponding formulae given in Sections
2to 4 shouldbeadapted.
From Table 3 we see that the selected bandwidths in the two unusual cases, i.e. the
series Stromwithp=3and theseries IFORwith p=1 arevery susceptibletoboundary
observations. The time series Strom seems to have some outliers at the right boundary.
For dierent , the inuence of these outliers on g^ (4)
are quite dierent. This inuence
is however not so clear if p = 1 is used. Hence p = 1 should be chosen for smoothing
this time series. For the time series IFOR, we see that there is a generally increasing
trend until about the 145-th observation. The trend after there is more complex than
before. This seems tobea structuralchange,whichwillbesmoothed away, ifp=1with
a positive is used. The selected bandwidth
^
h l
with = 0:10 is about the same as
^
h r
given in Table 2. Note that if changes from 0:06 to 0:08, the selected bandwidth is
Table 3:
^
h l
selected for Strom and IFOR with dierent
Time
Series p 0.00 0.02 0.04 0.06 0.08 0.10
Strom 1 0.160 0.161 0.163 0.163 0.163 0.163
3 0.101 0.106 0.116 0.152 0.157 0.160
IFOR 1 0.113 0.118 0.128 0.141 0.259 0.256
3 0.140 0.140 0.140 0.139 0.139 0.137
almost doubled. The structure of this time series can however be well tted, if p= 3 is
used. For the other two time series, especially for the series CAPE, both of p = 1 and
p=3 perform well.
Inthefollowingwewillgiveanexampleforsubjectivechoicefrommorethanone xed
points obtained at the end of the procedure. Suppose that we would like to deal with
the time series IFOR using p = 1. Then two stable xed points, i.e. h 1
f
=
^
h l
= 0:113
and h 2
f
=
^
h r
=0:264as given inTable 2 willbe found. Smoothingresults forIFORwith
p = 1 and these two bandwidths respectively are shown in Figure 4. We see that the
results with
^
h r
are clearly oversmoothed due to the reason mentioned above. Hence, for
p=1,
^
h shouldbechosenasthe optimalbandwidth. Thesmoothingresultswith
^
h also
provideus some useful informationabout this data set.
0 50 100 150 200
20 40 60 80 100 120 140
Decomposition results for IFOR with p=1 and h0=hmin
(a)
0 50 100 150 200
40 60 80 100 120 140
Decomposition results for IFOR with p=1 and h0=hmax
(b)
Figure4: Decomposition results for IFORwith p=1 and dierent starting bandwidths.
(a): results with h 1
0
. (b): results with h 2
0
. The curvesare drawn similarlyasin Figure2.
6 Final remarks
Inthis paperaniterativeplug-inalgorithmfordecomposingseasonaltime seriesisdevel-
oped. Simulatedanddata examplesshowthatthe proposed bandwidthselectorperforms
well in practice. This proposal can also be applied to equidistant nonparametric regres-
sionwithoutseasonality. Inordertoinvestigatethepracticalperformanceoftheproposal
in detail, a simulation study is required. This is however beyond the aim of this paper
and willbe carried out elsewhere.
conference: \Shouldwechooseabandwidthsubjectivelyorobjectively?" Herewepropose
tocarry out the procedureatleast twice with p=1 andp=3. In cases whenthe results
with p=1 and p=3are both satisfactory, we can use the results with any p, e.g. those
withp=3. In this casethe result maybeconsideredto beobjective. Sometimeswe have
however tochoose psubjectively by means of our experience and more detailed analysis
asshowninthelastsection. Furthermore,if thereexistmorethanone stablexedpoints
for the p wewould like touse, further subjective choice is alsorequired. Note that both,
the diagnosis at the boundary and the analysis of the smoothing results with a stable
xed point corresponding to a local minimum, will provide us more useful information
about the structure of our data set. Hence, we recommend anexperienced data analyst
touse the proposalhere indierent ways.
The proposed bandwidth selector is motivated by optimizing m^ = g^+
^
S. Similarly,
we can develop an iterative plug-in algorithm for optimizing g.^ Sometimes selection of
optimalbandwidth forestimating S isalsointeresting. However, itissenseless todevelop
suchaprocedureundertheassumptionA3. IfA3holds,thenamorepreferableprocedure
is: 1. To estimateg with a corresponding optimalbandwidth and 2. To estimateS from
the residualsy
i
^ g(x
i
) parametrically, i.e. with the seasonal means.
In model(1)itisassumed thatthe errors are iidfor simplicity. This isanimpractical
assumption, in particular for analyzing a time series. Recent works on iterative plug-in
bandwidthselectioninnonparametricregression withdependent errors(see Herrmannet
al.,1992,Ray andTsay, 1997andBeranandFeng, 2002a,b)showthat itisnot diÆculty
to develop a data-driven procedure for model (1) with a general stationary time series
error process.
Acknowledgements: This work was nished under the advice of Prof. Jan
Beran, University of Konstanz, Germany, and was nancially supported by the Center
of Finance and Econometrics (CoFE)at the University of Konstanz. Some basic results
used here are obtained in the author's PhD thesis, which was nished under the advice
ofProf. SiegfriedHeiler, University ofKonstanz. The dataforthe timeseries CAPEand
Hsales are downloadedfrom theTime Series DataLibrary. We would like tothankProf.
RobJ. Hyndman, Monash University, for makingthese data publicly available.
A sketched proof of Lemma 1: The proof of this lemma based on some desirable
standardizing and orthogonal nite sample properties of ^g and
^
S. These properties are
quantied by the following properties of w
1
and w
2 .
a:
n
P
i=1 w
1i (x
i x
t )
j
= 8
<
: 1;
0;
j =0;
1j p;
a 0
: 8
>
>
<
>
>
: n
P
i=1 w
1i cos (
j
(i t))=0;
n
P
i=1 w
1i sin(
j
(i t))=0;
j =1; ::: ;q:
b:
n
P
i=1 w
2i (x
i x
t )
j
=0; 0j p;
b 0
: 8
>
>
<
>
>
: n
P
i=1 w
2i cos (
j
(i t))=1;
n
P
i=1 w
i sin(
j
(i t))=0;
j =1; ::: ;q:
(A.1)
Note that w 0
1
=(e 0
1
;0 0
)(X 0
KX) 1
X 0
K and w 0
2
=(0 0
; 0
s )(X
0
KX) 1
X 0
K. Hence we have
w 0
1
X=(e 0
1
;0 0
) and w 0
2
X=(0 0
; 0
s
). Observingthe denition of e
1
and
s
we obtain the
results in (A.1). Note that (A.1) ensures that g,^
^
S and hence m^ are exactly unbiased,
if m is the sum of a polynomial of order no larger than p and S is an exactly periodic
component with periods.
1. Under A2,A3 we have,in the neighbourhoodof x
t ,
g(x)= p
X
j=0 g
(j)
(x
t )
j!
(x x
t )
j
+ g
(k)
(x
t
+(x x
t ))
k!
(x x
t )
k
; (A.2)
where 0< <1 and
S(x
i )=
q
X
j=1 (
2j cos
j
(i t)+[
3j sin
j
(i t)]): (A.3)
This leads to S(x
t )=
q
P
j=1
2j
. Followinga 0
,we have
B[^g(x
t )]=
n
X
i=1 w
1i [g(x
i
)+S(x
i
)] g(x
t )=
n
X
i=1 w
1i g(x
i
) g(x
t
); (A.4)
since n
P
i=1 w
1i S(x
i
)=0. For B(
^
S)we have
B[
^
S(x
t )]=
n
X
i=1 w
2i [g(x
i
)+S(x
i
)] S(x
t )=
n
X
i=1 w
2i g(x
i
); (A.5)
since n
P
i=1 w
2i S(x
i )=
P
j=1
2j
=S(x
t
) following b 0
and (A.3). Property a 0
results in
n
X
i=1 w
2i 8
<
: p
X
j=0 g
(j)
(x
t )
j!
(x x
t )
j 9
=
;
=0: (A.6)
Hence
B[
^
S(x
t )] =
n
X
i=1 w
2i g
(k)
(x
t +(x
i x
t ))
k!
(x
i x
t )
k
:
= g
(k)
(x
t )
k!
h k
n
X
i=1 w
2i
x
i x
t
h
k
= o(h k
); (A.7)
where the lastequation isdue tothe fact
n
X
i=1 w
2i (
x
i x
t
h )
k 0
=o(1); for any k 0
0: (A.8)
Equation (refss2uk) holds, since the weights w
2i
are asymptotically periodic (see (7)).
This shows that B(
^
S) is only due to the k-th order term in the Taylor expansion of g.
Andthe contribution of this term toB(
^
S) is negligiblecompared with B(^g). We obtain
B[
^
S(x
t
)]=o(B[^g(x
t )])
and
B[m(x^
t )]
:
=B[^g(x
t )]:
Observe that B(^g) is the same asfor a localpolynomial ttingof order p,weobtain (9).
2. Detailed proof of(10) may befound inFeng (1999),where isitshown inparticular
thatthetwoweightingsystemsw
1
andw
2
areasymptoticallyorthogonalinthesensethat
n
P
i=1 w
1i w
2i
=o(
n
P
i=1 w
2
1i ) =o(
n
P
i=1 w
2
2i
). This follows from (A.8), since K
p
(u) is a polynomial
kernel.
3. Formula (11) follows from (9) and (10). Lemma 1is proved. 3
In the following, it will be explained, why Lemma 2 and Theorem 1 should hold.
Detailed proofs are omitted, since these results are similar to those in nonparametric
regression withoutseasonality.
Asketched proofof Lemma2: Case1. Notethatthetwotermsontherighthandside
of(18)areduetothecontributionofB(^g (k)
)andvar(^g (k)
)(seee.g. theproofofProposition
I;j A
this case B(^g (k
) is negligible and
^
I is dominated by var(^g (k)
), which tends to innite as
n!1. Observe that w k
i
=O[(nh k+1
I;j )
1
],we havevar(^g (k)
)=O(n 1
h (2k+1)
I;j
) and hence
^
I =O
p (n
1
h (2k+1)
Ij
). Inserting this in the formula for h
j
we obtain h
j
= O
p (h
Ij
), i.e. in
this case h
j 1
is inated to a bandwidth of order O
p (h
Ij
). Step 1 is proved. Results in
Steps 2and 3are clear.
Case 2. Note that Step 1 0
will not appear, if h
0
is of a larger order than h
A such
that h
0
! 0, since now A 0
1 is satised in the rst iteration. In this case h
1
is already
consistent and only Step 2 0
will appear. Step 1 0
occurs, only if 0 <h
0
< 0:5 is taken to
be a constant. Now B(
^
I
1
) is a constant and hence
^
I
1
= O
p
(1) = O
p
(I). Now we obtain
h
1
=O
p (h
A
), which is of the correct order but not yetconsistent. The process willthen
be changed intoStep 2 0
in the seconditeration. Lemma 2 isproved. 3
Remark A1. Theoretically, if the procedure is startedwith an h
0
such that h
A
=o(h
0 )
and h
0
!0 as n ! 1, then h
1
will already be consistent. Hence such a starting band-
width is asymptotically more preferable. Now the asymptotic behaviour of an iterative
plug-inbandwidthselectoriseasytounderstand. Ifthe samplesizeis smallandthe data
haveaspecialstructure,atoolargestartingbandwidth,e.g. h 2
0
=h
max
mayperhapslead
to
^
I
1 :
= 0. Now h
j
could not be deated to the optimal bandwidth. In the application
we did not yetnd such aphenomenon. If this occurs, it isnoproblemfor our proposal,
because it willbe discovered by starting with the other bandwidth h 1
0 .
A sketched proof of Theorem 1: The proof of Theorem 1 can be carried out based
ona formulagiven inthe appendix inBeranand Feng (2002a). See alsoBeranand Feng
(2002b). They showed that, when convergence is reached, the rate of convergence of an
iterativeplug-in bandwidthselector is quantied by:
(
^
h h
A )=h
A :
=
1
2k+1 2Æ I
1
(
^
I I): (A.9)
Equation (A.9)shows that B(
^
h) and var(
^
h) atthe end of the proposed procedure are of
the corresponding orders as those of
^
I. var(
^
h) is dominated by the second term in (19)
of orderO(n 1
h 4k 1
I
),where h
I
denotes the bandwidthfor estimating I used at the end
of theprocedure, whichis oforder O
p (n
1=7
) fork =2and O
p (n
1=13
) fork =4. In both
cases,i.e. k =2with
3
andk =4with
2
,theorderofthesecondtermontherighthand
side of (18) is nolarger than that of the rst. Hence wehave B(
^
h)=O[B(
^
I)]=O(h 2
I ).
Straightforward calculation leads to the results of Theorem 1. 3
Beran, J. 1999. SEMIFAR models { A semiparametricframework for modelling trends,
longrangedependenceandnonstationarity. Discussionpaper,CenterofFinanceand
Econometrics(CoFE), No. 99/16, Center of Finance and Econometrics, University
of Konstanz.
Beran, J. and Feng, Y. (2002a). Local polynomial tting with long-memory, short-
memoryand antipersistent errors. The Annalsof the Institute of StatisticalMathe-
matics(in press).
Beran, J. and Feng, Y. (2002b). Iterative plug-in algorithms for SEMIFAR models -
denition, convergence and asymptotic properties. To appear in Journal of Com-
putational and Graphical Statistics.
Beran, J., Feng, Y. and Heiler, S. (2000). Modifying the double smoothing bandwidth
selectioninnonparametricregression. DiscussionPaper, CoFE,No. 00/37,Univer-
sity of Konstanz. Submitted.
Cleveland,W.S.1979. RobustLocallyWeightedRegressionandSmoothingScatterplots.
J. Amer. Statist. Assoc. 74,No. 36,829{836.
Fan,J.andGijbels,I.1996. LocalPolynomialModelinganditsApplications. Chapman
&Hall, London.
Feng, Y. 1999. Kernel- and Locally Weighted Regression { with Application to Time
Series Decomposition. VerlagfurWissenschaftund Forschung, Berlin.
Feng, Y. and Heiler, S. (2000). Eine robuste datengesteuerte Version des Berliner-
Verfahrens(inGerman). Wirtschaft und Statistik, 10/2000,786 { 795.
Gasser, T., Kneip, A.and Kohler, W. 1991. A exible and fastmethodfor automatic
smoothing. J. Amer. Statist. Assoc., 86,643{652.
Gasser, T., Muller, H.G. and Mammitzsch, V. 1985. Kernels for nonparametric curve
estimation. J.Roy. Statist. Soc. Ser. B 47238{252.
Gasser, T., Sroka, L. and Jennen-Steinmetz, C. 1986. Residual Variance and Residual
Pattern inNonlinear Regression. Biometrika73,625{33.
smoothing parameters from their optimum? (with discussion) J. Amer. Statist.
Assoc., 83, 86{99.
Hall, P., Kay, J.W. and Titterington, D.M. 1990. Asymptotically Optimal Dierence-
based Estimationof Variance inNonparametric Regression. Biometrika77,521{8.
Heiler, S. 1966. Analyse der Struktur Wirtschaftlicher Prozesse durch Zerlegung von
Zeitreihen. Dissertation,Tubingen.
Heiler, S. 1970. Theoretische Grundlagen des \Berliner Verfahrens". In Wetzel, W.
(Ed.): Neuere Entwicklungen auf dem Gebiet der Zeitreihenanalyse, Sonderheft 1
zum Allg. Statistischen Archiv,67{93.
Heiler, S. and Feng, Y. 1996. Datengesteuerte Zerlegung saisonaler Zeitreihen. IFO-
Studien3/1996, 337{369.
Heiler,S.andFeng, Y.(2000). Data-drivendecompositionof seasonaltimeseries. Jour-
nal of Statistical Planning and Inference,91, 351 {363.
Heiler,S.andMichels,P.1994. DeskriptiveundExplorativeDatenanalyse,Oldenbourg-
Verlag,Munchen.
Herrmann, E.and Gasser,T. 1994. Iterative plug-inalgorithmforbandwidth selection
in kernel regression estimation. Preprint, Darmstadt Institute of Technology and
University of Zurich.
Herrmann,E., Gasser,T.and Kneip,A.1992. Choiceofbandwidthforkernelregression
when residualsare correlated. Biometrika, 79, 783{795.
Makridakis, Wheelwright and Hyndman 1998. Forecasting: Methods and Applications
(3rd edition). John Wiley,New York.
Muller, H.G. 1988. Nonparametric Analysis of Longitudinal Data, Springer-Verlag,
Berlin.
Ray, B.K. and Tsay, R.S. 1997. Bandwidth selection for kernel regression with long-
range dependence. Biometrika, 84,791{802.
1215{30.
Ruppert, D., Sheather, S.J. and Wand, M.P.1995. Aneective bandwidthselectorfor
localleast squares regression. J.Amer. Statist. Assoc. 90,1257{1270.
Ruppert, D. and Wand, M.P. (1994). Multivariatelocallyweighted least squaresregres-
sion. Ann. Statist. 22 1346{1370.
Schlittgen,R. and Streitberg, B. 1991. Zeitreihenanalyse. R.Oldenbourg, Munchen.
Yuanhua Feng
Department of Mathematics and Statistics
University of Konstanz
D-78457Konstanz, Germany
Email: yuanhua.feng@uni-konstanz.de
Tel. +49-7531-88-7363
Fax. +49-7531-88-2407