• Keine Ergebnisse gefunden

In this paper, we study in detail the properties of Wishart Autoregressive model of order one, WAR(1), which was introduced by Gourieroux et al. (2009) to cap-ture the dynamics of quadratic and symmetric matrices, such as series of realized variance-covariance matrices on the basis of the sample variance-covariance matrix distribution, known in the literature as the Wishart distribution. In particular, we analyze the properties of the Wishart d.f. estimator subject to different stationarity conditions on the latent vector autoregressive processes and to different distribu-tional settings of the covariance matrix itself.

Non-degenerate Wishart distributions are characterized by d.f. larger than the di-mension of the process. The estimation results from fitting the WAR(1) model to different series of realized covariance matrices reveal that the estimated values of the d.f. are much smaller than the dimension of the process, indicating a degenerate Wishart process for the realized covariance matrices. These results diverge from the ones reported by Gourieroux et al. (2009), which estimated sound values (larger than the dimension of the process) based on one month sample of data.

One possible explanation for this discrepancy lies in the inconsistency between the stationarity assumption imposed by Gourieroux et al. (2009) to identify the model and the type of data used to estimate the model. As a result, we relax the station-arity assumption on the latent VAR processes and show that under nonstationary (cointegration) conditions, the estimated Wishart d.f. exhibit a downward bias and converge in probability to a value that might be smaller than the dimension of the process. Although mathematically sound, this theory becomes questionable when applied to realized (co)variance series: although they exhibit highly persistent dy-namics with large probability mass on extreme observations, there is no empirical or theoretical evidence that realized covariance processes are nonstationary.

A more plausible explanation for estimating Wishart d.f. smaller than the dimen-sion of the process when applied to daily realized covariance matrices, is given by the divergence between the distributional assumption on the underlying covariance process comprised in the WAR(1) specification and the statistical properties of

re-observations, typically detected in realized covariance series. Although in general minor, this fact has in this case a dramatic effect on the estimated parameters of the WAR(1) model: the d.f. estimates based on samples which include extreme events describe degenerate WAR(1) processes of covariance matrices. Our empirical results show that reasonable estimates might be obtained only from short samples or samples stemming from calm financial periods.

Moreover, the WAR(1) specification assumes that portfolio realized volatilities are Gamma distributed, which implies that their coefficient of variation is a constant.

But based on our empirical findings and numerous previous results, the distribution of realized volatilities is closer to the log-normal specification with a coefficient of variation dependent on the mean, but also on the variance of the process, which in this case is time varying (volatility of volatility effect described by Corsi et al.

(2008)). Thus during periods of high and clustered volatility, estimating a WAR(1) process on series, which are in fact approximately log-normally distributed, generates unsound results, which are difficult to interpret: the estimates of the d.f. are smaller than the dimension of the covariance process and consequently describe a degenerate Wishart distribution for the underlying realized covariance matrices. Although in these cases, it loses its interpretation as a WAR(1) model, the corresponding autore-gressive specification reveals high potential in forecasting multivariate volatilities (Chiriac & Voev (2009)).

Bibliography

Andersen, T. G., Bollerslev, T., Diebold, F. X. & Ebens, H. (2001), ‘The distribution of stock return volatility’,Journal of Financial Economics 61, 43–76.

Andersen, T. G., Bollerslev, T., Diebold, F. X. & Labys, P. (2001), ‘The distribution of exchange rate volatility’, Journal of the American Statistical Association 96, 42–55.

Asai, M., McAleer, M. & Yu, J. (2006), ‘Multivariate stochastic volatility: A rewiew’, Econometric Reviews25, 145–175.

Bauer, G. H. & Vorkink, K. (2007), Multivariate realized stock market volatility.

Working Paper 2007-20, Bank of Canada.

Bauwens, L., Laurent, S. & Rombouts, J. (2006), ‘Multivariate garch models: a survey’, Journal of Applied Econometrics 21, 79–109.

Bonato, M. (2009), Estimating the degrees of freedom of the realized volatility wishart autoregressive model. Working paper.

Bonato, M., Caporin, M. & Ranaldo, A. (2009), Forecasting realized (co)variances with a block wishart autoregressive model. Working paper.

Chiriac, R. & Voev, V. (2009), ‘Modelling and forecasting multivariate realized volatility’, Journal of Applied Econometrics . forthcoming.

Corsi, F., Kretschmer, U., Mittnik, S. & Pigorsch, C. (2008), ‘Volatility of realized volatility’, Econometric Reviews27, 46–78.

Gourieroux, C., Jasiak, J. & Sufana, R. (2009), ‘The wishart autoregressive process of multivariate stochastic volatility’, Journal of Econometrics 150, 167–181.

Hamilton, J. D. (1994),Time Series Analysis, Princeton University Press, Princeton, New Jersey.

McAleer, M. & Medeiros, M. C. (2008), ‘Realized volatility: A review’,Econometric Reviews 27, 10–45.

Taylor, S. J. (1986),Modelling Financial Time Series, Wiley, New York.

Appendix A.1: Proofs

which leads to the MM estimator of K from Equation (1.6).

A.1.2 Proof of Proposition 1.2.2

Equation (A.1.4) can be further written as:

K = 2(E [α0Ytα])2

From Equation (A.1.5) we can derive K as follows:

K = α0E [Yt]α

wherezk,t α0xk,t

α0Σ(∞)α

iidN (0,1) andStPK

k=1zk,t2 ∼χ(K). In this format, the MM estimator from Equation (1.6) can be written as:

Kˆ = PT

t=1St

T =St. (A.1.6)

Stationary xk,t processes implies that St is a stochastic stationary process. Given thatxk,t are serially correlated,Stare also serially correlated. Therefore, in order to derive the distribution ofSt(i.e. ˆK) we apply the Central Limit Theorem (CLT) for Stationary Stochastic Processes (see Hamilton (1994), pp.195), which states that:

√T(StE [St])d N(0,

Then xk,t can be recursively written as follows:

xk,t=εk,t+εk,t−1+. . .+εk,1 (A.1.10)

Given thatzk,t follows a random walk, for which Hamilton (1994) shows that

whereW(·) stands for the standard Brownian motion, we derive the the asymptotic distribution of ˆK to be:

Kˆ d K Z1

0

[W(s)]2ds. (A.1.14)

A.1.4 Proof of Proposition 1.2.4

Given xk,t from Equation (1.21), we can derive the process Yt to be:

Yt =

Pre- and post-multiplying both sides of Equation (A.1.15) byC0 and respectivelyC, we obtain the following expression:

= C0Yt−1C+C0AtC+C0A0tC+C0 XK

k=1

εk,tε0k,tC. (A.1.16)

In Equation (A.1.16), At can be expended to:

At =

and the conditional expectation of C0YtC is given by:

E [C0YtC|=t−1] = C0Yt−1C+C0E [At|=t−1]C+C0E [A0t|=t−1]C From Equation (A.1.19), we observe that the processC0YtCis a matrix random walk with drift. Consequently the process C0YtC can be written as:

C0YtC =C0Yt−1C+KC0ΣC+νt, (A.1.20) whereνtis a heteroscedastic error term of dimensionr×1 with zero conditional mean.

Rewriting Equation (A.1.20) recursively, we derive the unconditional expectation of C0YtC to be:

E [C0YtC] = tKC0ΣC. (A.1.21)

A.1.5 Proof of Proposition 1.2.5

Pre-multiplying Equation (1.21) by C0, we get:

uk,t C0xk,t =C0(In−H)xk,t−1+C0εk,t =C0xk,t−1−C0Hxk,t−1+C0εk,t

= C0xk,t−1−C00xk,t−1+C0εk,t =C0xk,t−1+C0εk,t

= uk,t−1+ηk,t, (A.1.22)

whereηk,tiidN (0, C0ΣC). DenoteC0ΣC Ω. The processuk,t is a random walk, for which V [uk,t] = C0V [xk,t]C =C0Σ(∞)C =tΩ =C0tΣC =tC0ΣC =tΩ.

From Equation (A.1.21) and Equation (1.17) we deduce that:

E [C0YtC] =KC0Σ(∞)C =KtΩ (A.1.23) Similar to Equation (A.1.5), we derive K to be given by:

K = α0E [C0YtC]α

Similar to the unit root case, we derive the asymptotic distribution of ˆK to be:

Kˆ d K Z1

0

[W(s)]2ds. (A.1.27)

A.1.6 Non-linear least square method of estimating a WAR(1) model

Starting from the conditional mean of Yt given in Equation (1.8), Gourieroux et al.

(2009) write the WAR(1) process as a linear autoregressive process of order 1:

Yt+1 =MYtM0+KΣ +$t+1, (A.1.28)

where$tis a (n×n) matrix of stochastic errors with zero conditional mean and con-ditional heteroskedasticity. Gourieroux et al. (2009) show that this representation leads itself to a two step estimation procedure, which provides consistent estimators of the model.

Step 1: The parameter matricesM and Σ =KΣ are estimated from the first-order conditional moment, which coincides with the nonlinear least-squares, on a series of positive definite matrices. The latent autoregressive matrix M is identifiable up to its sign. The parameters K and Σ are identified up to a scale factor.

( ˆM,Σˆ) = argmin(

XT

t=2

kvech(Yt)−vech(MYt−1M0+ Σ)k2) (A.1.29) wherek · k represents the mathematical operator for the Euclidian norm7.

Step 2: The parameters K and Σ are obtained from the second-order of moments, under the assumption thatxk,t and respectively Yt are stationary, as follows:

1. The stationary (marginal) distribution of Yt is the centered Wishart distribu-tion, denoted by W(K,0,Σ(∞)), where Σ(∞) is defined in Equation (1.10).

Once having estimated M and Σ of dimension n ×n from the first-order method of moments on the series of quadratic matrices Yt, the estimator of Σ(∞) is derived from Equation (1.9) by applying the vech operator to both sides of the equation, which stacks the lower triangular components of the symmetric matrixes: ˆΣ, ˆMΣˆ(∞) ˆM0 and ˆΣ in vectors of dimension n(n+1)2 . Then Equation (1.10) can be written as:

vech( ˆΣ(∞)) = vech( ˆMΣˆ(∞) ˆM0) +vech( ˆΣ)

= Mˆvech( ˆΣ(∞)) +vech( ˆΣ), (A.1.30) where ˆM is a matrix of dimension n(n+1)2 × n(n+1)2 . From Equation (A.1.30), we can derive the expression ofvech( ˆΣ(∞)) and, automatically by reshaping, the matrix ˆΣ(∞):

(In(n+1)

2 ×n(n+1)2 −Mˆ)vech( ˆΣ(∞)) = vech( ˆΣ), (A.1.31)

vech( ˆΣ(∞)) = (In(n+1)

2 ×n(n+1)2 −Mˆ)−1vech( ˆΣ), (A.1.32) 2. By replacing ˆΣ(∞) in equations (A.1.2) and (A.1.3) and givenα of dimension

1, we derive the estimator of K to be given by:

K(α) =ˆ 2(α0Σˆ(∞)α)2

Vˆ[α0Ytα] = 2(E [α0Ytα])2

Vˆ[α0Ytα] , (A.1.33) which turns out to be identical to the one from Equation (1.6). Based on the results derived in Section (1.2.1), this estimator of K is consistent and normally distributed with mean 0 and variance P

j=−∞γj.

3. A consistent estimator of Σ, which depends on the portfolio allocation, is:

Σ(α) =ˆ Σˆ

K(α)ˆ . (A.1.34)

Appendix B.1: Figures

Figure B.1.1: Kernel densities of portfolio realized volatilities. Dotted lines correspond to the fitted Gamma distributions, dashed lines correspond to fitted log-normal distributions and solid lines are the kernel densities of the underlying series; upper panel: estimation based on one month of data (May 2006), middle panel: one year of data (year 2003) and lower panel: six and an half years of data (01.01.2001-30.06.2006).

Modelling and Forecasting

Multivariate Realized Volatility

2.1 Introduction

Multivariate volatility modelling is of particular importance in the areas of risk man-agement, portfolio management and asset pricing. Typical econometric approaches include multivariate GARCH models (for a comprehensive review see Bauwens et al.

(2006)), stochastic volatility models (reviewed in Asai et al. (2006)) and, more re-cently, realized covariance measures (see Barndorff-Nielsen & Shephard (2004) and Andersen, Bollerslev, Diebold & Ebens (2001), among others). While in the GARCH and stochastic volatility framework the volatility process is latent, the realized co-variance methods employ high-frequency data to enable precise estimation of the daily covariance of the underlying assets, thus making it effectively observable.

A prominent feature of volatility is its strong persistence, which motivated the de-velopment of the integrated GARCH (Engle & Bollerslev (1986)), the fractionally integrated GARCH (Baillie, Bollerslev & Mikkelsen (1996)) and the linear ARCH (Robinson (1991), Giraitis, Robinson & Surgailis (2000)) models. Realized volatility series tend to exhibit a slow decay in the autocorrelation function (see e.g., Andersen

& Bollerslev (1997), Andersen, Bollerslev, Diebold & Ebens (2001)), and are mod-eled by means of fractionally integrated ARMA (ARFIMA) processes by Andersen, Bollerslev, Diebold & Labys (2003), Oomen (2001) and Koopman, Jungbacker &

Hol (2005), among others.

flexible model specifications, applicable for large number of assets. Yet, there is lit-tle research on time series models for realized covariance matrices. The existing literature has typically focused on univariate analysis of realized volatilities or single realized covariance (correlation) series. Andersen et al. (2003) model log-realized volatilities and realized correlations with univariate ARFIMA models, while Corsi (2009) and Corsi & Audrino (2007) develop Heterogenous Autoregressive (HAR) models to capture the strong persistence through a hierarchical autoregressive struc-ture. A problem which arises in this context, is that the matrix constructed from the variance and correlation forecasts obtained from disjoint models is not guaranteed to be positive definite.

In order to obtain a forecast of the entire covariance matrix, Voev (2007) proposes a methodology in which the univariate variance and covariance forecasts can be com-bined to produce a positive definite matrix forecast. A drawback of this approach is that the dynamic linkages among the variance and covariance series (e.g., volatility spillovers) is neglected. The Wishart Autoregressive (WAR) model of Gourieroux et al. (2009), and the model of Bauer & Vorkink (2007), who employ the matrix log transformation to guarantee positive definiteness of the forecast, are among the few proposed approaches for the dynamics of the whole realized covariance matrix.

The standard WAR model, however, is incapable of producing long memory type dependence patterns and is built on latent processes, whose interpretation is difficult and which makes the introduction of exogenous forecasting variables problematic.

The study of Bauer & Vorkink (2007) differs from ours in that its primary focus is to investigate the forecasting power of various predictive variables, such as past returns, risk-free interest rate, dividend yield, etc., while our main contribution is to improve upon the ability to characterize the dynamic aspects of volatility and to comprehensively analyze the resulting forecasting implications.

The approach developed in this paper involves the following 3-steps: first, decom-posing the series of covariance matrices into Cholesky factors; second, forecasting the Cholesky series with a suitable time series model; and third, reconstructing the matrix forecast. The positivity of the forecast is thus ensured by “squaring” of the Cholesky factors, which can be modelled without imposing parameter restrictions.

A further advantage of the methodology is that the inclusion of exogenous predic-tive variables is, at least conceptually, straightforward. The idea of modelling the

Cholesky factorization of a volatility matrix is not new. Tsay (2002) discusses its use as a re-parameterization of the latent covariance matrix in a traditional multivariate GARCH framework, while in Gallant & Tauchen (2001), a Cholesky-GARCH type of model is used in the context of efficient method of moments estimation. Inter-estingly, Pourahmadi (1999), suggests to model the Cholesky factors of the inverse of the covariance matrix, which can be very appealing in cases where the inverse is the object of direct interest as, e.g., in the solution to a minimum-variance portfolio problem. More recently, the idea of modelling the Cholesky factorization of the re-alized covariance matrix that we advocate here, has been put forward, although not implemented empirically, by Andersen et al. (2003).

The degree of parameterization (flexibility) of the time series model should be guided by the dimension of the matrix, as well as by the application we have in mind; do we aim at a good in-sample fit, or are we more interested in out-of-sample forecasting?

In this paper, our interest is in the latter and hence we tend to favor very parsi-monious specifications. The model is based on fractionally integrated processes and can be seen as an application of the multivariate ARFIMA model of Sowell (1989).

Estimation is carried out using the conditional maximum likelihood (ML) method developed in Beran (1995). The conditional approach is preferred over the exact ML methods proposed in the univariate case by Sowell (1992) and An & Bloomfield (1993), since the exact ML approach requires the inversion of a T n×T n matrix, where T is the sample size, and n is the dimension of the process. For a review of inference of and forecasting with ARFIMA models, we direct the reader to Doornik

& Ooms (2004).

To assess the merits of our model in practice, we undertake a comprehensive out-of-sample forecasting study using recent data, partially covering the ongoing financial crisis. In the analysis, we consider 1-step (daily), 5-step (weekly) and 10-step (bi-weekly) horizons, using direct and iterative forecasts from a range of models using both high-frequency and daily data. An issue we need to address in the context of volatility forecast evaluation, is that ex-post measures of volatility are subject to es-timation error. We use results from Hansen & Lunde (2006) and Patton (2009), who derive the necessary and sufficient conditions a loss function should satisfy, in or-der for the ranking of the models to be robust to noise in the ex-post volatility proxy.

A further problem which arises, is how to determine the best performing model(s) in a model-rich environment. Pairwise comparisons of loss functions can be misleading, unless we consider some sort of Bonferroni bound correction, and would involve a geometrically increasing number of tests as the number of models increases. Fortu-nately, Hansen, Lunde & Nason (2009) have developed a methodology, the model confidence set (MCS) approach, which allows selecting a set of models containing the best one with a certain level of confidence, naturally adapting to the number of models and thus requiring only one test. Using the root mean squared error (RMSE) criterion, we show that the forecasts based on the fractionally integrated model pro-posed in this paper have the smallest error for all forecasting horizons. Applying the MCS approach reveals that our model is often the only one selected by the proce-dure. To get a feeling of what this improved statistical performance implies from a practitioner’s point of view, we analyze the performance of mean-variance efficient portfolios and document that our approach leads to a superior mean-variance trade-off. Similar studies have been carried out by Fleming, Kirby & Ostdiek (2003) and Liu (2009), but for a more restricted set of models and with a different evaluation methodology.

The paper is structured as follows: Sections 2.2 and 2.3 describe the conditional covariance models and the forecasting procedures, Section 2.4 reports estimation and forecasting results and Section2.5 concludes.