Models and Forecast Evaluation Criteria

Search Volumes 11

2.3 Models and Forecast Evaluation Criteria

For each cryptocurrency, we can relate five time series with each other: the respective returns of the exchange rate with the Euro, the volatility of these returns and Google’s SVIs for the search-terms cryptocurrency, Bitcoin, as well as a search-term related to the name of the respective cryptocurrency.

2.3.1 VAR Model for Returns and Volatility

From the basic asset pricing equation using the stochastic discount factor m_t+1 (cp.

Cochrane 2008), the conditional moment of future returns R_t+1 is given by Et[R_t+1] =R^f_t +R^f_tcov(m_t+1, R_t+1),

whereR^f_t is the currently prevailing risk free rate at time t.

Figure 2.2: Closing Prices, Volatility and Search Volume Indices

The graphs compare the closing price and volatility (left scale, black line) with Google’s search volume index for the coin-name (right scale, blue line). The two upper graphs refer to Bitcoin while the two bottom graphs refer to Ripple.

(a) Bitcoin Price and SVI

2014 2015 2016 2017 2018

050001000015000

Price SVI for Bitcoin −Cash −Future

0100200300400500600700

(b) Bitcoin Volatility and SVI

2014 2015 2016 2017 2018

0.00.10.20.30.40.5

Volatility SVI for Bitcoin −Cash −Future0100200300400500600700

2018

0.51.01.52.0

Price SVI for Ripple0100200300400500600700

(d) Ripple Volatility and SVI

2018

0.00.10.20.30.4

Volatility SVI for Ripple0100200300400500600700

Assuming that the law-of-one price and the no arbitrage condition hold, the stochastic discount factor can be related to fundamental pricing factors. In our case, Google search volume may proxy or co-vary with one or several of these pricing factors. Hence, we assume that the future stochastic discount factor mt+1 can be proxied by a function of the present SVIs. The specific functional form may be non-linear. However, as our focus lies on predicting returns or volatility, we may approximate the conditional moment of returns by a linear function. As the conditional second centered moment is a function of the first moment and therewith a function of the factors determining it, a linear approximation for the conditional variance is suitable as well.

Table 2.3: Model Specification Overview Included SVIs

Model Coin Name Cryptocurrency Bitcoin

0 – – –

1 ✓ – –

2 – ✓ –

3 – – ✓

4 ✓ ✓ –

5 ✓ ✓ ✓

Thus, we estimate VAR-models for either the returns or the volatility of cryptocurrencies of the following form

x_i,t =µ_i+

∑

j=1

A_i,jx_i,t−j+ε_i,t (2.1)

wherex_i,t is an R×1 vector that contains one or several SVIs and either the return or the volatility of the i^th cryptocurrency. Ai,j are the R×R parameter matrices whileµi is a vector of constants. The innovations εi,t are i.i.d. white noise. pi is the lag-length selected by the SIC.¹⁷

We consider six models separately for either returns or volatility. Table 2.3 provides an overview of the SVIs that are included in each model specification in addition to autoregressive terms. Model 0, which reduces Equation (2.1) to a univariate AR(p)-model, serves as a benchmark. Model 1 is the specification which relates the search volume of a certain coin to the coin’s price or volatility. Model 2 considers the relevance of the general interest in cryptocurrencies for forecasting returns and volatility as it includes the SVI for the search-term cryptocurrency. Model 3 assesses whether the interest in Bitcoin as the most pronounced cryptocoin helps to predict returns or volatility. With Models 4 and 5 we test if we can improve the forecasts by combining the general interest of Google users in cryptocurrencies and their interest in the respective cryptocoin. In the case of Bitcoin, Model 3 reduces to Model 1 and Model 5 reduces to Model 4 as the SVI for Bitcoin is both the coin name as well as the proxy for general interest.

The models are estimated using OLS. Estimation is conducted inR (2018) using packages forecast (Hyndman and Khandakar 2008) and vars (Pfaff 2008). Data and code are available at https://tinyurl.com/y7chh5r6.

17 While an autoregressive model for the prediction of one variable suffices to predict one day ahead, forecasting returns or volatility over several days with the help of Google’s SVI requires a VAR in order to also predict the SVI development.

2.3.2 Evaluation Measures

In order to assess whether Google’s SVIs help to predict returns or volatility, Model 0 has to be outperformed by other model specifications according to the following measures. To evaluate the in- and out-of-sample fit of the models, we calculate the root mean squared error (RM SE) as

RM SEm=

¿ ÁÁ

À 1

T −p−1

∑

t=p

(xt+1−xˆm,t+1)²

wherex_t+1 is the observed variable of interest and can either be the return series or the volatility series. m denotes any of the models 0 to 5. ˆx_t+1 denotes the forecasted value.

We then use the test developed byClark and West(2006,2007) for nested models, including their critical values, to assess whether the RM SE is significantly reduced by the inclusion of Google’s SVI in comparison to Model 0. The null-hypothesis of the test is that the models have the same forecast error whereas the alternative is that Model m has a smaller forecast error than the benchmark Model 0. The test statistic is calculated as

z =RM SE₀− (RM SE_m−κ_m),

whereκ_m =_T¹_−p∑^T_t=pxˆ_m,t+1−xˆ_0,t+1. In our case, the models are only partially nested. Hence, it is not clear upfront which model is the more parsimonious one. In consequence, the adjustment κ_m can be positive or negative. We therefore require upfront that the RM SE of the model including the SVIs is strictly lower than the RM SE of the benchmark model.

We also run a Mincer-Zarnowitz regression (?) of the realizationsx_t+1 on the fitted values ˆ

x_m,t+1 from the respective model to evaluate the in-sample fit. Out-of-sample the fitted values are replaced by the forecasted values. The regression equation, thus, reads as follows:

x_t+1=a₀+a₁xˆ_m,t+1+e_m,t+1.

The R² of this regression (denoted byR²_{M Z} in the following) serves as a measure for the quality of the in-sample fit or the out-of-sample forecast performance.

For the volatility models, we also calculate the quasi-likelihood loss function (QL) as in Patton (2011) who shows that the QL is robust with regard to noise in the proxy measure (the Garman and Klass(1980) volatility measures in our case). TheQL is calculated as

follows:

QL_m= 1 T −p

∑

t=p

( σ_t+1² ˆ

σ_m,t+1² −log( σ_t+1² ˆ

σ_m,t+1² ) −1). The better the forecast, the smaller is theQL measure.

As volatility enters into the model in logarithmic form, before evaluation, we transform it back to the standard, non-logarithmic measure of Garman and Klass (1980). Although forecasting the logarithmic transform of a variable and then transforming it back, bears its problems (cp.Granger and Newbold 1976), L¨utkepohl and Xu (2012) show that forecasting the logs can result in dramatic gains in forecast precision, when the resulting variance is more homogeneous. This is the case in our application.¹⁸

Furthermore, we conduct Wald-tests to i) check the model fit and ii) to see whether the SVIs Granger cause returns or volatility. The respective test scores w are constructed as Wald-statistic of a univariate model which corresponds to the return or volatility equation in the equation system (2.1). Hence, w is

w= (Rˆa−r)^′(RΣ Rˆ ^′) (Rˆa−r),

whereRis the matrix that linearly combines the vector of parameter estimates ˆa, andris a vector of real numbers containing the numeric restrictions imposed on the so formed linear combinations of parameter estimates. ˆΣis the estimated asymptotic variance-covariance matrix of the parameter estimates. We use the heteroskedasticity consistent jackknife estimator of Efron (1982) as recommended by Long and Ervin (2000) to estimate the variance/covariance matrix ˆΣ

Σˆ = (X^′X)⁻¹X^′diag( e²_t (1−ht)²

)X(X^′X)⁻¹,

where the residuals are denoted as e_t =x_t−µ− ∑^p_j=1x_t−ja_j (with x_t representing either the returns or the volatility). The matrix which collects all regressors in this equation is X= (xt−1, . . . ,xt−p), and ht=xt(X^′X)⁻¹xt. The division of e²_t by(1−ht)² increases the variance estimate for the high contribution of outliers. Asymptotically, the test statistic w converges to a χ²-distribution with the number of hypotheses,q, as degrees of freedom.

18 Granger and Newbold(1976) also suggest multiplication with a corrective term to mimic the calculation of the expectation of a log-normal distributed variable with the first two moments of the underlying nor-mally distributed variable in logarithms to get the optimal forecast in levels asy_t+h∣t^opt =exp{x_t+h∣t+¹2σ_x²}. L¨utkepohl and Xu (2012) find that the na¨ıve transform of the logarithmic forecast, x^na¨ıve_t+h∣t =e^x^t+h∣t, performs just as well as the optimal transformation suggested byGranger and Newbold(1976). We tested both, and come to the same conclusion. Thus, we only use the na¨ıve transformation.

Im Dokument Essays on the Statistics of Financial Markets (Seite 65-70)