1.4 Empirical Results
1.4.2 Euro Area Results
In order to check the robustness of the above findings, we repeat our analysis for the Euro Area. However, the situation for the Euro Area (EA) is rather complicated. As Google does not provide the possibility to limit the basis of searches on which the SVIs are calculated to the EA, we have to find proxy SVIs for the Euro price SVIs by setting other geographical limitations. Therefore, we simply take the basis of searches for Euro price levels worldwide. This might include searches for Euro price levels from other countries, and, thus, might add noise to the EA IPSO. This might hamper the correlation with inflation and consumption within the EA. In addition, we take the largest economies in the EA (France and Germany), and examine whether searches from within these two economies have any relation with EA consumption or inflation. The Google user basis in Germany or France, however, only partially matches the Euro Area’s user basis in total. Possible
Table 1.10: Clark-West Test Results
The table displays the test statistics of the test developed byClark and West(2006,2007). In the test, an AR(1) for inflation or consumption is compared with an equivalent VAR(1) which also includes the IPSO as defined in Equation (1.7). For worldwide searches as well as searches from France and Germany, the forecasts for inflation and growth of consumption loans in the Euro Area underlie the test. For searches from the US, the US time series are used. S defines the frequency for which centered seasonal dummies are included. The null hypothesis of the test is that the AR(1) yields the sameRM SP E as the VAR(1).
It can be rejected on a 10% significance level when the test statistic exceeds 1.280 or on a 5% significance level when the test statistics is larger than 1.645. The column ’Origin’ displays the two letter country code for the origin of the searches with which the IPSO is constructed. If worldwide searches are used for the construction ’all’ is displayed.
S
Series Origin 0 12
Monthly Inflation
all 1.52 -0.18 FR -0.16 -0.03 DE -0.49 -0.79 US 1.59 1.83 Monthly
Consumption
all -0.04 0.52 FR -0.32 1.30 DE -1.39 0.87 US -1.63 -0.69
Daily BEIR US 0.07 –
relations with EA-inflation or consumption might therefore also be veiled. Furthermore, when it comes to consumption, the European Central Bank (ECB) and other European institution do not provide monthly estimates for private consumption expenditure. This is why we have to resort to the available monthly measure of consumption loans granted to private households. We are, thus, strictly speaking not analyzing the relation of the IPSO with consumption growth in the Euro Area, but its relation with the growth in consumption loans granted to private households. The results for the EA are displayed in Table 1.11.
They suggest that if no seasonal component is included, a bidirectional Granger-causality between the IPSO, based on worldwide and French searches, and inflation on at least a 10% significance level can be detected. For the IPSO based on worldwide searches, a contemporaneous correlation can also be detected on a 10% significance level. When German searches are used, the IPSO Granger causes inflation on a 1% significance level.
However, these results are not robust to including centered seasonal dummies.
In-sample, in the case of consumption loan growth, when no seasonal dummies are included, on at least a 10% significance level, Granger causality can be found for the IPSO constructed on worldwide searches. For the index constructed on French searches, on every conventional significance level, the IPSO Granger causes consumption loan growth when controlling for
Table 1.11: EUR-IPSO: Causality Tests
The table shows the lag-length selected by the BIC,p, the seasonal component controlled forS, as well as thep-values for the Granger causality and contemporaneous causality test. Furthermore, the in-sample RM SE is reported. The column denoted IPSO→shows thep-value for the test on whether IPSO Granger causes inflation or consumption, respectively; the column →IPSO shows the result on whether the IPSO is Granger caused by the respective variables.
Series S p Granger Contemp. RM SE
IPSO→ →IPSO
Euro Area Inflation
Worldwide Searches
0 2 0.000 0.062 0.092 0.481
12 1 0.115 0.521 0.886 0.220
German Searches
0 6 0.330 0.006 0.266 0.405
12 1 0.593 0.997 0.479 0.221
French Searches
0 2 0.027 0.100 0.971 0.496
12 1 0.510 0.764 0.364 0.221
Euro Area Consumption
Worldwide Searches
0 2 0.009 0.236 0.855 10.753
12 2 0.359 0.175 0.546 6.351
German Searches
0 2 0.793 0.783 0.126 11.107
12 2 0.168 0.557 0.121 6.322
French Searches
0 2 0.618 0.077 0.852 11.082
12 2 0.001 0.608 0.759 6.093
seasonality. For the IPSO constructed with worldwide searches, Granger causality of the IPSO on consumption loan growth can only be found at every conventional significance level when one is not controlling for seasonality.
Comparable to the US data, the out-of-sample results are a little more promising. When looking at the RM SP Es of the out-of-sample forecasts reported in the Tables 1.12 1.13 and 1.14, when controlling for annual seasonality, we find that the RM SP E for inflation is always reduced by the inclusion of the IPSO by around 30%. When controlling for monthly seasonal dummies, in the case of inflation, the R2M Z is rather unaffected and increased or decreased marginally.
Table 1.12: Out-of-Sample Fit: Euro Area Inflation and Consumption (Worldwide)
The table displays the RM SP E andRM Z2 for the models with the IPSO (i.e., VAR(pi) models) and without the IPSO (AR(pi) models) for the listed macroeconomic time series. For each time series, different seasonality components are controlled for (S∈ {0,4,12} for inflation and consumption andS∈ {0,5,30} for the US-BEIR) and the respective RM SP E andR2M Z are reported. If the difference between the RM SP Es of the models with and without IPSO is negative, then including the IPSO helps in predicting the respective time series. For the difference of theR2M Z it is the other way around: If the difference here is positive then the IPSO increases the quality of the out-of sample prediction.
S with IPSO without IPSO Difference Euro Area
Inflation
RM SP E0 0.377 0.774 -0.397
12 0.168 0.231 -0.064
R2M Z 0 6.27 0.03 6.24
12 82.64 83.65 -1.01
Euro Area Consumption
RM SP E0 7.642 18.979 -11.337
12 3.999 15.855 -11.856
R2M Z 0 19.15 7.30 11.85
12 80.98 19.96 61.02
For consumption loan growth the reduction in the RM SP E is drastic for indices based on worldwide, French and German searches. For all indices, by including the IPSO when forecasting EA consumption loan growth, the RM SP E is reduced to less than a third of theRM SP E of the benchmark model. We also find that R2M Z is always increased strongly to levels over 70% for consumption loan growth by including the IPSO as well as centered seasonal dummies for a monthly frequency.
For the EA forecasts of inflation, when controlling for annual seasonality, the null hypothesis of the Clark-West test that a simple AR(1) model has the same RM SP E as a VAR(1) model cannot be rejected on any significance level. When not controlling for seasonality, the Clark-West test turns up significant on a 10% significance level for the IPSO based on worldwide searches. For EA consumption, only when controlling for seasonality, the Clark-West test indicates a significant reduction (on the 10% level) of the RM SP E for the IPSO based on French searches.
Again we find that the IPSO mimics the seasonality in the macroeconomic time series.
When including seasonal dummies, for the Euro Area in-sample all results vanish, except for the Granger causality of the IPSO based on French searches and consumption loan growth. However, focusing only at the out-of-sample results, the improvements of including the IPSO in forecasting inflation are sizable. The gains when predicting consumer loan growth are very large.
Table 1.13: Out-of-Sample Fit: Euro Area Inflation and Consumption (German Searches)
The table displays the RM SP E andRM Z2 for the models with the IPSO (i.e., VAR(pi) models) and without the IPSO (AR(pi) models) for the listed macroeconomic time series. For each time series, different seasonality components are controlled for (S∈ {0,4,12} for inflation and consumption andS∈ {0,5,30} for the US-BEIR) and the respective RM SP E andR2M Z are reported. If the difference between the RM SP Es of the models with and without IPSO is negative, then including the IPSO helps in predicting the respective time series. For the difference of theR2M Z it is the other way around: If the difference here is positive then the IPSO increases the quality of the out-of sample prediction.
S with IPSO without IPSO Difference
Euro Area Inflation
RM SP E0 0.415 0.774 -0.359
12 0.165 0.231 -0.067
R2M Z 0 5.59 0.03 5.56
12 83.31 83.65 -0.34
Euro Area Consumption
RM SP E0 7.860 18.979 -11.119
12 4.368 15.855 -11.487
R2M Z 0 15.89 7.30 8.59
12 76.96 19.96 57.00
Table 1.14: Out-of-Sample Fit: Euro Area Inflation and Consumption (French Searches)
The table displays the RM SP E andRM Z2 for the models with the IPSO (i.e., VAR(pi) models) and without the IPSO (AR(pi) models) for the listed macroeconomic time series. For each time series, different seasonality components are controlled for (S∈ {0,4,12} for inflation and consumption andS∈ {0,5,30} for the US-BEIR) and the respective RM SP E andR2M Z are reported. If the difference between the RM SP Es of the models with and without IPSO is negative, then including the IPSO helps in predicting the respective time series. For the difference of theR2M Z it is the other way around: If the difference here is positive then the IPSO increases the quality of the out-of sample prediction.
S with IPSO without IPSO Difference
Euro Area Inflation
RM SP E0 0.377 0.774 -0.397
12 0.164 0.231 -0.068
R2M Z 0 5.29 0.03 5.26
12 83.67 83.65 0.02
Euro Area Consumption
RM SP E0 7.531 18.979 -11.448
12 3.973 15.855 -11.882
R2M Z 0 21.77 7.30 14.47
12 81.77 19.96 61.82
1.5 Summary
Google search queries are a popular addendum to autoregressive models used for prediction.
Their use is justified if one is willing to accept the assumption that people gather information before taking action. From an econometric point of view, including Google’s search queries or any derivatives based on search queries like our IPSO adds an additional source of information to the autoregressive model and allows a faster adjustment of the dynamics compared to a pure autoregressive model.
While Google data are therefore an attractive variable to improve predictions, they are not directly available on all desired frequencies over long time horizons. We have therefore proposed an algorithm which allows to construct multi-annual search volume indices based on overlapping periods of subsequently downloaded subsamples for the same search query where these subsamples contain a sufficient overlap. The method also paves the way to make more than five SVIs comparable where five is the maximum that Google allows to be compared directly on its website. During a detailed evaluation of our algorithm and a comparison with other approaches to concatenate SVIs (naive concatenation scheme and a method based on time frame comparison), it turns out that our algorithm is capable to circumvent all potential pitfalls (zeros in the index or sudden spikes) while preserving the statistical properties of the benchmark SVIs.
We illustrated the use of our algorithm in an application to forecast US and European inflation and consumption measures, thereby discussing again potential pitfalls in gathering adequate datasets. Multiple Google SVIs were made comparable and were aggregated to an Index for the Prices Searched Online (IPSO) which constitutes an expected average price level of individuals who engage in buying. The index based on US searches precedes the monthly US inflation rate and is contemporaneously correlated with monthly US inflation and consumption growth. When forecasting monthly US or Euro Area inflation out-of-sample, the RM SP E can be reduced by around 30% when the IPSO is included.
Similarly, the prediction of Euro Area consumption loan growth is decisively improved when the IPSO is included in the prediction model.
Appendix
Suppose, we download SVIs for a search-term j ∈ Mand region i for two overlapping time frames TA and TB. According to Equation (1.2), we can describe the SVIs at the point in time of the overlap, i.e., t∈ TA∩ TB, as
SV IA,t=αA+βAst+νA,t, (1.8) SV IB,t =αB+βBst+νB,t. (1.9) Since the region i and the search-term j are fixed, we drop the respective subscripts in Equations (1.8) and (1.9). Furthermore, we use A and B to clearly relate the objects to the reference time frames TA and TB. Solving both equations for st and equating the results yields
SV IA,t−αA−νA,t
βA =
SV IB,t−αB−νB,t
βB . (1.10)
Solving expression (1.10) yields Equation (1.3) SV IA,t = αA−ββA
BαB
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
+ ββA
B
¯
SV IB,t + νA,t−ββA
BνB,t
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
= γ + δ SV IB,t + εt