• Keine Ergebnisse gefunden

Appendix 3.C Illustration of the empirical results

4.2 Methodology of NCoVaR

4.2 Methodology of NCoVaR

Let us first bring some intuition behind the Conditional Value-at-Risk and Granger causality separately and then use this to build CoVaR-NGraCo (Conditional Value-at-Risk-Nonlinear Granger Causality) or NCoVaR for simplicity. In the standard setting we consider two insti-tutions,iandj, whose returns on assets are given byXiandXj, respectively. Talking about systemic risk, we setjto be some aggregate variable so that we investigate the relationship between institutioniand the system as a whole. Following the original CoVaR literature, let us defineVaRγas the leftγ-quantile of the unconditional returns of a given institution. (In practice γis chosen from{0.01,0.05,0.1}.) For institutioniwe have therefore

P(XiVaRiγ) =γ, (4.1)

or equivalently

VaRiγ= inf{xi:FXi(xi)≥γ}, (4.2) whereFXiis the cumulative distribution function ofXi. (For institutionj, the notation is anal-ogous throughout the chapter.) The intuition behind CoVaR is to evaluateVaRγof institution j conditional on some event associated with institutioni. In particular, Adrian and Brunner-meier (2011) consider two conditioning events, i.e. institutioniis at itsVaRiγor at its median (VaRiγ=0.5=M ediani). By comparing the difference between the two, it is possible to estimate the risk contribution of institutioniontoj, denoted byΔCoVaR.

In our study we follow a similar reasoning as Adrian and Brunnermeier (2011), however, we add a (discrete) time dimension. For any periodt, let us define the future returns’ information set byGXti, and the past and/or current returns’ information set byFXti. Following Granger (1969), we say that returns of institutioniare Granger causing those of institutionjifFXti contains additional information onGXtj which is not already contained inFXtj alone. We formulate the definition of conditional Granger causality analogously, i.e. we say that returns of institutioniare Granger causing those of institutionjif, conditional on some past or current

events of those institutions (denoted byA(FXti)andB(FXtj), respectively), FXticontains additional information onGXtjwhich is not already contained inFXtjalone.

Given the intuition behind the CoVaR and general Granger causality, we may now turn to NCoVaR. Similarly toΔCoVaR, we test the difference in Granger causal risk effects from institutioni onj, between two conditioning events, i.e. when institutioniis and/or was in trouble (or around itsVaRiγ) and when it is and/or was around the median of its returns. An advantage of allowing institutions to be around (and not exactly at) theirVaRγor median levels is that we could thereof account for possible nonlinearities in corresponding distributions -something the original methodology could not capture. In particular, we consider aμ-radius ball (μ > 0) centered atVaRγor the median. (The following reasoning holds for GandF being multivariate, provided thatVaRγand the medians are taken over the marginals.) We also allow for conditioning on the past and/or current realizations ofXtj. To formalize this we give the following definition of NCoVaR.

Definition 4.2.1. Given any stationary bivariate process{(Xti, Xtj)}, we say that{Xti}is a nonlinear CoVaR Granger cause of{Xtj}if

P

GXtjVaRjγ ≤μ FXtiVaRiγ ≤μ,B(FXtj)

= P

GXtjVaRjγ ≤μ FXti−M ediani ≤μ,B(FXtj) ,

whereμ >0, . is the Euclidian distance measure,Gdenotes a set of future realizations andF denotes a set of past and/or current realizations of the corresponding variables andB(.)reflects some event over the argument.

In this study, we consider two possible scenarios. In the first, we assume that institution j is already in distress, so that potential Granger causal risk effects from institutionido not only induce even higher losses onjbut also can clog its recovery. The second scenario is more similar to the traditional risk analysis, where future troubles in institutionjcome directly from the past problems of institutionj. One may thereof reformulate Def. 4.2.1 in the form of two possible scenarios, which we investigate in detail below.

4.2. METHODOLOGY OF NCOVAR Scenario 1.Given any stationary bivariate process{(Xti, Xtj)}, we say that{Xti}is a nonlinear CoVaR Granger cause of{Xtj}in tail if

P

GXtjVaRjγ ≤μ FXtiVaRiγ ≤μ, FXtjVaRjγ ≤μ

= P

GXtjVaRjγ ≤μ FXti−M ediani ≤μ, FXtjVaRjγ ≤μ ,

whereμ >0, . is the Euclidian distance measure,Gdenotes a set of future realizations and Fdenotes a set of past and/or current realizations of the corresponding variables.

Scenario 2.Given any stationary bivariate process{(Xti, Xtj)}, we say that{Xti}is a nonlinear CoVaR Granger cause of{Xtj}in median if

P

GXtjVaRjγ ≤μ| FXtiVaRiγ ≤μ, FXtj−M edianj ≤μ

= P

GXtjVaRjγ ≤μ| FXti−M ediani ≤μ, FXtj−M edianj ≤μ ,

whereμ >0, . is the Euclidian distance measure,Gdenotes a set of future realizations and Fdenotes a set of past and/or current realizations of the corresponding variables.

In practice it is impossible to condition on the infinite sets of future or past realizations of variables of interest. Therefore, we reformulateGandFas finite sets of future periods or lags, respectively. We limit ourselves to the canonical setting whereGXtj = Xtj+1, as it is most commonly used in practical Granger causality testing, however, our reasoning holds for any GXtj=Xtj+k,1≤k <∞. Similarly, we replaceFXtiandFXtjbyXt,lii={Xtili+1, . . . , Xti} and Xt,lj

j = {Xtjl

j+1, . . . , Xtj}, whereli 1andlj 1denote the number of lags of a corresponding variable.

In Granger causality testing, the goal is to find evidence against the null hypothesis of no causality, which according to Def. 4.2.1 is represented by equivalence in conditional probability.

We assume that process{(Xti, Xtj)}is strictly stationary. In that case, the null hypothesis is a statement about the invariant distribution evaluated at conditionalVaRγlevels of the(li+lj+ 1)-dimensional vectorWt = (Zt, Xt,li

i, Xt,lj

j), where we substituteZt = Xtj+1. (For clarity

purposes and to bring forward the fact that we consider the invariant distribution ofWt, we drop the time index, so thatW = (Z, Xi, Xj).) Formally, the null hypothesis from Scenarios 1 and 2 can be rewritten as

fZ,Xi,Xj

zγ|xiγ, xj

=fZ,Xi,Xj

zγ|xim, xj

, (4.3)

wherezγ= VaRZγ,xiγ= VaRiγ,xim=M edianianddistinguishes between Scenario 1 and 2 asxjγ = VaRjγorxjm=M edianj, respectively. It is helpful to restate the problem in terms of ratios of joint densities evaluated at given quantiles, as under the null the density ofZevaluated around itsVaRγ level and conditional on specific events inXi andXjis equal to the same density conditional on the different set of events inXiandXj. Therefore, the joint probability density function, together with its marginals must satisfy

fZ,Xi,Xj

zγ, xiγ, xj fXi,Xj

xiγ, xj =fZ,Xi,Xj(zγ, xim, xj) fXi,Xj

xim, xj . (4.4)

Since Eq. (4.4) holds for any quantile of the vector(Z, Xi, Xj)in the support ofZ, Xi, Xj, Eq. (4.4) might be equivalently rewritten as

fZ,Xi,Xj

zγ, xiγ, xj fXi,Xj

xim, xj = fXi,Xj

xiγ, xj fXi,Xj

xim, xjfZ,Xi,Xj(zγ, xim, xj) fXi,Xj

xim, xj . (4.5)

Analogously to Baeck and Brock (1992) or Hiemstra and Jones (1994), a natural methodology to assess Eq. (4.5) comes from the test for conditional independence. However, as showed by Diks and Panchenko (2005) and Diks and Panchenko (2006), these tests can severely over-reject in Granger causal setting, because its dependence on the conditional variance. Diks and Panchenko (2006) propose to add a positive weight functiong(zγ, xim, xj)and, given that the

4.2. METHODOLOGY OF NCOVAR null should hold in the support of the joint densities, it might be equivalently written as

τg

fZ,Xi,Xj

zγ, xiγ, xj fXi,Xj

xim, xj

fXi,Xj

xiγ, xj fXi,Xj

xim, xjfZ,Xi,Xj(zγ, xim, xj) fXi,Xj

xim, xj

g(zγ, xim, xj) = 0.

(4.6)

Diks and Panchenko (2006) discuss several possibilities of choosingg(zγ, xim, xj). In this study we focus ong(zγ, xim, xj) =fXi,Xj(xim, xj)2, as the estimator ofτghas a corresponding U-statistic representation, bringing the desired asymptotic normality properties for weakly depen-dent data. Substituting into Eq. (4.6), one finds that

τ=fZ,Xi,Xj

zγ, xiγ, xj fXi,Xj

xim, xj

−fXi,Xj

xiγ, xj fZ,Xi,Xj

zγ, xim, xj

. (4.7)

To evaluate the data-driven representation ofτ, we rely on kernel methods. In particular, we consider the local density estimator

fˆW(w) =εdW n

n k=1

K

w−wk

ε , (4.8)

wherenis the sample size,εis the bandwidth parameter (similar toμfrom the Def. 4.2.1),d reflects the dimensionality of a given vectorWandK(.)is a bounded Borel functionRdW R satisfying

|K(t)|dt <∞,

K(t)dt= 1 and |tK(t)| →0 as |t| → ∞. (4.9)

In practice,K(.)is often chosen to be a probability density function (Wand and Jones, 1995).

In order to guarantee the consistency of the pointwise density estimators, we assume that the bandwidth parameterεcomes from the sequenceεn, which is slowly decreasing with the sample size, i.e.

εn0 and n→ ∞ as n→ ∞. (4.10)

Parzen (1962) shows that under conditions (4.9) and (4.10) and provided thatfis continuous at w, the estimate of densityfat a given pointwis consistent.

Given a given bandwidthε, a natural estimator forτis

Tn(ε) =C n k=1

n p=1

K

(zγ, xiγ, xj)T (zk, xik, xjk)T ε

K

(xim, xj)T(xip, xjp)T ε

−K

(xiγ, xj)T (xik, xjk)T ε

K

(zγ, xim, xj)T(zp, xip, xjp)T ε

,

(4.11)

whereεis the bandwidth and

C=εdZ−2dXi−2dXj

n2 . (4.12)

(We sum over two indices as it allows to calculate the variance ofTn(ε)explicitly.) The asymp-totic distribution of the test statistic can be derived from the behavior of the properties of the second order U-statistic, as described by Serfling (1980) and van der Vaart (1998).

Theorem 4.2.1. Under the conditions described by Eqs. (4.9) and (4.10), for a given set of VaRγlevels and given bandwidth parameter sequenceεn, test statisticTnn)satisfies:

√n(Tnn)−τ) Sn

−→ Nd (0,1),

whereSn is a heteroskedasticity and autocorrelation consistent estimator of the asymptotic standard deviation of√n(Tnn)−τ).

The proof of Theorem 4.2.1 can be found in Appendix 4.A. As argued by Diks and Panchenko (2006), although the test statistic is not positive definite, the one-sided version of the test, i.e. re-jecting on larger values, turns out to yield better performance.

In this study we chooseγto be 0.05 as it is the most commonly applied VaR significance level. We calculateVaRγfrom the empirical quantile function (Jones, 1992). Following the literature on nonparametric Granger causality testing (Hiemstra and Jones, 1994; Diks and Panchenko, 2006) we take the square kernel function.1The square kernel form of the estimator

1The asymptotic properties of the test statistic are, however, robust to any kernel specification, provided that it

4.2. METHODOLOGY OF NCOVAR in Eq. (4.8), can be rewritten as

fˆWSQ(w) =(2ε)dW n−1

n k=1

I( w−wk < ε), (4.13)

whereI( w−wk < ε)is the indicator function taking values 1 for any w−wk < εand zero otherwise, and . is the supremum norm over all the dimensions.

4.2.1 Optimal bandwidth

Although the asymptotic normality of the test statistic holds for an arbitrary decreasing sequence of bandwidths as long as it satisfies condition from Eq. (4.10), it influences the power of the test to a great extent (Silverman, 1998). Therefore, in order to improve the performance of the test, we calculate the optimal size of the bandwidth explicitly. Following Wand and Jones (1995) and Silverman (1998), the optimal bandwidth minimizes the Mean Squared Error (MSE) of Tnn), which may be decomposed into the sum of variance and squared bias ofTnn). In our inference it is worthwhile to point out that the optimal bandwidth values ofTnn)do not violate the consistency properties of any of the density estimators.

Corollary 4.2.1. Under the conditions given by Eqs. (4.9) and (4.10), the MSE-optimal se-quence of bandwidths ofTnn)guarantees consistency of any of the pointwise density estima-tors contributing toTnn).

The proof of Corollary 4.2.1 is given in Appendix 4.B. In fact, the MSE optimum rate of convergence of the bandwidth ofTnn)is slightly faster than that of individual density estimators, but still much slower thann−1. This is caused by the increased variance of a product of two estimators compared to their individual variances. Therefore, in order to balance this effect in the MSE, the sequence of optimal bandwidths ofTnn)should decrease at a slightly faster rate asn→ ∞, but never as fast asn−1. In testing for systemic risk this proves to be of satisfies conditions (4.9) and (4.10).

large importance as with a bandwidth parameter decreasing just slightly with the sample size we are still able to capture the majority of returns which are left toVaRγ.

In evaluating the optimal bandwidth value we rely on Monte Carlo methods. Correcting for the weak dependency, we apply the autocorrelation consistent estimator for the variance of Tn(ε), as proposed in Newey and West (1987). It might be verified that for a given bandwidth ε, the bias ofTn(ε)may be calculated from the Taylor expansion around any point as

E[Tn(ε)]−τ=1 2κ2ε2

fZ,Xi,Xj

zr, xir, xj

2fXi,Xj

xis, xj +fXi,Xj

xis, xj

2fZ,Xi,Xj

zr, xir, xj

−fXi,Xj

xir, xj

2fZ,Xi,Xj

zr, xis, xj

−fZ,Xi,Xj

zr, xis, xj

2fXi,Xj

xir, xj +o(ε2),

(4.14)

whereκ2is the second moment of the kernel and2fW(w)is the trace of the second derivative of density evaluated at pointw. Up to the error of ordero(ε2), Eq. (4.14) has a plug-in estimator, which can be easily calculated using kernel methods (Wand and Jones, 1995).