• Keine Ergebnisse gefunden

2.2. Vine copulas

2.2.5. Simulation and fitting of vine copulas

2.2.5. Simulation and fitting of vine copulas

One of the main reasons why vine copula are considered to be a very useful tool for modeling dependence in practice is that there is software available that can be used for example for simulation and fitting. All these implementations are contained in the R library VineCopula (Schepsmeier et al., 2017) for a parametric simplified framework.

Handling non-parametric simplified and parametric non-simplified vines is numerically challenging but there is software available:kdevine(Nagler, 2017) andgamCopula(Vatter and Nagler, 2016), respectively. We use VineCopula for all numerical vine copula related applications throughout this thesis. The parametric bivariate copulas used as candidate models are Gaussian, Student t, Clayton, Gumbel, Frank, Joe, BB1, BB6, BB7, BB8, Tawn type 1 and Tawn type 2 as well as their survival versions and 90/270 degree rotations (for details see Schepsmeier et al., 2017).

For simulation and Monte Carlo integration it is important that we can sample from vine copula distributions. St¨ober and Czado (2012) and Joe (2014) provide sampling algorithms for arbitrary vine copulas. They are based on the inverse Rosenblatt trans-formation (Rosenblatt, 1952), which is given byTc: [0,1]d→[0,1]d,w= (w1, . . . , wd)>7→

(Tc,1(w), . . . , Tc,d(w))>. The components ofTc(w) can recursively be defined byTc,md,d(w) = wmd,d and

Tc,mj,j(w) =Cm1

j,j|mj+1,j+1,...,md,d(wmj,j|Tc,mj+1,j+1(w), . . . , Tc,md,d(w)) (2.5) for j = 1, . . . , d−1, where mj,j denotes thejth diagonal entry of the structure matrix of the vine copula. The corresponding Rosenblatt transform is given byTc1: [0,1]d→[0,1]d, u= (u1, . . . , ud)> 7→(Tc,11(u), . . . , Tc,d1(u))>, where Tc,m1d,d(u) =umd,d and

Tc,m1j,j(u) = Cmj,j|mj+1,j+1,...,md,d(umj,j|umj+1,j+1, . . . , umd,d). (2.6)

The sampling algorithm then works as follows: First, sample wj ∼ uniform(0,1) for j = 1, . . . , d. Then, apply an inverse Rosenblatt transform Tc to the uniform sample, i.e.u = (u1, . . . , ud)> = Tc(w), where w = (w1, . . . , wd)> is mapped from the (uniform) w-scale to the (warped) u-scale in the following way:

• umd,d :=wmd,d,

• umd−1,d−1 =Cm1

d−1,d−1|md,d(wmd−1,d−1|umd,d), ...

• um1,1 =Cm1

1,1|m2,2,...,md,d(wm1,1|um2,2, . . . , umd,d).

Note that the appearing (inverse) conditional distribution functions can be obtained eas-ily for vine copulas (St¨ober and Czado, 2012, Section 5.3). This sampling algorithm is implemented inVineCopula asRVineSim.

Using the tree representation of vine copulas, Dißmann et al. (2013) developed a se-quential estimation method that fits a simplified parametric vine, i.e. the structure as well the corresponding pair-copula families and parameters, to a given data set tree-by-tree. Dißmann’s algorithm is the most frequently used procedure for fitting vine copulas and works as follows: First, the empirical Kendall’s τ values are calculated for all pairs.

Then, a spanning tree maximizing the sum of absolute Kendall’s τ values is determined such that most dependence is captured in the first tree of the vine. For every edge the maximum-likelihood estimate for each possible pair-copula from the candidate set is de-termined. Then, the pair-copula with the highest likelihood, AIC or BIC is assigned to the edge. Having specified the first tree the pseudo-data for the second tree is determined by applying the fitted conditional distribution functions. For the second tree, the empir-ical Kendall’s τ values for all edges admissible with respect to the proximity condition are determined. Then, as for the first tree, a maximal spanning tree with corresponding optimal pair-copulas is selected. This procedure is repeated until all d−1 trees of the vine copula are specified. For a more detailed description see Dißmann et al. (2013). This algorithm is also implemented in VineCopula as the function RVineStructureSelect.

Model distances for vine copulas

The contents of this chapter are a lightly edited reproduction of the published contents in Killiches et al. (2017b) and of parts of the submitted contents in Killiches et al. (2017c).

Sections 3.1 and 3.7 consist of modified parts of both Killiches et al. (2017b) and Killiches et al. (2017c). Sections 3.2 and 3.3 are based on Killiches et al. (2017b) and Sections 3.4 to 3.6 present contents of Killiches et al. (2017c).

3.1. Introduction

In the course of growing data sets and increasing computing power, statistical data analysis has considerably developed within the last decade. The necessity of proper dependence modeling has become evident at least since the financial crisis of 2007. Using vine copulas is a popular option to approach this task. The advantage of these models is that they are flexible and numerically tractable even in high dimensions.

Since it is interesting in many cases to determine how much two models differ, some authors like St¨ober et al. (2013) and Schepsmeier (2015) use the Kullback–Leibler (KL) distance (Kullback and Leibler, 1951), also known as KL divergence, as a model distance between vines. A symmetrized version of the KL distance is given by theJeffreys distance (JD) (Jeffreys, 1946). In model selection for copulas the KL distance is frequently used (see for example Chen and Fan, 2005, 2006; Diks et al., 2010). In the context of vine copulas, Joe (2014, Section 5.7) used the KL distance to calculate the sample size necessary to discriminate between two densities. Investigating the simplifying assumption Hobæk Haff et al. (2010) used the KL distance to find the simplified vine closest to a given non-simplified vine and St¨ober et al. (2013) gage the strength of non-simplifiedness of the trivariate Farlie-Gumbel-Morgenstern (FGM) copula for different dependence parameters.

Similarly, Spanhel and Kurz (2015) use the KL distance to assess the quality of simplified

vine copula approximations. However, all popular distance measures require multivariate integration, which is why they can only deal with up to three- or four-dimensional models in a reasonable amount of time.

In this chapter we will address the question of how to measure the distance between two vine copulas even for high dimensions and show how to use distance measures for model selection in two applications. For this purpose, we develop methods based on the Kullback–Leibler distance, where we use the fact that it can be expressed as the sum over expectations of KL distances between univariate conditional densities. By cleverly approximating these expectations in different ways, we introduce three new distance mea-sures with varying focuses. The approximate Kullback–Leibler distance (aKL) aims to approximate the true Kullback–Leibler distance via structured Monte Carlo integration and is a computationally tractable distance measure in up to five dimensions. Thediagonal Kullback–Leibler distance (dKL) focuses on the distance between two vine copulas on spe-cific conditioning vectors, namely those lying on certain diagonals in the space. We show that even though the resulting distance measure does not approximate the KL distance in a classical sense, it still reproduces its qualitative behavior quite well. While this way of measuring distances between vines is fast in up to ten dimensions, we still have to reduce the number of evaluation points in order to get a numerically tractable distance measure for dimensions 30 and higher. By concentrating on only one specific diagonal we achieve this, defining thesingle diagonal Kullback–Leibler distance (sdKL). The lack of symmetry of the KL distance and its substitutes is overcome by developing similar approximations to the Jeffreys distance. In numerous examples and applications we illustrate that the proposed methods are valid distance measures and outperform benchmark approaches like Monte Carlo integration regarding computational time. Moreover, in order to enable the assessment of the size of our developed distance measures we provide a baseline cali-bration based on the comparison of specific Gaussian copulas to the independence copula.

Further, we show possible fields of applications for the dKL and sdKL in model selection.

For this purpose we develop a hypothesis test that answers the question if the distance between two models from nested model classes is significant. Then we show how to select the best model out of a list of candidate models with the help of a model distance based measure. Finally, we also use the new distance measures and the developed hypothesis test to answer the question how to determine the optimal truncation level of a fitted vine copula, a task already recently discussed by Brechmann et al. (2012) and Brechmann and Joe (2015). Truncation methods have the aim of enabling high-dimensional vine copula modeling by severely reducing the number of used parameters without changing the fit of the resulting model too much.

The remainder of this chapter is organized as follows: In Section 3.2 we develop the above mentioned modified model distances for vine copulas and perform several plausibility

checks on their performance. Section 3.3 contains a simulation studies comparing the performances of all introduced distance measures. In order to facilitate model selection using model distances we provide a hypothesis test based on parametric bootstrapping in Section 3.4. In Section 3.5 we show how the model distances can be used to assess the best model fit out of a set of candidate models. As a final application the determination of the optimal truncation level of a vine copula is discussed in Section 3.6. Section 3.7 concludes the chapter with some summarizing comments.

3.2. Model distances for vines

There are many motivations to measure the model distance between different vines. For example, St¨ober et al. (2013) try to find the simplified vine with the smallest distance to a given non-simplified vine. Further, it might be of interest to measure the distance between a vine copula and a Gaussian copula, both fitted to the same data set, in order to assess the need for the more complicated model. Common methods to measure such distance are the Kullback–Leibler distance and the Jeffreys distance.

In order to simplify notation, for the remainder of this chapter we assume that the diagonal of a d-dimensional structure matrix is given by 1 :d. This assumption comes without any loss of generality: Property 2 from Definition 2.1 implies that the diagonal of any vine structure matrix is a permutation of 1 :d, where we use the notation r:s to describe the vector (r, r+ 1, . . . , s)> for r ≤ s. Hence, relabeling of the variables suffices to obtain the desired property.

Further, for a simplified vine we define the associatedmatched Gaussian vine, i.e. the vine with the same structure matrix and Kendall’s τ values associated with the pair-copulas but only Gaussian pair-pair-copulas.

Definition 3.1(Matched Gaussian vine). For a simplified vine copulaR = (M, B, P(1), P(2)) let K = (ki,j)di,j=1 denote the lower-triangular matrix that contains the corresponding Kendall’s τ values. Then, the matched Gaussian vine of R is given by the vine cop-ula ˜R = (M,B,˜ P˜(1),P˜(2)), where ˜B is a family matrix where all entries are Gaussian pair-copulas, parameter matrix ˜P(1) = (˜p(1)i,j)di,j=1 with ˜p(1)i,j = sin π2ki,j

and ˜P(2) is a zero-matrix.

3.2.1. Kullback–Leibler distance

Kullback and Leibler (1951) introduced a measure that indicates the distance between two d-dimensional statistical models with densitiesf, g: Rd→[0,∞). The so-calledKullback–

Leibler distance between f and g is defined as KL(f, g) :=

Z

x∈Rd

ln

f(x) g(x)

f(x) dx. (3.1)

The KL distance between f and g can also be expressed as an expectation with respect tof:

KL(f, g) =Ef

ln

f(X) g(X)

, (3.2)

where X∼f. Note that the KL distance is non-negative and equal to zero if and only if f = g. It is not symmetric, i.e. in general KL(f, g) 6= KL(g, f) for arbitrary densities f andg. To clarify the order of the arguments, in the following we denote f as the reference density. Further, since symmetry is one of the properties of a distance, the Kullback–

Leibler distance is not a distance in the classical sense and thus is often referred to as Kullback–Leibler divergence. A symmetrized version of the KL distance is given by the Jeffreys distance (Jeffreys, 1946), which is defined as

JD(f, g) = KL(f, g) + KL(g, f). (3.3) Since the Jeffreys distance is just a sum of two Kullback–Leibler distances, we will in the following sections concentrate on the KL distance and apply our results to the Jeffreys distance in Section 3.3.2.

Under the assumption that f and g have identical marginals, i.e. fj =gj,j = 1, . . . , d, the KL distance betweenf andg is equal to the KL distance between their corresponding copula densities. This is due to the fact that the KL distance is invariant under one-to-one transformations of the marginals (Cover and Thomas, 2012). Hence, if we let cf and cg be the copula densities corresponding to f and g, respectively, and assume thatf and g have the same marginal densities, we obtain

KL(f, g) = KL cf, cg

. (3.4)

In this chapter we are mainly interested in comparing different models that are ob-tained by fitting a data set. Since we usually first estimate the margins and afterwards the dependence structure (cf. IFM method in Joe, 1997, Section 10.1), the assumption of identical margins is always fulfilled. Hence, we will in the following concentrate on calculating the Kullback–Leibler distance between copula densities.

Having a closer look at the definition of the KL distance, we see that for its calculation ad-dimensional integral has to be evaluated. In general, this cannot be done analytically and, further, is numerically infeasible in high dimensions. For example, Schepsmeier (2015) stresses the difficulty of numerical integration in dimensions 8 and higher. In this section,

we propose modifications of the Kullback–Leibler distance designed to be computationally tractable and still measure model distances adequately. These modifications are all based on the following proposition that shows that the KL distance between d-dimensional copula densitiescf and cg can be expressed as the sum over expectations of KL distances between univariate conditional densities.

Proposition 3.2. For two copula densitiescf and cg it holds:

KL cf, cg

=

d

X

j=1

Ecf

(j+1):d

h KL

cfj|(j+1):d · |U(j+1):d

, cgj|(j+1):d · |U(j+1):di

, (3.5)

where U(j+1):d ∼ cf(j+1):d and (d+ 1) :d := ∅. Further, cfj|(j+1):d(· |uj+1, . . . , ud) denotes the univariate conditional density ofUj|(Uj+1, . . . , Ud)> = (uj+1, . . . , ud)>implied by the density cf.

We will prove an even more general version of Proposition 3.2 that holds for arbitrary densitiesf and g:

KL f, g

=

d

X

j=1

Ef(j+1):d

h KL

fj|(j+1):d · |X(j+1):d

, gj|(j+1):d · |X(j+1):di ,

where X(j+1):d ∼ f(j+1):d and fj|(j+1):d(· |xj+1, . . . , xd) denotes the univariate conditional density of Xj|(Xj+1, . . . , Xd)> = (xj+1, . . . , xd)> implied by f. Proposition 3.2 then fol-lows directly from this statement.

Proof. Recall that using recursive conditioning we can obtain for density f f(x1, . . . , xd) =

d

Y

j=1

fj|(j+1):d xj|x(j+1):d .

Thus, the Kullback–Leibler distance betweenf andg can be written in the following way:

KL f, g

= Z

x∈Rd

ln

f(x) g(x)

f(x) dx

= Z

x∈Rd d

X

j=1

ln fj|(j+1):d xj|x(j+1):d gj|(j+1):d xj|x(j+1):d

!

f(x) dx

=

d

X

j=1

Z

xd∈R

· · · Z

x1∈R

ln fj|(j+1):d xj|x(j+1):d gj|(j+1):d xj|x(j+1):d

!

f(x1, . . . , xd) dx1· · · dxd

=

d

X

j=1

Z

xd∈R

· · · Z

xj∈R

ln fj|(j+1):d xj|x(j+1):d gj|(j+1):d xj|x(j+1):d

!

×



 Z

xj−1∈R

· · · Z

x1∈R

f(x1, . . . , xd) dx1· · ·dxj1





dxj· · · dxd

=

d

X

j=1

Z

xd∈R

· · · Z

xj∈R

ln fj|(j+1):d xj|x(j+1):d gj|(j+1):d xj|x(j+1):d

!

fj,...,d(xj, . . . , xd) dxj· · · dxd

=

d

X

j=1

Z

xd∈R

· · · Z

xj+1∈R



 Z

xj∈R

ln fj|(j+1):d xj|x(j+1):d gj|(j+1):d xj|x(j+1):d

!

×fj|(j+1):d(xj|x(j+1):d) dxj





f(j+1):d(x(j+1):d) dxj+1· · ·dxd

=

d

X

j=1

Ef(j+1):d

KL fj|(j+1):d(· |X(j+1):d), gj|(j+1):d(· |X(j+1):d) .

Proposition 3.2 is especially useful ifcf andcg are vine copula densities since for a vine copula with structure matrixM the appearing (univariate) conditional densitycj|(j+1):dof Uj|(Uj+1, . . . , Ud)> = (uj+1, . . . , ud)> can be easily obtained by taking the product over all pair-copula expressions corresponding to the entries in the jth column of M. We will prove this in Proposition 3.3.

Proposition 3.3. Let U = (U1, . . . , Ud)> be a random vector with vine copula density c and corresponding structure matrix M = (mi,j)di,j=1. Then, for j < d

cj|(j+1):d(uj|uj+1, . . . , ud) =

d

Y

k=j+1

cmk,j,mj,j;mk+1,j,...,md,j

Cmk,j|mk+1,j,...,md,j umk,j|umk+1,j, . . . , umd,j , Cmj,j|mk+1,j,...,md,j umj,j|umk+1,j, . . . , umd,j

;umk+1,j, . . . , umd,j . (3.6) Proof. From Equation 2.4 we know that the vine copula density can be written as a product over the pair-copula expressions corresponding to the matrix entries. In Property 2.8 (ii), Dißmann et al. (2013) state that deleting the first row and column from a d-dimensional structure matrix yields a (d−1)-dimensional trimmed structure matrix. Due to Property 2 from Definition 2.1 the entry m1,1 = 1 does not appear in the remaining

matrix. Hence, we obtain the density c2:d by taking the product over all pair-copula expressions corresponding to the entries in the trimmed matrix. Iterating this argument yields that the entries of matrixMk:= (mi,j)di,j=k+1 resulting from cutting the firstk rows and columns from M represent the densityc(k+1):d. In general, we have

cj|(j+1):d(uj|uj+1, . . . , ud) = cj:d(uj, . . . , ud) c(j+1):d(uj+1, . . . , ud).

The numerator and denominator can be obtained as the product over all pair-copula expressions corresponding to the entries of Mj−1 and Mj. Thus, cj|(j+1):d is simply the product over the expressions corresponding to the entries from the first column of Mj−1. This proves Equation 3.6.

As an example, combining the results from Proposition 3.2 and Proposition 3.3, for four-dimensional copula densities cf and cg we can write:

KL cf, cg

=Ecf

2:4

h KL

cf1|2:4(· |U2:4), cg1|2:4(· |U2:4)i +Ecf

3,4

h KL

cf2|3,4(· |U3,4), cg2|3,4(· |U3,4)i +Ecf4

h KL

cf3|4(· |U4), cg3|4(· |U4) i

+ 0

(3.7)

where for instance

cf1|2:4(u1|u2, u3, u4) = cf12(u1, u2)cf1,3;2

C1f|2(u1|u2), C3f|2(u3|u2);u2

×cf1,4;23

C1f|23(u1|u2, u3), C4f|23(u4|u2, u3);u2, u3 . The zero in the last line of Equation 3.7 results from the fact that cf4(u4) = cg4(u4) = 1 for all u4 ∈[0,1]. This is generally the case for the dth summand in Equation 3.5, which will therefore be omitted in the following. Further note that the last non-zero term of Equation 3.7 can also be written as KL

cf3,4(·, ·), cg3,4(·, ·) .

Of course the evaluation of the KL distance with this formula still implicitly requires the calculation of a d-dimensional integral since the expectation in the first summand of Equation 3.5 demands a (d−1)-dimensional integral of the KL distance between univariate densities. A commonly used method to approximate expectations is Monte Carlo (MC) integration (see for example Caflisch, 1998): For a random vector X ∈ Rd with density f: Rd → [0,∞) and a scalar-valued function h: Rd → R, the expectation Ef[h(X)] = R

Rdh(x)f(x) dx can be approximated by Ef[h(X)]≈ 1

NMC

NMC

X

i=1

h(xi), (3.8)

where {xi}Ni=1MC is an i.i.d. sample of size NMC distributed according to the density f.

However, the slow convergence rate of this method has been subject to criticism. Moreover, Do (2003) argues that when approximating the KL distance via Monte Carlo integration the random nature of the method is an unwanted property. Additionally, MC integration might produce negative approximations of KL distances even though it can be shown theoretically that the KL distance is non-negative.

As an alternative to Monte Carlo integration, in the next sections we propose several ways to approximate the expectation in Equation 3.5 by replacing it with the average over a (d−j)-dimensional non-random grid Uj, such that

KL cf, cg

d1

X

j=1

1

| Uj|

X

u(j+1):d∈ Uj

KL

cfj|(j+1):d(· |u(j+1):d), cgj|(j+1):d(· |u(j+1):d)

. (3.9)

Note that, being a sum over univariate KL distances, this approximation produces non-negative results, regardless of the grids Uj, j = 1, . . . , d. Now, the question remains how to choose the grids Uj, such that the approximation is on the one hand fast to calculate and on the other hand still maintains the main properties of the KL distance. We will provide three possible answers to this question yielding different distance measures and investigate their performances.

Throughout the subsequent sections we assume the following setting: LetRf andRg be twod-dimensional vines with copula densitiescf andcg, respectively. We assume that their vine structure matrices have the same entries on the diagonals, i.e. diag(Mf) = diag(Mg).

Note that, although this assumption is a restriction, there are still 2(d−22 )+d−2 different vine decompositions with equal diagonals of the structure matrix, which is shown in Proposition 3.4.2 As before, without loss of generality we set the diagonals equal to 1 :d.

Proposition 3.4. Let σ = (σ1, . . . , σd)> be a permutation of 1 :d. Then, there exist 2(d−22 )+d−2 different vine decompositions whose structure matrix has the diagonal σ.

Proof. The number of vine decompositions whose structure matrix has the same diago-nal σ can be calculated as the quotient of the number of valid structure matrices and the number of possible diagonals. Morales-N´apoles (2011) show that there are d!2 ·2(d−2d ) different vine decompositions. In each of the d−1 steps of the algorithm for encoding a vine decomposition in a structure matrix (see St¨ober and Czado, 2012) we have two possible choices such that there are 2d1 structure matrices representing the same vine decomposition. Hence, there are in total d!2 ·2(d−2d )·2d1 valid structure matrices. Further,

2This includes, for example, C- and D-vines having the same diagonal.

there are d! different diagonals. Thus, for a fixed diagonal σ there exist

d!

2 ·2(d−2d )·2d−1

d! = 2(d−22 )+d2

different vine decompositions.

3.2.2. Approximate Kullback–Leibler distance

We illustrate the idea of the approximate Kullback–Leibler distance at the example of two three-dimensional vines Rf and Rg. For the first summand (j = 1) of Equation 3.9, the KL distance betweencf1|2,3(· |u2, u3) and cg1|2,3(· |u2, u3) is calculated for all pairs (u2, u3)>

contained in the gridU1. In this example we assume that the pair-copula cf2,3 is a Gumbel copula with parameter θ= 6 (implying a Kendall’s τ value of 0.83). Regarding the choice of the grid, if we used the Monte Carlo method, U1 would contain a random sample of cf2,3. Recall from Section 2.2.5 that such a sample can be generated by simulating from a uniform distribution on [0,1]2 and applying the inverse Rosenblatt transformation Tcf

2,3. Figure 3.1 displays a sample of size 900 on the (uniform) w-scale and its transformation via Tcf

2,3 to the (warped) u-scale.

Figure 3.1.: Sample of size 900 from the uniform distribution (left) and corresponding warped sample under transformation Tcf

2,3, which is a sample from a Gumbel copula with θ= 6 (right).

As mentioned before we do not want our distance measure to be random. This motivates us to introduce the concept of structured Monte Carlo integration: Instead of sampling from the uniform distribution on the w-scale, we use a structured grid W, which is an equidistant lattice on the two-dimensional unit cube3, and transform it to the warped

u-3Since most copulas have an infinite value at the boundary of the unit cube, we usually restrict ourselves to [ε,1ε]d for a small ε >0.

scale by applying the inverse Rosenblatt transformationTcf

2,3 (cf. Equation 2.5). Figure 3.2 shows an exemplary structured grid with 30 grid points per margin.

Figure 3.2.: Structured grid with 30 grid points per margin (left) and corresponding warped grid under transformation Tcf

2,3 (right).

Applying this procedure for all grids Uj,j = 1, . . . , d−1, yields theapproximate Kullback–

Leibler distance.

Definition 3.5 (Approximate Kullback–Leibler distance). Let Rf and Rg be as described above. Further, let n ∈ N be the number of grid points per margin and ε > 0. Then, the approximate Kullback–Leibler distance (aKL) between Rf (reference vine) andRg is defined as Rosenblatt transform Tcf

(j+1):d associated with the copula density cf(j+1):d. Note that by construction|Gj|=nd−j.

Proposition 3.6 shows that the approximate KL distance in fact approximates the true

Proposition 3.6 shows that the approximate KL distance in fact approximates the true