Group-Based Evolutionary Models - Algebraic Statistics

The Jukes-Cantor model for either binary or DNA sequences and the Kimura models with two or three parameters belong to the class of group based models. These models have the property that a linear change of coordinates by using discrete Fourier transform translates the ideal of phylogenetic invariants into a toric model. More specifically, the symbols of the alphabet in a group-based model can be labelled by the elements of a finite group in such a way that the probability of translating group elements (from gtoh) depends only on their difference (g−h). By replacing the original coordinatespi1,...,inby Fourier coordinatesqi1,...,in, the ideal of phylogenetic invariantes becomes toric.

An evolutionary model on the state space [n] is calledgroup-based if there is an abelian group G with elements g1, . . . , gn and a mapping ψ : G → R such that the n×n instantaneous rate matrix Q= (Qij) satisfies the condition

Qij=ψ(gj−gi), 1≤i, j≤n. (7.37)

Example 7.18.Consider the cyclic groupG=Z₄. The group table for the differencesg−hof group elementsg, h∈Ghas the form

−0 1 2 3 0 0 3 2 1 1 1 0 3 2 2 2 1 0 3 3 3 2 1 0

(7.38)

and can be mapped onto the entries of the instantaneous rate matrix for the Kimura’s two-parameter model K80,







. α β α α . α β β α . α α β α .





. (7.39)

♦

7.7 Group-Based Evolutionary Models 161

Example 7.19.Take the Klein groupG=Z₂×Z₂. The group table for the differencesg−hof group elementsg, h∈Ghas the form

− (0,0) (0,1) (1,0) (1,1) (0,0) (0,0) (0,1) (1,0) (1,1) (0,1) (0,1) (0,0) (1,1) (1,0) (1,0) (1,0) (1,1) (0,0) (0,1) (1,1) (1,1) (1,0) (0,1) (0,0)

(7.40)

and can be mapped onto the entries of the instantaneous rate matrix for the Kimura’s three-parameter model K81,







. α β γ α . γ β β γ . α γ β α .





. (7.41)

♦ We show that the substitution matrices share the same dependencies. To this end, we need to summarize some basic facts about characters and discrete Fourier transforms. For this, letGbe a finite abelian group. We denote the group operation by addition, the neutral element by 0, and the l-fold multiple of a group element g ∈G as l·g =g+· · ·+g (l times). Let C^∗ denote the set of non-zero complex numbers; we can regardC^∗ as an abelian group with ordinary multiplication.

A character ofGis a group homomorphism mappingGintoC^∗; that is,χ:G→C^∗ is a character ifχ(g1+g2) =χ(g1)χ(g2) for all elements g1, g2∈G. The set of characters ofGis denoted by ˆG. The set ˆGis non-empty, since it contains thetrivial characterǫdefined byǫ(g) = 1 for allg∈G.

Lemma 7.20.The set of characters ofGforms an abelian group under pointwise multiplication; that is,

(χχ^′)(g) =χ(g)χ^′(g), χ, χ^′ ∈G, gˆ ∈G.

The groups GandGˆ are isomorphic.

Proof. First, the defined operation on ˆG is associative and commutative which follows directly from associativity and commutativity of complex multiplication. Thus ˆG forms a commutative semigroup.

The trivial characterǫis the neutral element of ˆGand so ˆGforms a commutative monoid.

Suppose the groupGhas ordern. Then for each characterχ∈G, we haveˆ χ(g)ⁿ=χ(gⁿ) =χ(1) = 1.

Thus the image ofGunderχlies in the group ofn-th roots of unity. The norm of ann-th root of unity ζ is|ζ| =p

ζζ = 1, where z denotes the conjugate of a complex number z. Thusζζ = 1 and hence ζ=ζ⁻¹. Therefore, the inverse of a characterχis given byχ⁻¹(g) =χ(g) for allg∈G. It follows that Gˆ forms an abelian group.

Second, the fundamental theorem of abelian groups says that each finite abelian group G is a direct sum of cyclic groups Z1, . . . , Zk. Let zi be a generating element of Zi with order ni; that is, Zi ={l·zi |0≤l≤ni−1}, 1≤i≤k. Let ζi denote a primitive ni-th root of unity; that is,ζ_iⁿⁱ = 1 andζ_i^j6= 1 for 1≤j≤ni−1.

Define χi∈Gˆ such thatχi(li·zi) =ζ_i^lⁱ, where 0≤li≤ni−1, 1≤i≤k, and extend such that

χi(l1·z1+· · ·+lk·zk) =ζ_i^lⁱ, 0≤li≤ni−1, 1≤i≤k.

Letg∈G. Writeg=l1·z1+· · ·+lk·zk, where 0≤lj ≤nj−1, 1≤j≤k, and put

φ(g) =χ^l₁¹· · ·χ^l_k^k. (7.42)

For eachg, h∈G, we have by definitionφ(g+h) =φ(g)φ(h). Hence,φis a group homomorphism.

Letg∈Gsuch thatφ(g) =ǫ. Writingg=l1·z1+· · ·+lk·zk gives 1 =ǫ(zi) = (χ^l₁¹· · ·χ^l_k^k)(zi) = χ1(zi)^l¹· · ·χk(zi)^l^k =χi(zi)^lⁱ =ζ_i^lⁱ, 1≤i≤k. It follows thatni is a divisor ofli for each 1≤i ≤k and thusg= 1. Hence, the mapping is one-to-one.

Let χ∈G. Sinceˆ χ(zi) is an ni-th root of unity, we haveχ(zi) =ζ_i^eⁱ =χi(zi)^eⁱ for some 0≤ei ≤ ni−1, 1≤i≤k. Thusχ=χ^e₁¹· · ·χ^e_k^k and hence the mapping is onto. ⊓⊔ The group ˆGis called thedual group orcharacter group ofG.

Lemma 7.21.Let G andH be finite abelian groups. The dual group of the direct product G×H = {(g, h)|g∈G, h∈H} is isomorphic to Gˆ×H.ˆ

Proof. Letχ be a character ofG×H. The restriction ofχto Gis a character ofGand the restriction to H is a character of H; we denote the restricted characters byχG and χH, respectively. This gives χ(g, h) = χG(g)·χH(h) for all (g, h) ∈ G×H. The mapping χ 7→ (χG, χH) provides the required

isomorphism. ⊓⊔

Example 7.22.The dual group of the cyclic groupG=Z_n={0,1, . . . , n−1} is the group ˆG={χ^b| 0≤b≤n−1} with

χ^b(a) =ζ^ab, 0≤a, b≤n−1, (7.43)

whereζ is a primitiven-th root of unit. ♦

Example 7.23.The dual group of the additive groupG=Z^k₂ of order 2^k has the characters (χ^a₁¹· · ·χ^a_k^k)(b1·z1+· · ·+bk·zk) =

Yk i=1

χ^a_iⁱ(bi·zi) = Yk i=1

(−1)^aⁱ^bⁱ = (−1)^h^a^,bi. (7.44)

where 0≤ai, bi≤1, 1≤i≤k. ♦

Let Gbe a finite abelian group and letL²(G) ={f |f :G→C} be the set of all complex-valued functions onG. This set becomes a complex vector space by defining addition

(f1+f2)(g) =f1(g) +f2(g), f1, f2∈L²(G), g∈G, (7.45) and scalar multiplication

(af)(g) =a·f(g), f ∈L²(G), g∈G, a∈C. (7.46) Define the delta functionsδg, g∈G, onGby

7.7 Group-Based Evolutionary Models 163

δg(h) =

1,ifg=h, 0,otherwise.

The vector space L²(G) has the delta functions as aC-basis, since each function f ∈L²(G) has the Fourier expansion

f(g) =X

f(h)δh(g), g∈G.

A multiplication on theC-spaceL²(G) is given as (f1∗f2)(g) =X

h∈G

f1(h)f2(g−h), g∈G, f1, f2∈L²(G).

This operation is associative and is calledconvolution orHadamard product.

An inner product on the vector space L²(G) is defined by hf1, f2i=X

g∈G

f1(g)f2(g), f1, f2∈L²(G). (7.47)

The delta functions onGdefine an orthonormal basis ofL²(G) with respect to this inner product, since we have

hδg, δhi=X

δg(l)δh(l) =

1,ifg=h,

0,otherwise, g, h∈G.

Theorem 7.24 (Orthogonality Relations). Let Gbe a finite abelian group.

• For all charactersχ andψ of G, we have hχ, ψi=

|G|,ifχ=ψ,

0, otherwise. (7.48)

• For all elements g andhin G, we have X

χ(g)χ(h) =

|G|,ifg=h,

0, otherwise. (7.49)

Proof. First, we have

hχ, ψi=X

χ(g)ψ(g) =X

(χψ⁻¹)(g)ǫ(g) =hχψ⁻¹, ǫi,

sinceψ=ψ⁻¹. Thus we can reduce to the case ψ=ǫ. Put S=hχ, ǫi=X

χ(g).

If χ is the trival character, the result will follow. Otherwise, there is a group element h ∈ G with χ(h)6= 1. By multiplying the above equation withχ(h), we obtain

χ(h)S=χ(h)X

χ(g) =X

χ(h+g) =X

χ(g) =S.

Thus we haveχ(h)S=S withχ(h)6= 1 and henceS = 0. This proves the first assertion.

Second, we have X

χ(g)χ(h) =X

χ(g−h), g, h∈G.

Ifg=h, the result will follow. Otherwise, there is a character ψwithψ(l)6= 1 for some l∈G. Define S=P

χχ(l). Thusψ(l)S=P

χ(ψχ)(l) =S and henceS= 0. ⊓⊔

The discrete Fourier transform (DFT)F :L²(G)→L²( ˆG) assigns to each functionf ∈L²(G) a functionFf = ˆf defined by

fˆ(χ) =X

g∈G

f(g)χ(g), χ∈G.ˆ (7.50)

In particular, the DFT of the delta functionδg,g∈G, is given by δˆg(χ) =X

χ(h)δg(h) =χ(g), χ∈G.ˆ (7.51)

Theorem 7.25.Let Gbe a finite abelian group.

• Linearity: The DFT F :L²(G)→L²( ˆG)is aC-space isomorphism.

• Convolution: The DFT turns convolution into multiplication,

f\1∗f2(χ) = ˆf1(χ)·fˆ2(χ), χ∈G, fˆ 1, f2∈L²(G).

• Inversion: For each function f ∈L²(G), f(g) = 1

|G| X

χ∈Gˆ

χ(g) ˆf(χ), g∈G.

• Parseval identity: For all functions f1, f2∈L²(G), hf1, f2i= 1

|G|hfˆ1,fˆ2i.

• Translation: For each h∈Gandf ∈L²(G), definef^h(g) =f(h+g). Then we have cf^h(χ) =χ(h) ˆf(χ), f ∈L²(G), χ∈G, hˆ ∈G.

Proof. First, the DFT provides a linear map, since

( ˆf1+ ˆf2)(χ) = ˆf1(χ) + ˆf2(χ) =X

χ(g)[f1(g) +f2(g)]

χ(g)[f1+f2](g) =f\1+f2(χ)

7.7 Group-Based Evolutionary Models 165

The inversion formula implies that this mapping is one-to-one. Since both spaces have the same dimen-sion, it follows by linear algebra that the mapping is also onto. Hence, the mapping is a vector space isomorphism.

Third, due to linearity, we may consider only the basis elements δh, h∈G. By the orthogonality relations (7.49) and (7.51), the right-hand side gives

Fourth, the orthogonality relations (7.49) give hfˆ1,fˆ2i=X

fc^h(χ) =X Example 7.26.Consider the cyclic group G=Z_n. By taking the basis of L²(G) given by the delta functions, the proof of Lemma 7.20 and (7.51) show that

ˆδa(χ^b) =χ^b(a) = 1/χ^b(a) =ζ^−ab, 0≤a, b≤n−1, (7.52) whereζ= exp(2πi/n) is a primitiven-th root of unity. Thus the matrix of the DFT is given by

An=

In the quaternary casen= 4, we obtain

A4=

7.7 Group-Based Evolutionary Models 167

> F := FourierTransform ( Z, normalization = full):

> ptlist := convert ( F, ’list’):

> complexplot ( ptlist, x = -50..225, style = point);

♦

This mapping is provided by the 2^k×2^k Hadamard matrix H₂^k = ((−1)^h^a^,bi).

Fig. 7.14.DFT of cubic function (n= 9).

These matrices can be recursively defined as follows, H2=

1 1 1−1

, H₂^k+1=

H2^k H2^k

H2^k −H2^k

=H₂^k⊗H2, k≥1. (7.59)

♦ We will see that if the instantaneous rate matrixQis group-based, the corresponding substitution matrices exp(Qt) will also be group-based.

Lemma 7.29.LetGbe an abelian group of order n. The eigenvalues of an×n group-based instanta-neous rate matrixQsatisfying (7.37) are

λχ =X

h∈G

χ(h)ψ(h), χ∈G.ˆ (7.60)

The transition probabilities of the corresponding time-continuous Markov model are Pgh(t) = 1

|G| X

χ∈Gˆ

χ(h−g)e^λ^χ^t, t≥0. (7.61)

Proof. First, define then×nmatrixB= (χ(g))χ,g. We have

7.7 Group-Based Evolutionary Models 169 Hence, the rows ofB are the left eigenvectors ofQ. This shows the first assertion.

Second, let D be the n×n diagonal matrix with diagonal entries λχ. By (7.62), we have Q =

This proves the second assertion. ⊓⊔

Example 7.30 (Singular). Consider the binary JC model for the 1,3 claw tree (Fig. 7.15). Letπ=

?>=<

Fig. 7.15.The 1,3 claw tree.

(π0, π1) denote the probability distribution of the root and let the transition probability matrices along the branches be given as

Then the algebraic statistical model is defined by the mappingf :R⁴→R⁸with marginal probabilities

p000 =π0α0β0γ0+π1α1β1γ1, p001 =π0α0β0γ1+π1α1β1γ0, p010 =π0α0β1γ0+π1α1β0γ1, p011 =π0α0β1γ1+π1α1β0γ0, p100 =π0α1β0γ0+π1α0β1γ1, p101 =π0α1β0γ1+π1α0β1γ0, p110 =π0α1β1γ0+π1α0β0γ1, p111 =π0α1β1γ1+π1α0β0γ0.

The discrete Fourier transform gives a linear change of coordinates in the parameter space by us-ing (7.56),

π0=¹₂(r0+r1), α0=¹₂(a0+a1), β0= ¹₂(b0+b1), γ0=¹₂(c0+c1), π1=¹₂(r0−r1), α1=¹₂(a0−a1), β1= ¹₂(b0−b1), γ1=¹₂(c0−c1).

Simultaneously, it provides a linear change of coordinates in the probability space by making use of (7.58),

qijk = X1 r=0

X1 s=0

X1 t=0

(−1)^ir+js+ktprst. More specifically, we obtain

q000=p000+p001+p010+p011+p100+p101+p110+p111, q001=p000−p001+p010−p011+p100−p101+p110−p111, q010=p000+p001−p010−p011+p100+p101−p110−p111, q011=p000−p001−p010+p011+p100−p101−p110+p111, q100=p000+p001+p010+p011−p100−p101−p110−p111, q101=p000−p001+p010−p011−p100+p101−p110+p111, q110=p000+p001−p010−p011−p100−p101+p110+p111, q111=p000−p001−p010+p011−p100+p101+p110−p111. After these coordinate changes, the model has the monomial representation

q000=r0a0b0c0, q001=r1a0b0c1, q010=r1a0b1c0, q011=r0a0b1c1, q100=r1a1b0c0, q101=r0a1b0c1, q110=r0a1b1c0, q111=r1a1b1c1.

This model is toric and the phylogenetic invariants are given by binomials that can be established by the following program,

7.7 Group-Based Evolutionary Models 171

> ring r = 0, (r(0..1),a(0..1),b(0..1),c(0..1),q(0..7)), dp;

> ideal i0 = q(0)-r(0)*a(0)*b(0)*c(0);

> ideal i1 = q(1)-r(1)*a(0)*b(0)*c(1);

> ideal i2 = q(2)-r(1)*a(0)*b(1)*c(0);

> ideal i3 = q(3)-r(0)*a(0)*b(1)*c(1);

> ideal i4 = q(4)-r(1)*a(1)*b(0)*c(0);

> ideal i5 = q(5)-r(0)*a(1)*b(0)*c(1);

> ideal i6 = q(6)-r(0)*a(1)*b(1)*c(0);

> ideal i7 = q(7)-r(1)*a(1)*b(1)*c(1);

> ideal i = i0+i1+i2+i3+i4+i5+i6+i7;

> ideal j = std(i);

> eliminte (j, r(0)*r(1)*a(0)*a(1)*b(0)*b(1)*c(0)*c(1));

The output provides the following invariants, _[1]=q(1)*q(6)-q(0)*q(7)

_[2]=q(2)*q(5)-q(0)*q(7) _[3]=q(0)*q(4)-q(0)*q(7)

_[4]=q(2)*q(4)*q(6)-q(2)*q(6)*q(7) _[5]=q(1)*q(4)*q(6)-q(1)*q(6)*q(7)

♦ Example 7.31.Consider the 1, nclaw tree with root r and n≥1 leaves and take an abelian group G of ordern. Let πdenote the probability distribution of the root and let the transition probability matricesP^(ri), 1≤i≤n, along the branches be given as

P^(ri)(Xi=g|Xr=h) =f^(ri)(g−h), g, h∈G,1≤i≤n. (7.63) The joint probability of the group based model is then given by

p(g1, . . . , gn) =P(X1=g1, . . . , Xn=gn) =X

h∈G

π(h)P^(ri)(Xi=gi|Xr=h)

h∈G

π(h) Yn i=1

f^(ri)(gi−h).

In order to find the discrete Fourier transform of this probability density with respect to the groupGⁿ, the root distribution is replaced by the new function ˜π:Gⁿ →Cas follows,

π(h1, . . . , hn) =

π(h1),ifh1=. . .=hn,

0, otherwise. (7.64)

This definition gives

p(g1, . . . , gn) = X

(h1,...,hn)∈Gⁿ

π(h1, . . . , hn) Yn i=1

f^(ri)(gi−hi). (7.65) If we define

f(g1, . . . , gn) = Yn i=1

f^(ri)(gi), g1, . . . , gn∈G, (7.66) the joint probability distribution pcan be written as convolution of two functions onGⁿ,

p(g1, . . . , gn) = (˜π∗f) (g1, . . . , gn), g1, . . . , gn∈G. (7.67) Taking the discrete Fourier transform yields

q(χ1, . . . , χn) = ˆπ(χ˜ 1, . . . , χn)·fˆ(χ1, . . . , χn). (7.68) In particular, the discrete Fourier transform of the functionf has the form

fˆ(χ1, . . . , χn) = X

Moreover, the discrete Fourier transform of the root distribution is ˆ˜

It follows that the discrete Fourier transform of the joint probabilities has the monomial representation q(χ1, . . . , χn) =bπ(χ1· · ·χn)·

Yn i=1

fd^(ri)(χi). (7.71)

♦ This example is the base case for the induction in the general case. Given a rooted binary tree T with root r and n leaves. For each node v in T different from the root, write a(v) for the unique parent of v in T. The transition from a(v) to v is given by the substitution matrixP^(v). Suppose the

7.7 Group-Based Evolutionary Models 173

states of the random variables are the elements of a finite abelian groupG. Then the joint probability distribution of the labelling of the leaves can be written as

p(g1, . . . , gn) =X

π(gr) Y

v∈V(T) v6=r

P^(v)_g_a(v)_,g_v, (7.72)

where the sum extends over all states of the interior nodes of the treeT. We assume that the transition matrix entries P^(v)_g_a(v)_,g_v depend only on the difference of the group elementsga(v) and gv. We denote this entry byf^(v)(ga(v)−gv). Thus the group based model has the joint probability distribution

p(g1, . . . , gn) =X

π(gr) Y

v∈V(T) v6=r

f^(v)(ga(v)−gv). (7.73)

Theorem 7.32.Given the joint probability distributionp(g1, . . . , gn)of a group-based model parametrized in (7.73). The corresponding discrete Fourier transform has the form

q(χ1, . . . , χn) =π(χb 1· · ·χn)· Y

v∈V(T) v6=r

fd^(v)( Y

l∈Λ(v)

χl), (7.74)

whereΛ(v)is the set of leaves which have the node v as a common ancestor.

The formula (7.72) is a polynomial representation of the evolutionary model, while formula (7.74) provides a monomial representation of the same model. Since the groupsGand ˆGare isomorphic, the monomial representation can be rewritten as follows,

qg1,...,gn 7→bπ(g1+. . .+gn)· Y

v∈V(T) v6=r

fd^(v)( X

l∈Λ(v)

gl). (7.75)

We can regard this formula as the monomial mapping from a polynomial ring in|G|ⁿ unknowns qg1,...,gn=q(g1, . . . , gn) (7.76) to the polynomial ring in the unknownsπ(g) andb fd^(v)(g), which are indexed by the nodes ofT and the elements ofG.

Im Dokument Algebraic Statistics (Seite 172-187)