• Keine Ergebnisse gefunden

Proof of Theorem 2.5

Im Dokument Metric Learning for Structured Data (Seite 184-195)

an edit script over∆A such that ¯δ00(x˜) = z, and we obtain˜ dc(x, ˜˜ z) ≤ c(δ¯00, ˜x) = c(δ, ˜¯ x) +c(δ¯0, ˜y) =dc(x, ˜˜ y) +dc(y, ˜˜ z).

Lemma A.2. Let Abe analphabet, let X be aforestoverAand let k ∈ {1, . . . ,|X|}. Then, it holds:

∀i∈ {1, . . . ,|X|}:k∈ ancX(i)⇒k <i≤rlX(i)≤rlX(k) (A.6)

∀i∈ {1, . . . ,|X|}:k<i≤rlX(k)⇒k ∈ancX(i) (A.7)

∀i,j∈ancX(k):i< j⇔i∈ancX(j) (A.8) Proof. We first provide a proof for Equations A.6 and A.7, and then go on to prove EquationA.8. Our first proof works via induction over the size of the subtree ˜xk.

If|x˜k|=1, thenkcan not be an ancestor of any element and, likewise, there exist noi such that k <i≤rlX(k) = kork <rlX(i)≤ rlX(k) = k. Therefore, the base case holds for Equations A.6andA.7.

Now, assume that|x˜k|>1. Let ˜xk =xk(x˜k1, . . . , ˜xkR

k). Further, let for allr ∈ {1, . . . ,Rk+ 1}:kr:=k+rl=11|x˜lk|+1. Recall that, according to LemmaA.1, for allr ∈ {1, . . . ,Rk} it holds: ˜xkr = x˜kr. Further note that rlX(kr) = kr+|x˜kr| −1 = k+rl=1|x˜kl| = kr+1−1.

Finally, it holds: kRk+1= k+lR=k1|x˜kl|+1=k+|x˜k|=rlX(k) +1.

Regarding EquationA.6, we considerk ∈ancX(i). Then per definition of ancestors, one of the following two cases applies.

parX(i) =k: In that case, letrbe the index such thati=kr. Then, it holds:

k<k+

r1 l

=1

|x˜1k|+1=kr= i≤rlX(i) =rlX(kr) =kr+1−1≤ kRk+1−1=rlX(k)

parX(i)6=k: In that case, there is some j∈ancX(i)such that parX(j) = k, otherwise k would not be in ancX(i). Letr be the index such thatkr = j. Then, per induction, we know that

k<k+

r1 l

=1

|x˜1k|+1=kr= jI.H.< i≤rlX(i)

I.H.≤rlX(j) =rlX(kr) =kr+1−1≤kRk+1−1=rlX(k) which concludes the proof.

Regarding EquationA.7, we considerk<i≤rlX(k). Then, there exists exactly one rsuch that kr ≤i <kr+1. Now, ifkr =i, we obtain parX(i) = k, which in turn implies k ∈ ancX(i). If kr < i, then kr < i ≤ kr1−1 = rlX(kr). Therefore, per induction, it holds: kr ∈ancX(i). Due to the definition of ancestors, we also know thatk∈ancX(kr). Therefore,k ∈ancX(i), which concludes the proof.

Now, consider Equation A.8. We perform an inductive proof over the size of the ancestor set |ancX(k)|.

If ancX(k)is empty or contains only a single element, then the claim holds trivially.

If |ancX(k)| > 1, consider i := parX(k). Then, per definition of ancestors, we have ancX(k) ={i} ∪ancX(i). Because|ancX(i)|<|ancX(k)|, our induction hypothesis applies and the claim holds for all pairwise comparisons within ancX(i). It remains to show that the claim holds for all pairwise comparisons (i,j)withj∈ ancX(k). There are only two

possible cases for j. Either i= j, in which case the claim holds trivially, or j∈ancX(i). In that case, EquationA.6impliesj<i, which means that the claim holds as well. This concludes the proof.

Next, we generalize the notion oftree mappingsbetweentrees(refer to Definition2.14) totree mappingsbetween forests.

Definition A.3(Mappings). LetAbe analphabetand letX,YbeforestsoverA. Then, we define atree mapping MbetweenXandYas a subsetM ⊆ {1, . . . ,|X|} × {1, . . . ,|Y|}

such that the following conditions hold for all entries(i,j),(i0,j0)∈ M.

i≥i0 ⇐⇒ j≥j0 (pre-order preservation) (A.9) i∈ancX(i0) ⇐⇒ j∈ancY(j0) (ancestral preservation) (A.10) We define theleft-complementofMasI(M,X,Y):=i∈ {1, . . . ,|X|}@j∈ {1, . . . ,|Y|}: (i,j)∈ M and we define theright-complementofMasJ(M,X,Y):=j∈ {1, . . . ,|Y|}@i∈ {1, . . . ,|X|}:(i,j)∈ M . Finally, we define thecostofMaccording to somecost function coverAas follows.

c(M,X,Y):=

(i,j)∈M

c(xi,yj) +

iI(M,X,Y)

c(xi,−) +

jJ(M,X,Y)

c(−,yj)

In a next step, we show that we can always find an edit script which is exactly as expensive as thetree mappingin question. Conversely, we can always find atree mapping which is at most as expensive as theedit scriptin question. This very fact permits us to search for cheapesttree mappingsinstead of cheapestedit scripts, as we show in the next Lemma. First, however, we define an alternative distance based ontree mappings, which we will then show to be equivalent.

Definition A.4(Forest Edit Distance, Forest Mapping Distance). LetAbe analphabet and letX,YbeforestsoverA. Further, letcbe acost functionoverA. Then, we define theforest edit distancedc(X,Y)betweenXandYas

dc(X,Y):= min

δ¯A{c(δ,¯ X)|δ¯(X) =Y} (A.11) Further, we define theforest tree mappingdistanceDc(X,Y)betweenXandYas

Dc(X,Y):= min

M⊂{1,...,|X|}×{1,...,|Y|}{c(M,X,Y)|M is atree mappingbetween XandY} (A.12) In the next lemma, we demonstrate that under some conditions to thecost function, dcandDc are equivalent.

Lemma A.3. LetAbe analphabetand let X,Y beforestsoverA. Further, let c be acost function overA. Then, it holds:

1. For anytree mapping M between X and Y there exists anedit scriptδ¯MAsuch that δ¯(X) =Y and c(δ,¯ X) =c(M,X,Y).

2. If c fulfills the triangular inequality and is self-equal, then for anyedit scriptδ¯∈ Awith δ¯(X) = Y there exists a tree mapping Mδ¯ between X and Y such that c(Mδ¯,X,Y) ≤ c(δ,¯ X).

3. If c fulfills the triangular inequality and is self-equal, then dc(X,Y) =Dc(X,Y). Proof. We will consider each claim in turn.

Regarding the first claim, we define two more auxiliary sets, namelyIC(M,X,Y):= i∈ {1, . . . ,|X|}∃j∈ {1, . . . ,|Y|}:(i,j)∈ M andJC(M,X,Y):=j∈ {1, . . . ,|Y|}∃i∈ {1, . . . ,|X|}: (i,j)∈ M . Then, we can construct ¯δM as the concatenation of three edit scripts δ¯repM , ¯δdelM , and ¯δinsM as follows. We define ¯δrepM as the list of repi,y

j for all(i,j)∈ M in lexical ascending order, first sorted according to iand then according to j. Per con-struction, thisedit scriptreplaces allxi with the mapped labelyj according to thetree mapping M.

Next, we define ¯δdelM as the list of deli for all i∈ I(M,X,Y)indescendingorder. Per construction, ¯δdelM (X)contains exactly those xi such thati∈ IC(M,X,Y).

Finally, we define ¯δinsM as the list of inspar

Y(j),yj,rY(j),rY(j)+RM,X,Y(j) for all j ∈ J(M,X,Y) in ascending order, where we define RM,X,Y(j) recursively as RM,X,Y(j) := |adjY(j)∩

JC(M,X,Y)|+j0adjY(j)∩J(M,X,Y)RM,X,Y(j0)and where adjY(j) ={j0|parY(j0) =j}. Per construction, ¯δinsM inserts all labels of Y which are missing in ¯δrepM δ¯delM (X). The definition ofrY(j)and RM,X,Y(j)ensures that labelyjis inserted at the correct position and uses all children which are mapped to labels in Xand are descendants of ˜yj inY.

For ¯δM :=δ¯repM δ¯delM δ¯insM we thus obtain ¯δM(X) =Yand

c(δ¯M,X) =c(δ¯repM ,X) +c(δ¯delM , ¯δrepM (X)) +c(δ¯insM, ¯δrepM δ¯delM (X))

=

(i,j)∈M

c(xi,yj) +

iI(M,X,Y)

c(xi,−) +

jJ(M,X,Y)

c(−,yj) =c(M,X,Y).

Regarding the second claim, we perform an inductive proof. As base case, consider the emptyedit script δ¯ =e, which implies that ¯δ(X) = Y = X. In that case, we define Mδ¯ = (i,i)i ∈ {1, . . . ,|X|} . Accordingly, we obtain c(Mδ¯,X,Y) = c(Mδ¯,X,X) =

i|=X|1c(xi,xi). Becausecis self equal,c(xi,xi)is zero for all i, which in turn implies that c(Mδ¯,X,Y) =0=c(e,X)as desired.

Now, consider a non-emptyedit scriptδ¯ = δ1. . .δT+1 over∆A such that ¯δ(X) = Y and let ¯δ0 :=δ1. . .δT as well asY0 :=δ¯0(X). Due to induction, we know that there exists atree mapping Mδ¯0 betweenX andY0 such thatc(Mδ¯0,X,Y0)≤ c(δ¯0,X). Now, consider the final edit δT+1. IfδT+1(Y0) = Y0 = Y, we define Mδ¯ := Mδ¯0. Because Mδ¯0 is a valid tree mappingbetweenXandY0 it is also a validtree mappingbetweenX andY=Y0. Further, for the cost we obtainc(δ,¯ X) =c(δ¯0,X)Induction≥ c(Mδ¯0,X,Y0) =c(Mδ¯,X,Y).

It remains to consider all cases in which Y = δT+1(Y0) 6= Y0. We distinguish the following cases.

δT+1=repj,y

j for somej∈ {1, . . . ,|Y|}. Then, we define Mδ¯ := Mδ¯0. Mδ¯ is a tree map-pingbetweenX andY because the ancestral structure ofY0 andY is exactly the same and Mδ¯0 was per induction a validtree mappingbetweenXandY0.

Further, if there exists anisuch that(i,j)∈ Mδ¯0 we obtain:

c(δ,¯ X) =c(δ¯0,X) +c(y0j,yj)Induction≥ c(Mδ¯0,X,Y0) +c(y0j,yj)

=c(Mδ¯,X,Y)−c(xi,yj) +c(xi,y0j) +c(y0j,yj)

triang.

≥ c(Mδ¯,X,Y)−c(xi,yj) +c(xi,yj) =c(Mδ¯,X,Y) Conversely, if there is noisuch that(i,j)∈ Mδ¯0 we obtain:

c(δ,¯ X) =c(δ¯0,X) +c(y0j,yj)Induction≥ c(Mδ¯0,X,Y0) +c(y0j,yj)

=c(Mδ¯,X,Y)−c(−,yj) +c(−,y0j) +c(y0j,yj)

triang.

≥ c(Mδ¯,X,Y)−c(−,yj) +c(−,yj) =c(Mδ¯,X,Y)

δT+1 =delj for somej∈ {1, . . . ,|Y0|}. Then, for allj0,∈ {1, . . . ,j−1}it holds ancY(j0) = ancY0(j0), and for allj0,∈ {j+1, . . . ,|Y0|}it holds ancY(j0) ={j00|j00 ∈ancY0(j0),j00 <

j} ∪ {j00−1|j00 ∈ancY0(j0),j00 ≥j}. Accordingly, we define Mδ¯ :={(i,j0)∈ Mδ¯0|j0 <

j} ∪ {(i,j0−1)∈ Mδ¯0|j0 >j}such that Mδ¯ is a validtree mappingbetweenX and Y.

Further, if there exists anisuch that(i,j)∈ Mδ¯0 we obtain:

c(δ,¯ X) =c(δ¯0,X) +c(y0j,−)Induction≥ c(Mδ¯0,X,Y0) +c(y0j,−)

=c(Mδ¯,X,Y)−c(xi,−) +c(xi,y0j) +c(y0j,−)

triang.

≥ c(Mδ¯,X,Y)−c(xi,−) +c(xi,−) =c(Mδ¯,X,Y) Conversely, if there is noisuch that(i,j)∈ Mδ¯0 we obtain:

c(δ,¯ X) =c(δ¯0,X) +c(y0j,−)Induction≥ c(Mδ¯0,X,Y0) +c(y0j,−)

=c(Mδ¯,X,Y) +c(−,y0j) +c(y0j,−)

triang.

≥ c(Mδ¯,X,Y) +c(−,−)sel f=id.c(Mδ¯,X,Y)

δT+1 =inspar(j),yj,l,r for somej∈ {1, . . . ,|Y|},l≤r ∈ {1, . . . ,|$¯(y˜j)|}. Then, for all j0 < j it holds: ancY(j0) = ancY0(j0). For all j0 with j ∈ ancY(j0) it holds: ancY(j0) = {j00 ∈ ancY0(j0−1)|j00 < j} ∪ {j} ∪ {j00+1|j00 ∈ ancY0(j0−1),j00 ≥ j}. Finally, for all j0 with j0 > j and j ∈/ ancY(j0) it holds: ancY(j0) = {j00 ∈ ancY0(j0−1)|j00 <

j} ∪ {j00+1|j00 ∈ ancY0(j0−1),j00 ≥ j}. In other words, the ancestors for all j0 < j are maintained, while the ancestors for j0 > j in Y are the ancestors of j0 −1 inY0, except for j, which may be added as an ancestor. Accordingly, we define Mδ¯:={(i,j0)∈ Mδ¯0|j0 < j} ∪ {(i,j0+1)|(i,j0)∈ Mδ¯0,j0 ≥ j}such that Mδ¯is a valid tree mappingbetween XandY.

Further, for the cost we obtain:

c(δ,¯ X) =c(δ¯0,X) +c(−,yj)Induction≥ c(Mδ¯0,X,Y0) +c(−,yj) =c(Mδ¯,X,Y)

Therefore, in all cases, we obtain c(Mδ¯,X,Y) ≤ c(δ,¯ X)which concludes the proof by induction.

Finally, the third claim follows from the previous two. In particular, consider the following proof by contradition. If Dc(X,Y)<dc(X,Y), then there exists atree mapping M between X and Y such that c(M,X,Y) < dc(X,Y). However, we have shown that we can construct an edit script δ¯M such that ¯δM(X) = Y and c(δ¯M,X) = c(M,X,Y). Therefore,dc(X,Y)≤ c(δ¯M,X) =c(M,X,Y)<dc(X,Y), which is a contradiction. Con-versely, if dc(X,Y) < Dc(X,Y), then there exists an edit script δ¯ such that ¯δ(X) = Y and c(δ,¯ X) < Dc(X,Y). However, we have shown that we can construct a tree map-ping Mδ¯ between X and Y such that c(Mδ¯,X,Y) ≤ c(δ,¯ X). Therefore, Dc(X,Y) ≤ c(Mδ¯,X,Y) ≤ c(δ,¯ X) < Dc(X,Y), which is also a contradiction. This only leaves the option Dc(X,Y) =dc(X,Y), which concludes the proof.

As an example for the first construction in LemmaA.3, consider thetreesx˜= a(b) and ˜y=c(d), as well as thetree mappingM ={(1, 2)}. Mwould be translated into the following three edit scripts. First, ¯δrepM =rep1,y

2 =rep1,d; second, ¯δdelM =del2; and third, δ¯insM = inspar

˜

y(1),y1,ry˜(1),ry˜(1)+RM, ˜x, ˜y(1) = ins0,c,1,2. Note that the third construction works because

RM, ˜x, ˜y(1) =|adjy˜(1)∩JC(M, ˜x, ˜y)|+

j0adjy˜(1)∩J(M, ˜x, ˜y)

RM, ˜x, ˜y(j0) =|{2} ∩ {2}|+0=1

Accordingly, thetree mapping M= {(1, 2)}would be translated into theedit script δ¯M = rep1,ddel2ins0,c,1,2, which does indeed result in ¯δM(x˜) = del2ins0,c,1,2(d(b)) = ins0,c,1,2(d) =c(d) =y. The costs are˜ c(δ¯M, ˜x) =c(a,d) +c(b,−) +c(−,d) =c(M, ˜x, ˜y).

As an example for the second construction in LemmaA.3, consider thetreesx˜ =aand y˜ =b, as well as theedit scriptδ¯=rep1,cins0,b,1,2del2. Thisedit scriptwould be translated into atree mapping as follows. First, we initialize ourtree mappingas Me ={(1, 1)}. Next, consider the first edit, δ1 = rep1,c, which transforms ˜x into rep1,c(a) = c. The corresponding tree mappingremainsMrep1,c ={(1, 1)}. Next, consider the secondedit, δ2 =ins0,b,1,2, which transformscinto ins0,b,1,2(c) = b(c). The accordingtree mapping would thus beMrep1,cins0,b,1,2 ={(1, 2)}. Finally, consider the thirdedit,δ3 =del2, which transformsb(c)into del2(b(c)) = y. The according˜ tree mappingwould thus become Mδ¯=∅. For the costs we obtain

c(Mδ¯, ˜x, ˜y) =c(a,−) +c(−,b)triang.≤ c(a,c) +c(−,b) +c(c,−) =c(δ, ˜¯ x)

By virtue of LemmaA.3we can compute the cheapesttree mappingbetween two forestsinstead of the cheapest edit scriptwhich transforms oneforestinto the other, as long as ourcost functionfulfills the triangular inequality and is self-equal. This already simplifies our problem significantly because there is only a finite number of possible validtree mappingsbetween two inputforests, while there is an infinite number ofedit scripts. However, the number oftree mappingsis inO(2|x˜|·|y˜|)such that an exhaustive enumeration is infeasible. Instead, Zhang and Shasha (1989) propose a dynamic program-ming scheme which relies on decomposing theedit distancebetween two inputforests intoedit distancesbetween subforests. In particular, we define subforests as follows.

Definition A.5 (subforest). Let A be an alphabet, let X be a forest over A, and let k ∈ N,i∈ Z. Then, we define thesubforest X[k,i]fromk toirecursively as follows. If X= e, thenX[k,i]:=e. Otherwise, letX =x(X1),X0 for somex∈ Aand someforests X1,X0 ∈ T(A). In that case, we define:

X[k,i]:=





e ifk >i∨k>|X| (X1,X0)[k−1,i−1] if 1<k≤ i x(X1[1,i−1]),X0[1,i− |X1| −1] if 1=k≤ i

(A.13)

For example, the subforest(a,b,c)[2, 3]would beb,c. The subforest ˜x[2, 4] for ˜x = a(b(c,d),e)would beb(c,d). In general, subforests maintain the structure of the input forest, as the following Lemma demonstrates.

Lemma A.4. Let A be an alphabet, and let X 6= e be a forest over A. Then, for any i ∈ {1, . . . ,|X|}it holds: X[i,rlX(i)] =x˜i, that is, the subforest from i to rlX(i)is the ith subtree according to pre-order.

Proof. Note that X 6= eand i ≤ rlX(i)≤ |X| such that the first case of Equation A.13 does not apply. Now, let X = x(X1),X0 for some x ∈ A and some forests X1,X0 ∈ T(A) and consider the third case of Equation A.13, that is, i = 1. In that case, we obtain X[1,rlX(1)] = X[1,|X1|+1] = x(X1[1,|X1|),X0[1, 0] = x(X1[1,|X1|). Recursive application of case 3 yieldsx(X1[1,|X1|) =. . .= x(X1,e[1, 0]) =x(X1) =x˜1.

Now, consider case 2 of EquationA.13, that is,i>1, and distinguish the following subcases.

If parX(i) =0, letX=x˜1, . . . , ˜xRand letr ∈ {1, . . . ,R}be the index such that ˜xi =x˜r. Accordingly,i=rl=11|x˜l|. Further, let ˜xi =x˜r =xi(Xi)for someforestXi. Now, recursive application of case 2 of EquationA.13 yields X[i,rlX(i)] = (X1,X0)[i−1,rlX(i)−1] = . . .= (x˜2, . . . , ˜xR)[i− |x˜1|,rlX(i)− |x˜1|] = . . .= (x˜r, . . . , ˜xR)[1,|x˜r|] At this point, case 3 of EquationA.13applies and yields(x˜r, . . . , ˜xR)[1,|x˜r|] = xi(Xi[1,|Xi|]),(x˜r+1, ˜xR)[1, 0] = xi(Xi[1,|Xi|]) =. . .=x˜i, which concludes the proof.

Using the concept of subforests, we can now go on to establish the Bellman equations which will form the basis for the dynamic programming Algorithm2.1.

Lemma A.5. LetAbe analphabetand let X,Y be non-emptyforestsoverA. Further, let c be a cost functionoverA.

Then, for any i ∈ {1, . . . ,|X|+1}, j ∈ {1, . . . ,|Y|+1}, k ∈ ancX(i)∪ {i}, and l ∈ ancY(j)∪ {j}it holds:

Dc(e,e) =0 (A.14) Dc(X[i,rlX(k)],e) =c(xi,−) +Dc(X[i+1,rlX(k)],e) (A.15) Dc(e,Y[j,rlY(l)]) =c(−,yj) +Dc(e,Y[j+1,rlY(l)]) (A.16)

Dc(X[i,rlX(k)],Y[j,rlY(l)]) =minn

(A.17) c(xi,−) +Dc(X[i+1,rlX(k)],Y[j,rlY(l)]),

c(−,yj) +Dc(X[i,rlX(k)],Y[j+1,rlY(l)]), c(xi,yj) +Dc(X[i+1,rlX(i)],Y[j+1,rlY(j)])+

Dc(X[rlX(i) +1,rlX(k)],Y[rlY(j) +1,rlY(l)])o Dc(X[i,rlX(k)],Y[j,rlY(l)]) =minn

(A.18) c(xi,−) +Dc(X[i+1,rlX(k)],Y[j,rlY(l)]),

c(−,yj) +Dc(X[i,rlX(k)],Y[j+1,rlY(l)]),

Dc(x˜i, ˜yj) +Dc(X[rlX(i) +1,rlX(k)],Y[rlY(j) +1,rlY(l)])o Dc(x˜i, ˜yj) =min{c(xi,−) +Dc(X[i+1,rlX(i)],Y[j,rlY(j)]), (A.19)

c(−,yj) +Dc(X[i,rlX(i)],Y[j+1,rlY(j)]), c(xi,yj) +Dc(X[i+1,rlX(i)],Y[j+1,rlY(j)])o

Proof. First, consider EquationsA.14,A.15, andA.16. In all these cases, only the empty tree mapping M=is possible because at least one inputforestis empty. The cost of the emptytree mappingfor any twoforestsXandYis

c(∅,X,Y) =

|X|

i=1

c(xi,−) +

|Y| j

=1

c(−,yj) This cost decomposes as desired, in particular:

c(∅,e,e) =0,

c(,X[i,rlX(k)],e) =c(xi,−) +c(,X[i+1,rlX(k)],e), and c(∅,e,Y[j,rlY(l)]) =c(−,yj) +c(∅,e,Y[j+1,rlY(l)])

Next, consider EquationsA.17andA.18. In particular, let Mbe atree mapping be-tween the subforests X[i,rlX(k)]andY[j,rlY(l)]such thatc(M,X[i,rlX(k)],Y[j,rlY(l)]) = Dc(X[i,rlX(k)],Y[j,rlY(l)]). To avoid symbol clutter, we will use the shorthands Xi := X[i,rlX(k)], Xi+1 := X[i+1,rlX(k)],Yj := Y[j,rlY(l)], andYj+1 := Y[j+1,rlY(l)]. Now, one of the following three cases has to apply:

1∈ I(M,Xi,Yj): In this case, M0 := {(i0 −1,j0)|(i0,j0) ∈ M} is a tree mapping be-tweenXi+1andYj. Further, it holdsc(M0,Xi+1,Yj) =Dc(Xi+1,Yj). Otherwise, there would exist atree mapping M˜0 betweenXi+1 andYj, such thatc(M˜0,Xi+1,Yj) <

c(M0,Xi+1,Yj). In that case, consider ˜M := {(i0+1,j0)|(i0,j0)∈ M˜0}, which is atree mappingbetweenXi andYj, such that:

Dc(Xi,Yj)≤c(M,˜ Xi,Yj) =c(xi,−) +c(M˜0,Xi+1,Yj)

<c(xi,−) +c(M0,Xi+1,Yj) =c(M,Xi,Yj) =Dc(Xi,Yj)

which is a contradiction. Therefore, it holds:

Dc(Xi,Yj) =c(M,Xi,Yj) =c(xi,−) +c(M0,Xi+1,Yj)

= c(xi,−) +Dc(Xi+1,Yj) (A.20) 1∈ J(M,Xi,Yj): In this case, M0 := {(i0,j0 −1)|(i0,j0) ∈ M} is a tree mapping

be-tweenXi andYj+1. Further, it holdsc(M0,Xi,Yj+1) =Dc(Xi,Yj+1). Otherwise, there would exist a tree mapping M˜0 between Xi and Yj+1, such thatc(M˜0,Xi,Yj+1) <

c(M0,Xi,Yj+1). In that case, consider ˜M:={(i0,j0+1)|(i0,j0)∈ M˜0}, which is atree mappingbetween Xi andYj, such that:

Dc(Xi,Yj)≤c(M,˜ Xi,Yj) =c(−,yj) +c(M˜0,Xi,Yj+1)

<c(−,yj) +c(M0,Xi,Yj+1) =c(M,Xi,Yj) =Dc(Xi,Yj) which is a contradiction. Therefore, it holds:

Dc(Xi,Yj) =c(M,Xi,Yj) =c(−,yj) +c(M0,Xi,Yj+1)

= c(−,yj) +Dc(Xi,Yj+1) (A.21) 1∈ IC(M,Xi,Yj)∧1∈ JC(M,Xi,Yj): In this case, we first show that(1, 1)∈ M. If that would not be the case, there would exist ai∈ {1, . . . ,|Xi|}and a j∈ {1, . . . ,|Yj|}, such that(1,j)∈ M,(i, 1)∈ M, andi6=1 or j6=1. Ifi>1, Equation2.21implies thatj<1, which is a contradiction. Conversely, if j>1, Equation2.21implies that i<1, which is a contradiction. Therefore,i= j=1 and, thus,(1, 1)∈ M.

Now, Equation2.22 implies that for all(i0,j0)∈ Mit must hold: 1∈ancXi(i0) ⇐⇒

1 ∈ ancYj(j0). In conjunction with Equation A.6, we obtain 1 ≤ i0 ≤ |x˜i| ⇐⇒

1 ≤ j0 ≤ |y˜j|. Accordingly, M must be decomposable as M = M1∪M2 where for all (i0,j0) ∈ M1 it holds i0 ≤ |x˜i| and j0 ≤ |y˜j|; and for all (i0,j0) ∈ M2 it holds i0 > |x˜i| and j0 > |y˜j|. This, in turn, implies that M1 is a tree mapping between X[i,rlX(i)] Lemma= A.4i and Y[j,rlY(j)] Lemma= A.4j, and M20 := {(i0

|x˜i|,j0− |y˜j|)|(i0,j0) ∈ M2} is a tree mappingbetween X0 := X[rlX(i) +1,rlX(k)]

andY0 :=Y[rlY(j) +1,rlY(l)].

Further, it holds c(M1, ˜xi, ˜yj) = Dc(x˜i, ˜yj). Otherwise, there would exist a tree mapping M˜1between ˜xi and ˜yj, such thatc(M˜1, ˜xi, ˜yj)< c(M1, ˜xi, ˜yj). In that case, consider ˜M := M˜1∪M2, which is atree mappingbetweenXi andYj such that:

Dc(Xi,Yj)≤ c(M,˜ Xi,Yj) =c(M˜1, ˜xi, ˜yj) +c(M02,X0,Y0)

< c(M1, ˜xi, ˜yj) +c(M20,X0,Y0) =c(M,Xi,Yj) =Dc(Xi,Yj)

which is a contradiction. Also, it holdsc(M2,X0,Y0) =Dc(X0,Y0). Otherwise, there would exist a tree mapping M˜20 between X0 and Y0, such that c(M˜02,X0,Y0) <

c(M2,X0,Y0). In that case, consider ˜M := M1∪ {(i0+|x˜i|,j0+|y˜j|)|(i0,j0) ∈ M˜02} which is atree mappingbetweenXi andYj such that:

Dc(Xi,Yj)≤ c(M,˜ Xi,Yj) =c(M1, ˜xi, ˜yj) +c(M˜02,X0,Y0)

< c(M1, ˜xi, ˜yj) +c(M20,X0,Y0) =c(M,Xi,Yj) =Dc(Xi,Yj)

which is a contradiction. Therefore, we obtain:

Dc(Xi,Yj) =c(M,Xi,Yj) =c(M1, ˜xi, ˜yj) +c(M02,X0,Y0) (A.22)

= Dc(x˜i, ˜yj) +Dc X[rlX(i) +1,rlX(k)],Y[rlY(j) +1,rlY(l)]

Finally, consider the term Dc(x˜i, ˜yj). Because (1, 1) ∈ M1, it follows that M10 := {(i0 −1,j0−1)|(i0,j0) ∈ M1\ {(1, 1)}} is a tree mapping between X0i+1 := X[i+ 1,rlX(i)]andYj0+1:=Y[j+1,rlY(j)]. Further, it holdsc(M10,Xi0+1,Yj0+1) = Dc(X0i+1,Yj0+1). If that would not be the case, there would exist atree mapping M˜01between X0i+1 and Yj0+1, such that c(M˜10,Xi0+1,Yj0+1) < c(M01,X0i+1,Yj0+1). In that case, consider M˜1:={(1, 1)} ∪ {(i0+1,j0+1)|(i0,j0)∈ M˜10}, which is atree mappingbetween ˜xi and ˜yj, such that:

Dc(x˜i, ˜yj)≤c(M˜1, ˜xi, ˜yj) =c(xi,yj) +c(M˜10,Xi0+1,Yj0+1)

<c(xi,yj) +c(M01,X0i+1,Yj0+1) =c(M1, ˜xi, ˜yj) =Dc(x˜i, ˜yj) which is a contradiction. Therefore, it holds:

Dc(x˜i, ˜yj) =c(M1, ˜xi, ˜yj) =c(xi,yj) +c(M01,X0i+1,Yj0+1) (A.23)

=c(xi,yj) +Dc(X[i+1,rlX(i)],Y[j+1,rlY(j)])

Note that these three cases are exhaustive, that is, one of the EquationsA.20,A.21, or A.22 has to apply. Further, the cheapest option of these three has to apply, oth-erwise c(M,Xi,Yj) > Dc(Xi,Yj), which would be a contradiction. The minimum of Equations A.20,A.21, andA.22yields EquationA.18. If we then plug Equation A.23into EquationA.18we obtain EquationA.17.

Finally, consider EquationA.19. We obtain this equation by settingk= iand l=jin EquationA.17, thus yielding:

Dc(x˜i, ˜yj)Lemma= A.4Dc(X[i,rlX(i)],Y[j,rlY(j)]) =minn c(xi,−) +Dc(X[i+1,rlX(i)],Y[j,rlY(j)]), c(−,yj) +Dc(X[i,rlX(i)],Y[j+1,rlY(j)]), c(xi,yj) +Dc(X[i+1,rlX(i)],Y[j+1,rlY(j)])+

Dc(X[rlX(i) +1,rlX(i)],Y[rlY(j) +1,rlY(j)])o

Note that X[rlX(i) +1,rlX(i)] =eandY[rlY(j) +1,rlY(j)] =e. Therefore,Dc(X[rlX(i) + 1,rlX(i)],Y[rlY(j) +1,rlY(j)]) =Dc(e,e)Eq.=A.140, which in turn yields EquationA.19.

An example for the decompositions in Equations A.18 and A.19 is shown in Fig-ureA.1.

Using these decompositions, we can finally prove the invariants of Algorithm2.1, which then imply the correctness of the algorithm.

Lemma A.6. LetAbe analphabet, letx and˜ y be˜ treesoverA, and let c be acost functionover A. Then, after each completion of lines 6-26 in Algorithm2.1for the inputx,˜ y, and c it holds for˜ all i∈ {k, . . . ,rlx˜(k)}and all j ∈ {l, . . . ,rly˜(l)}:

Di,j =Dc(x˜[i,rlx˜(k)], ˜y[j,rly˜(l)]) and (A.24)

di,j =Dc(x˜i, ˜yj) (A.25)

Dc b

d c

e, f g

c(b,−) +Dc

c d e, f g

Dc

b

d c

, f g

+Dc(e,e)

c(−,f) +Dc b

d c

e,g

c(b,−) +Dc c d, f

g

c(b,f) +Dc c d,g

c(−,f) +Dc

b

d c

,g deleteb

replacebwithf

insertf

deleteb

replacebwithf

insertf

Figure A.1:An illustration of the decompositions in EquationsA.18(top) andA.19(bottom) for the example subforestsXi =b(c,d),eandYj=f(g).

Proof. We perform an inductive argument overiand jin descending order. First, con-sider the base cases. If i = rlx˜(k) +1 and j = rly˜(l) +1, we obtain Dc(x˜[rlx˜(k) + 1,rlx˜(k)], ˜y[rly˜(l) +1,rly˜(l)]) = Dc(e,e) Eq.=A.14 0, which is correctly computed in line 6.

Further, ifi≤rlx˜(k)andj=rly˜(l) +1, we obtainDc(x˜[i,rlx˜(k)], ˜y[rly˜(l) +1,rly˜(l)]) = Dc(x˜[i,rlx˜(k)],e) Eq.=A.15 c(xi,−) +Dc(x˜[i+1,rlx˜(k)], ˜y[rly˜(l) +1,rly˜(l)]), which is cor-rectly computed in lines 7-9 for alli∈ {k, . . . ,rlx˜(k)}.

Similarly, ifj≤rly˜(l)andi=rlx˜(k) +1, we obtainDc(x˜[rlx˜(k) +1,rlx˜(k)], ˜y[j,rly˜(l)]) = Dc(e, ˜y[j,rly˜(l)]) Eq.=A.15 c(−,yj) +Dc(x˜[rlx˜(k) +1,rlx˜(k)], ˜y[j+1,rly˜(l)]), which is cor-rectly computed in lines 10-12 for allj∈ {l, . . . ,rly˜(l)}.

Now, consider the casei≤rlx˜(k)andj≤rly˜(l). Per induction, we already know that Di+1,j =Dc(x˜[i+1,rlx˜(k)], ˜y[j,rly˜(l)]),

Di,j+1 =Dc(x˜[i,rlx˜(k)], ˜y[j+1,rly˜(l)]),

Di+1,j+1 =Dc(x˜[i+1,rlx˜(k)], ˜y[j+1,rly˜(l)]), and Drlx˜(i)+1,rly˜(j)+1 =Dc(x˜[rlx˜(i) +1,rlx˜(k)], ˜y[rly˜(j) +1,rly˜(l)]).

Now, distinguish the following cases.

If rlx˜(i) = rlx˜(k) and rly˜(j) = rly˜(l), we obtain ˜x[i,rlx˜(k)] = x˜[i,rlx˜(i)] Lemma= A.4i and ˜y[j,rly˜(l)] =y˜[j,rly˜(j)]Lemma= A.4j, such thatDc(x˜[i,rlx˜(k)], ˜y[j,rly˜(l)]) = Dc(x˜i, ˜yj), which can be computed according to Equation A.19. Therefore, lines 16-18 of Algo-rithm2.1ensure thatDi,j = Dc(x˜[i,rlx˜(k)], ˜y[j,rly˜(l)]). Further, becauseDi,j is now equiv-alent toDc(x˜i, ˜yj), line 19 is correct as well.

If rlx˜(i) 6= rlx˜(k) orrly˜(j) 6= rly˜(l), the decomposition of Dc(x˜[i,rlx˜(k)], ˜y[j,rly˜(l)]) according to Equation A.18applies. Accordingly, lines 21-23 of Algorithm2.1 ensure Di,j =Dc(x˜[i,rlx˜(k)], ˜y[j,rly˜(l)]), under the condition thatdi,j = Dc(x˜i, ˜yj). We know that this condition holds if we have executed lines 6-26 before with the keyrootskx˜(i)and ky˜(j). Because the loops in lines 4-5 iterate thekeyrootsin descending oder, it remains to show thatk<kx˜(i)andl≤ky˜(j), ork ≤kx˜(i)andl<ky˜(j).

First, consider the caserlx˜(i) 6= rlx˜(k). In that case,k 6= kx˜(i) andk 6= i, otherwise rlx˜(k) =rlx˜(i) =rlx˜(kx˜(i)), which is a contradiction. Further, it must holdk <i≤rlx˜(k), otherwiseiwould not be accessed in the loop in line 13. In turn, EquationA.7implies thatk∈ancx˜(i). Further, due to the definition ofkeyroots, kx˜(i)≤i≤rlx˜(i) =rlx˜(kx˜(i)). Now, if kx˜(i) = i, we obtain k < i = kx˜(i) as desired. Otherwise, we obtain kx˜(i) <

i ≤ rlx˜(kx˜(i)), such that EquationA.7implies that kx˜(i) ∈ ancx˜(i). Now, assume that kx˜(i)< k. In that case, EquationA.8implies kx˜(i)∈ancx˜(k). Consequently, EquationA.6 tells us that rlx˜(k) ≤ rlx˜(kx˜(i)). However, due to k ∈ ancx˜(i), Equation A.6also tells us that rlx˜(k) ≥ rlx˜(i) = rlx˜(kx˜(i)), such that rlx˜(k) = rlx˜(kx˜(i)) = rlx˜(i), which is a contradiction. Therefore, we can conclude that k < kx˜(i). It remains to show that l≤ky˜(j). Ifrly˜(l) =rly˜(j), thenl=ky˜(j), because the minimum is unique. Otherwise, rly˜(l)6=rly˜(j).

If rly˜(j) 6= rly˜(l), we know that l 6= ky˜(j) and l 6= j, otherwise rly˜(l) = rly˜(j) = rly˜(ky˜(j)), which is a contradiction. Further, it must holdl< j≤rly˜(l), otherwisejwould not be accessed in the loop in line 14. In turn, Equation A.7implies that l ∈ ancy˜(j). Further, due to the definition of keyroots, ky˜(j) ≤ j ≤ rly˜(j) = rly˜(ky˜(j)). Now, if ky˜(j) = j, we obtainl< j=ky˜(j)as desired. Otherwise, we obtain ky˜(j)<j≤rly˜(ky˜(j)), such that Equation A.7 implies that ky˜(j) ∈ ancy˜(j). Now, assume that ky˜(j) < l. In that case, Equation A.8 implies ky˜(j) ∈ ancy˜(l). Consequently, Equation A.6 tells us that rly˜(l) ≤ rly˜(ky˜(j)). However, due to l ∈ ancy˜(j), Equation A.6 also tells us that rly˜(l)≥rly˜(j) =rly˜(ky˜(j)), such thatrly˜(l) =rly˜(ky˜(j)) =rly˜(j), which is a contradiction.

Therefore, we can conclude thatl<ky˜(j). It remains to show that k≤kx˜(i). Ifrlx˜(k) = rlx˜(i), thenk=kx˜(i), because the minimum is unique. Otherwise,rlx˜(k)6=rlx˜(i), which implies k<kx˜(i), as we have shown above.

Now, we can finally complete the proof. First, note that Lemma A.3implies that dc(x, ˜˜ y)is equivalent toDc(x˜1, ˜y1)ifcfulfills the triangular inequality and is self-equal.

Further, Lemma A.6tells us that the output of Algorithm2.1,d1,1, is equal to Dc(x˜1, ˜y1) if keyrootsk∈ K(x˜)andl∈ K(y˜)exist such thatk≤ 1≤rlx˜(k)andl≤1 ≤rly˜(l). Per definition ofoutermost right leaves, we know thatrlx˜(1) =|x˜|andrly˜(1) =|y˜|. Further, per definition ofkeyroots, we know that kx˜(|x˜|) = min{k|rlx˜(k) = rlx˜(|x˜|)} = 1, and ky˜(|y˜|) =min{l|rly˜(l) =rly˜(|y˜|)}=1 because 1 is the lowest possible index. Therefore, 1∈ K(x˜)and 1∈ K(y˜), which concludes the overall proof.

Im Dokument Metric Learning for Structured Data (Seite 184-195)