D Proofs for Positive Reduction for EL lhs TBoxes

Since our target TBoxT containsEL_lhsconcept inclusions, we can find counterexam-ples by posing atomic queries(T,A)|=A(b)to the oracle, withA ∈N_C∩Σ_T and b∈Ind(A). We start our proofs in this section with a straightforward argument for this fact.

Proof of Lemma 4.If(A, D(a))is a positive counterexample then there exists a concept nameAand an individualb∈Ind(A)such that(A, A(b))is also a counterexample.

Proof. As(A, D(a))is a positive counterexample,(T,A) |= D(a)and (H,A) 6|= D(a). Then, by Lemma 14,IT,A|=D(a)andIH,A6|=D(a), whereIT,AandIH,A

In order to compute a tree interpretation of anELconcept C we firstminimize an ABoxAso that|A|is bounded by|T |. Recall that we denote byA^−a the result of removing fromAall ABox assertions whereaoccurs. That is,A^−a = A \ A^a, whereA^a = {r(a, b) | b ∈ Ind(A),r ∈ Σ_T ∩NR} ∪ {r(b, a) | b ∈ Ind(A),r ∈ Σ_T ∩NR} ∪ {A(a)| A∈Σ_T ∩NC}. Also,A^−r(a,b)is obtained by removing a role assertionr(a, b)fromA. LetI_Abe the canonical model of an ABoxA. We say thatI_A is acountermodelifI_A6|=T andI_A|=H. We defineAasminimalif the following conditions are satisfied:

1. I_Ais a countermodel;

2. I_A^−a |=T; and 3. I_A−r(a,b)|=T.

The following lemma shows that the size of a minimal ABox is polynomial in|T |.

Lemma 15. LetIAbe the canonical model of a minimal ABoxA. Then,|∆^I^A| ≤ |T |.

Proof. By Condition 1 of minimal ABoxes,I_Ais a countermodel. SoI_A6|=T. Then there isC vA ∈ T such thata∈(C\A)^I^A, for somea∈∆^I^A. Ifa∈C^I^A then (by Lemma 10) there is a homomorphismh:IC → I_AmappingρCtoa, whereρC

is the root ofIC. We need to show thathis surjective. Suppose this is not the case.

Then, there isd∈∆^I^A such thatd /∈Imh, whereImh={e∈∆^I^A|e=h(p)for some p∈∆^I^C}. Now, denote asI_A−dthe result of removingd /∈ImhfromIA. SinceI_A−d

is a subinterpretation ofIA, ifa /∈A^I^Athena /∈A^I^A−^d. Soa∈(C\A)^I^A−^d, which means thatI_A^−d6|=T. This contradicts the fact thatI_A^−d |=T for any elementdfrom

∆^I^A (Condition 3 of minimal ABoxes). SinceCvA∈ T, we know that|∆^I^C| ≤ |T |.

Thus,|∆^I^A| ≤ |∆^I^C| ≤ |T |. o

Algorithm 2, presented in Section 3, minimizesAto anA⁰ with the properties described above. Since the Algorithm receives a positive counterexample, we know that I_Ais not a model ofT, that is,I_A 6|=T . In order to satisfy Condition 1 above and reduceA(Conditions 2 and 3), Algorithm 2 applies rules ‘Concept saturate’, ‘Domain Minimize’ and ‘Role Minimize’, as described in Section 3.

Proof of Lemma 5. Given a positive counterexample(A, D(a))withD ∈ NC, Al-gorithm 2 computes in polynomially many steps with respect to|A|,|H|, and|T |an ABoxA⁰ such that|Ind(A⁰)| ≤ |T |and(A⁰, A(b))is a positive counterexample, for some concept nameAand individualb∈Ind(A⁰).

Proof. By Lemma 15, ifAis a minimal ABox then|Ind(A)| ≤ |T |. Then, we only need the following claims to show this lemma.

Claim 1Algorithm 2 computes a minimal ABoxA.

For Condition 1, we have that, in Line 3, Algorithm 2 concept saturates AwithH.

Then, after computing Line 3, we have thatI_A |=H. Since the minimization rules described above do not remove any concept name implied byH, the ABox computed by the algorithm is a model ofHin all steps that follow Line 3. By definition of the rules,

at least one counterexample is entailed by(T,A), which is the counterexample where the rules are being applied. So for all iterations of Algorithm 2,I_A6|=T.

For Condition 2, suppose there isA^−dsuch thatI_A−d6|=T. Then there isCvA∈ T anda∈∆^I^A−^dsuch thata∈(C\A)^I^A−^d. This contradicts the fact that, in Line 6, domain minimization was applied inAfor all counterexamples. Thus,I_A−d|=T. The argument is similar for role minimization (Condition 3).

Claim 2Algorithm 2 runs in polynomially many steps with respect to|A|and|NC∩Σ_T|, whereNC∩Σ_T are the concept names in the vocabulary.

We know that the number of possible concept name assertions inAis|NC∩ΣT|·|Ind(A)|.

So, in Line 3, the number of applications of the rule Concept Saturate with H is bounded by|NC∩ΣT| · |Ind(A)|. Also, the number of iterations in Line 4 is at most (Figure 4a). Assume Algorithm 2 starts minimizingAin Line 4 with the counterexample A1(b). The algorithm eliminatess(b, d)andt(d, b)fromA. As a result,A3(d)is not a counterexample any more. In the next iteration, the algorithm tries to minimizeAwith A2(a), which does not eliminates any other assertion fromA. SoAis now minimal. The result of minimizingAis shown by Figure 4b, it contains now onlyA1(b)andA2(a)as counterexamples.

We have seen that a minimal ABoxAis a countermodel bounded by|T |. Algorithm 3, presented in Section 3, is based on two operations (i) minimization, presented above, and (ii)unfolding. The unfolding operation doubles the length of a cycle in A. By increasing the length of cycles and then minimizing, the algorithm proceeds unfolding elements untilAis tree shaped. We say thatAhas a (undirected) cycle if there is a finite sequencea0·r1·a1·...·rk·aksuch that (i)a0=akand (ii) there are mutually

distinct assertions of the formr_i+1(a_i, a_i+1)orr_i+1(a_i+1, a_i)inA, for0 ≤i < k.

For a cyclec=a₀·r₁·a₁·...·r_k·a_k, denote asnodes(c) ={a₀, a₁, ..., a_k−1}the set of individuals that occur inc. Also, roles(c) = {r₁, r₂, ..., r_k}is the set of roles that occur inc. We denote bybathe copy of an elementacreated by the unfolding operation described below. The set of copies of individuals that occur incis denoted bynodes(bc) ={ab0,ab1, ...,ba_k−1}. LetI_Abe the canonical interpretation of an ABox A. An elementa∈∆^I^A isfoldedif there is a cyclec=a0·r1·a1·...·rk·ak with a=a0=ak. Without loss of generality we assume thatr1(a0, a1)∈ A. Theunfolding ofcis described below.

1. We first open the cycle by removingr₁(a₀, a₁)fromA. Sor₁^I^A :=r^I₁^A\ {(a₀, a₁)}.

2. Then we create copies of the nodes in the cycle:

– ∆^I^A :=∆^I^A∪ {bb|b∈nodes(c)}

– A^I^A :=A^I^A∪ {bb|b∈A^I^A}

– r^I^A :=r^I^A∪ {(bb,d)b |(b, d)∈r^I^A} ∪{(bb, e)|(b, e)∈r^I^A,e /∈nodes(c)}

3. As a third step we close again the cycle, now with double size. So we update r₁^I^A :=r₁^I^A∪ {(a0,ab1),(ab0, a1)}.

We now show that our unfolding maintains the invariant that ifA(a)is a counterex-ample for(T,A)relative to(H,A)thenA(a)will remain as a counterexample after applying this operation over an arbitrary cycle inA. This is obtained by Lemmas 16 and 17.

Lemma 16. LetA⁰be the result of unfolding a cyclecinA. Then the following relation S⊆∆^I^A×∆^I^A0is a simulationIA⇒ IA⁰:

– fora∈∆^I^A\nodes(c),(a^I^A, a^I^A0)∈S;

– fora∈nodes(c),(a^I^A, a^I^A0)∈Sand(a^I^A,ba^I^A0)∈S.

Proof. We need to show thatSis a simulationIA⇒ IA⁰. That is, ford, d₁∈∆^I^A and e, e₁∈∆^I^A0:

1. for all concept namesA∈Σ_T and all(d, e)∈S, ifd∈A^I^A thene∈A^I^A0; 2. for all role namesr ∈Σ_T, all(d1, e1)∈Sand alld2 ∈∆^I^A, if(d1, d2)∈r^I^A

then there existse2∈∆^I^A0 such that(e1, e2)∈r^I^A0 and(d2, e2)∈S.

For Point 1 we have that by definition of the unfolding operation (Step 2), ifa∈A^I^A thena∈A^I^A0andba∈A^I^A0. Point 2 follows from Claims 1 and 2 below.

Claim 1Ifa^I^A has anr-successorbthena^I^A0 has anr-successordwith(b, d)∈S.

By definition of the unfolding operation (Step 1), r₁(a₀, a₁)is the only role asser-tion removed fromA. In Step 3 we include(a₀,ab₁)tor^I₁^A. By definition ofS, we have that(a₁,ab₁)∈S. So ifa^I^A has anr-successorbthena^I^A0 has anr-successordwith (b, d)∈S.

Claim 2Ifa^I^A has anr-successorbthenba^I^A0 has anr-successorewith(b, e)∈S.

For(a₀, a₁)∈r^I₁^A, in Step 3 we include(ba₀, a₁)tor^I

0 A

1 . By definition ofS,(a^I₁^A0, a^I₁^A0)∈ S. Otherwise, in Step 2 we have that for allr-successorsb^I^A ofa^I^A such thatb /∈ nodes(c),(ba^I^A0, b^I^A0)∈r^I^A0. By definition ofS,(b^I^A, b^I^A0)∈S. Also, in Step 2, for ther-successorsb^I^A ofa^I^A such thatb∈nodes(c), we have that(ba^I^A0,bb^I^A0)∈r^I^A0, wherebb^I^A0 is the copy ofb^I^A0. Again, by definition ofS,(b^I^A,bb^I^A0)∈S. o Lemma 17. LetA⁰be the result of unfolding a cyclecinA. Leth_∗:I_A⁰ → I_Abe the following mapping:

– fora∈∆^I^A\nodes(c),h_∗(a^I^A0) =a^I^A;

– fora∈nodes(c),h_∗(a^I^A0) =a^I^A andh_∗(ba^I^A0) =a^I^A. Then,h_∗:I_A⁰ → I_Ais a homomorphism.

Proof. By definition of the unfolding operation, no concept name assertion is removed fromA⁰. Soa∈A^I^A0 iffa∈A^I^A. Also, in Step 2 of the unfolding operation we have thatba∈A^I^A0 iffa∈A^I^A. So ifa∈A^I^A0 orba∈A^I^A0 thenh∗(a) =h∗(ba) = a∈ A^I^A. Now, for(a, b)∈r^I^A0, we make a case distinction:

– a, b /∈nodes(bc): in this case, the unfolding operation does not include any new role assertion. Then,(a, b)∈r^I^A0 implies(a, b)∈r^I^A.

– ba,bb∈nodes(bc): by Step 2 of the unfolding operation, if(ba,bb)∈r^I^A0 then(a, b)∈ r^I^A.

– ba∈nodes(bc)andb /∈nodes(bc): for(ba0, a1)∈r₁^I^A0 we know that(a0, a1)∈r^I₁^A. Otherwise, by Step 2, if(ba, b)∈r^I^A0 then(a, b)∈r^I^A.

– a /∈nodes(bc)andbb∈nodes(bc): by the definition of the unfolding operation there is only one case, in Step 3, which is(a0,ba1)∈ r^I₁^A0. In this case we know that (a0, a1)∈r₁^I^A.

In all cases we have that fora, b∈∆^I^A0,(a, b)∈r^I^A0implies(h_∗(a), h_∗(b))∈r^I^A. o Before we show Lemma 6 we need the following lemma, which shows the progress of our unfolding operations.

Lemma 18. LetIn be the minimal ABox computed in then-th iteration in Line 5 of Algorithm 3. AssumeInhas a cycle. For alln≥0,|∆^Iⁿ⁺¹|>|∆^Iⁿ|.

Proof. By assumptionIn has a cyclec. LetI_n⁰ be the result of unfoldingcandIn+1

be the result of minimizing I_n⁰. Let h∗ : I_n⁰ → In be the homomorphism defined in Lemma 17. Letg =h_∗|_∆^In+1 beh_∗restricted to∆^Iⁿ⁺¹ ⊆ ∆^I⁰ⁿ. SinceIn+1is a subinterpretation ofI_n⁰,g:In+1→ Inis a homomorphism.

Claim 1g:I_n+1→ I_nis a surjective homomorphism.

Supposegis not surjective. SinceIn+1is a countermodel (Condition 1 of minimal ABoxes) there isC vA∈ T such thata∈(C\A)^Iⁿ⁺¹, witha∈∆^Iⁿ⁺¹. LetJ be the subinterpretation ofIndetermined by the range ofg. By the unfolding definition,

a ∈A^Iⁿ⁺¹iffg(a)∈ A^Iⁿ. Theng(a) ∈(C\A)^J. SinceI_nis minimal, ifgis not surjective then this contradicts Condition 2 of minimal ABoxes.

Claim 2Supposeg : In+1 → In is an injective homomorphism. Then, ford1, d2 ∈

∆^Iⁿ⁺¹,(g(d1), g(d2))∈r^Iⁿimplies(d1, d2)∈r^Iⁿ⁺¹.

Suppose this is not the case and there isd1, d2∈∆^Iⁿ⁺¹such that(g(d1), g(d2))∈r^Iⁿ and(d1, d2)∈/ r^Iⁿ⁺¹. LetJ be the result of removing(g(d1), g(d2))fromr^Iⁿ. Since g is injective,g : In+1 → J is also a homomorphism. AsIn+1is a countermodel (Condition 1 of minimal ABoxes) there isCvA∈ T such thata∈(C\A)^Iⁿ⁺¹, with a∈∆^Iⁿ⁺¹. Then,g⁰(a)∈C^J. By the unfolding definition,a∈A^Iⁿ⁺¹iffg(a)∈A^Iⁿ. By definition ofJ,g(a)∈A^Iⁿiffg⁰(a)∈A^J. Theng⁰(a)∈(C\A)^J. SinceJ isIn

with(g(d1), g(d2))removed fromr^Iⁿ, this contradicts the fact thatInis role minimal (Condition 3 of minimal ABoxes).

Claim 3g:I_n+1→ I_nis not an injective homomorphism.

Asg is surjective (Claim 1),aorba is in∆^Iⁿ⁺¹. Suppose thatg is injective. Then, for eacha∈nodes(c), only one of{a,ba}are in∆^Iⁿ⁺¹. Recall that cyclecis a sequence a0·r1·a1·...·rk·ak, witha0 = ak, where we defined w.l.g. that(a0, a1) ∈ r₁^Iⁿ. Assumea0 ∈ ∆^Iⁿ⁺¹ (the case whereba0 ∈∆^Iⁿ⁺¹ is similar). Now, we make a case distinction:

– k = 1: in this case, the cycle is a reflexive element. That is,a₀ = a₁and, then, (a0, a0)∈r₁^Iⁿ. By definition of the unfolding operation,(a0, a0)∈/r^I₁ⁿ⁺¹⊆r^I

0 n

1 . So ifa₀∈∆^Iⁿ⁺¹then(g(a₀), g(a₀))∈r₁^Iⁿand(a₀, a₀)∈/r₁^Iⁿ⁺¹, which contradicts Claim 2.

– k > 1: By definition of the unfolding operationr₁(a₀,ba₁), r₁(ba₀, a₁) are the only role assertions between elements innodes(c)andnodes(bc)inI_n⁰, whereI_n⁰ isIn with cyclec unfolded. This means that (*) neither(ba_i, a_i+1)or(a_i+1,ba_i) are inr^I_i+1ⁿ⁺¹ ⊆ r^I_i+1ⁿ⁰ , for 1 ≤ i < k. By assumption a₀ ∈ ∆^Iⁿ⁺¹. Then, for 1 ≤i < k,bai ∈ ∆^Iⁿ⁺¹, otherwise we would obtain a contradiction with Claim 2. Sincea0 = ak ∈ ∆^Iⁿ⁺¹ andg is injective,ba0 = bak ∈/ ∆^Iⁿ⁺¹. By the same argument, asbak−1∈∆^Iⁿ⁺¹we have thatak−1∈/∆^Iⁿ⁺¹. By definition ofc, either (ak−1, ak)or(ak, ak−1)are inr^I_kⁿ. Together with the fact (*) that neither(bak−1, ak)

or(a_k,ba_k−1)are inr^I_kⁿ⁺¹ ⊆r_k^Iⁿ⁰, this contradicts Claim 2.

Theng is not injective. Since gis surjective (Claim 1) and not injective (Claim 3),

|∆^Iⁿ⁺¹|>|∆^Iⁿ|. o

Proof of Lemma 6.Algorithm 3 computes a minimal tree shaped ABoxAwith size polynomial in|T |and runs in polynomially many steps in|T |and|A|.

Proof. The fact that the computed ABox is tree shaped follows from Line 3. Also, by Lemma 5 the size of the ABox is bounded by|T |. So it remains to show that Algorithm 3 terminates after at polynomially many steps in|T |and|A|. By Lemma 5, Lines 2 and

5 is polynomial in|A|and|N_C∩Σ_T|. Also, unfolding a cyclecin Line 4 is linear in

|A|. It remains to show that the number of iterations is bounded by|T |. LetI_nbe the minimal ABox computed in then-th iteration in Line 5 of Algorithm 3. By Lemma 5, for allniterations of Algorithm 3, in Line 5|∆^Iⁿ|is bounded by|T |. By Lemma 18, after eachn+ 1-th iteration of the algorithm,|∆^Iⁿ⁺¹|increases by at least one element with respect to|∆^Iⁿ|. So the number of iterations is bounded by|T |. o

Im Dokument Exact Learning Description Logic Ontologies from Data Retrieval Examples (Seite 21-27)