• Keine Ergebnisse gefunden

E Proofs for Limits of Polynomial Time Learnability

To show Lemma 7, we first show Lemma 22, which uses Lemmas 20 and 21 from [12].

We also require the following lemma from [11], which characterizes concept inclusions entailed by acyclicELTBoxes.

Lemma 19 ([11]).LetT be an acyclicELTBox,ra role name andDanELconcept expression. Suppose that T |= d

1≤i≤nAi ud

1≤j≤m∃rj.Cj v D, whereAi are concept names for1 ≤ i ≤n,Cj areELconcept expressions for1 ≤j ≤m, and m, n≥0, then

– ifDis a concept name such thatT does not contain an inclusionD≡C, for some concept expressionC, then there existsAi,1≤i≤n, such thatT |=AivD;

– ifD is of the form ∃r.D0 then either (i) there existsAi,1 ≤ i ≤ n, such that T |= Ai v ∃r.D0 or (ii) there exists rj,1 ≤ j ≤ m, such that rj = r and T |=CjvD0.

Lemma 20. LetB=F1u...uFn, whereFi ∈ {Ei,E¯i}. For any0 ≤m≤n, any sequence of role namesσ = σ1. . . σm, anyL = (σ1, . . . ,σn) ∈ Ln and anyEL concept expressionCoverΣn, ifTLB|=Cv ∃σ.Bthen either:

1. m= n,σ = σi, for some1 ≤ i ≤nandCis of the formAuC0,AiuC0or BiuC0, for someELconcept expressionC0; or

2. |=Cv ∃σ.B.

Proof. We prove the proposition by induction onm. Since for allFi occurring inB, TLBdoes not contain an inclusionFi≡C, whereCis anELconcept expression, by Lemma 19, there is a concept nameZsuch thatTLB|=ZvFi. Then, form= 0,Cis of the formZuC0, whereZis a concept name,C0is anELconcept expression and TLB|=ZvFi. This is only possible ifZisFiitself. As this holds for allFi, we have that|=CvB.

Form >0. By Lemma 19 we have one of the following two cases:

– Cis of the formZuC0, for some concept nameZand someELconcept expression C0such thatTLB|=Zv ∃σ.B. It is easy to see that this is only possible ifm=n, σ=σiandZis one ofA,AiorBi.

– Cis of the form∃σ1.C0uC00for some concept expressionsC0andC00such that TLB|=C0 v ∃σ2.· · · ∃σm.B. By induction hypothesis,|=C0 v ∃σ2.· · · ∃σm.B.

But then|=Cv ∃σ.B.

o

Lemma 21 ([12]).For any acyclicELTBoxT, any inclusionA vC ∈ T and any concept expression of the form∃t.Dwe haveT |=Av ∃t.Dif, and only if,T |=Cv

∃t.D.

We are now ready for Lemma 22.

Lemma 22. For allELconcept inclusionsCvDoverΣnwhereBis not a subconcept ofC:

– eitherTLB|=CvDfor everyL∈Lnor

– the number ofL∈Lnsuch thatTLB|=CvDdoes not exceed the size ofD.

Proof. To prove this lemma we argue by induction on the structure ofDand show the following.

Claim 1For allELconcept inclusions C v D overΣn where B ∈ Bn is not a subconcept ofC, if there isL∈LnandB∈Bnsuch thatTLB|=CvDthen:

– eitherTLB|=CvDfor everyL∈Lnand everyB∈Bnor

– for eachL ∈ Ln such that TLB |= C v D there isσ inL and a sequence of rolest1, . . . , tm,m≥0, such that|=D v ∃t1.· · · ∃tm.∃σ.>, wheretj ∈ {r, s}, 1≤j≤m.

We assume throughout the proof that in all casesBis not a subconcept ofCand that there exists someL0∈Lnsuch thatTLB

0 |=CvD.

Base case:Dis a concept name. We make the following case distinction.

– Dis one ofXi,Ai,Bi,EiorE¯ifor1≤i≤n. By Lemma 19,Cis of the form ZuC0, for some concept nameZ, andTLB0 |=ZvD. IfDis one ofXi,Ai,Bi, EiorE¯i, then this can only be the case ifZ =D. But then for everyL∈Lnwe haveTLB|=CvD.

– DisX0. By Lemma 19,Cis of the formZuC0, for some concept nameZ, and TLB0 |= Z v X0. This is the case if either Z = X0, orZ is one of A,Ai, Bi, 1≤i≤n. In either case, for everyL∈Lnwe haveTLB|=CvX0.

– DisA. IfCis of the formAuC0or, for alli,1≤i≤n,AiorBiis a conjunct of C, then for everyL∈Lnwe haveTLB|=CvA. Assume now thatCis not of this form. Then for somejsuch that1≤j ≤n,Cis neither of the formAuC0nor of the formAjuC0nor of the formBjuC0. LetL= (σ1, . . . ,σn)∈Lnbe such thatTLB|=CvA. Notice thatTLB|=CvA, forL= (σ1, . . . ,σn)∈Ln, if, and only if,TLB|=CvX0u ∃σ1.Bu · · · u ∃σn.B. By Lemma 20, for such aTLBwe must have|=Cv ∃σj.B, but then this is not possible asBis not a subconcept of C.

Thus ifDis a concept name then either for everyL∈Lnwe haveTLB|=CvDor there exists noL∈Lnsuch thatTLB|=CvD, whereBis not a subconcept ofC.

Induction step. IfD=D1uD2, thenTLB|=CvDif, and only if,TLB|=CvDi, i∈ {1,2}. So the lemma follows from the induction hypothesis.

ForD = ∃t.D0, suppose that there isL ∈ Ln such thatTLB |= C v D. Then, by Lemma 19, either (i) there exists a conjunct Z of C, Z a concept name, such that TLB |= Z v ∃t.D0 or (ii) there exists a conjunct∃t.C0 ofC withTLB |= C0 v D0. Consider cases (i) and (ii).

(i) LetZ be a conjunct ofCsuch thatZis a concept name andTLB|=Z v ∃t.D0. Notice thatZcannot beEiorE¯ias for noL∈Lnwe haveTLB|=Eiv ∃t.D0or TLB|= ¯Ei v ∃t.D0. Consider the remaining possibilities.

• Z is one of Xi,0 ≤ i ≤ n. It is easy to see that for L, L0 ∈ Ln we have TLB|=Xiv ∃t.D0if, and only ifTLB0 |=Xiv ∃t.D0. Thus, for everyL∈Ln we haveTLB|=Zv ∃t.D0.

• Zis one ofAi,Bifor1≤i≤n. By Lemma 21,TLB|=Zv ∃t.D0if, and only if,TLB|=X0u ∃σi.Bv ∃t.D0. By Lemma 19, eitherTLB|=X0v ∃t.D0or TLB|=∃σi.Bv ∃t.D0. IfTLB|=X0v ∃t.D0then for everyL∈Lnwe have TLB|=Cv ∃t.D0. Now, suppose that∃t.D0is such thatTLB6|=X0 v ∃t.D0 andTLB |=∃σi.Bv ∃t.D0. By inductive applications of Lemma 19, this is only possible when|= ∃t.D0 v ∃σi.>. Notice that since allσi are unique, there exists exactly oneL∈Ln(namely,LisL0) such thatTLB|=Zv ∃σi.F, where|=BvF.

• ZisA. Suppose that for someL= (σ1, . . . ,σn)∈Ln we haveTLB|=Av

∃t.D0, equivalentlyTLB|=X0u ∃σ1.Bu. . .∃σn.Bv ∃t.D0. By Lemma 19, eitherTLB|=X0v ∃t.D0orTLB|=∃σi.Bv ∃t.D0, for somei: 1≤i≤n, so, as above, unlessTLB|=X0 v ∃t.D0 we have that|=∃t.D0 v ∃σi.>, as required.

(ii) Let∃t.C0 be a conjunct ofC withTLB |= C0 v D0. The induction hypothesis implies that either (a) for everyL∈Lnwe have thatTLB|=C0 vD0or (b) for each L∈Lnsuch thatTLB|=C0vD0there isσinLand a sequence of rolest1, . . . , tm, m≥0, such that|=D0 v ∃t1.· · · ∃tm.∃σ.>, wheretj ∈ {r, s},1 ≤j ≤m. In case (a), we have that for everyL∈Ln,TLB|=Cv ∃t.D0. In case (b), if for each L∈Lnsuch thatTLB|=C0vD0there isσsuch that|=D0v ∃t1. . . .∃tm.∃σ.>

then same happens with∃t.D0(notice that for everyL∈Lnand everyB∈Bnwe have thatTLB|=C0vD0iffTLB|=∃t.C0 v ∃t.D0).

To summarize, eitherTLB |= C v ∃t.D0 for every L ∈ Ln and every B ∈ Bn or TLB|=Cv ∃t.D0 implies that|=∃t.D0 v ∃t0. . . .∃tm.∃σ.>,m≥0, for someσin L. Since allσare unique for eachL∈Ln, the number of differentL∈Lnsuch that TLB|=Cv ∃t.D0does not exceed|D|. o

Before we proceed to the proof of Lemma 7, we need Lemma 23.

Definition 3. The unravellingAuofAinto a (possibly infinite) tree is defined as:

– Ind(Au)is the set of sequencesb0r0· · ·rn−1bnwithb0, . . . , bn∈Ind(A), r0, . . . , rn−1∈NRandri(bi, bi+1)∈ A;

– for eachA(b)∈ Aandα=b0r0· · ·rn−1·bn ∈ Ind(Au)withbn =b, we have A(α)∈ Au;

– for eachα=b0r0· · ·rn−1bn∈Ind(Au)withn >0, we have rn−1(b0r0· · ·rn−1bn−1, α)∈ Au.

Lemma 23. For any ABoxAandELconcept expressionDoverΣnthere is a concept expressionCAsuch thata∈CAIA and, for everyL∈LnandB∈Bn:

(TLB,A)|=D(a) iff TLB|=CAvD.

Proof. LetAube the unravelling ofA. LetTLBbe a TBox for some arbitraryB∈Bn

andL∈Ln. By definition ofAuwe have that(TLB,A)|=D(a)iff(TLB,Au)|=D(a).

Denote asAu,ka the subtree ofAuwhich is rooted ina∈Ind(Au)and has depthk∈N. LetTLB0 be the result of removingX0u ∃σ1.Bu · · · u ∃σn.BvAfromTLB. Then, (TLB0,Au)|=D(a)iff(TLB0,Au,|D|a )|=D(a). LetITB0

L ,Aube the canonical model of TLB0andAu. By definition ofTLB, one can make it a canonical model ofTLBandAu by includingd∈AITLB0,Au wheneverd∈(X0u ∃σ1.Bu · · · u ∃σn.B)ITLB0,Au. Then, (TLB,Au) |= D(a)iff(TLB,Au,|D|+na ) |= D(a). LetCAbe the concept expression corresponding to the tree interpretation ofAu,|D|+na rooted ina. We have that, for every L∈LnandB∈Bn,(TLB,A)|=D(a)iffTLB|=CAvD. o We can now proceed to the proof of Lemma 7. We say that anELconcept expres-sion C occursin an ABoxA if there existsa ∈ Ind(A)such thatA |= C(a). For a, b∈Ind(A), arole chainfromatobis a sequencea0·t0·...·tn−1·anwitha0=a, an=bandti(ai, ai+1)∈ A, where0≤i≤n−1andti∈ {r, s}.

Proof of Lemma 7.For any ABoxA, anyELconcept assertionD(a)overΣn, and any a∈Ind(A), if there isL∈LnandB∈Bnsuch that(TLB∪ T,A)|=D(a)then:

– either(TLB∪ T,A)|=D(a), for everyL∈LnandB∈Bn, or – (TLB∪ T,A)|=D(a)for at most|D|elementsL∈Ln, or – (TLB∪ T,A)|=D(a)for at most|A|elementsB∈Bn. Proof. We make a case distinction:

1. for alli,1≤i≤n,EiuE¯idoes not occur inA: first notice that in this case, for everyELconcept expressionCoverΣn,a∈Ind(A)andTLB∈S:

(TLB∪ T,A)|=C(a) iff (TLB,A)|=C(a).

For anyAandELconcept expressionDoverΣn, by Lemma 23, there is a concept expressionCAsuch thata∈CAIAand, for everyL∈LnandB∈Bn:

(TLB,A)|=D(a) iff TLB|=CAvD.

If there is noB ∈ Bn such thatB occurs inAthen the Lemma follows from Corollary 22. Notice that although our construction ofCAis not polynomial, Corol-lary 22 does not impose any restriction in the size ofCA. Otherwise, since for alli, 1≤i≤n,EiuE¯idoes not occur inA, we have that the number ofB∈Bnsuch thatBoccurs inAis linear in the size ofA. So the number ofB∈Bnsuch that (TLB∪ T,A)|=D(a)does not exceed the size ofA.

2. there isi,1 ≤ i ≤ n, such thatEiuE¯i occurs inA: letEiuE¯iAbe the set of individualsb ∈ Ind(A)such that Ei uE¯i(b) ∈ A. By construction ofT, for every ABoxAand every ELconcept expression D overΣn we have that (T,A)|=D(b), whereb∈EiuE¯iA. Then, in particular, for everyL∈Lnwe have that(TLB∪ T,A) |= D(b). Fora ∈ Ind(A)\EiuE¯iA we make a case distinction:

– there is a role chain fromato someb ∈ EiuE¯iA: by definition ofT, as (EiuE¯i) v A for every1≤i≤nand everyA∈Σn∩NC, we have that (T,A)|= (E1uE¯1)(b). Then, since{∃r.(E1uE¯1)v(E1uE¯1),∃s.(E1u E¯1)v(E1uE¯1)} ⊆ T, we have that(T,A)|= (E1uE¯1)(a). In this case, by the argument above, for everyL∈Lnand everyELconcept expressionD overΣn, we have that(TLB∪ T,A)|=D(a).

– for all b ∈ EiuE¯iA, there is no role chain from a to b: let A0 = A \ {Ei(b),E¯i(b) | b ∈ EiuE¯iA}. Since in this case, for all b ∈ EiuE¯iA, there is no role chain fromatob, we have that, for everyELconcept expression D,A |=D(a)iffA0 |=D(a). By definition ofA0,EiuE¯idoes not occur in A0, then the lemma follows as in Case 1.

o The next lemma from [12] prepares the proof of Lemma 8.

Lemma 24 ([12]).For any 0 ≤ i ≤ nandΣn-concept D, if T0 6|= Xi v D then there exists a sequence of role names t1, . . . tlsuch that |= D v ∃t1.· · · ∃tl.Y and T06|=Xiv ∃t1.· · · ∃tl.Y, whereY is either>or a concept name,0≤l≤n−i+ 1.

Proof of Lemma 8.For anyn >1and anyELTBoxHinΣnwith|H|<2n, there ex-ists an ABoxA, an individuala∈Ind(A)and anELconcept expressionDoverΣnsuch that (i) the size ofAplus the size ofDdoes not exceed6nand (ii) if(H,A)|=D(a) then(TLB,A)|=D(a)for at most oneL∈Ln and if(H,A)6|=D(a)then for every L∈Lnwe have(TLB∪ T,A)|=D(a).

Proof. AsTLB∪ T|=Cv Diff(TLB∪ T,AC,a)|=D(a), whereAC,ais an ABox with canonical model isomorphic to the tree interpretation ofCwith rootρCmapped to a∈Ind(A), to prove this lemma we show the following claim.

Claim 1For any n > 1 and anyELTBoxH inΣn with |H| < 2n, there exists anELCICvDoverΣnsuch that (i) the size ofCvDdoes not exceed6nand (ii) ifH |=CvDthenTLB∪ T|=CvDfor at most oneL∈Lnand ifH 6|=CvD then for everyL∈Lnwe haveTLB∪ T|=Cv D.

We define an exponentially large TBoxTand use it to prove that one can select anEL concept inclusionC vDin such a way that eitherH |=C vDandT 6|=CvD, or vice versa. Then, the oracle can return(AC,a, D(a))as a counterexample, where AC,ais the tree shaped ABox corresponding to theELconcept expressionCrooted in a∈Ind(A).

To defineT, for any sequenceb =b1. . . bn, where everybiis either0or1, we denote byCbthe conjunctiond

i≤nCi, whereCi =Aiifbi= 1andCi=Biifbi = 0.

Then we define

T=T0∪ {CbvAuX0|b∈ {0,1}n}.

LetACbbe the ABox corresponding to a concept expressionCb, as defined above. Since, for alli,1≤i≤n,EiuE¯idoes not occur inACb, we have that(TLB∪ T,ACb)|= C(a) iff (TLB,ACb)|=C(a). Then, in the following we only considerTLB. Consider the possibilities forHandT.

(1) IfH 6|=Tthen there exists an inclusionCvD∈ Tsuch thatH 6|=CvD.

Clearly,CvDis entailed byTLB, for everyL∈Ln, and the size ofCvDdoes not exceed6n, soCvDis as required.

(2) Suppose that for someb∈ {0,1}nand a concept expression of the form∃t.D0 we haveH |= Cb v ∃t.D0 andT 6|= Cb v ∃t.D0. To ‘minimise’Cb v ∃t.D0, notice thatT0 6|= X0 v ∃t.D0. Then, by Lemma 24, there exists a sequence of role names t1, . . . , tl, for0 ≤ l ≤ n+ 1andY being>or a concept name such that

|=∃t.D0 v ∃t1.· · · ∃tl.Y, soH |=Cbv ∃t1.· · · ∃tl.Y, andT06|=X0v ∃t1.· · · ∃tl.Y. Clearly, the size ofCb v ∃t1.· · · ∃tl.Y does not exceed6n. It remains to prove that TLB|=Cbv ∃t1· · · ∃tl.Y for at most oneL∈Ln.

Suppose for someL ∈ Lnwe haveTLB |= Cb v ∃t1.· · · ∃tl.Y. By Lemma 19, there isAjorBjsuch thatTLB|=Ajv ∃t1.· · · ∃tl.Y (orTLB|=Bj v ∃t1.· · · ∃tl.Y, respectively). AsT06|=X0 v ∃t1.· · · ∃tl.Y it is easy to see that this is only possible whenl=n,(t1, t2, . . . , tn) =σj, andY is implied byB. Since everyσjis unique, for everyL0∈Lnsuch thatL0 6=Lwe haveTLB0 6|=Cbv ∃σj.Y.

Thus,Cbv ∃t1.· · · ∃tl.Y is as required.

(3) Finally, suppose that Case 1 and 2 above do not apply. ThenH |= T and for everyb∈ {0,1}n and everyELconcept expression overΣn of the form∃t.D0: if H |= Cb v ∃t.D0 thenT0 |= X0 v ∃t.D0. We show that unless there exists an inclusionCvDsatisfying the conditions of the lemma,Hcontains at least2ndifferent inclusions. Thus, we have derived a contradiction.

Fixb∈ {0,1}n. AsH |=Twe haveH |=Cb vA. Then there must exist an (at least one) inclusionC vAuD ∈ Hsuch thatH |= Cb vC and6|= C v A. Let C =Z1u · · · uZmu ∃t1.C10 u · · · u ∃tl.Cl0, whereZ1,. . . ,Zmare different concept names. AsH |=Cbv ∃tj.Cj0we haveT0|=X0v ∃tj.Cj0, forj= 1, . . . l. AsH |=T we haveH |=X0v ∃tj.Cj0, forj= 1, . . . l. SoH |=Z1u · · · uZmuX0vA.

Suppose that for somei: 1≤i≤nthere exists noj : 1≤j ≤msuch thatZj

is eitherAiorBi. Then we haveTLB6|=Z1u · · · uZmuX0 vA, for anyL∈ Ln. Notice that in the worst caseZ1u · · · uZmcontains the conjunction of allΣn-concept names, exceptAi,Bi, so the size ofZ1u · · · uZmuX0vAdoes not exceed6n, and Z1u · · · uZmuX0vAis as required.

Assume thatZ0u · · · uZmuX0contains a conjunctBisuch thatbi 6= 0. Then H |=CbvBiand for noL∈Lnwe haveTLB|=CbvBi. The size ofCbvBidoes not exceed6n, so it is as required.

Assume thatZ0u · · · uZmuX0contains a conjunctAi such thatbi 6= 1. Then H |=CbvAiand for noL∈Lnwe haveTLB|=CbvAi. The size ofCbvAidoes not exceed6n, so it is as required.

The only remaining option is thatZ1u · · · uZmuX0contains exactly theAiwith bi= 1and exactly theBiwithbi = 0.

This argument applies to arbitraryb∈ {0,1}n. Thus if there exists no inclusion CvDsatisfying the conditions of the lemma thenHcontains at least2ninclusions.

o Proof of Theorem 5.TheELdata retrieval framework is not polynomially exact learn-able.

Proof. Assume that TBoxes are polynomial time learnable in the open data model. Then there exists a learning algorithm whose running time is bounded at any stage by a polynomialp(n, m). Choosensuch thatb2n/nc>(p(n,6n))2and letS1=Lnand S2=Bn. We follow Angluin’s strategy of removing elements fromS1andS2in such a way that the learner cannot distinguish between any of the remainingTLBTBoxes encoded byL∈S1andB∈S2. The strategy is as follows.

Given an membership query(TLB∪T,A)|=D(a), withA |=C(a), ifTLB∪T|= C vDfor everyL∈Lnand everyB∈Bn, then the answer is ‘yes’; otherwise the answer is ‘no’ and allL∈LnandB∈BnwithTLB∪ T|=CvDare removed from S1andS2, respectively. By Lemma 7, at most the size ofDelements can be removed fromS1or at most the size ofAelements can be removed fromS2. Given an equivalence query withH, the answer is ‘no’ and a concept inclusionCvDnot entailed byHsuch thatTLB0 ∪ T |=C vDfor at most oneL0 ∈Ln is guaranteed by Lemma 8. Then a counterexample(T,A)|= D(a)withA |=C(a)and bounded by6nis produced (consider the size of a query or a counterexample(T,A)|=D(a)as being the size ofA plus the size of concept expressionD).

As all counterexamples produced are bounded by6n, the overall running time of the algorithm is bounded byp(n,6n). Hence, the learner asks no more thanp(n,6n)queries and the size of every query does not exceedp(n,6n). By Lemmas 7 and 8, at most (p(n,6n))2elements are removed fromS1andS2during the run of the algorithm. But then the algorithm cannot distinguish between any TBoxesTLBandTLB00forL6=L0∈S1 andB6=B0∈S2based on the given answers and we have derived a contradiction.

o