A Model for Learning Description Logic Ontologies Based on Exact Learning

(1)

A Model for Learning Description Logic Ontologies Based on Exact Learning

Boris Konev

University of Liverpool United Kingdom

Ana Ozaki

Frank Wolter

Abstract

We investigate the problem of learning description logic (DL) ontologies in Angluin et al.’s framework of exact learning via queries posed to an oracle. We consider membership queries of the form “is a tuple~aof individuals a certain answer to a data retrieval queryqin a given ABox and the unknown target ontology?” and completeness queries of the form “does a hypothesis ontology entail the unknown target ontology?”. Given a DLLand a data retrieval query languageQ, we study polynomial learnability of ontologies inLusing data retrieval queries inQand provide an almost complete classification for DLs that are fragments ofELwith role inclusions and of DL-Lite and for data retrieval queries that range from atomic queries andEL/ELI-instance queries to conjunctive queries. Some results are proved by non-trivial reductions to learning from subsumption examples.

Introduction

Building an ontology is prone to errors, time consuming, and costly. The research community has addressed this problem in many different ways, for example, by supplying tool support for editing ontologies (Musen 2013; Bechhofer et al. 2001;

Day-Richter et al. 2007), developing reasoning support for debugging ontologies (Wang et al. 2005; Schlobach et al.

2007), supporting modular ontology design (Stuckenschmidt, Parent, and Spaccapietra 2009), and by investigating automated ontology generation from data or text (Cimiano, Hotho, and Staab 2005; Buitelaar, Cimiano, and Magnini 2005;

Lehmann and V¨olker 2014; Borchmann and Distel 2011;

Ma and Distel 2013). One major problem when building an ontology is the fact that domain experts are rarely ontology engineering experts and that, conversely, ontology engineers are typically not familiar with the domain of the ontology.

An ontology building project therefore often relies on the successful communication between an ontology engineer (familiar with the semantics of ontology languages) and a domain expert (familiar with the domain of interest). In this paper, we consider a simple model of this communication process and analyse, within this model, the computational complexity of reaching a correct and complete domain ontology. We assume that

• the domain expert knows the domain ontology and its vo-

cabulary without being able to formalize or communicate this ontology;

• the domain expert is able to communicate the vocabulary of the ontology and shares it with the ontology engineer.

Thus, the domain expert and ontology engineer have a com- mon understanding of the vocabulary of the ontology. The ontology engineer knows nothing else about the domain.

• the ontology engineer can pose queries to the domain expert which the domain expert answers truthfully. Assuming that the domain expert can interpret data in her area of ex- pertise, the main queries posed by the ontology engineer are based on data retrieval examples:

– assume a data instanceAand a data retrieval queryq(~x) are given. Is the tuple~aof individuals a certain answer to queryq(~x)inAand the ontologyO?

In addition, we require a way for the ontology engineer to find out whether she has reconstructed the target ontology already and, if this is not the case, to request an example illustrating the incompleteness of the reconstruction. We abstract from defining a communication protocol for this, but assume for simplicity that the following query can be posed by the ontology engineer:

– Is this ontologyHcomplete? If not, return a data in- stanceA, a queryq(~x), and a tuple~asuch that~ais a certain answer toq(~x)inAand the ontologyOand is not a certain answer toq(~x)inAand the ontologyH. Given this model, our question is whether the ontology engineer can learn the target ontologyOand which computational resources are required for this depending on the ontology language in which the ontologyOand the hypothesis ontology H are formulated. Our model obviously abstracts from a number of fundamental problems in building ontologies and communicating about them. In particular, it makes the assumption that the domain expert knows the domain ontology and its vocabulary (without being able to formalize it) de- spite the fact that finding an appropriate vocabulary for a domain of interest is a major problem in ontology design (Lehmann and V¨olker 2014). We make this assumption here in order to isolate the problem of communication about the logical relationships between known vocabulary items and its dependence on the ontology language within which the relationships can be formulated.

(2)

The model described above is an instance of Angluin et al.’s framework of exact learning via queries to an oracle (An- gluin 1987). The queries using data retrieval examples can be regarded as membership queries posed by a learner to an oracle and the completeness query based on a hypothesisH can be regarded as an equivalence query by the learner to the oracle. Formulated in Angluin’s terms we are thus interested in whether there exists a deterministic learning algorithm that poses membership and equivalence queries of the above form to an oracle and that polynomially learns an arbitrary ontology over a given ontology language.

As usual in the exact learning literature, we consider two distinct notions of polynomial learnability: polynomialtime learnability and polynomial querylearnability. If one can learn TBoxes¹ in a given DLLwith a deterministic algorithm using polynomially many polynomial size queries, then we say that TBoxes inLare polynomial query learnable. If one can learn TBoxes inLwith a deterministic algorithm in polynomial time, then we say that TBoxes inLare polynomial time learnable. Precise definitions are given below.

Clearly, polynomial time learnability implies polynomial query learnability. The converse does not hold for arbitrary learning problems. Intuitively, when studying polynomial time learnability one takes into account potentially costly transformations of counterexamples to equivalence queries provided by the oracle that the learning algorithm is required to do when it analyses the counterexamples. In contrast, when studying polynomial query learnability one abstracts from such intermediate computations and focuses on the learning protocol itself. It turns out that for the DLs considered in this paper in many cases there is no difference between polynomial time and polynomial query learnability; the only exception, however, is rather interesting and will be discussed in detail below.

We investigate polynomial learnability for seven DLs and four query languages: the DLs areELand its fragmentsEL^lhs andEL^rhsin which complex concepts are allowed only on the left-hand and, respectively, right-hand side of concept inclusions. We also consider their extensionsELH,ELH^lhs, andELH^rhswith role inclusions. In addition, we consider the DL-Lite dialect DL-Lite^∃_Hwhich is defined as the extension ofELH^rhswith inverse roles. We thus consider significant fragments of the OWL2 EL and OWL2 QL profiles of the web ontology language OWL. The introduction of the fragments EL^lhsandEL^rhs is motivated by the fact thatELTBoxes typically cannot be polynomially learned (see below). In data retrieval examples we consider the following standard query languages: atomic queries (AQs),EL-instance queries (EL-IQs),ELI-instance queries (ELI-IQs), and conjunctive queries (CQs).

Our results regarding polynomialquery learnability of TBoxes are presented in Table 1. In the table,EL(H)ranges overELandELHand (–) denotes that the query language is not expressive enough to determine a unique (up to logical equivalence) TBox in the corresponding DL using data retrieval examples. Thus, in those casesnolearning algorithm exists, whereas in all other cases one can easily construct a

1In the DL context we identifyontologieswith TBoxes.

learning algorithm that makes exponentially many queries.

Note that the table shows that for theEL-dialects polynomial query learnability does not depend on whether role inclusions are present (though some proofs are considerably harder with role inclusions). A particularly interesting result is thatEL^rhs TBoxes are polynomially query learnable using IQs in data retrieval examples but not using CQs. Thus, a more expressive language for communication does not always lead to more efficient communication.

The bottom row shows polynomial query learnability results for the case in whichconcept subsumptionsrather than data retrieval examples are used in the communication between the learner and the oracle. Except for polynomial query learnability ofELH^lhs(which we prove in this paper), the results for subsumption are from (Konev et al. 2014).²Our polynomial query learnability results for data retrieval examples are by reductions to learnability using concept subsumptions. Our focus on data retrieval examples rather than subsumptions is motivated by the observation that domain experts are often more familiar with querying data in their domain than with the logical notion of subsumption between complex concepts.

We now discuss our results for polynomialtimelearnability.

As mentioned above, all non polynomial query learnability results transfer to non polynomial time learnability results.

Moreover, in both the subsumption and the data retrieval frameworks our proofs of positive polynomial query learnability results forEL^lhs,ELH^lhs,EL^rhs, andELH^rhsactually prove polynomial time learnability. In fact, the only case in which we have not been able to extend a polynomial query learnability result to a polynomial time learnability result is for DL-Lite^∃_HTBoxes: it remains open whether DL-Lite^∃_H TBoxes can be learned in polynomial time using subsumption orELI-IQs in data retrieval queries. The reason is interesting: checking whether anELI-IQ is entailed by a DL-Lite^∃_H TBox and ABox is NP-complete in combined complexity (Kikot, Kontchakov, and Zakharyaschev 2011) and such entailment checks are required in our polynomial query learning algorithm to transform counterexamples provided by the oracle. It remains open whether our learning algorithm can be modified in such a way that no such entailment checks are required. In contrast to DL-Lite^∃_H, inELH^rhsthe corresponding entailment problem is in PTime in combined complexity (Bienvenu et al. 2013), and so a polynomial time learning algorithm can use entailment checks.

Finally, we note that the two open problems in Table 1 for polynomial query learnability are open for polynomial time

2The authors of (Konev et al. 2014) consider polynomial time learnability only. As polynomial time learnability implies polynomial query learnability, the corresponding results in Table 1 follow.

Note that the learning algorithm for DL-Lite^∃_H TBoxes given in (Konev et al. 2014) only shows polynomial query learnability of DL-Lite^∃H TBoxes using subsumption queries but does not show polynomial time learnability of DL-Lite^∃HTBoxes using subsumption queries (it is wrongly assumed that checkingT |=CvDis in PTime for DL-Lite^∃HTBoxesT and concept inclusionsCvD).

In fact, polynomial time learnability of DL-Lite^∃HTBoxes using subsumption queries is an open problem (see below). All other polynomial time learnability results in (Konev et al. 2014) hold.

(3)

Table 1:Positive (3) and negative (7) results regarding polynomial query learnability.

Framework EL(H)lhs EL(H)rhs EL(H) DL-Lite^∃H

Data

AQs 3 – – –

EL-IQs 3 3 7 –

ELI-IQs 3 3 ? 3

CQs 3 7 7 ?

Subsump. 3 3 7 3

learnability as well.

Throughout this paper we focus on polynomial query learnability and only provide a brief discussion of our polynomial time learnability results. A more detailed discussion of polynomial time learnability as well as other proof details are provided in an appendix of this paper, available from http://cgi.csc.liv.ac.uk/^∼frank/publ/publ.html

Related Work Apart from Angluin’s classical learning algorithm for propositional Horn, we highlight investigations of exact learnability of fragments of FO Horn (Reddy and Tadepalli 1999; Arias and Khardon 2002; Arias, Khardon, and Maloberti 2007; Selman and Fern 2011) and, more re- cently, schema mappings (ten Cate, Dalmau, and Kolaitis 2012). ELH^lhs can be seen as a fragment of FO Horn which, in contrast to many existing approaches, allows re- cursion and does not impose bounds on the number of variables per clause. In DL, exact learning has been studied for the description logic CLASSIC in (Frazier and Pitt 1996;

Cohen and Hirsh 1994) where it is shown that CLASSIC concept expressions (but not TBoxes) can be learned in polynomial time. In this case, membership queries ask whether the target concept subsumes a given concept. Related work on machine learning in DL also include learning DL concept expressions using refinement operators (Lehmann and Hitzler 2010) and completing knowledge bases using formal concept analysis (Baader et al. 2007).

All exact learning frameworks for logical theories considered so far are based on interpretations (Angluin, Frazier, and Pitt 1992; ten Cate, Dalmau, and Kolaitis 2012; Klar- man and Britz 2015) or entailments (Frazier and Pitt 1993;

Reddy and Tadepalli 1998; Arias and Khardon 2002). In this paper we introduce a new class of examples based on certain answers to data retrieval queries.

Preliminaries

Let NC andNR be countably infinite sets of concept and rolenames, respectively. We begin by introducing members of theELfamily of DLs (Baader, Brandt, and Lutz 2005).

AnELconcept expressionis formed according to the rule C, D := A | > | CuD | ∃r.C, whereA ranges over NC and rranges over NR. An ELconcept inclusion (CI) takes the form C v D, whereC andD areELconcept expressions. AnELTBoxT is a finite set ofELCIs. AnEL role inclusion (RI)takes the formrvs, wherer, s∈N_Rand anELRBoxRis a finite set ofELrole inclusions. The union of anELTBox and anELRBox is called aELHTBox. We also consider the fragmentsEL^lhsandEL^rhsofELin which

concepts on the right-hand side and, respectively, left-hand side of CIs must be concept names. Thus,∃r.AvBis an EL^lhsCI but not anEL^rhsCI andAv ∃r.Bis anEL^rhsCI but not anEL^lhsCI. ByELH^lhsandELH^rhswe denote the extension of these fragments withELRIs.

Aroleis a role name or an inverse roler⁻withr∈NR. The language DL-Lite^∃_His obtained fromELH^rhsby admitting both role names and inverse roles in concept expressions and in role inclusions and by admitting, in addition to concept names,basic concepts∃r.>, withra role, on the left-hand side of CIs. Call anELconcept expression using inverse roles anELIconcept expression. Then DL-Lite^∃_Hcoincides with the extension of the language DL-Lite_R(without disjointness constraints) introduced in (Calvanese et al. 2007) with arbi- traryELIconcept expressions on the right-hand side of CIs.

ThesignatureΣ_T of a TBoxT is the set of concept and role names that occur inT.

In description logic, data are stored in ABoxes. LetNIbe a countably infinite set ofindividual names. AnABoxAis a finite non-empty set containing assertionsA(a)andr(a, b), wherea, bare individuals inNI,Ais a concept name andris a role.Ind(A)denotes the set of individuals that occur inA. Ais asingletonABox if it contains only one ABox assertion.

We consider the main query languages for retrieving data from ABoxes using DL TBoxes. Anatomic query (AQ)q takes the form A(a)orr(a, b), whereA ∈ N_C, r ∈ N_R, anda, b ∈ NI. AnEL-instance query (EL-IQ)qtakes the formC(a)orr(a, b), whereCis anELconcept expression, r∈NRanda, b∈NI.ELI-instance queries (ELI-IQs)are defined in the same way by replacingELconcept expressions withELIconcept expressions. Finally, aconjunctive query (CQ)qis a first-order sentence∃~xϕ(~a, ~x), whereϕis a con- junction of atoms of the formr(t1, t2)orA(t), wheret1, t1, t can be individual names from~aor individual variables from

~x. We often slightly abuse notation and denote by AQ the set of AQs and similarly forEL-IQs,ELI-IQs and CQs.

Thesizeof a concept expressionC(TBoxT, ABoxA, queryq), denoted by|C|(and, respectively,|T |,|A|, and|q|) is the length of the word that represents it.

The semantics of DLs is defined as usual (Baader et al.

2003). For an interpretation I, we write I |= α to state that a CI, RI, ABox assertion, or queryαis true inI. An interpretationIis amodelof a knowledge base (KB)(T,A) ifI |=αfor allα∈ T ∪ A. We set(T,A)|=αand say that αisentailedby(T,A)ifI |=αfor all modelsIof(T,A).

Alearning frameworkFis a triple(X,L, µ), whereXis a set ofexamples(also called domain or instance space),Lis a set oflearning concepts, andµis a mapping fromLto2^X. Given a DLL, thesubsumptionlearning frameworkF_S(L), studied in (Konev et al. 2014), is defined as(X,L, µ), where Lis the set of all TBoxes that are formulated inL;X is the set of concept and role inclusionsαthat can occur in TBoxes ofL; andµ(T)is defined as{α∈X | T |=α}, for every T ∈ L. It should be clear thatµ(T) =µ(T⁰)iff the TBoxes T andT⁰entail the same set of inclusions, that is, they are logically equivalent.

For a DLLand query languageQ, we study thedata re- trievallearning frameworkF_D(L, Q)defined as(X,L, µ), whereLis again the set of all TBoxes that are formulated in

(4)

L;Xis the set ofdata retrieval examplesof the form(A, q), where Ais an ABox andq ∈ Q; andµ(T) = {(A, q) ∈ X | (T,A) |= q}. We only consider data retrieval frameworks F_D(L, Q)in which µ(T) = µ(T⁰)iff the TBoxes T andT⁰ are logically equivalent. Note that this is not the case for the pairs(L,AQ)withLfromEL^rhs(H),EL(H), DL-Lite^∃_H, and for the pair(DL-Lite^∃_H,EL-IQ)(see Table 1).

For example, for theELTBoxes T¹ = {A v ∃r.>}and T²={Av ∃r.∃r.>}we have(T¹,A)|=qiff(T²,A)|=q for every ABoxAand AQ q. Thus, T¹ andT² cannot be distinguished using data retrieval examples based on AQs and soELTBoxes cannot be learned using such examples.

We now give a formal definition of polynomial query learnability within a learning framework. Given a learning frame- workF= (X,L, µ), we are interested in the exact identifica- tion of atargetlearning conceptl∈ Lby posing queries to oracles. LetMEMl,Xbe the oracle that takes as input some x ∈ X and returns ‘yes’ ifx ∈ µ(l)and ‘no’ otherwise.

We say thatxis apositive exampleforlifx∈µ(l)and a negative exampleforlifx6∈µ(l). Then amembership query is a call to the oracleMEM_l,X. Similarly, for everyl∈ L, we denote byEQ_l,X the oracle that takes as input ahypothesis learning concepth∈ Land returns ‘yes’, ifµ(h) = µ(l), or acounterexamplex∈ µ(h)⊕µ(l)otherwise, where⊕ denotes the symmetric set difference. Anequivalence query is a call to the oracleEQl,X.

We say that a learning framework(X,L, µ)isexact learn- ableif there is an algorithmAsuch that for any targetl∈ L the algorithmAalways halts and outputsl⁰ ∈ Lsuch that µ(l) =µ(l⁰)using membership and equivalence queries answered by the oraclesMEM_l,X andEQ_l,X, respectively. at any stage in a run A learning framework(X,L, µ)ispoly- nomial queryexact learnable if it is exact learnable by an algorithmAsuch that at every step the sum of the sizes of the inputs to membership and equivalence queries made byAup to that step is bounded by a polynomialp(|l|,|x|), wherelis the target andx∈Xis the largest counterexample seen so far (Arias 2004).

An important class of learning algorithms—in particular, all algorithms presented in (Konev et al. 2014; Frazier and Pitt 1993; Reddy and Tadepalli 1998) fit in this class—is the algorithm in which the hypothesishof any equivalence query is of polynomial size inland such thatµ(h)⊆µ(l). Then counterexamples returned by theEQl,X oracles are always positive. We say that such algorithms usepositive bounded equivalence queries. The learning algorithms studied in this paper are positive and, therefore, the equivalence queries posed to the domain expert are in fact completeness queries that ask whether the hypothesis entails the target TBox.

Polynomial Query Learnability

In this section we prove the positive results presented in Table 1 for the data retrieval setting by reduction to the subsumption setting. We employ the following result based on (Konev et al. 2014), except forF_S(ELH^lhs)which is proved in the appendix by extending the proof forF_S(EL^lhs)in (Konev et al. 2014).

A r,s

A

A ...

A s A

r s ...

A s A

r r A s

... A s A

r s ...

A s A

r r

r

Figure 1:An ABoxA={r(a, a), s(a, a), A(a)}and its unravelling up to leveln.

Theorem 1 The subsumption learning frameworks F_S(EL^lhs), F_S(ELH^lhs), F_S(EL^rhs), F_S(ELH^rhs) and F_S(DL-Lite^∃_H)are polynomial query exact learnable with membership and positive bounded equivalence queries.

We begin by illustrating the idea of the reduction forEL^lhs and AQs. To learn a TBox from data retrieval examples we run a learning from subsumptions algorithm as a ‘black box’.

Every time the learning from subsumptions algorithm makes a membership or an equivalence query we rewrite the query into the data setting and pass it on to the data retrieval oracle.

The oracle’s answer, rewritten back to the subsumption setting, is given to the learning from subsumptions algorithm.

When the learning from subsumptions algorithm terminates we return the learnt TBox. This reduction is made possible by the close relationship between data retrieval and subsumption examples. For every TBoxT and inclusionsCvB, one can interpret a concept expressionCas a labelled tree and encode this tree as an ABoxA^Cwith rootρCsuch thatT |=CvB iff(T,A^C)|=B(ρC).

Then, membership queries in the subsumption setting can be answered with the help of a data retrieval oracle due to the relation between subsumptions and AQs described above.

An inclusionC v Bis a (positive) subsumption example for some target TBox T if, and only if,(A^C, B(ρC))is a (positive) data retrieval example for the same targetT. To handle equivalence queries, we need to be able to rewrite data retrieval counterexamples returned by the data retrieval oracle into the subsumption setting. For every TBoxT and data retrieval query(A, B(a))one can construct a concept expressionC_Asuch that(T,A)|=B(a)iffT |=C_AvB.

Such a concept expression C_A can be obtained by unrav- ellingAinto a tree-shaped ABox and representing it as a concept expression. This unravelling, however, can increase the ABox size exponentially. Thus, to obtain a polynomial query bound on the the learning process,C_AvDcannot be simply returned as an answer to a subsumption equivalence query.

For example, for a target TBox T = {∃rⁿ.A v B} and a hypothesis H = ∅ the data retrieval query (A, B(a)), where A = {r(a, a), s(a, a), A(a)}, is a positive counterexample. The tree-shaped unravelling of A up to level n is a full binary tree of depth n, as shown in Fig. 1. On the other hand, the non-equivalence of T and H can already be witnessed by (A⁰, B(a)), where A⁰ = {r(a, a), A(a)}. The unravelling of A⁰ up to level n produces a linear size ABox {r(a, a2), r(a2, a3), . . . , r(an−1, an), A(a), A(a2), . . . , A(an)}, corresponding to

(5)

the left-most path in Fig. 1, which, in turn, is linear-size w.r.t. the target inclusion∃rⁿ.AvB. Notice thatA⁰is obtained fromAby removing thes(a, a)edge and checking, using membership queries, whether(T,A⁰)|=qstill holds.

In other words, one might need to ask further membership queries in order to rewrite answers to data retrieval equivalence queries given by the data retrieval oracle into the subsumption setting.

We address the need of rewriting counterexamples by introducing an abstract notion of reduction between different exact learning frameworks. To simplify, we assume that both learning frameworks use the same set of learning conceptsL and only consider positive bounded equivalence queries. We say that a learning frameworkF= (X,L, µ)positively polynomial query reducestoF⁰= (X⁰,L, µ⁰)if, for anyl, h∈ L, µ(h) ⊆ µ(l)if, and only if,µ⁰(h) ⊆ µ⁰(l); and for some polynomialsp1(·),p2(·)andp3(·,·)there exist a function fMEM :X⁰ →X, translating anF⁰membership query toF, and a partial functionfEQ :L × L ×X →X⁰defined for every(l, h, x)such that|h| ≤p1(|l|), translating an answer to anFequivalence query toF⁰, such that:

• for allx⁰ ∈X⁰we havex⁰ ∈µ⁰(l)ifffMEM(x⁰)∈µ(l);

• for allx∈X we havex∈µ(l)\µ(h)ifffEQ(l, h, x)∈ µ⁰(l)\µ⁰(h);

• |fMEM(x⁰)| ≤p2(|x⁰|);

• the sum of sizes of inputs to queries used to compute fEQ(l, h, x) is bounded by p3(|l|,|x|), |fEQ(l, h, x)| ≤ p3(|l|,|x|)andlcan only be accessed by calls to the oracle MEM_l,X.

Note that even thoughfEQtakeshas input, the polynomial query bound on computingfEQ(l, h, x)does not depend on the size ofhasfEQis only defined forhpolynomial in the size ofl.

Theorem 2 Let F = (X,L, µ) andF⁰ = (X⁰,L, µ⁰)be learning frameworks. If there exists a positive polynomial query reduction fromFtoF⁰and a polynomial query learning algorithm forF⁰that uses membership queries and positive bounded equivalence queries then Fis polynomial query exact learnable.

We use Theorem 2 to prove polynomial query learnability ofF_D(DL-Lite^∃_H,ELI-IQ)andF_D(ELH^lhs,AQ)by reduction toF_S(DL-Lite^∃_H)and, respectively,F_S(ELH^lhs). The remaining positive results in Table 1 are similar and given in the appendix.

The function fMEM required in Theorem 2 is easily defined by setting fMEM(r v s) := ({r(a, b)}, s(a, b))(for distincta, b∈N_I) andfMEM(CvD) := (A^C, D(ρC))since (T,{r(a, b)}) |= s(a, b)iffT |= r v sand(T,A^C) |=

D(ρC)iffT |=CvD.

Conversely, given a positive counterexample(A, r(a, b)) (that is,(T,A) |= r(a, b)and(H,A)6|= r(a, b)for target TBoxT and hypothesisH) there always existss(a, b)∈ A such that({s(a, b)}, r(a, b))is a positive counterexample as well. Thus, we definefEQ(T,H,(A, r(a, b))) :=svr. In what follows we define the image offEQfor counterexamples of the form(A, C(a)).

Algorithm 1Reducing a positive counterexample 1: functionREDUCECOUNTEREXAMPLE(A,C(a)) 2: Find a role saturated and parent/child mergedC(a) 3: ifC=C0u...uCnthen

4: FindCi,0≤i≤n, such that(H,A)6|=Ci(a)

5: C:=Ci

6: ifC=∃r.C⁰and there iss(a, b)∈ Asuch that 7: (T,{s(a, b)})|=r(a, b)and(T,A)|=C⁰(b)then 8: REDUCECOUNTEREXAMPLE(A,C⁰(b))

9: else

10: Find a singletonA⁰ ⊆ Asuch that

11: (T,A⁰)|=C(a)but(H,A⁰)6|=C(a) 12: return(A⁰,C(a))

Construction offEQ forF_D(DL-Lite^∃_H,ELI-IQ) Given a targetT and hypothesisHsuch thatT |=H, Algorithm 1 transforms every positive counterexample (A, C(a))into a positive counterexample (A⁰, D(b))whereA⁰ ⊆ Ais a singleton ABox (i.e., of the form{A(a)}or{r(a, b)}). Using the equivalences(T,{A(b)})|=D(b)iffT |=AvDand (T,{r(b, c)})|=D(b)iffT |=∃r.> v D, we then obtain a positive subsumption counterexample which will be the image of(T,H,(A, C(a)))underfEQ.

Given a positive data retrieval counterexample(A, C(a)), Algorithm 1 exhaustively applies the role saturation and parent-child mergingrules introduced in (Konev et al. 2014).

We say that anELI-IQC(a)isrole saturated for(T,A) if(T,A) 6|= C⁰(a)wheneverC⁰ is the result of replacing an occurrence of a rolerby some roleswithT 6|=r vs andT |=svr. To define parent/child merging, we identify eachELIconceptCwith a finite treeTCwhose nodes are labeled with concept names and edges are labeled with roles.

For example, ifC = ∃t.(Au ∃r.∃r⁻.∃r.B)u ∃s.>then Fig. 2a illustratesTC. Now, we say that anELI-IQC(a)is parent/child mergedforT andAif for nodesn1, n2, n3in TCsuch thatn2is anr-successor ofn1,n3is ans-successor ofn2andT |=r⁻ ≡swe have(T,A)6|=C⁰(a)ifC⁰is the concept that results from identifyingn1andn3. For instance, the concept in Fig. 2c is the result of identifying the leaf labeled withBin Fig. 2b with the parent of its parent. The corresponding role saturation and parent-child merging rules are formulated in the obvious way.

In Algorithm 1 the learner uses membership queries in Lines 2, 7 and 10-11. We present a run for T = {A v ∃s.B, s v r} and H = {s v r}. Assume the oracle gives as counterexample (A, C(a)), where A = {t(a, b), A(b), s(a, c)}andC(a) =∃t.(Au∃r.∃r⁻.∃r.B)u

∃s.>(a)(Fig. 2a). Role saturation producesC(a) =∃t.(Au

∃s.∃s⁻.∃s.B)u ∃s.>(a) (Fig. 2b). Then, applying parent/child merging twice we obtainC(a) =∃t.(Au ∃s.B)u

∃s.>(a)(Fig. 2c and 2d).

Since(H,A)6|=∃t.(Au ∃s.B)(a), after Lines 3-5, Algo- rithm 1 updatesCby choosing the conjunct∃t.(Au ∃s.B).

AsC is of the form ∃t.C⁰ and there ist(a, b) ∈ A such that (T,A) |= C⁰(b), the algorithm recursively calls the function “ReduceCounterExample” withAu ∃s.B(b). Now, since (H,A) 6|= ∃s.B(b), after Lines 3-5, C is updated

(6)

tA B

s r

r r

s t

B s s

s

A

(a) (b)

s s tA

s B

s tsBA

(c) (d)

Figure 2:ConceptCbeing role saturated and parent/child merged.

to ∃s.B. Finally, C is of the form ∃t.C⁰ and there is no t(b, c) ∈ Asuch that (T,A) |= C⁰(c). So the algorithm proceeds to Lines 10-11, where it choosesA(b)∈ A. Since (T,{A(b)}) |= ∃s.B(b)and(H,{A(b)}) 6|= ∃s.B(b)we

have thatT |=Av ∃s.BandH 6|=Av ∃s.B.

The following two lemmas state the main properties of Algorithm 1. A detailed analysis is given in the appendix.

Lemma 3 Let(A, C(a))be a positive counterexample. Then the following holds:

1. ifCis a basic concept then there is a singletonA⁰ ⊆ A such that(T,A⁰)|=C(a);

2. ifCis of the form∃r.C⁰andCis role saturated and parent/child merged then either there iss(a, b)∈ A(where r, s are roles) such that (T,{s(a, b)}) |= r(a, b) and (T,A) |= C⁰(b)or there is a singleton A⁰ ⊆ A such

that(T,A⁰)|=C(a).

Lemma 4 For any DL-Lite^∃_H targetT and any DL-Lite^∃_H hypothesisHwith size polynomial in|T |, given a positive counterexample(A, C(a)), Algorithm 1 computes with polynomially many polynomial size queries in|T |,|A|and|C| a positive counterexample(A⁰, D(b)), whereA⁰ ⊆ Ais a singleton ABox.

Proof. (Sketch) Let (A, C(a)) be the input of “Reduce- CounterExample”. The computation of Line 2 requires polynomially many polynomial size queries in|C|and|T |. IfChas more than one conjunct then it is updated in Lines 3-5, soCbecomes either (1) a basic concept or (2) of the form∃r.C⁰. By Lemma 3 in case (1) there is a singleton A⁰ ⊆ Asuch that (T,A⁰) |= C(a), computed by Lines 10-11 of Algorithm 1. In case (2) either there is a singleton A⁰ ⊆ Asuch that (T,A⁰) |= C(a), computed by Lines 10-11 of Algorithm 1, or we obtain a counterexample with a refinedC. Since the size of the refined counterexample is strictly smaller after every recursive call of “ReduceCoun- terExample”, the total number of calls is bounded by|C|.o

Using Theorem 1 and Theorem 2 we now obtain that F_D(DL-Lite^∃_H, ELI-IQ) is polynomial query exact learnable.

Construction offEQforF_D(ELH^lhs,AQ) We first transform a positive counterexample of the form(A, A(a))into a positive counterexample of the form(A⁰, B(ρ_A⁰))withA⁰a tree-shaped ABox rooted inρ_A⁰. We then define the image of (T,H,(A, A(a)))underfEQasC_A⁰ vB, whereC_A⁰ is the ELconcept expression corresponding toA⁰. Our algorithm

Algorithm 2Minimizing an ABoxA 1: functionMINIMIZEABOX(A) 2: Concept saturateAwithH

3: foreveryA∈NC∩Σ_T anda∈Ind(A)such that 4: (T,A)|=A(a)and(H,A)6|=A(a)do 5: Domain MinimizeAwithA(a) 6: Role MinimizeAwithA(a) 7: return(A)

is based on two operations:minimization, computed by Algo- rithm 2, andcycle unfolding. Algorithm 2minimizesa given ABox with the following three rules:

(Concept saturateAwithH) IfA(a)∈ A/ and(H,A)|= A(a)then replaceAbyA ∪ {A(a)}, whereA∈N_C∩Σ_T anda∈Ind(A).

(Domain MinimizeAwithA(a)) If(A, A(a))is a counterexample and(T,A⁻^b)|=A(a)then replaceAbyA⁻^b, whereA⁻^bis the result of removing fromAall ABox assertions in whichboccurs.

(Role Minimize A with A(a)) If (A, A(a)) is a counterexample and(T,A⁻^r(b,c)) |= A(a)then replaceAby A⁻^r(b,c), whereA⁻^r(b,c)is obtained by removing a role assertionr(b, c)fromA.

Lemma 5 For anyELH^lhstargetT and anyELH^lhshypoth- esisHwith size polynomial in|T |, given a positive counterexample(A, A(a)), Algorithm 2 computes, with polynomially many polynomial size queries in|A|and|T |, an ABoxA⁰ such that|A⁰| ≤ |T |and there exists an AQA⁰(a⁰)such that (A⁰, A⁰(a⁰))is a positive counterexample.

It remains to show that the ABox can be made tree-shaped.

We say that an ABoxAhas an (undirected) cycle if there is a finite sequencea0·r1·a1·...·rk ·ak such that (i) a0=akand (ii) there are mutually distinct assertions of the formri+1(ai, ai+1)orri+1(ai+1, ai)inA, for0 ≤i < k.

Theunfoldingof a cyclec = a0·r1·a1 ·...·rk ·ak in a given ABox Ais obtained by replacing c by the cycle c⁰=a0·r1·a1·...·rk·ak−1·rk·ba0·r1· · ·bak−1·rk·a0, wherebai are fresh individual names,0 ≤ i ≤ k−1, in such a way that (i) ifr(ai, d)∈ A, for an individualdnot in the cycle, thenr(bai, d)∈ A; and (ii) ifA(ai)∈ Athen A(bai)∈ A.

We prove in the appendix that after every cycle unfolding/minimisation step in Algorithm 3 the ABoxAon the one hand becomes strictly larger and on the other does not exceed the size of the target TBox T. Thus Algorithm 3 terminates after a polynomial number of steps yielding a tree-shaped (by Line 3) ABoxAsuch that(A, B(ρ_A))is a positive counterexample.

Lemma 6 For anyELH^lhstargetT and anyELH^lhshypoth- esisHwith size polynomial in|T |, given a positive counterexample(A, A(a)), Algorithm 3 computes, with polynomially many polynomial size queries in|T |and|A|, a tree shaped ABoxArooted inρ_AandB∈NCsuch that(A, B(ρ_A))is a positive counterexample.

Using Theorem 1 and Theorem 2 we obtain that the learning frameworkF_D(ELH^lhs, AQ)is polynomial query exact

(7)

Algorithm 3Computing a tree shaped ABox 1: functionFINDTREE(A)

2: MINIMIZEABOX(A)

3: whilethere is a cyclecinAdo 4: Unfolda∈Ind(A)in cyclec 5: MINIMIZEABOX(A)

6: FindB∈NC∩Σ_T such that for rootρ_AofA 7: (T,A)|=B(ρ_A)but(H,A)6|=B(ρ_A) 8: return(A, B(ρ_A))

learnable.

Limits of Polynomial Query Learnability

We prove thatEL^rhsTBoxes are not polynomial query learnable using data retrieval examples with CQs. This is in contrast to polynomial query learnability of DL-Lite^∃_HandEL^rhs TBoxes using data retrieval examples withELI-IQs and, respectively,EL-IQs. Surprisingly, the negative result holds already if queries of the form∃x.A(x)are admitted in addition toEL-IQs.

To prove our result, we define a superpolynomial setSof TBoxes and show that (i) any polynomial size membership query can distinguish at most polynomially many TBoxes fromS; and (ii) there exist superpolynomially many polynomial size data retrieval examples that the oracle can give as counterexamples which distinguish at most one TBox from S. To present the TBoxes inS, fix two role namesrands.

For any sequenceσ = σ¹σ². . . σⁿ withσⁱ ∈ {r, s}, the expression∃σ.Cstands for∃σ¹.∃σ². . .∃σⁿ.C. Denote by Lthe set of all sequencesσ, of which there areN = 2ⁿ many. For every such sequenceσ, consider theEL^rhsTBox T^σdefined as

T^σ = {Av ∃σ.M} ∪ T⁰ with

T⁰ = {AvX0, M v ∃r.Mu ∃s.M} ∪ {Xiv ∃r.Xi+1u ∃s.Xi+1|0≤i < n} Here theXiare used to generate a binary tree of depthnfrom the ABox{A(a)}. The inclusionAv ∃σ.Msingles out one path in this tree for eachT^σ. Finally, wheneverM holds, then eachT^σgenerates an infinite binary tree withMs. Denote by Γn ={r, s, A, M, X0, . . . , Xn}the signature of the TBoxes T^σ ∈S. Notice thatT⁰is easy to learn. Moreover, ifT⁰is known to the learner and only IQs are available in responses to equivalences queries, then a single equivalence query can force the oracle to revealT^σ asA v ∃σ.M can be found

‘inside’ every counterexample. On the other hand, if CQs are used then the oracle can provide counterexamples of the form ({A(a)},∃x.M(x)), without giving any useful information aboutT^σ. Points (i) and (ii) above follow from Lemma 7 and, respectively, Lemma 8, proved in the appendix.

Lemma 7 For any ABoxAand CQqoverΓneither:

• for everyT^σ∈S,(T^σ,A)|=q; or

• the number ofT^σ ∈ S such that(T^σ,A)|= qdoes not exceed|q|.

Lemma 8 For anyn >1and anyEL^rhsTBoxHoverΓn

there are a singleton ABoxAoverΓnand a queryqthat is an EL-IQ overΓnwith|q| ≤n+ 1or of the formq=∃x.M(x) such that either:

• (H,A)|=qand(T^σ,A)|=qfor at most oneT^σ∈S; or

• (H,A)6|=qand for everyT^σ∈Swe have(T^σ,A)|=q.

Lemmas 7 and 8 together imply thatEL^rhsTBoxes are not polynomial query learnable usings CQs in data retrieval examples. Moreover, it is sufficient to admit CQs of the form

∃x.M(x)in addition toEL-IQs.

The two lemmas above hold forELH^rhs,EL, andELH as well. This proves the non polynomial query learnability results involving CQs in Table 1. The non polynomial query learnability forF_D(EL,EL-IQ)andF_D(ELH,EL-IQ)are proved in the appendix by a nontrivial extension of the non polynomial query learnability result forELTBoxes from subsumptions in (Konev et al. 2014).

Polynomial Time Learnability

We briefly comment on our results for polynomial time learnability. The learning algorithm forF_S(ELH^lhs)in the appendix of this paper and the learning algorithms for F_S(EL^lhs),F_S(EL^rhs)andF_S(ELH^rhs)given in (Konev et al. 2014) are in fact polynomialtimealgorithms. Thus, we obtain:

Theorem 9 The subsumption learning frameworks F_S(EL^lhs),F_S(ELH^lhs),F_S(EL^rhs), andF_S(ELH^rhs)are polynomial time exact learnable with membership and positive bounded equivalence queries.

Then one can modify the notion of positive polynomialquery reducibility to an appropriate notion of positive polynomial timereducibility and provide positive polynomial time reductions to prove that the results of Table 1 forEL^lhs,ELH^lhs, EL^rhsandELH^rhshold for polynomial time learnability as well.

Open Problems

A great number of challenging problems remain open.

Firstly, it would be of great in interest to find out whether F_S(DL-Lite^∃_H) and F_D(DL-Lite^∃_H,ELI-IQ) are not only polynomial query learnable but also polynomial time learnable. We conjecture that this is not the case (if P6=NP) but did not yet find a way of proving this. Secondly, as stated in Table 1, polynomial query learnability ofF_D(EL,ELI-IQ) andF_D(DL-Lite^∃_H,CQ)remain open problems. Polynomial time learnability of those frameworks is open as well. In both cases we conjecture non polynomial query (and, therefore, time) learnability but a significant modification of the techniques introduced here will be required to prove this.

Finally, it would be of interest to apply modified versions of the algorithms presented here to obtain worst-case expo- nential but practical algorithms for frameworks that are not polynomial query learnable. Examples one might consider are the DLsELandELHwith either subsumption queries or data retrieval queries.

AcknowledgementsOzaki is supported by the Science without Borders scholarship programme.

(8)

References

Angluin, D.; Frazier, M.; and Pitt, L. 1992. Learning con- junctions of Horn clauses.Machine Learning9:147–164.

Angluin, D. 1987. Queries and concept learning. Machine Learning2(4):319–342.

Arias, M., and Khardon, R. 2002. Learning closed Horn expressions. Inf. Comput.178(1):214–240.

Arias, M.; Khardon, R.; and Maloberti, J. 2007. Learn- ing Horn expressions with LOGAN-H. Journal of Machine Learning Research8:549–587.

Arias, M. 2004. Exact learning of first-order expressions from queries. Ph.D. Dissertation, Citeseer.

Baader, F.; Calvanese, D.; McGuiness, D.; Nardi, D.; and Patel-Schneider, P. 2003.The Description Logic Handbook:

Theory, implementation and applications. Cambridge Uni- versity Press.

Baader, F.; Ganter, B.; Sertkaya, B.; and Sattler, U. 2007.

Completing description logic knowledge bases using formal concept analysis. InIJCAI, volume 7, 230–235.

Baader, F.; Brandt, S.; and Lutz, C. 2005. Pushing theEL envelope. InIJCAI, 364–369. Professional Book Center.

Bechhofer, S.; Horrocks, I.; Goble, C.; and Stevens, R. 2001.

Oiled: a reason-able ontology editor for the semantic web. In KI. Springer. 396–408.

Bienvenu, M.; Ortiz, M.; ˇSimkus, M.; and Xiao, G. 2013.

Tractable queries for lightweight description logics. InAAAI, 768–774. AAAI Press.

Blackburn, P.; Benthem, J. F. A. K. v.; and Wolter, F. 2006.

Handbook of Modal Logic, Volume 3 (Studies in Logic and Practical Reasoning). New York, NY, USA: Elsevier Science Inc.

Borchmann, D., and Distel, F. 2011. Mining ofEL-GCIs.

InThe 11th IEEE International Conference on Data Mining Workshops. Vancouver, Canada: IEEE Computer Society.

Buitelaar, P.; Cimiano, P.; and Magnini, B., eds. 2005. On- tology Learning from Text: Methods, Evaluation and Appli- cations. IOS Press.

Calvanese, D.; De Giacomo, G.; Lembo, D.; Lenzerini, M.;

and Rosati, R. 2007. Tractable reasoning and efficient query answering in description logics: TheDL-Litefamily.Journal of Automated reasoning39(3):385–429.

Cimiano, P.; Hotho, A.; and Staab, S. 2005. Learning concept hierarchies from text corpora using formal concept analysis.

J. Artif. Intell. Res. (JAIR)24:305–339.

Cohen, W. W., and Hirsh, H. 1994. Learning the CLASSIC description logic: Theoretical and experimental results. In KR, 121–133. Morgan Kaufmann.

Day-Richter, J.; Harris, M. A.; Haendel, M.; Lewis, S.; et al.

2007. Obo-edit an ontology editor for biologists. Bioinfor- matics23(16):2198–2200.

Frazier, M., and Pitt, L. 1993. Learning From Entailment:

An Application to Propositional Horn Sentences. InICML, 120–127.

Frazier, M., and Pitt, L. 1996. Classic learning. Machine Learning25(2-3):151–193.

Kikot, S.; Kontchakov, R.; and Zakharyaschev, M. 2011. On (in) tractability of OBDA with OWL 2 QL. CEUR Workshop Proceedings.

Klarman, S., and Britz, K. 2015. Ontology learning from interpretations in lightweight description logics. InILP.

Konev, B.; Ludwig, M.; Walther, D.; and Wolter, F. 2012.

The logical difference for the lightweight description logic EL. J. Artif. Intell. Res. (JAIR)44:633–708.

Konev, B.; Lutz, C.; Ozaki, A.; and Wolter, F. 2014. Exact learning of lightweight description logic ontologies. InKR.

Lehmann, J., and Hitzler, P. 2010. Concept learning in description logics using refinement operators. Machine Learn- ing78(1-2):203–250.

Lehmann, J., and V¨olker, J. 2014.Perspectives on Ontology Learning, volume 18. IOS Press.

Lutz, C.; Piro, R.; and Wolter, F. 2011. Description logic TBoxes: Model-theoretic characterizations and rewritability.

InIJCAI, 983–988.

Ma, Y., and Distel, F. 2013. Learning formal definitions for snomed CT from text. InAIME, 73–77.

Musen, M. A. 2013. Prot´eg´e ontology editor. Encyclopedia of Systems Biology1763–1765.

Reddy, C., and Tadepalli, P. 1998. Learning First-Order Acyclic Horn Programs from Entailment. InICML, 23–37.

Morgan Kaufmann.

Reddy, C., and Tadepalli, P. 1999. Learning Horn definitions:

Theory and an application to planning. New Generation Comput.17(1):77–98.

Schlobach, S.; Huang, Z.; Cornet, R.; and Van Harmelen, F. 2007. Debugging incoherent terminologies. Journal of Automated Reasoning39(3):317–349.

Selman, J., and Fern, A. 2011. Learning first-order definite theories via object-based queries. InECML/PKDD (3), 159–

174.

Stuckenschmidt, H.; Parent, C.; and Spaccapietra, S., eds.

2009. Modular Ontologies: Concepts, Theories and Tech- niques for Knowledge Modularization, volume 5445 ofLec- ture Notes in Computer Science. Springer.

ten Cate, B.; Dalmau, V.; and Kolaitis, P. G. 2012. Learning schema mappings. InICDT, 182–195.

Wang, H.; Horridge, M.; Rector, A.; Drummond, N.; and Sei- denberg, J. 2005. Debugging OWL-DL ontologies: A heuris- tic approach. InThe Semantic Web–ISWC 2005. Springer.

745–757.

Technical Tools

We start by introducing basic tools for studying the DLs considered in this paper. These include the canonical (also called universal or minimal) model of DL knowledge bases, the well known link between homomorphisms, relational structures and CQ evaluation, and also the translation between tree-shaped interpretations and ELI-IQs. The DLs studied in this paper are fragments of the DLELIH, where anELIHTBoxconsists of a finite set of CIsC vDwith C, DasELIconcepts and a finite set of RIsrvs, withr, s

(9)

roles. The semantics ofELIHis given by interpretations.

An interpretationI= (∆^I,·^I)consists of a non-empty set

∆Î and a function·Îthat assigns each concept nameAto a setAÎ ⊆∆Î and each role namerto a binary relation rÎ ⊆∆Î×∆Î. To interpret an ABoxA, we consider interpretations I which also assign to each a ∈ Ind(A)an elementaÎ ∈∆Î, where we assume thataÎ6=bÎwhenever a6=b(theunique name assumption). TheextensionCÎof an ELIconcept expressionCis inductively defined as follows:

• >^I= ∆^I

• (CuD)Î=CÎ∩DÎ

• (∃r.C)Î={d∈∆Î | ∃e∈CÎ : (d, e)∈rÎ}

• (∃r⁻.C)Î={d∈∆Î| ∃e∈CÎ : (e, d)∈rÎ} An interpretationIsatisfies:

• a concept inclusionC vD, in symbolsI |=CvD, if C^I⊆D^I;

• a role inclusionrvs, in symbolsI |=rvs, ifr^I⊆s^I;

• an instance assertionC(a), in symbolsI |=C(a), ifa^I ∈ C^I;

• a role assertion r(a, b), in symbols I |= r(a, b), if (aÎ, bÎ)∈rÎ.

We say that an interpretationI is amodelof a TBoxT (an ABoxA) ifI |=αfor allα∈ T (α∈ A). A CI (a RI) αfollows froma TBoxT if every model ofT is a model ofα, in symbols T |= α. We use |= αto denote that α follows from the empty TBox. A knowledge base (KB) is a pairK = (T,A)consisting of a TBoxT and an ABox A. A queryqfollows fromK = (T,A)if every model of (T,A)is a model ofq, in symbols(T,A)|=q. If the DL at hand allows inverse roles then we assume that the ABoxA is closed under inverses, i.e.r(a, b)∈ Aiffr⁻(b, a)∈ A. If we add an assertionr(a, b)inAthen we do so assuming thatr⁻(b, a)is also added and, so, the resultingAis again closed under inverses. Similarly, if an assertionr(a, b) is removed fromAthen we do so assuming that r⁻(b, a)is also removed.

Trees and Homomorphisms Apathin anELIconcept expressionCis a finite sequenceC0·r1·C1·...·rk·Ck, where C0=C,k≥0, and∃ri+1.Ci+1is a top-level conjunct of Ci, for0≤i < k. The setpaths(C)contains all paths inC.

We also definetail(p) ={A|Ais a top-level conjunct of Ck}, whereCkis the last concept expression in pathp.

Definition 10 The interpretationI^Cof anELIconcept ex- pressionCis defined as follows:

• ∆^I^C =paths(C)

• A^I^C ={p∈paths(C)|A∈tail(p)}

• r^I^C ={(p, p⁰)∈paths(C)×paths(C)|p⁰=p·r·D} We denote the root ofI^CbyρC.

The next lemma relates homomorphisms from interpretations I^Cinto interpretationsIto the extension of the conceptC inI. Given two interpretationsIandJ, a homomorphism h:I → J is a mapping from∆^Ito∆^J such that

• ifd∈A^I, thenh(d)∈A^J for allA∈NC;

• if(d, d⁰)∈r^I, then(h(d), h(d⁰))∈r^J, for allr∈NR. The proof of the following lemma is straightforward.

Lemma 11 LetCbe anELIconcept expression. LetIbe an interpretation withd∈∆^I. Then,d∈C^Iif, and only if, there is a homomorphismh:I^C→ Isuch thath(ρC) =d.

Canonical Models forELIH

We introduce the canonical modelIT,Aof a knowledge base consisting of a TBoxT and an ABoxAand the canonical modelI^C,T of anELIconcept expressionCond TBoxT. We start by introducing the canonical modelIAof an ABox A.

Definition 12 The canonical modelIA= (∆ÎÂ,·ÎÂ)of an ABoxAis defined as follows:

• ∆^I^A ={a|a∈Ind(A)}

• A^I^A ={a|A(a)∈ A,A∈N_C}

• r^I^A ={(a, b)|r(a, b)∈ A,r∈NR}

The canonical modelIT,Aof anELIHknowledge base K= (T,A)is defined as the union of a sequence of interpre- tationsI⁰,I¹, . . .. We defineI⁰by extendingIAwith

r^I⁰={(a, b)|s(a, b)∈ A,T |=svr},

wherer, sare roles. Assume now thatIⁿhas been defined.

Its domain∆^Iⁿconsists of sequencesa0·r0·C0·r1·C1·...· rm·Cm, wherea0∈Ind(A). To defineIⁿ⁺¹, we introduce some notation. For sequences

p=a0·r0·C0·r1·C1·...·rm·Cm

and

q=C₀⁰ ·r₁⁰ ·C₁⁰ ·...·r⁰_m⁰ ·C_m⁰ ⁰

we define the concatenationp·s·qofpandqthrough a role sas

a0·r0·C0·r1·C1·...·rm·Cm·s·C₀⁰·r⁰₁·C₁⁰·...·r_m⁰ ⁰·C_m⁰ ⁰ Now letk ≤nbe minimal such that there isC vD ∈ T andp∈∆Î^kwithp∈CÎ^k, butp /∈DÎⁿ. LetDbe of the formd

1≤i≤lAi ud

1≤j≤l⁰∃sj.Ej, whereAi are concept names; sj are roles and Cj areELI concept expressions with1 ≤i ≤l,1 ≤j ≤l⁰; andl, l⁰ ≥0. Then we define Iⁿ⁺¹as follows:

∆^Iⁿ⁺¹ = ∆^Iⁿ∪ {p·sj·q|q∈paths(Ej),1≤j≤l⁰}; for allA∈N_C:

A^Iⁿ⁺¹ = A^Iⁿ∪

{p·sj·q|A∈tail(q),1≤j≤l⁰} ∪ {p|Ai=A,1≤i≤l};

for all roler:

r^Iⁿ⁺¹ = r^Iⁿ∪

{(p·sj·q, p·sj·q⁰)|(q, q⁰)∈s^I^Ej, T |=svr,1≤j≤l⁰} ∪

{(p, p·sj·Ej)| T |=sj vr,1≤j≤l⁰}.

(10)

b

s c

r a

B As r

s b

c a

A

(a) The canonical modelIA (b) The canonical modelIT,A

Figure 3: Canonical Models with A = {r(a, b), A(b), s(a, c)}andT ={Av ∃s.B}

This concludes the inductive definition of the sequence I⁰,I¹, . . .. Finally, we setIT,A=S

n≥0Iⁿ. As an example let A = {r(a, b), A(b), s(a, c)} andT = {A v ∃s.B}. Figures 3-a and 3-b show the interpretationsIAandIT,A, respectively.

The following two lemmas summarize the main properties ofIT,A. The proofs are straightforward.

Lemma 13 Letqbe a CQ and(T,A)anELIHknowledge base. ThenIT,A|=qif and only if(T,A)|=q.

The canonical modelI^C,T of anELIconcept expression Cand aELIHTBoxT is defined asIT,AC.

Lemma 14 LetCandDbeELIconcept expressions and T anELIH TBox. ThenI^C,T |= D(ρC)if, and only if, T |=CvD.

Definition of Polynomial Time Learnability

We employ the following standard definition of polynomial time exact learnability. A learning framework(X,L, µ)is polynomialtimeexact learnable if it is exact learnable by an algorithmAsuch that at every step (we count each call to an oracle as one step of computation) of computation the time used byAup to that step is bounded by a polynomial p(|l|,|x|), wherel is the target andx ∈ X is the largest counterexample seen so far.

The following proposition is a direct consequence of fact that any polynomial time algorithm only generates polynomial size output.

Proposition 15 If a learning frameworkFis polynomially time exact learnable thenFis polynomially query exact learnable.

In what follows, whenever possible, we prove polynomial time learnability for the positive results in Table 1. By Propo- sition 15 this implies polynomial query learnability.

Proofs for Theorem 1 and Theorem 9

It remains to prove polynomial time exact learnability of ELH^lhsTBoxes in the subsumption framework. To this end, we extend the polynomial time learnability proof forEL^lhs TBoxes presented in (Konev et al. 2014) toELH^lhsTBoxes.

The main challenge in allowing role inclusions is that the product construction, which is fundamental for the learning algorithm presented in (Konev et al. 2014), has to take into account the role hierarchy. In particular, the product construction can lead to non-tree shaped interpretations which may

not be easily mapped into polynomial size tree interpretations as was done in the construction forEL^lhs.

We start by giving a brief overview of the algorithm provided in (Konev et al. 2014), show that two naive attempts to extend it with role inclusions fail and then demonstrate how it can be modified. Before that, we need to introduce some notions.

We often work with interpretations which have the struc- ture of a tree. A directed graphGis a pair(V, E)whereV is a set of vertices andE⊆V ×V is a set of ordered pairs of vertices (called edges) connecting vertices. A path in a directed graphG= (V, E)is a finite sequenced0·d1·. . .·dk, k ≥ 0, where(di, di+1) ∈ E, for 0 ≤ i < k. The set paths(G, d)contains all paths in Gstarting fromd ∈ V. That is, ifd0·d1·. . .·dk ∈paths(G, d)thend0=d. Let tail_G(p) =dkbe the last element in pathp=d0·d1·. . .·dk. A directed graphGis tree shapedif there is a unique element (the root), denoted by ρG, such that (i) for every d ∈ V there isp ∈ paths(G, ρG)such thatd = tailG(p) and (ii) for every distinctp1, p2 ∈ paths(G, ρG), we have thattail_G(p1)6=tail_G(p2).

Definition 16 (Tree shaped interpretation) An interpretation I is tree shaped if the directed graph G_I = (∆Î,{(d, d⁰)| ∃r∈ NR,(d, d⁰) ∈rÎ})is tree shaped and rÎ∩sÎ =∅for distinctr, s∈N_R.

Given a tree shaped interpretationI, we denote byρ_Ithe unique element of ∆^I that is the root of G_I. Every tree shaped interpretation I can be viewed as anEL concept expressionC_Iin a straightforward way. Also, the tree inter- pretationI^Cof anELconcept expressionC(Definition 10) is tree shaped.

Definition 17 (Product) Theproductof two interpretations IandJ is the interpretationI × J with

• ∆^I×J = ∆^I×∆^J;

• (d, e)∈A^I×J ifd∈A^Iande∈A^J;

• ((d, e),(d⁰, e⁰))∈r^I×J if(d, d⁰)∈r^Iand(e, e⁰)∈r^J. One can show that the product of tree shaped interpretations is a disjoint union of tree shaped interpretations. IfI andJ are tree shaped interpretations, we denote byI ×^ρJ the maximal tree shaped interpretation that is contained in I × J with root(ρ_I, ρ_J). Products preserve the truth ofEL concept expressions (Lutz, Piro, and Wolter 2011):

Lemma 18 For allELconceptsC:d∈C^Iande∈C^J iff (d, e)∈C^I×J.

Definition 19 (Simulation) Let I,J be interpretations, d0 ∈ ∆^I ande0 ∈ ∆^J. A relation S ⊆ ∆^I×∆^J is a simulationfrom(I, d0)to(J, e0)if(d0, e0) ∈ S and the following conditions are satisfied:

• for all concept names A ∈ NC and all (d, e) ∈ S, if d∈A^Ithene∈A^J;

• for all role namesr ∈ NR, all(d, e) ∈ S and alld⁰ ∈

∆^I, if(d, d⁰) ∈ r^I then there existse⁰ ∈ ∆^J such that (e, e⁰)∈r^J and(d⁰, e⁰)∈S.