Fixed Parameter Tractable Reasoning in DLs via Decomposition

(1)

Fixed Parameter Tractable Reasoning in DLs via Decomposition

František Simanˇcík, Boris Motik, and Markus Krötzsch Department of Computer Science, University of Oxford, UK

1 Introduction

DL reasoning is of high computational complexity even for basic DLs such asALCI [3, Chapter 3]. Intuitively, due to disjunctions (or-branching) and/or existential quantifiers (and-branching), a DL reasoner may need to investigate (at least) exponentially many combinations of concepts. A range of highly-tuned optimizations, such as absorp- tion, dependency-directed backtracking, blocking, and caching [3, Chapter 9], can be used to tame these sources of complexity. None of these techniques, however, provide formal tractability guarantees. Such guarantees can be obtained by restricting the language expressivity, as done in theEL[2], DL-Lite [4,1], and DLP [8] families of DLs.

Tractable DLs typically do not support disjunctions, which eliminates or-branching, and they either significantly restrict universal quantification (as inELand DL-Lite) or disal- low existential quantification (as in DLP), which eliminates or reduces and-branching.

Obtaining tractability guarantees for hard computational problems has been exten- sively studied inparameterized complexity[5]. The general idea is to measure the “hard- ness” of a problem instance of sizenusing a nonnegative integerparameter k, and the goal is to solve the problem in time that becomes polynomial innwheneverkis fixed.

A particular goal is to identifyfixed parameter tractable(FPT) problems, which can be solved in time f(k)·n^c, wherecis a constant and f is an arbitrary computable function that dependsonlyonk. Note that not every problem that becomes tractable ifkis fixed is in FPT. For example, checking whether a graph of sizencontains a clique of sizek can clearly be performed in timeO(n^k), which is polynomial ifkis a constant; however, sincekis in the exponent ofn, this does not prove membership in FPT.

Note that every problem is FPT if the parameter is the problem’s size, so a useful parameterization should allow increasing the size arbitrarily while keeping the parameter bounded. Various problems in AI were successfully parameterized by exploiting the graph-theoretic notions of tree decompositions andtreewidth[6,7,10], which we recapitulate next. Ahypergraphis a pairG=hV,Hi whereV is a set of vertices and H ⊆2^V is a set ofhyperedges. Atree decompositionofGis a pairhT,LiwhereT is an undirected tree whose sets of vertices (also calledbags) and edges are denoted with B(T) andE(T), andL:B(T)→2^V is a labeling ofB(T) by subsets ofVsuch that (T1) for eachv∈V, the set{b∈B(T)|v∈L(b)}induces a connected subtree ofT, and (T2) for eache∈H, there exists a bagb∈B(T) such thate⊆L(b).

ThewidthofhT,Liis defined as maxb∈B(T)L(b)−1. Finally, thetreewidthofGis the minimum width among all possible tree decompositions ofG. Consider now an instance N of the SAT problem, where N is a finite set of clauses (i.e., disjunctions

(2)

of possibly negated propositional variables). The notions of tree decompositions and treewidth ofNare defined w.r.t. the hypergraphGN=hVN,HNiwhereVNis the set of propositional variables occurring inN, andHN contains the hyperedge{p1, . . . ,pk}for each clause (¬)p1∨. . .∨(¬)pk∈N. When parameterized by treewidth, SAT is FPT [10]. Intuitively, the treewidth ofN shows how many propositional variables must be considered simultaneously in order to check the satisfiability ofN; thus, bounding the treewidth has the effect of bounding or-branching.

Inspired by these results, we present a novel DL reasoning algorithm that ensures fixed parameter tractability. To this end, in Section 3 we introduce a notion of a de- compositionDof a signatureΣ. Intuitively,Dis a graph that restricts the propagation information between the atomic concepts inΣ. A decomposition ofΣ can be seen as one or more tree decompositions, each reflecting the propagation of information due to or-branching, interconnected to reflect the propagation of information due to and- branching. We identify a parameter ofDcalledwidth; intuitively, this parameter determines an upper bound on the number of concepts that must be considered simultaneously to solve a reasoning problem. LetObe anALCIontologynormalizedto contain only axioms of the form

iA_ivF

jB_j,Av ∃R.B, andAv ∀R.B, whereA(i) andB(j)

are atomic concepts, andRis a (possibly inverse) role. We present a resolution-based reasoning calculus that runs in timeO(f(d)· |D| · |O|), wheredis the width ofD,|D|is the size ofD, and|O|is the number of axioms inO. Our calculus is not complete for all D: it is not guaranteed to derive all consequences that might be of interest. To remedy that, we introduce a notion ofDbeingadmissibleforOand the relevant consequences, and we show that admissibility guarantees completeness.

Ideally, givenOand the relevant consequences, one would identify an admissible decompositionDof smallest width and then run our calculus in order to obtain an FPT algorithm. In Section 4, however, we show that, for certainO, all admissible decompositions of smallest width have exponentially many vertices. This is in contrast to tree decompositions (e.g., for each instance of SAT, a tree decomposition of minimal width exists in which the number of vertices is linear in the size of the instance) and is due to the fact that, in addition to or-branching, our decompositions analyze information flow due to and-branching as well. We therefore further restrict the notion of admissible decompositions in several ways. For each of the resulting notions, one can compute a decomposition of width at mostd(if one exists) in time f(d)· |O|^cwithfa computable function andcan integer constant; together with our resolution-based calculus, we thus obtain an FPT calculus for reasoning with normalizedALCIontologies.

In Section 5 we show that the minimum decomposition width of several commonly used ontologies is much smaller than the respective ontology’s size. This suggests that decomposition width provides a “reasonable” measure of ontology complexity, and that our approach might even provide practical tractability guarantees.

Our results can be applied toSH Iontologies by transforming away role hierar- chies and transitivity and normalizing the ontology in a preprocessing step. Such trans- formations, however, are don’t-care nondeterministic, and the minimum decomposition width of the normalization result might depend on the nondeterministic choices. In this paper we thus restrict our attention to normalizedALCIontologies, and we leave an investigation of how normalization affects the minimum width for future work.

(3)

R1

AvA ^R²

K1vM1tA AuK2vM2

K1uK2vM1tM2

R3

KvM :KvM∈ O R4

Bu

iDivF

jEj

or

iD_ivF

jE_j Au

iCivF

jFj

:

Av ∃R.B∈ O Civ ∀R.Di∈ O Ejv ∀R⁻.Fj∈ O

Fig. 1.A simple resolution calculus

The proofs of all results presented in this paper are available in the technical report at http://www.comlab.ox.ac.uk/boris.motik/pubs/smk11dl-decomposition.pdf.

2 Source of Complexity in DL Reasoning

In order to motivate the results presented in the following sections, in this section we present a very simple calculus that is not FPT, and we discuss the rough idea for making the calculus FPT. The calculus is based on resolution, and is similar to the calculus presented in [9]. Resolution can often provide worst-case optimal calculi whose best case complexity is significantly lower than the worst case complexity; indeed, the calculus from [9] has demonstrated excellent practical performance.

The calculus manipulatesclauses—expressions of the formK v M, whereK is a finite conjunction of atomic concepts, andMis a finite disjunction of atomic concepts.

Withsig(K),sig(M), andsig(KvM) we denote the sets of atomic concepts occurring inK,M, andKvM, respectively. We consider two disjunctions (resp. conjunctions) to be the same whenever they mention the same atoms; that is, we disregard the order and the multiplicity of atoms. We write empty K and M as > and⊥, respectively.

Furthermore, we say that a clause K⁰vM⁰ is a strengtheningof a clause KvM if sig(K⁰)⊆sig(K) andsig(M⁰)⊆sig(M). We writeKvM∈ Nˆ if the set of clausesN contains at least one strengthening of the clauseKvM.

Given a normalized ontologyO, our calculus constructs aderivation—a sequence S₀,S₁, . . .of sets of clauses such that S₀=∅, and for eachi>0, set S_i is obtained fromS_i−1 by applying a rule from Fig. 1. RulesR1 andR2implement propositional resolution, and ruleR3ensures that each clause inOis taken into account. RuleR4

handles role restrictions; letter R stands for a role (i.e., Rneed not be atomic), and inv(R) is the inverse role ofR; finally, note that the atomBin the premise of the rule is optional. Intuitively, the rule says that, ifB,D_i, and¬E_jjointly imply a contradiction, butAv ∃R.B,C_iv ∀R.D_i, and¬F_jv ∀R.¬E_jhold, thenA,C_i, and¬F_jjointly imply a contradiction too. Reasoning with the second premise is analogous.

Asaturationis defined asSBS

iS_i. The calculusinfersa clauseKvM, written O `KvM, ifKvM∈ S. It is straightforward to see that the calculus isˆ sound: if O `KvM, then O |=KvM. Typically, resolution is used as a refutation-complete calculus; however, it is possible to show that the variant of resolution presented here iscompletein the following stronger sense: ifO |=KvM, thenO `KvM; note that this means that the calculus infersat least one strengtheningof each clause entailed by

(4)

O. This stronger notion of completeness can be useful in practice; for example,Ocan be classified using a single run of the calculus, which is not the case for calculi (such as tableau) that are only refutationally complete.

Letdbe the number of atomic concepts inO. Since each clause is uniquely identi- fied by the atomic concepts that occur inKand/or M, the calculus can derive at most 4^d clauses, which is exponential in|O|. The high complexity of DL reasoning arises because one may have to consider exponentially many combinations of concepts, and this fact fundamentally underpins all DL reasoning algorithms. Clearly, a tractable algorithm should consider only polynomially many combinations. For example, reasoning algorithms forELexploit the fact that only polynomially many combinations are “relevant” and that all of them can be constructed deterministically. In the following sections, we ensure tractability of reasoning in a radically different way. Instead of restricting the ontology language, we show that by restricting the structure of the ontology with a suitable parameter one can limit the number of concepts that must be simultaneously considered, which effectively limits the exponent in the above calculation. Since the base of the exponent not depend on|O|, we will thus obtain an FPT reasoning calculus.

3 Reasoning with Decompositions

In this section we develop the notions of decomposition, decomposition admissibility, and the resolution calculus. We start by introducing the notion of decomposition.

Definition 1. LetΣ = hΣ_A, Σ_Ribe a DL signature, whereΣ_A is a finite set of atomic concepts and ΣR is a finite set of atomic roles; let ΣR⁻={R⁻|R∈ΣR} be the set of inverse roles ofΣR; and letbe a symbol not contained inΣA∪ΣR∪ΣR⁻.

AdecompositionofΣ is a labeled graphD=hV,E,sigi, whereVis a finite set of vertices,E ⊆ V × V ×(ΣR∪Σ_R⁻∪ {})is a set of directed edges labeled by a role or by, andsig:V →2^Σ^Ais a labeling of each vertex with a set of atomic concepts. The widthofDis defined aswd(D)Bmaxv∈V|sig(v)|.

Note thatDis not defined w.r.t. an ontology, but w.r.t. a signatureΣ, and we will establish a link betweenDandOshortly in our notion of admissibility. This is mainly so as to gather all conditions that guarantee completeness in one place. We discuss the intuition behind this definition after presenting the resolution-based calculus.

Definition 2. LetΣ be a DL signature, letD=hV,E,sigi be a decomposition ofΣ, and letObe a normalizedALCIontology overΣ. Theresolution calculusforDand Ois defined as follows.

Aclause systemforDis a functionSthat assigns to each vertex v∈ V a set of clausesS(v). Aderivationof the calculus is a sequence of clause systemsS₀,S₁,S₂, . . . such thatS₀(v) = ∅for each v∈ Vand, for each i >0,S_iis obtained fromS_i−1 by an application of a derivation rule from Fig. 2; we assume that each derivation isfair in the usual sense. Thesaturationis the clause systemSdefined byS(v)BS

iS_i(v)for each v∈ V. The calculusinfersa clause K vM at vertex v, writtenO,v`_D KvM, if KvM∈ S(v); furthermore, the calculusˆ infersa clause KvM, writtenO `_D KvM, if a vertex v∈ Vexists such thatO,v`_DKvM.

(5)

R1

addAvAtoS(v) :A∈sig(v) R2

K1vM1tA∈ S(v) AuK2vM2∈ S(v) addK1uK2vM1tM2toS(v) R3

addKvMtoS(v) :KvM∈ O sig(KvM)⊆sig(v)

R4

Bu

iDivF

jEj∈ S(u)

or

iDivF

jEj∈ S(u) addAu

iCivF

jFjtoS(v) :

Av ∃R.B∈ O Civ ∀R.Di∈ O Ejv ∀inv(R).Fj∈ O hu,v,Ri ∈ E sig(Au

iCivF

jFj)⊆sig(v) R5

KvM∈ S(u)

addKvMtoS(v) :hu,v, i ∈ E sig(KvM)⊆sig(v)

Fig. 2.The decomposition calculus

The calculus iscomplete(sound) ifO |=KvM implies (is implied by)O `_DKvM for each clause KvM over Σ. Given a set of clausesCover Σ, the calculus is C- completeifO |=KvM impliesO `_DKvM for each KvM∈ C.

While the simple calculus from Section 2 saturates a single set of clauses, the resolution calculus forDandOsaturates one set of clauses per decomposition vertex. In particular, for a vertexv∈ V, setS(v) contains only clauses whose propositional atoms are all contained insig(v), sovidentifies a propositional subproblem ofO. RulesR₁–R₃ implement propositional resolution “within” each vertexv. RuleR₅propagates propositional consequences from vertexuto vertexvconnected by an-labeled edge; thus, the -labeled edges ofD“connect” the subproblems ofOin accordance with or-branching.

Finally, ruleR4propagates modal consequences from a vertexuto a vertexvconnected by anR-labeled edge; thus, theR-labeled edges ofD“connect” the subproblems ofO in accordance with and-branching. A clause is inferred if at least one saturated setS(v) contains a strengthening of the clause.

Note that rulesR1–R3consider only one vertex at a time, whereas rulesR4andR5

involve two vertices. Thus, although this was not our initial motivation, the calculus seems to exhibit significant parallelization potential. We leave a thorough investigation of the reasoning problem in terms of parallel complexity classes for future work.

The notion ofC-completeness takes into account that one might be interested not only in refutational completeness, but in the derivation of all clauses from some setC.

For example, if one is interested in the classification of O, thenCwould contain all clauses of the formAvBwithAandBatomic concepts occurring inO.

The following proposition determines the complexity of the calculus in terms of the sizes ofDand the number|O|of axioms inO. It essentially observes two key facts: first, since the clauses in eachS(v) are restricted to atomic concepts insig(v), the maximum number of clauses inS(v) is determined solely bywd(D); and second, given a node or a

(6)

pair of nodes, all rules can be applied in time that also depends solely onwd(D). Once we limit the size ofD, this proposition will provide us with an FPT algorithm.

Proposition 1. LetD=hV,E,sigiandObe as in Definition 2. The saturation of the resolution calculus forDandOcan be computed in time O(f(wd(D))·(|V|+|E|)· |O|), where f is some computable function.

The rules of our calculus are clearly sound for arbitrary decompositions Dand ontologies O; however, the converse is not true. As a trivial example, note that the decomposition with the empty vertex and edge sets satisfies Definition 1, and that our calculus does not infer any clause using suchD. Therefore, we next introduce the notion ofadmissibility, which we later show to be sufficient for completeness.

Definition 3. LetD=hV,E,sigibe a decomposition of a DL signatureΣ=hΣA, ΣRi.

Let W ⊆ V be an arbitrary set of vertices. The signature ofW is defined as sig(W)BS

w∈Wsig(w). The-projectionofDw.r.t.Wis the undirected graphD_W that contains the undirected edge{u,v}for eachhu,v, i ∈ Ewith u,v∈ W. SetWis -connectedif, for all u,v∈W, vertices{w0,w1, . . . ,wn} ⊆ Wexist such that w0=u, wn=v, andhwi−1,wi, i ∈ Efor each1≤i≤n; furthermore,Wis an-componentof DifWis-connected, and eachW⁰such thatW(W⁰⊆ Vis not-connected.

DecompositionDisadmissiblefor an ontologyOifhu,v, i ∈ Eimplieshv,u, i ∈ E for all u,v∈ V, and if each-componentWofDsatisfies the following properties:

(i) D_Wis an undirected tree;

(ii) for each atomic concept A∈sig(W), the set{w∈ W |A∈sig(w)}is-connected;

(iii) for each clause KvM∈ Osuch thatsig(K)⊆sig(W), a vertex w∈ Wexists such thatsig(KvM)⊆sig(w);

(iv) for each axiom Av ∃R.B∈ Osuch that A∈sig(W), an-componentUofDand vertices w∈ Wand u∈ Uexist such that

– hu,w,Ri ∈ E, – A∈sig(w), – B∈sig(u),

– for each Cv ∀R.D∈ O, if C∈sig(W)then C∈sig(w)and D∈sig(u), and – for each Ev ∀inv(R).F∈ O, if E∈sig(U)then E∈sig(u)and F∈sig(w).

A clause Kv M iscoveredbyDif an-componentWofDand a vertex w∈ W exist such thatsig(K)∪[sig(M)∩sig(W)]⊆sig(w). DecompositionDisadmissible forCif each clause inCis covered byD.

Definition 3 incorporates two largely orthogonal ideas. First, each-componentW ofDreflects the propositional constraints on domain elements of a particular type in a model ofO. To deal with or-branching, eachWis a tree decomposition formed by undirected-labeled edges. Conditions (i)–(iii) are analogous to (T1) and (T2) in Section 1, but (iii) is more general: instead of requiringsig(KvM)⊆sig(w) for eachKvM∈ O and some w∈ W, Condition (iii) takes into account that, if sig(K)*sig(W), then KvMcan be satisfied by making the atomic concepts insig(K)\sig(W) false on the appropriate domain element; thus,sig(KvM)⊆sig(w) must hold for somew∈ W only ifsig(K)⊆sig(W). Admissibility forCuses an analogous idea.

(7)

Second, to deal with and-branching, the-components ofDare interconnected via role-labeled edges. If a conceptAoccurs in an-componentWand in an axiom ofOof the formAv ∃R.B, then a domain element corresponding toWmight need to have an R-successor; to reflect that,Dmust contain an-componentU, and verticesw∈ Wand u∈ Uconnected by anR-labeled edge must exist such thatA∈sig(w) andB∈sig(u).

Furthermore, in order to address the universal quantifiers overR, ifCv ∀R.D∈ Oand C∈sig(W), thenC∈sig(w) andD∈sig(u) must hold, and analogously for universals overinv(R). These conditions ensure thatwanducontain all atomic concepts that might be relevant for modal reasoning, which in turn allows our calculus to infer all relevant constrains on atomic concepts.

The following theorem shows that admissibility indeed ensures completeness.

Theorem 1. LetObe an ontology, letCbe a set of clauses, and letD=hV,E,sigibe a decomposition that is admissible forOandC. Then, the resolution calculus forDand OisC-complete.

Ideally, given an ontologyOand a set of clausesC, one would identify a decomposition Dof smallest width and then apply the resolution calculus forDandOto obtain an FPT algorithm. The following theorem shows, however, that this idea does not work, since it is not the case that, for each ontologyO, there exists a decomposition of minimal width that is admissible forOand whose size is polynomial in|O|. In order to address this problem, in Section 4 we further restrict the notion of admissibility.

Theorem 2. A family ofALCIontologies{O_n}exists such that each decomposition admissible forO_nandC={Cv ⊥}of minimal width has size exponential in|O_n|.

4 Constructing Decompositions of Polynomial Size

In Section 4.3 we present a general method for computing admissible decompositions of polynomial size, for which we obtain the desired FPT result. This method embodies two largely orthogonal ideas, each of which we present separately for didactic purposes.

In particular, in Section 4.1 we present an approach for analyzing and-branching, and in Section 4.2 we present an approach for analyzing or-branching.

4.1 Analyzing And-Branching via Deductive Overestimation

In this section we present an approach for analyzing and-branching, which is inspired by the reasoning algorithm for EL [2]. The approach uses an overestimation of the subsumption relation to construct the decomposition. It manipulates expressions of the formK A, whereKis a conjunction of atomic concepts, andAis an atomic concept.

Given anALCIontologyOand a set of clausesC, thedeductive overestimation for OandCis the relation obtained by exhaustive application of the rules shown in Fig. 3.

Intuitively,K Astates that an object whose existence is required to satisfyKcan become an instance ofA. OnELontologies coincides with the subsumption relation, but on more expressive ontologies overestimates the subsumption relation. In order to check whether a clauseKvM∈ Cis entailed byO, ruleE1introduces an instance

(8)

E1

K A1 . . . K An

:A1u. . .uAnvB1t. . .tBm∈ C K=A1u. . .uAn

E2

K A1 . . . K An

K B1 . . . K Bm

:A1u. . .uAnvB1t. . .tBm∈ O

E3

K A

B B :Av ∃R.B∈ O E4

K A K C

B D :Av ∃R.B∈ O Cv ∀R.D∈ O ^E⁵

K A B E

K F :Av ∃R.B∈ O Ev ∀R⁻.F∈ O

Fig. 3.Computing the deductive overestimation forOandC

of all atomic concepts in K. Rule E2 addresses the fact that, if some object αis an instance ofA1, . . . ,AnandOcontains a clauseA1u. . .uAnvB1t. . .tBm, then the object must be an instance of someBi. Since a polynomial overestimation method that reasons by case is unlikely to exist, ruleE2overestimates the subsumption relation by saying thatαcan be an instance of allB1, . . . ,Bm. RuleE3takes into account that, given Av ∃R.B∈ O, each instance ofAneeds anR-successor that is an instance ofB. Anal- ogously to theELreasoning calculus, in order to obtain a polynomial overestimation method, ruleE3“reuses” the same successor to satisfy multiple existential restrictions to the same conceptB. Finally, rulesE4andE5implement modal reasoning.

Having computed , we construct the decompositionD_E=hV,E,sigiof the symbols occurring inOandCas shown below. Note thatD_Econtains no-labeled edges, as this decomposition method does not analyze or-branching. By Theorems 1 and 3, the resolution calculus forD_EandOisC-complete.

VB{vK|K Afor someA} sig(vK)B{A|K A}

EB{hvB,vK,Ri |K AandAv ∃R.B∈ O}

Theorem 3. DecompositionD_Eis admissible forOandC.

4.2 Analyzing Or-Branching via Tree Decomposition

We now present an approach for computing admissible decompositions that analyzes or-branching. The approach handles the clauses inOas explained in Section 1 for SAT, and it imposes additional constraints in order to satisfy condition (iv) of Definition 3.

Given a normalized ontologyOand a set of clausesC, we define the hypergraph GO,C=hV,Hisuch thatV andHare the smallest sets satisfying the following properties. For each atomic conceptAoccurring inOorC, we haveA∈V. For each clause KvM∈ O, we havesig(KvM)∈H. For eachAv ∃R.B∈ O, setHcontains hyper- edgesdomAv∃R.B andranAv∃R.B defined as shown below, whereCiv ∀R.Di, 1≤i≤n andEjv ∀inv(R).F_j, 1≤ j≤mare all axioms inOof the respective forms:

domAv∃R.BB{A,C1, . . . ,Cn,F1, . . . ,Fm}, ran_Av∃R.BB{B,D1, . . . ,Dn,E1, . . . ,Em}.

(9)

Finally,sig(KvM)∈Hfor eachKvM∈ C.

Given a tree decompositionhT,LiofGO,C, we construct (don’t-care nondeterminis- tically) a decompositionD_T=hV,E,sigias follows. The vertices ofD_Tare the bags of T—that is,VBB(T). The signatures ofD_Tare the labels ofT—that is,sigBL. The -edges ofD_Tare the edges ofT—that is, for each{u,v} ∈E(T), we havehu,v, i ∈ E.

Finally, for each Av ∃R.B∈ O, choose verticesu,v ∈ V such thatranAv∃R.B⊆L(u) anddomAv∃R.B⊆L(v) and sethu,v,Ri ∈ E; suchuandvexist due to property (T2) of the definition of tree decompositions in Section 1.

Theorem 4. Every decompositionD_Tis admissible forOandC.

4.3 Analyzing And- and Or-Branching Simultaneously

We now show how to combine the approaches for analyzing and- and or-branching to obtain aC-decompositionof a normalizedALCIontologyOand a set of clausesC.

The procedure consists of three steps. First, we compute the relation as described in Section 4.1. This step analyzes the and-branching inherent inOandC.

Second, for allK such thatK Afor someA, we simultaneously define hyper- graphsGK =hVK,HKiwhereVKB{A|K A}, andHK are the smallest sets satisfying the following conditions. For each clauseK⁰vM⁰∈ Owithsig(K⁰vM⁰)⊆VK, we havesig(K⁰vM⁰)∈H_K. For each axiomAv ∃R.B∈ Osuch thatA∈V_K, setH_Kcon- tains hyperedgedom_K,Av∃R.Band setH_Bcontains hyperedgeran_K,Av∃R.Bdefined below, whereC_iv ∀R.D_i, 1≤i≤nandE_jv ∀inv(R).F_j, 1≤j≤mare all axioms inOof the respective forms such thatCi∈VKandEj∈VB:

dom_K,Av∃R.BB{A,C1, . . . ,C_n,F1, . . . ,F_m}, ranK,Av∃R.BB{B,D1, . . . ,Dn,E1, . . . ,Em}. Finally, [sig(KvM)∩VK]∈HKfor eachKvM∈ C.

Third, we compute a tree decompositionhTK,LKifor each hypergraphGK; without loss of generality we assume that all sets B(TK) are disjoint. We then construct the decomposition D_C=hV,E,sigi as follows. The vertices ofD_C are the bags of the tree decompositions—that is,VBS

KB(T_K). The signatures ofD_C are the labels of the tree decompositions—that is, sig B S

KL_K. The-edges ofD_C are the edges of the tree decompositions—that is,hu,v, i ∈ Efor each{u,v} ∈E(T_K). Finally, for each axiom Av ∃R.B∈ Oand eachK such that A∈V_K, chooseu∈B(V_B) andv∈B(V_K) such thatran_K,Av∃R.B⊆L(u) anddom_K,Av∃R.B⊆L(v) and sethu,v,Ri ∈ E; suchuandv exist due to property (T2) of the definition of tree decompositions in Section 1.

The class of allC-decompositionsofOandCconsists of all decompositions obtained in the way specified above. Note that the first step (computation of ) is deter- ministic, but the second step is not as eachGKmay admit several tree decompositions.

TheC-widthofOandCis the minimal width of anyC-decomposition ofOandC.

Theorem 5. Every decompositionD_Cis admissible forOandC.

To show that DL reasoning is FPT if theC-width is bounded, we next estimate the effort required for computing aC-decomposition ofOandC. WithkOkandkCkwe denote the sizes of (i.e. the numbers of symbols required to encode)OandC, respectively.

(10)

Table 1.Upper bounds onC-width for classification

Ontology |ΣA| |Σ^norm_A | wd(D_E) wd(D_C)

SNOMED CT (http://ihtsdo.org/snomed-ct/) 315,489 516,703 349 100 SNOMED CT-SEP (see [9] for reference) 54,973 149,839 1,196 168 FMA (http://fma.biostr.washington.edu/) 41,700 81,685 1,166 35

GALEN (http://opengalen.org/) 23,136 49,245 646 54

OBI (http://obi-ontology.org/) 2,955 4,296 304 45

Proposition 2. An algorithm exists that takes as input a positive integer d, a normalized ALCIontologyO, and a set of clausesC, that runs in time O(g(d)·(kOk+kCk)⁵)for g a computable function, and that computes aC-decomposition ofOandCof width at most d whenever at least one such decomposition exists.

We can now formulate the main FPT result forC-decompositions.

Theorem 6. Let d be a positive integer, let Obe a normalizedALCIontology, and let K v M be a clause. The problem of deciding whether aC-decomposition ofOand C={KvM}of width at most d exists, and if so, whetherO |=KvM, is FPT.

5 Experimental Results

It can be argued that FPT is interesting only if the parameter can be substantially smaller than the input size. In order to judge the “usefulness” ofC-width as a complexity measure, we measured theC-width of several ontologies (listed in Table 1) that are often used for evaluating DL reasoners. We weakened all ontologies toALCH Iby discard- ing all unsupported features, we applied the structural transformation from [9], and we eliminated role inclusion axioms by unfolding the role hierarchy into universal restrictions to obtain normalizedALCIontologies. Note that there are several different ways of formulating and optimizing structural transformation, and each could produce an ontology of a differentC-width, so our results are not necessarily optimal.

After normalization, we next computed the deductive overestimation and the decomposition D_E as described in Section 4.1, we constructed the hypergraphsGK as described in Section 4.3, and we fed all of them into TreeD¹—a library for computing tree decompositions—to construct aC-decompositionD_C. For each ontology we considered two sets of goal clauses:C₁={Av ⊥ |A∈ΣA}, which corresponds to checking satisfiability of all atomic concepts, andC₂ ={AvB|A,B∈Σ_A}, which corresponds to classification. In theory, theC-width ofOandC₁can be smaller than theC-width of OandC₂; however, we have not observed a difference between the two in practice, so we present here only the results for classification. Also, please note that TreeD was able only to produce approximate, rather than exact tree decompositions; hence, our results provide only an upper bound on theC-width.

The results of our experiments are shown in Table 1. For each ontology we list the number of atomic concepts in the original ontology (|Σ_A|), the number of atomic

1http://www.itu.dk/people/sathi/treed/

(11)

concepts after normalization (|Σ^norm_A |), and the widths of the two decompositions that we constructed. Notice that although some of the tested ontologies contain tens or even hundreds of thousands of concepts, the width ofD_Crarely exceeds one hundred, and it is always by several orders of magnitude smaller than the total number of concepts in the ontology. This suggests that our notion of a decomposition might even prove to be useful in practice, provided that our resolution algorithm is suitably optimized.

6 Conclusion

We presented a DL reasoning algorithm that is fixed parameter tractable for a suitable notion of the input width. We see two main challenges for our future work. On the the- oretical side, our approach should be extended to more complex ontology languages;

handling counting seems particularly challenging. On the practical side, our algorithm should be optimized for practical use. A particular challenge is to combine the construc- tion of a decomposition with actual reasoning and thus save preprocessing time.

References

1. Artale, A., Calvanese, D., Kontchakov, R., Zakharyaschev, M.: The DL-Lite Family and Relations. Journal of Artificial Intelligence Research 36, 1–69 (2009)

2. Baader, F., Brandt, S., Lutz, C.: Pushing theELEnvelope. In: Kaelbling, L.P., Saffiotti, A.

(eds.) Proc. of the 19th Int. Joint Conference on Artificial Intelligence (IJCAI 2005). pp.

364–369. Morgan Kaufmann Publishers, Edinburgh, UK (July 30–August 5 2005)

3. Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P.F. (eds.): The De- scription Logic Handbook: Theory, Implementation and Applications. Cambridge University Press, 2nd edn. (August 2007)

4. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable Reason- ing and Efficient Query Answering in Description Logics: The DL-Lite Family. Journal of Automated Reasoning 9, 385–429 (2007)

5. Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer (1999)

6. Gottlob, G., Pichler, R., Wei, F.: Bounded Treewidth as a Key to Tractability of Knowledge Representation and Reasoning. In: Proc. of the 21st Nat. Conf. on Artificial Intelligence (AAAI 2006). pp. 250–256. AAAI Press, Boston, MA, USA (2006)

7. Gottlob, G., Scarcello, F., Sideri, M.: Fixed-parameter complexity in AI and nonmonotonic reasoning. Artificial Intelligence 138(1–2), 55–86 (2002)

8. Grosof, B.N., Horrocks, I., Volz, R., Decker, S.: Description Logic Programs: Combining Logic Programs with Description Logic. In: Proc. of the 12th Int. World Wide Web Confer- ence (WWW 2003). pp. 48–57. ACM Press, Budapest, Hungary (May 20–24 2003) 9. Simanˇcík, F., Kazakov, Y., Horrocks, I.: Consequence-Based Reasoning beyond Horn On-

tologies. In: Proc. of the 22nd Int. Joint Conf. on Artificial Intelligence (IJCAI 2011) (July 16–22 2011), to appear

10. Szeider, S.: On Fixed-Parameter Tractable Parameterizations of SAT. In: Giunchiglia, E., Tacchella, A. (eds.) Proc. of the 6th Int. Conf. on Theory and Applications of Satisfiabil- ity Testing (SAT 2003), Selected Revised Papers. LNCS, vol. 2919, pp. 188–202. Springer, Santa Margherita Ligure, Italy (May 5–8 2003)