• Keine Ergebnisse gefunden

Targeting temporal query answering, we finally lift the rewriting approach introduced in Section 8.1.3 to the temporal setting. We show that, under the assumptions formu-lated in the previous sections, positive temporal QL1 queries also enjoy a rewritability property—w.r.t. temporal knowledge bases formulated in L—and how the certain an-swers to positive temporal QL1 queries overLcan be computed. Recall our assumption on the properties of the query languages QL1 and QL2 and logic L:

(P1) Consistency of knowledge bases in L should be decidable. This is a basic prereq-uisite for any reasoning procedure, in particular for query answering.

(P2) The logic L should have the canonical model property w.r.t. QL1 (see Defini-tion 8.9). This property often represents a first step towards a rewritability result and is similarly important for our result.

(P3) QL1 queries should beQL2 rewritable w.r.t.L. In particular, we make use of ∆T, DK, and ϕT introduced in Definition 8.11.

(P4) The set of answers to anyQL2 query w.r.t. a finite interpretation should be com-putable.

Observe that we do not need (P4) in order to obtain our result, but the latter would otherwise be useless.

Before providing the rewritability result, we first lift the constructions of Defini-tions 8.9 and 8.11 to the temporal setting. For this, consider a temporal QL1 query Φ and a consistent TKB K =hT,(Ai)0≤i≤ni, which is possible by (P1); to see the latter, observe that the consistency of the TKB can be decided by regarding the KBs hT,Aii with i∈[0, n] separately. That is, we can also assume these atemporal KBs to be con-sistent, and thus define the sequences IK := (IKi)i≥0 and DK := (DKi)i≥0 of canonical and finite interpretations by (P2) and (P3); for all i > n, we consider the canonical and finite interpretations of hT,∅i. Due to the assumption that each IKi is countably infinite and Condition (ii) of Definition 8.6, we can without loss of generality assume that these canonical models have the same domain. Similarly, the finite interpretations DKi have the common domain ∆T. Thus, they are valid sequences of interpretations according to our semantics. We then define the temporal QL2 query ΦT as the one obtained from Φ by replacing every QL1 query ϕoccurring in Φ by the QL2 queryϕT and get the following rewritability result.

Theorem 8.15 Let QL1 and QL2 be query languages and L be a logic that has the canonical model property w.r.t.QL1, such thatQL1 queries areQL2 rewritable w.r.t.L.

8.3 A Rewritability Result

Then, for every consistent TKB K=hT,(Ai)0≤i≤ni, every temporal QL1 queryΦ, and every i∈[0, n], we have:

Cert(Φ,K, i) =Ans(Φ,IK, i) =Ans(ΦT,DK, i).

Proof. We first prove Cert(Φ,K, i) ⊆ Ans(Φ,IK, i). For some a ∈ Cert(Φ,K, i) and every I = (Ii)0≤i≤n with I |= K, we then have I, i |=QL1 a(Φ). By (P2) and Defini-tion 8.14, we getIK, i|=QL1 a(Φ) or, equivalently,a∈Ans(Φ,IK, i).

It is left to prove the following two claims:

(1) Ans(Φ,IK, i)⊆Ans(ΦT,DK, i), (2) Ans(ΦT,DK, i)⊆Cert(Φ,K, i).

We show this by induction on the structure of Φ.

For the base case, we regard an atemporalQL1query Φ. For (1), leta∈Ans(Φ,IK, i).

Since Φ is a QL1 query, the semantics yields that a ∈ Ans(Φ,IKi). From (P3), we obtain a ∈ Ans(ΦT,DKi), and thus a ∈ Ans(ΦT,DK, i), by the semantics of temporal QL2 queries.

For (2), leta∈Ans(ΦT,DK, i). Since ΦT is aQL2query, this impliesa∈Ans(ΦT,DKi).

Because of (P3), we have a ∈Cert(Φ,Ki). This means that for every interpretation I with I |= Ai and I |= T, we have that I |=QL1 a(Φ). Hence, for every sequence I = (Ii)0≤i≤n with I |=K, we have Ii |=QL1 a(Φ). Since Φ is a QL1 query, the latter condition is equivalent to a∈Ans(Φ,I, i), and thus we geta∈Cert(Φ,K, i).

Let now Φ be of the form Φ1∧Φ2. For (1), assume that IK, i |=QL1 a(Φ), and thus we have IK, i |=QL1 aΦ11) and IK, i |=QL1 aΦ22). By the induction hypothesis, DK, i |=QL2 aΦ1T1) and DK, i|=QL2 aΦ2T2), and thus we get DK, i|=QL2 a(ΦT) by the definition of ΦT and the semantics.

For (2), we assume that DK, i |=QL2 a(ΦT), and thus DK, i |=QL2 aΦ1T1) and DK, i |=QL2 aΦ2T2). Hence, we have a ∈ Cert(Φ1,K, i) and a ∈Cert(Φ2,K, i) by the induction hypothesis. Thus, for every I with I |= K, it holds that I, i |=QL1 aΦ11) and I, i|=QL1 aΦ22). This is equivalent to a∈Cert(Φ1∧Φ2,K, i).

Let now Φ be of the form #FΦ1. For Claim (1), we take IK, i |=QL1 a(#FΦ1). By the temporal semantics, we have IK, i+ 1 |=QL1 a(Φ1). By the induction hypothesis, we get DK, i+ 1|=QL2 a(ΦT1), which implies DK, i|=QL2 a(ΦT) by the definition of ΦT. For (2), letDK, i|=QL2 a(ΦT). Hence, we have DK, i+ 1|=QL2 a(ΦT1), which implies a ∈Cert(Φ1,K, i+ 1) by the induction hypothesis. This means that, for every I|=K, we have I, i|=QL1 a(#FΦ1), which shows that a∈Cert(Φ,K, i).

For the next inductive case, let Φ be of the form Φ12. For (1), we assume that IK, i |=QL1 a(Φ12), and thus there is a ki such that IK, k |=QL1 aΦ22) and IK, j |=QL1 aΦ11) for all j ∈ [i, k[. By the induction hypothesis, we obtain DK, k|=QL2 aΦ2T2) andDK, j|=QL2 aΦ1T1) for allj∈[i, k[. The definitions of|=QL2 and ΦT yield that DK, i|=QL2 a(ΦT).

For (2), we assume thatDK, i|=QL2 a(ΦT). By the definition of ΦT, there is a ki with DK, k |=QL2 aΦ2T2) and DK, j |=QL2 aΦ1T1) for all j ∈ [i, k[. The induction hypothesis yields a ∈ Cert(Φ2,K, k) and a ∈ Cert(Φ1,K, j) for all j ∈ [i, k[. As a consequence, we have for every I|=K that I, i|=QL1 a(Φ12).

The remaining cases can be proven in a similar way. For example, the arguments for #PΦ1 can be obtained from those of case #FΦ1 by replacing i+ 1 by i−1, and correspondingly for Φ12 and Φ12. The cases for2 and 2P follow from similar arguments.

8.4 Summary

In this chapter, we have generalized the abstract setting for ontology-based temporal query answering introduced in Chapter 3 even more. In particular, we have shown that the temporal query answering problem is rewritable in certain cases if the negation in the queries is dropped. This generality allows to augment many existing query answering approaches with temporal features: both the temporal queries and the ontologies can be instantiated with arbitrary queries and logical theories, as long as they satisfy certain requirements. And we have shown that many formalisms already applied or proposed in the literature do so.

The rewritability result is still of theoretical nature, but we have proposed algorithms for answering PTQs in [BLT15].8 The focus of the latter work is on the temporal database monitoring problem: the continuous evaluation of a fix set of PTQs over a temporal knowledge base, which contains data about the past (and present), growing over time. We have identified three different approaches for solving that problem.

The most straightforward option is to evaluate PTQs in a database system that supports temporal information or in a data stream processing system. The advantage of this approach is that the optimization techniques of the database can directly be exploited. Yet, it requires to store the whole history of past data—even if only a small part of it is necessary to answer the query—and to re-evaluate the query at each time point using a temporal database query language, such as ATSQL [CTB01]. The feasibility thus depends on the amount of data that has to be considered. Note that many existing stream processing systems limit the latter by adopting a “sliding view”

semantics, in which only a fixed amount of past time points is used to evaluate PTQs.

Second, we can apply an approach proposed by [Cho95; Tom04], which achieves a so-called bounded history encoding; that is, the amount of data that is required to answer the given queries is bounded. In the algorithm, it is continuously updated as soon as new data is available. However, since the original proposal disregards future operators, they have to be eliminated in an extra step; in [BLT15], we describe how this can be done. Although this elimination is independent of the length of the history, it involves a theoretical non-elementary blowup in the size of the query, due to the application of the separation theorem (see Lemma 2.20). An advantage of the history encoding from [Cho95; Tom04] is that it can be implemented inside a database system using views and triggers, which could yield a good performance in spite of the possibly very large size of the query. Generally, this option is the best of the three if the PTQs contain no future operators or if one can find a small equivalent representation without future operators.

As most general solution, we propose a new algorithm, which is an adaptation of the one proposed in [Cho95]. Our algorithm allows for rigid unary predicates (see

8Recall that we consider a slightly different semantics in [BLT15].

8.4 Summary

Section 3.1), directly works with future operators, and also achieves a bounded history encoding. In particular, we can limit the influence of the future operators on the time and space requirements to a single exponential factor. But it is not straightforward how this algorithm can be implemented inside a database system. While it is theoretically the most efficient solution, it remains to be seen how it implementations perform in practice.

9 Conclusions

The goal of this thesis was to systematically analyze ontology-based access to temporal data in terms of computational complexity, and rewritability to existing formalisms. In this chapter, we summarize our achievements and describe directions of future research.

9.1 Summary of Results

In this work, we have focused on a temporal query answering scenario that reflects the needs of the applications of today: the temporal queries are based on LTL, one of the most important temporal logics; the ontologies are written in standard lightweight logics; and the data allows to capture data streams.

In Chapter 3, we have introduced an abstract temporal query language that combines atemporal queries QL via the operators of linear temporal logic and allows to access temporal data through ontologies. In particular, it generalizes existing temporal query languages and, at the same time, provides a framework for the design of new formalisms and general investigations. We have proven that its direct application yields only con-tainment in NExpTime (NP) w.r.t. combined (data) complexity,1 which does not fit the low complexity we usually have with lightweight DLs; regarding entailment, we thus get containment inco-NExpTime(co-NP). For that reason, we have proposed a new approach based on the original algorithm for solving the satisfiability problem in LTL, which requires only polynomial space. This approach is similarly general w.r.t. the query language and DL considered, but leaves one part of the TQ satisfiability problem, r-satisfiability, open and thus to be solved for concrete TQs and DLs.

In Chapter 4, we have studied the combined complexity of the satisfiability problem in DL-LTL for DLs DL meeting a few rather weak requirements. Moreover, we have showed that the latter are satisfied in the popular DLELand manyDL-Litefragments.

Nevertheless, DL-LTL has turned out to be powerful enough to show a lower bound of NExpTimeif rigid symbols are considered. For the case without rigid symbols, we have shown containment in PSpace, based on our new approach. ForEL and DL-LiteHhorn, this PSpaceresult also holds w.r.t. rigid symbols if the considered concept inclusions are global. An overview of these results is given in Figure 4.1.

In Chapter 5, we have focused on TCQ entailment inEL and showed results similar to those forEL-LTL w.r.t. combined complexity (see Figure 9.1): our general approach can be applied to design a polynomial-space algorithm, and co-NExpTime-hardness can be shown similarly. However, in contrast to EL-LTL, the former also holds for the case with rigid concept names. This shows that the local GCIs allowed in EL-LTL are rather powerful; note thatEL-LTL formulas without them can be seen as TCQs w.r.t. a

1Recall that, in order to obtain this complexity, the satisfiability of conjunctions of QLqueries and negatedQLqueries w.r.t. a KB inDLhas to be decidable nondeterministically in polynomial time.

Data Complexity Combined Complexity

(i) (ii) (iii) (i) (ii) (iii)

DL-Lite[[core|horn]|H] ALogTime ALogTime ALogTime PSpace PSpace PSpace

Th. 6.31 Th. 6.37 [SC85] Co. 6.19

EL P co-NP co-NP PSpace PSpace co-NExpTime

[Cal+06],Th. 5.19 Th. 5.21 Co. 5.20 [SC85] Co. 5.16 Th. 5.18,Co. 5.17 ALC-SHQa co-NP co-NP ExpTime ExpTime co-NExpTime 2-ExpTime

DL-Lite[krom|bool] co-NP co-NP ≤ExpTime ExpTime co-NExpTime 2-ExpTime

[Cal+05] Co. 7.5,Th. 7.6 Th. 7.7,Th. 7.8 Th. 7.9

DL-LiteH[krom|bool] co-NP co-NP ExpTime 2-ExpTime 2-ExpTime 2-ExpTime

Co. 7.12 Co. 7.12 Co. 7.5 [BBL15a]

Figure 9.1: The complexity of TCQ entailment considering (i) no rigid symbols, (ii) rigid concept names, and (iii) rigid role names. Our results are highlighted. All complexities except those marked with ≤are tight; ≥hardness,≤ contain-ment.

a[BBL15b]

global ontology. Regarding data complexity, tractability only holds for the case without rigid symbols.

In Chapter 6, we have focused on TCQ entailment in Horn fragments of DL-Lite, including role inclusions, which offer expressive features that are rather similar to those of EL. But the results we have shown are considerably better: containment in PSpace and ALogTime w.r.t. combined and data complexity. Although we could not achieve FO rewritability, the latter result is interesting since containment in ALogTime is considered as an indicator for the existence of efficient parallel implementations [AB09, Thm. 6.27]; also recall that, in many applications, data complexity better captures resource consumption than combined complexity (see Section 2.3). To achieve both of these results, we have shown that polynomially many stored assertions and queries, if guessed before processing, suffice to test r-satisfiability; that is, to testy¿y if the data of one observation moment does not contradict rigid knowledge about past moments.

In Chapter 7, we have considered TCQ entailment regarding the two DLsDL-Litekrom andDL-Litebool, also with role inclusions, which are no Horn logics. These DLs are rather expressive, but do not allow to model qualified existential restrictions on the left-hand side of concept inclusions. As described above, we have identified this feature as a cause of complexity for TCQ answering; recall that it is the main difference betweenEL and DL-Litehorn, which instead allows for inverse roles. For that reason, the results of Chapter 7 are especially interesting. We have shown that entailment in DL-Litebool can be reduced to entailment in DL-Litekrom. TCQ entailment in these logics generally has turned out to be as complex as in more expressive DLs, such asSHQ[BBL15b]. Further, we have shown that role inclusions, which allow to express qualified existential restric-tions on the right-hand side of concept inclusions, lead to 2-ExpTime-completeness in combined complexity, which is even higher than the results proven for very expressive DLs [BBL15b]. Note that further results on TCQ entailment in DLs extending SHQ are presented in [BBL15a]. An interesting observation regarding those DLs is that the complexity of CQ entailment is often already higher than the one of TCQ entailment