• Keine Ergebnisse gefunden

In order to integrate a relational database into an RDF graph, Ontology Based Data Access (OBDA) is used. In this section a formal framework for OBDA is presented, which had been used to formalize the OBDA system that has been implemented in this thesis.

3.1. Mapping

In ontology based data access a relational schema S is mapped onto an ontologyO based on a mappingM such that SPARQL queries can be issued against an instance of S. Thereby, the ontology O serves as the global schema of the data. This means that when a mapping has been defined, a user of the OBDA system does not need any knowledge of the underlying relational data. The user can simply issue SPARQL queries against the ontology and retrieve the desired information. Hereinafter, map-pings from relational data onto RDF data are defined.

Definition 47 (Mapping Templates)

Let A be an attribute of a relation. The arbitrary string θ is a mapping template. In a mapping template substrings of the form {A} can occur and denote template variables in a mapping template.7 Furthermore, att(θ) denotes the set of attribute names in θ.

Example 34

The string http://www.uni.com/student/{ID} is a mapping template with one tem-plate variable in it, namely{ID}.

Definition 48 (Evaluation of Mapping Templates)

Let A be an attribute of a relation and lettu be a tuple in a relation. The evaluation of a single template variable {A} is defined as follows:

J{A}

Ktu∶=str(tu[A])

Where str denotes the function that creates a string from a given input. The evalua-tion of a mapping templateJθKtufor a given tupletuis the string obtained by replacing each template variable inθwith the evaluation of the template variable. Furthermore, let R be the relation schema of tu then att(θ) ⊆att(R).

Example 35

Consider the tuple tu =<1, Alice, ComputerScience> from table 1. The evaluation Jhttp∶ //www.uni.com/student{ID}

Ktu=http∶ //www.uni.com/student/

J{ID} Ktu

=http∶ //www.uni.com/student/1.

7If { or } are used within θ without being used as markup for the template variable, then they have to be escaped as \{ or \}. Consequently also \ has to be escaped as \\ if it is not used as escape character.

Definition 49 (Mapping Rule)

Let ϕ be a relational algebra expression, θ1 and θ2 mapping templates and iri ∈ I, then a mapping rule is:

ϕ↝ (θ1, iri, θ2) Example 36

An example of a mapping rule that defines that for each student in table 1 a triple should be created where the subject contains the ID, the predicate is always the IRI http://www.uni.com/nameand the object is an IRI including the value stored in the NAMEcolumn of the relationSTUDENT is:

ST U DEN T ↝ (http∶ //www.uni.com/student/{ID}, http∶ //www.uni.com/name,

http∶ //www.uni.com/student/{N AM E}) Definition 50 (Evaluation of Mapping Rule)

The evaluation of a mapping rule over an instance of a relational schema s is a set of triples:

Jϕ↝ (θ1, iri, θ2)Ks= {(Jθ1Ktu, iri, Jθ2Ktu) ∣tu∈JϕKs} Example 37

The evaluation of the mapping rule shown in example 36 results in the two triples depicted in listing 11.

< http :// www . uni . com / s t u d e n t /1 >

< http :// www . uni . com / name >

< http :// www . uni . com / s t u d e n t / Alice >.

< http :// www . uni . com / s t u d e n t /2 >

< http :// www . uni . com / name >

< http :// www . uni . com / s t u d e n t / Bob >.

Listing 11: RDF triples resulting from evaluation of mapping rules.

Definition 51 (Mapping)

Amapping M is a set of mapping rules.

3.2. Formal Framework for Ontology Based Data Access

After having defined all inputs that are given to an OBDA system a formal framework for OBDA will be formalized now. The definitions for the formal framework of OBDA are based on [12] and on [1].

Definition 52 (OBDA Specification)

An OBDA specification(S, M, O)specifies how the relational schemaS can be mapped onto the ontology O based on the mapping M such that the result of the evaluation of all mapping rules inM result in valid RDF triples.

With the help of an OBDA specification, an instance sof a relational schema and therefore, the relational data inscan be mapped onto the ontology O.

Definition 53 (OBDA instance)

An OBDA instance is the tuple((S, M, O), s) where sis the instance of a relational schema S.

SPARQL queries can be issued against an OBDA instance such that sets of variable mappings are returned that correspond to the triples that are created by the evalua-tion of each mapping rule in a mapping M. In order to also obtain results that are not explicitly stored in the data, but can be inferred with the help of the ontology, two approaches can be used. In the first approach the input mapping is saturated with additional rules, such that the mapping also creates all implicit triples.

Definition 54 (Mapping Saturation)

For a given mapping M and an ontology O the function sat(M, O) produces a sat-urated mapping M, where M ⊆ M. Thereby, M is the set of mapping rules that produces all triples produced by M and all implicit triples that can be inferred based on O.

Example 38

Consider the ontology consisting of one triple:

{(http∶ //www.uni.com/BachelorStudent,

http∶ //www.w3.org/2000/01/rdf−schema#subClassOf, http∶ //www.uni.com/Student)}

This ontology defines that each bachelor student is also a student. Consider the following mapping M.

{ST U DEN T ↝ (http∶ //www.uni.com/student/{ID},

http∶ //www.w3.org/1999/02/22−rdf−syntax−ns#type, http∶ //www.uni.com/BachelorStudent)}

Based on the ontology, a mapping rule that defines that each bachelor student is also a student has to be added to the mapping in order to create a saturated mapping.

The saturated mappingM is depicted below.

{ST U DEN T ↝ (http∶ //www.uni.com/student/{ID},

http∶ //www.w3.org/1999/02/22−rdf−syntax−ns#type, http∶ //www.uni.com/BachelorStudent),

ST U DEN T ↝ (http∶ //www.uni.com/student/{ID},

http∶ //www.w3.org/1999/02/22−rdf−syntax−ns#type, http∶ //www.uni.com/Student)}

The second approach to also consider implicit knowledge when querying an OBDA instance is to extend queries according to the given ontology.

Definition 55 (Query Extension)

For a SPARQL SELECT query Q and an ontology O the function extend(Q, O) extends the query Q based on the ontologyO to the query Q. Thereby, Q returns all variable bindings that would have been returned, ifQwould have been executed on an RDF graph that includes all implicit triples based onO.

Example 39

The query depicted in listing 12 retrieves all vertices that are of the type student.

Consider the ontology from example 38. The ontology defines that each bachelor student is also a student. Therefore, the query can be extended to the query depicted in listing 13. Thereby, the union of all students and bachelor students is created to obtain all implicit results.

P R E F I X rdf : < http :// www . w3 . org / 1 9 9 9 / 0 2 / 2 2 - rdf - syntax - ns \# >

P R E F I X uni : < http :// www . uni . com / >

S E L E C T ? s W H E R E {

? s rdf : type uni : S t u d e n t }

Listing 12: Input query to an OBDA system.

P R E F I X rdf : < http :// www . w3 . org / 1 9 9 9 / 0 2 / 2 2 - rdf - syntax - ns \# >

P R E F I X uni : < http :// www . uni . com / >

S E L E C T ? s W H E R E {

{? s rdf : type uni : S t u d e n t } U N I O N

{? s rdf : type uni : B a c h e l o r S t u d e n t } }

Listing 13: Ontology based extended query.

In order to obtain results from the underlying relational database the SPARQL query has to be rewritten to a SQL query, which retrieves the desired results. The SPARQL query is rewritten based on the mapping. The SQL query is then issued

against the underlying relational database and the result of the SQL query is trans-formed into respective SPARQL results, which are then returned to the user of the system.

Definition 56 (Relation to Variable Binding Transformation)

Given a relation schemaRand an instance of this schemar, the function transform(r) transforms the relation into a set of variable bindings.

transform(r)= {µ∣tu∈r andµ= {(toVar(A), tu[A])∣A∈att(R) andtu[A] ≠N U LL}}

Thereby, the static function toVar(A) creates a SPARQL variable from an attribute name.

Example 40

Consider the relation depicted in table 1. The result of transf orm(ST U DEN T) is:

transf orm(ST U DEN T) =

{{(?ID,1),(?N AM E, Alice),(?F IELD, Computer Science)}, {(?ID,2),(?N AM E, Bob),(?F IELD, Computer Science)}}

Definition 57 (Query Rewriting)

Given a SPARQL query Q, and an OBDA instance ((S, M, O), s), the function rewrite(Q, M) rewritesQ to a SQL query such that:

transf orm(

Jrewrite(extend(Q, O), sat(M, O)) Ks) =

JQKJsat(M,O)Ks

In figure 4 the dataflow in an OBDA system is depicted. The mapping saturation and query extension based on the ontology are the first steps in the figure. After that the extended SPARQL query is translated to SQL with the help of the mapping. The resulting SQL query is executed on the underlying relational database and the query results are transformed to variable bindings.

Mapping

Ontology

SPARQL Query

Translate Query SQL Query

Execute SQL Query

SQL Results

Transform Results

SPARQL Results Saturate

Mapping

Saturated Mapping

Relational Database Dataflow

Extend Query

Extended SPARQL Query

Figure 4: Dataflow in an OBDA system.

Figure 5: Dataflow in UltrawrapOBDA.