• Keine Ergebnisse gefunden

4. Ultrawrap 30

4.2. Tripleview Optimization

In order to enhance the query execution time of SPARQL queries that are posed against the OBDA system, tripleviews may be optimized. Sequeda names three

pos-sible optimizations of the tripleviews:

1. Addition of primary key columns.

2. Creation of separate tripleviews for different data types.

3. Materialization of views.

1. Addition of primary key columns:

Indices optimize the performance of relational databases by minimizing the number of disk accesses required when a query is executed. An index stores a pointer with the physical address on a hard disk where information about a primary key is stored.

Sequeda argues that due to the fact that the subject column S and the object column O in the tripleviews do not correspond to the primary keys of the source relation of the triple SQL optimizers cannot leverage indexing for speeding up query execution. Therefore two additional columns can be added to a tripleview, namely S_pk, which denotes the primary key of the tuple from which the subject is taken and O_pk, which does the same for the object. Thereby, O_pk is null if O is a literal and not an IRI.

Due to the fact that the views the system works with are actually queries that are executed whenever a view is accessed, the desired data is still stored in the source relations. Therefore, queries with these additional primary keys can exploit the indices and speed up queries because the joins are done on these values.

Example 42

Consider the description view from example 41. Adding primary keys from the source relation to the tripleview results in the view depicted in table 20. Hereby, the value in the primary key column for the object is NULL, because the objects are literals.

descriptionView

S S_pk P O O_pk

http://www.uni.com/

course/c1 c1 http://www.uni.com/

description "Teaches the basics of

mathematics." NULL http://www.uni.com/

course/c2 c2 http://www.uni.com/

description "Physics is one of the most fundamental sci-entific disciplines, and its main goal is to un-derstand how the uni-verse behaves."

NULL

http://www.uni.com/

course/c3 c3 http://www.uni.com/

description "Teaches relational

al-gebra and SQL." NULL Table 20: Triple view having additional primary key columns.

2. Creation of separate tripleviews for different data types:

The second tripleview optimization that is proposed creates separate views depending

BOOKS

ID NAME DESCRIPTION

b1 Basics of Databases This book covers the basic topics of databases.

b2 Physics in a Nutshell A collection of physics formula.

Table 21: Relation holding information about literature used at a university.

on the datatype of the object column in a tripleview. In the first step separate tripleviews were created depending on the predicate of a triple, or the class of an instance, as described above. These triples may have different source relations. All values in these tripleviews were cast to the datatype varchar. Sequeda argues that the size of the object column in a tripleview is the same as the biggest column from any of the source relations, where the column corresponds to the later object column of the tripleview. This leads to poor query performance. Therefore, separate tripleviews were created for the same property with different datatypes in the object column.

Example 43

Consider the BOOKSrelation depicted in table 21 that holds information about books used for teaching at a university. Furthermore, consider a mapping rule that creates triples from this relation, where the subject corresponds to the ID, the predicate is http://www.uni.com/descriptionand the object is a literal that corresponds to the value stored in theDESCRIPTIONcolumn. The mapping looks like:

BOOKS ↝ (http∶ //www.uni.com/books/{ID},

http∶ //www.uni.com/description,

”{DESCRIP T ION}”)

(5)

Now consider that the DESCRIPTION column in the BOOKS relation is of the type varchar(50) and that the type of the DESCRIPTION column in the COURSES rela-tion depicted in table 18 is of the type varchar(150). Even though, the mapping rules define that from both relations triples should be created where the predicate is http://www.uni.com/description, the triples would not be stored in the same tripleview because the object columns are of different data types. Actually two triple-views for the propertyhttp://www.uni.com/descriptionwould be created. One for the triples where the object column has the datatype varchar(50) and one tripleview where the datatype is varchar(150).

3. Materialization of views:

In UltrawrapOBDAa distinction is drawn between tripleviews and materialized triple-views. Hereby, tripleviews are stored as queries, which are executed whenever a view is accessed. Materialized tripleviews on the other hand are stored as actual rela-tions. This means that the underlying query does not have to be executed when the materialized tripleview is accessed.

Sequeda argues that materializing every tripleview leads to the best query execution times of UltrawrapOBDA at the cost of additional space. Materializing no tripleview requires no additional space, but also leads to higher query execution times. In order to keep the required space to store tripleviews small and to also have low execution times, Sequeda proposes to materialize only leaf views. In case of properties, leaf views are the tripleviews, where the property of the tripleview has no more subproperties.

In case of classes, leaf views are the tripleviews in which the respective class does not have any subclasses.

Example 44

Consider the ontology shown in listing 3, which defines that bachelor students and master students are subclasses of the class student. Furthermore, bachelor and master students do not have any subclasses and there is no instance of student that is not also a bachelor or master student. Therefore, the leaf views would be the views for the bachelor student and master student classes and subsequently these tripleviews would be materialized. Finally, the not materialized student tripleview would be defined as the union of all of its subclasses, i.e. bachelor and master student.