Evaluation Results - Global-as-View Ontology-Based Data Access for Relational Data [09/2019]

6. Evaluation 54

6.3. Evaluation Results

The first step of the reimplemented system and the optimized reimplemented system is to saturate the mapping. The newly generated saturated mapping is equivalent to the saturated mapping provided for the benchmark. Therefore, it seems that the mapping saturation for the ontological termsubClassOf is implemented correctly.

In table 41 the average query execution time for single depths and selectivities are given for Ultrawrap^OBDA, the reimplemented system, the optimized reimplemented system and Ontop. For instance, D5_20 in the first column means that the depth of the ontology was 5 and that the selectivity was 20.

The results show that the execution times for the reimplementation and the op-timized reimplementation are lower than the execution time of Ultrawrap^OBDA and Ontop in all cases. Furthermore, the execution times of the reimplementation and the optimized reimplementation are very similar to each other.

The highest average execution time per query for each OBDA system is for the ontology with depth 2 and the selectivity of 2. Ultrawrap^OBDA needs averagely 9795ms to execute a query, the reimplementation needs 1798ms, the optimized reim-plementation needs 1790ms and Ontop needs averagely 2206ms. This means that Ultrawrap^OBDA needs about 5.47 times as long as the reimplementation and the optimized reimplementation need. Ontop on the other hand only needs about 1.23 times as long as the newly implemented systems. The shortest execution time for each OBDA system are achieved for the ontology with the depth 2 and a selectivity of 100. The average execution time per query for Ultrawrap^OBDA for this setting is 550ms and for Ontop it is slower with 721ms. The reimplemented system needs only 121ms to execute a query and the optimized reimplementation needs about 111ms per query. This means Ultrawrap^OBDA needs about 4.78 times as long as the reim-plementation and Ontop needs about 6.23 times as long as the reimreim-plementation.

∗Since Ultrawrap^OBDAis not publicly available the results for the system are taken from [1] and thereby, Ultrawrap^OBDAwas evaluated with a different experimental setting.

Depth and Select.

UltrawrapOBDA^∗ Reimplementation Optimized

Reimplementation

Ontop

D6_100 903 412 402 1395

D6_50 960 443 446 1145

D6_20 952 469 453 1024

D6_10 1150 589 580 1073

D6_5 2237 834 790 1222

D6_2 3528 1770 1736 2031

D5_100 876 400 371 1277

D5_50 904 392 379 1046

D5_20 1094 475 470 1010

D5_10 1087 543 550 1044

D5_5 2196 858 846 1297

D5_2 3934 1718 1709 2111

D4_100 1148 426 361 1293

D4_50 1205 428 411 1090

D4_20 1259 472 449 978

D4_10 1525 535 532 1004

D4_5 2973 874 872 1328

D4_2 4990 1789 1759 2027

D3_100 1040 305 300 1116

D3_50 1253 322 313 979

D3_20 1417 365 362 879

D3_10 1793 453 452 998

D3_5 2537 689 678 1233

D3_2 6697 1746 1753 2183

D2_100 550 121 111 721

D2_50 640 133 142 677

D2_20 995 200 198 710

D2_10 1955 362 359 881

D2_5 3835 718 713 1260

D2_2 9795 1798 1790 2206

Average 2156 687 676 1274

Table 41: Execution times of SPARQL queries on OBDA systems in ms.

Generally Ultrawrap^OBDA seems to be faster than Ontop in settings with higher se-lectivities such as 100 or 50 but Ontop seems to be faster than Ultrawrap^OBDA with lower selectivities such as 2 or 5. Since Ultrawrap^OBDA was evaluated with a dif-ferent experimental setup and since this evaluation is 6 years old, the comparison of Ultrawrap^OBDA has to be handled with care. In figure 8 the average execution time per query of all depths an selectivities is shown.

Figure 8: Average execution time per query.

The chart shows that the average execution time of Ultrawrap^OBDA is the highest with 2156ms per query and thereby, Ultrawrap^OBDA needs averagely 3.16 times as long as the newly implemented system. Ontop has an average query execution time of 1274ms and is faster than Ultrawrap^OBDA but slower than the newly implemented systems. Ontop needs averagely 1.87 times as long as the newly implemented system.

Note that Ontop does not create any views or stores any additional data in the rela-tional database but creates SQL queries that are executed directly on the unchanged relational database. The unoptimized reimplementation needs averagely 687ms and the optimized reimplementation needs averagely 676ms to execute a query. Thereby, the reimplementation and the optimized reimplementation have comparable execu-tion times, even though the optimized version supports exclusive superclass instances and needs less space to materialize views.

The space required to store the relational data of the Texas Benchmark, without any views or materialized views is 1219MB. For materializing views additional space is needed. There is no information on how much additional space Ultrawrap^OBDAneeds to materializes views and Ontop does not create any additional views. Therefore, only

the required space for the reimplementation and the optimized reimplementation is compared. In figure 9 the additional space for materializing views is presented.

Figure 9: Additional Space Required for Materializing Views.

The chart shows that the optimized version of the materialized views requires only 91MB of additional storage space to store the materialized views. The unoptimized views required 205MB to store the materialized views and subsequently the required space could be reduced by about 114MB or about 55%. While the storage space was reduced with the optimization, the execution time did not change as shown in figure 8.

In [20] it is argued that benchmarks for databases tend to measure execution times but often neglect the completeness and correctness of query results. Since a fast query execution is not helpful if the result set is incorrect or incomplete the virtualized graph created in the compilation phase has been compared to the results of Ontop. By executing the query SELECT * WHERE {?s ?p ?o}, a result set containing all triples in the OBDA system was returned. This result was created for Ontop and the newly implemented system. The comparison of the results returned by Ontop and the newly implemented system show that each result in the result set of Ontop is existing in the result set of the new system and vise versa. However, the result set of the newly implemented system included duplicate results. These duplicate results were caused by the mapping saturation because of a subclass relationship that did not have a tree structure. Other potential causes for duplicate results may be a careless definition of mapping rules or by the implemented bag semantics of SPARQL. Nevertheless, the reimplemented system and Ontop returned the same results and thus, it seems that both systems return complete and correct results.

Im Dokument Global-as-View Ontology-Based Data Access for Relational Data [09/2019] (Seite 67-71)