Methods and Cost Models for XPath Query Processing in Main Memory Databases

(1)

for XPath Query Processing in Main Memory Databases

Henning Rode Master Thesis Universit¨at Konstanz

October 2003

(2)

1 Introduction 2

1.1 Outline . . . 3

1.2 The Monet MMDB . . . 3

1.3 The XPath Query Language . . . 4

1.4 Query Optimization in XPath . . . 6

2 XPath Evaluation 8 2.1 MMDBMS specific Adaptations . . . 9

2.2 General Pre-/Postorder Properties . . . 10

2.2.1 Tree Awareness . . . 11

2.2.2 Ancestor and Descendant Sizes . . . 12

2.3 XPath Operators . . . 13

2.3.1 Descendant and Ancestor Axis . . . 13

2.3.2 Following and Preceding Axis . . . 16

2.3.3 Child Axis . . . 17

2.3.4 Parent Axis . . . 19

2.3.5 Other Axes . . . 20

2.3.6 Node Tests . . . 22

2.4 Conclusion . . . 22

3 Result Size Estimation 24 3.1 Related Research . . . 24

3.2 Preliminaries . . . 25

3.3 Child Axis . . . 27

3.4 Descendant Axis . . . 27

3.4.1 Sampling . . . 28

3.5 Parent Axis . . . 28

3.5.1 Fan-Out Specific Groups . . . 29

3.5.2 Tag-Name Specific Groups . . . 29

3.5.3 Summarization of Groups . . . 30

3.6 Ancestor Axis . . . 31

3.6.1 Level Model of the Pruned Context Set . . . 31

3.6.2 Overlap-Free Ancestor Estimation . . . 32

3.6.3 Tag-Name Specific Context Sets . . . 33

3.7 Preceding-/Following-Sibling Axis . . . 34

3.7.1 Tag-Name Specific Context Sets . . . 35

3.8 Other Axis Steps . . . 36

3.9 Node Tests . . . 37

3.10 Experimental Studies . . . 38

3.10.1 Setup . . . 38

3.10.2 Accuracy . . . 39

3.10.3 Performance . . . 40 2

(3)

3.10.4 Storage Requirements . . . 44

4 Cost Models 47 4.1 Hierarchical Memory Access Models . . . 47

4.1.1 Data regions and Access Patterns . . . 48

4.2 Data Access of the XPath Operations . . . 49

4.2.1 Determining a Generic Access Pattern . . . 49

4.2.2 Axis specific Parameters . . . 51

4.3 Calibration of the CPU Costs . . . 53

4.3.1 CPU Cost Functions for the Axis Step Algorithms . . . 55

4.4 Experimental Studies . . . 56

4.4.1 Setup . . . 56

4.4.2 Cache Miss Tests . . . 57

4.4.3 Calibration Tests . . . 61

4.4.4 Accuracy Tests . . . 61

4.4.5 Estimation Based Tests . . . 64

5 Future Work 66 5.1 Improving the Evaluation of Step Expressions . . . 66

5.2 Open Cost Modeling Tasks . . . 67

5.3 Searching for Optimal Query Plans . . . 67

(4)

Recent work on XPath evaluation has produced efficient relational index structures for maintaining and querying XML through a DBMS. Built on top of an relational encoding, named the XPath Accelerator, this thesis takes a closer look at its utilization within the scope of query processing.

Basic XPath operations, such as axis steps and simple node tests, remain in the focus of the study. Appropriate database operations for their evaluation are introduced in the context of the main memory DBMS Monet. In those cases where the existing database operators fail to exploit the tree properties of XML data, new algorithms have been developed, designed specifically for evaluation of XPath axes.

As an important step towards cost analysis for the proposed XPath operations, result size estimation is discussed in the trade off between accuracy and expense.

Different methods show how statistical data as well as sampling techniques can be used for estimating result sizes of simple axis steps.

The generation of cost functions mainly considers the time, that the XPath operations spend on data access. Even in main memory databases, CPU processing usually gets stalled for outstanding memory fetches. Therefore, our cost functions explicitly analyze the cache usage of the operations, adopting a hierarchical memory access model.

Detailed tests demonstrate the accuracy and performance of the proposed result size and cost estimation techniques.

(5)

Introduction

Applications with XML based data storage rely on specialized languages, such as XQuery and XPath, when querying and retrieving particular information from large volumes of XML data. Increasing document sizes, however, make the task of efficient query processing even more challenging. Quite a number of recent research projects, aimed at tackling this issue, have come up with as many different approaches. In order to outline some of the basic ideas, besides SAX or DOM based processing of the textual XML representation, documents are often stored and queried within database systems. The latter approach may be further classified since some of them use a new kind of DBMS for native storage of XML data, while others try to map the tree-shaped XML data into conventional relational tables aiming to exploit the performance of existing RDBMS.

This study is also linked to a particular project, namely thePathfinder XQuery Compiler, belonging to the latter category, however opting for a specialized main memory database. The outcomes of the following work should nevertheless be adaptable for the most part to serve XPath processing in any kind of RDBMS.

Efficient XPath query processing on top of a database back-end is concerned with at least the following issues:

• finding appropriate storage and index structures,

• developing specialized algorithms to serve typical XPath operations,

• tactical and strategical query optimization.

Since recent research has already presented efficient encodings for maintaining and querying XML in relational tables [GKT04], we will mainly focus on the second and third task.

The term tactical optimization refers to the choice of the best matching algorithm for executing an operation in a certain given situation. For example in case of a join operation, database systems usually dispose of more than one particular implementation of its execution. Depending on the context conditions, e.g., whether the operands are sorted or whether an index exists, the most efficient algorithm should be identified. In contrast,strategical optimization, acting at a higher level, means searching for the the best query plan with respect to the order of single operations. Database systems usually perform this kind of optimization to find the best join order or to push down selections. Although both, tactical and strategical optimization, address different problems, it is often impossible to separate their tasks. Strategical decisions obviously have impact on the choice of the algorithms, and vice versa.

In order to enable the optimization, cost models are required for comparing the performance of different query plans and applied algorithms in advance. They

2

(6)

should provide an accurate although fast estimation of the time needed for the execution of a given operation. Hence, the design of new algorithms for efficiently serving XPath operations should also include a cost model of their expected performance.

1.1 Outline

To give an overview on the structure of this study, we start with two short introduc- tory excurses about the specifics of the applied MMDB system and the XPath query language. Chapter 2 subsequently introduces the data model, especially designed for storing and efficient querying XML data within the database. It is also shown, how XPath operations can be further improved by replacing standard database operations with new algorithmic approaches. Preceding the actual cost analysis of the proposed methods, Chapter 3 addresses the issue of result size estimation, which could be regarded as a mainly independent but nevertheless important subtopic of cost modeling. The need for accurate size estimates will become apparent when de- riving data access costs for the XPath operations in the last chapter. Both result size optimization and further cost modeling are accompanied by an experimental study, showing the accuracy and performance achieved in a set of defined test cases. The experiments also provide a background for the final discussion on the achievements made as well as on the needs for further research.

1.2 The Monet MMDB

Compared to conventional RDBMS, our chosen main memory database system MonetDB [MDB] comes with a number of non-standard facilities. Based on the research on CPU performance and cache utilization of typical database operations [Bon02, MBK00, Man02, MBK02], the development of the Monet database kernel deploys new concepts for storage and access of data tables. In the following, only the most important issues with respect to the design of our XPath algorithms are touched upon.

The BAT Concept Monet comes with the restriction to support only binary tables on the physical level of storage, named binary association tables (BATs).

Relations with more than two attributes are fragmented vertically and have to be joined for a full table view. In terms of BAT algebra, the first column of the binary tables is called the head, the second being the tail.

The main aim of the BAT data model is to keep the records as small as possible resulting in higher utilization of the caches. Following the observation that in current hardware architectures main memory access has become the new bottleneck for database operations [Bon02], improving the cache performance becomes the basic task of any kind of data access. Typical database operations, e.g., selections on a single property or joins comparing two fields, only involve a few attributes of their argument relations. If record sizes are large, relevant data with respect to the operation is burdened by large blocks of non-relevant data, in adjacent memory locations. It is thus unlikely to access more than just one needed data item within a single cache line load, furthermore, the total number of cached data items would be bounded by the number of the available cache lines. Hence, processing time on any current hardware system would be dominated by the time for memory access.

Since BATs ensure small record sizes and therefore a higher locality of relevant data, cache utilization is improved significantly. Figure 1.1 visualizes the benefits of traversing the data stored in a BAT compared to a conventional data table.

(7)

PSfrag replacements

record

cache line cache line

relevant attribute non-relevant attribute

Figure 1.1: Cache-line covering of large-sized vs. small BAT records.

The voidData type On the other hand, having only BATs would lead to high redundancy of data storage. Vertical fragmentation comes with the need of maintaining the primary key in every single table, and even worse, complex queries cause a high number of additional key joins that would considerably undermine the performance. To overcome these problems, Monet provides the feature of voidcolumns, containing so called virtual object identifiers. Instead of storing a consecutive sequence of integer numbers, which is the most commonly used data type for unique key values, the system just keeps the offset and the length of such adense sequence.

Hence, a BAT containing a column of typevoidis handled like a one-dimensional array. This approach not only avoids redundancy, but also significantly accelerates random key access. Searching for a key value in a void column is reduced to a simple positional lookup at memory address offset + key. We can thus regard a voidcolumn as a kind of an index on the data items of the BAT’s tail. Special join implementations allow to exploitvoid-property of their operands, thus minimizing the overhead caused by additional key joins.

Notice, that the array structure of a void BAT requires a combination with a fixed length data type. Avoiding restrictions on the combination of types, the Monet system stores variable sized data in a separate heap addressed by fixed length pointers from within the respective BAT.

The MIL language Monet strictly distinguishes between the database back-end and different kind of applications to be run on top of it, e.g., an SQL front-end or a specialized data mining tool. The intermediate language for communication of the two parts, called Monet Interpreter Language (MIL), provides a small but rather straightforward interface to the back-end functionality. It mainly includes a set of specialized algebra operators working on the BAT data model and the supported data types. Furthermore, as a procedural language, MIL also provides basic control structures and allows users to define their own functions.

Whereas we use MIL for all kinds of standard database operations, the existing language constructs are not sufficient to enable effective XPath evaluation. Since MIL code is interpreted, especially loops are performed significantly slower than in any compiled language. For the implementation of the algorithms, proposed in Chapter 2, we thus make use of the MIL extensibility, creating a new module of fast operators implemented in the C language, specialized in supporting XPath operations.

1.3 The XPath Query Language

XPath [BBC⁺03] is designed to address parts of XML documents. Using a path notation, it allows to navigate through the structure of an XML document.

(8)

Axis Step Result

v/child element child nodes ofv

v/descendant recursive closure of the child axis v/descendant-or-self vitself and its descendants

v/parent parent ofv

v/ancestor recursive closure of parent axis v/ancestor-or-self vitself and its ancestors

v/following element nodes followingvin document order v/preceding element nodes precedingvin document order v/following-sibling followings having the same parent asv v/preceding-sibling precedings having the same parent asv

v/self v

v/attribute attribute nodes ofv

Table 1.1: Overview of axis semantics originated in context nodev.

Being a functional language, XPath consists basically of expressions. XPath expressions always evaluate to asequence of items, meaning an ordered set of zero or more items. All information that affects the result of an expression is called the expression’s context. For this study, we concentrate on a subset of XPath expressions, namely the path expressions, as they represent the most specific part of the XPath language. A simple example of a path expression may help to explain the introduced terms:

/descendant::address/child::email.

The rooted path expression can be divided into twostep expressions. The descendant step, being the first one, originates in the root node. Its result sequence is passed over as the context to the subsequently evaluated child step. Both axis steps in the example are combined with name tests to further filter the result sequence of the respective steps. Thus, the above path expression selects

all elements named “email” that have an “address” parent and that are in the same document.

Besides the descendant and the child axes, the XPath language specifies ten further axes describing the position of the nodes in the XML tree with respect to a context node. [BBC⁺03] further categorizes these axes into forward and reverse ones, based on whether the nodes selected by the axis lie before or after the current context node in document order.

Table 1.1 lists all existing axes; for the exact definitions of axis semantics the reader is referred to [BBC⁺03]. In our case the distinction between reverse and forward axes plays a minor role, instead we introduce the termmajor axes for the ancestor, descendant, preceding and following axes that partition the entire set of nodes in the document (see Sec. 2.2).

XPath defines the result sequence of any step expression to be sorted in ascending document order. Informally, document order corresponds to a depth-first, left-to-right traversal of the nodes in the tree representation of an XML document.

A precise definition is given at the beginning of Chapter 2 in the context of introducing preorder values. Although XPath always regards a set of nodes as a ordered sequence, we only use this term whenever the current ordering of nodes is important in the considered context. Otherwise we refer to thecontext set CS, in which the step originates, and the node set NS representing the base set of XML tree nodes, over which the evaluation takes place. Both CS andNS completely parameterize an axis stepS and determine itsresult set RS: RS←S(CS, NS).

(9)

Every step expression in XPath is combined with anode test. A node test can be either aname test as in the above example or akind test, which enables selection of nodes based on their kind. Name tests filter out element nodes having the specified tag-name, except for the attribute axis, in which case the attributes of the given name are selected instead. Other kinds of selections are expressed in XPath by optional specification of predicates. Any boolean expression is allowed here to be used as a predicate, indicating whether a single node that has passed the node test qualifies for the result or not. However, the impact of predicates on the result size and evaluation performance of step expressions is beyond the scope of this work.

With the basic structure of path expressions being/S1/S2/ . . . /Sn, theprocess of evaluation is performed from left to right. Step expressionSi is evaluated and its result serves as the dynamic context for Si+1. If the context set of any step expression contains more than a single node, [BBC⁺03] describes its processing as successive single evaluations for each of its context nodes, followed by a final merging of the results, which performs a sorting in document order and duplicate elimination.

1.4 Query Optimization in XPath

When trying to find starting points for query optimization, we have to identify exchangeable parts within query plans that might differ in their execution performance, but still remain semantically equivalent. However, path expression are neithercommutative norassociative with respect to the ordering of their steps.

[WJLY03] resolves the problem by regarding an axis step as a containment join of the two setsCS andNS, redefiningCS,NS, andRS as sets of node pairs, e.g., everyc ∈ CS is a node tuple of the form hc1, c2i. The result of an axis step can then be specified as

S(CS, NS) ={hc1, n2i| hc1, c2i ∈CS,hn1, n2i ∈NS, n1on axisS ofc2}. Any path expressionCS/S0/S1/ . . . /Sn performed on documentDis translated to

Sn(. . . S1(S0(CS0, NS0), NS0). . . , NS0), CS0={hc, ci|c∈CS}, NS0={hn, ni|n∈D}.

The use of these node pairs allows to rewrite the query plans emerging from a path expression. For any two step expressionsSi, Si+1, Si+1 following directly on Si in the considered path, the step evaluation can be exchanged as follows:

Si+1(Si(CS, NS), NS)≡Si(CS, Si+1(NS, NS)).

In order to meet XPath semantics, the final result of the path expression has to be post-processed to yield a sorted, duplicate free node sequence.

Although this approach enables query optimization to a great extent, the node pairs introduce a significant overhead in terms of intermediate result sizes, especially for the major XPath axes. Whereas the previously described evaluation process defined in [BBC⁺03] requires duplicate elimination for the result of each step expression, the node pairs need to store each combination of nodes relying on the specified axis, which contradicts the aim of finding query plans with smaller intermediate results.

A different way of query optimization is suggested by the authors of [OMFB02].

They searched for symmetries in the semantics of path expressions and derived a large set of path equivalences. Applying simple rewriting rules allows to exchange parts of the path by equivalent expressions which enable faster evaluation. The

(10)

::emailN S

/child

NS

??

::addressN

/descS CS

NS

??

S

/child

S

/desc

ooooo

CS _N

::address

?

NS

::emailN

?

NS

Figure 1.2: Query plans for the example path expression, changing the evaluation order of axis stepS and node test N inside the step expressions.

found symmetries are mainly concerned with the rewriting from reverse to forward axes, since the former ones cause extremely higher evaluation costs in case of a SAX-like stream-based XML-processing. In the context of DBMS based querying, however, the differences between forward and reverse axes become negligible. Fur- thermore, the suggested path rewriting often leads to a higher complexity of the resulting expressions. For example, reverse axes are often exchanged by forward axes combined with further predicates. Thus, the approach would even cause additional costs for predicate evaluation in our case.

Recent research has generated more ideas on XPath query optimization, for instance, [KG02] performs logical optimization of path queries using information obtained from the corresponding DTDs. However, all these approaches consider the best query plan only in terms of an abstractly defined optimality. Without any knowledge about the actual implementation of the algorithms used for path evaluation, they cannot provide a concrete cost model for a given query plan.

We will thus pursue the long-term objective of query optimization from the opposite side. As described in the outline of this thesis (Sec. 1.1), we start our analysis at the level of the algorithms with the aim to provide detailed physical cost models for each introduced XPath operation. The models may be applied later by a query optimizer, to judge between possible query plans in any given situation.

As the above described methods of path rewriting and containment join ordering would not be beneficial in our case, we constrain the optimization problem to identify the best query plan for the evaluation of single step expressions. Similar to entire path expressions, the way the axis step and combined node test are ordered can become an issue of optimization. Figure 1.2 depicts two semantically equivalent query plans executing axis step S and node testN. The second version can be regarded as a typical selection push-down, reducing in advance the node set cardinality of the axis step. Intuitively, we would judge the node test push-down to show better performance in any case. Surprisingly, however, our further analysis will identify considerable advantages of the first version.

(11)

The XPath Accelerator and its Axis Evaluation

As mentioned before, this study is part of the Pathfinder project whose principal aim is the construction of an XQuery engine. The project’s underlying data model, the so calledXPath accelerator described in detail in [GKT04], is the first subject of this section.

This thesis‘ goal is to exploit the performance of a relational database system for XPath processing. But using an RDBMS means working on relational tables which do not allow natural storage of tree-shaped XML data. To resolve this problem, an encoding is needed that maps the structure of an XML document and also supports efficient querying of all XPath axes. Recent work on the subject has shown that thepre/post plane [Gru02] appears to be a very efficient XML encoding at least for query intensive usage.

In a nutshell, all nodes of the XML tree are labeled withpreorder andpostorder values. These two enumerations of the nodes are sufficient to represent the tree structure of the document. To be more precise we can define an order on the XML sequence of nodes. Ifa, bare nodes in an XML document D

a < b, ifaappears beforebin a sequential read ofD (2.1) For element nodes with the start tag being separated from the end tag only the start tag is taken into account. This order is called document order [FMM⁺03].

The enumeration of nodes in document order assigns an integer value to every node v∈D, called the preorder-valuepre(v).

If the end tag is considered instead of the start tag, a similar order can be defined on the XML node sequence. Again, the enumeration according to that order assigns an integer value to every nodev∈D, the postorder-valuepost(v).

With the above definitions of pre- and postorder, we can derive the respective values from the textual representation ofD and store them in a simple relational table containing the tupleshpre(v),post(v)ifor every nodev∈D. Figure 2.1 shows the transformation of a small sample document into a pre/post table. This process can be efficiently executed with the help of a SAX [SAX] based parser. The SAX eventsstartElementandendElementin combination with a stack suffice to build the pre/post table within one sequential read, whereas the stack never contains more elements thanheight(TD), the height of the XML treeTDcorresponding toD.

See [Gru02] for more detail on the implementation of the SAX callback procedures.

For a complete database storage of XML documents, additional node specific data such as tag-names and node kinds has to be collected as well. Since this data belongs to each particular node, the pre/post table can easily be extended to

8

(12)

<a>

<b>

<c/>

<d>

<e/>

<f/>

</d>

<g/>

</b>

<h/>

<i>

<j/>

<k/>

</i>

<l/>

</a>

(a) XML Document

pre post

0 11

1 5

2 0

3 3

4 1

5 2

6 4

7 6

8 9

9 7

10 8

11 10

(b) Pre/Post Table

Figure 2.1: Textual representation and pre/post table for an XML document.

store these additional attributes. Either pre- or postorder values may be chosen as primary keys because of their uniqueness. A relation containing this data may look like this:

pre post tag-name text kind

Notice that the described mapping between pre/post table and textual representation ofD is bijective, meaning that the XML document structure can be restored completely from the pre/post table in the database.

2.1 Main Memory DB specific Adaptations of the Data Model

The XML encoding introduced so far can be implemented in any relational database system. A more detailed description of the data model actually used in the Path- finder project includes specific main memory database related adaptations. Our chosen MMDB system Monet comes with the restriction that it supports only binary tables (BATs) on the physical level of storage. Relations with more than two attributes have to be fragmented vertically. Setting up the XPath accelerator on the base of Monet thus means that the single table containing all document data first has to be split into several BATs, one for each attribute. In each of these tables the primary key has to be maintained. Preorder values are chosen for this purpose as being dense and ascending integers they are suitably represented by avoidcolumn that causes no additional storage overhead. Table 2.1 shows the fully fragmented pre/post relation.

This data model is aimed at enabling a highly efficient evaluation of all XPath axes. The discussion of this issue follows in the next section, but in order to complete the introduction of the data model it is important to mention that fast support for the child/parent as well as sibling axes requires another tabledoc levelfor holding

(13)

doc prepost pre post Preorder and postorder ranks doc level pre level Preorder ID of a node and its level

doc tag pre name Preorder ID and tag-name of all element nodes doc text pre text Preorder ID and text value of all text nodes doc pi pre pi Preorder ID and value of all processing in-

struction nodes

doc com pre comment Preorder ID and value of comment nodes doc aname attr name Attribute ID and name of attribute doc avalue attr value Attribute ID and value of attribute

doc aowner attr owner Attribute ID and preorder ID of its owner node Table 2.1: Representation of the XML document with BATs.

the preorder identifier of a node and its level in the document tree.

level(v) =|v/ancestor| (2.2) Notice that the structural redundancy, introduced by the level BAT, leads to small storage overhead only, as a 1 byte integer values suffice to keep level information. For typical XML instances, the number of hierarchical levels of nodes remains quite small. The well-knownshakespeare.xml [Bos] for instance, an XML document containing all plays of Shakespeare, has a tree height of 7. Thus, we can at least expectheight(TD)<255.

For evaluation performance reasons, attribute nodes are numbered separately.

For most of the axis steps, XPath semantics excludes attributes from the result.

The cost of selections to filter out attributes are thus saved if axis step operations can work directly on attribute-free tables. Nevertheless, attributes as well as other nodes may happen to reside in the same context sequence. It is thus important to choose a numbering scheme for attributes that uses the same data type but does not interfere with the preorder numbering on other nodes. A possible solution would be to indicate attributes by a leading indicator bit.

Experiments have shown that the overall storage volume of the database in- creases by the factor of ≈1.5 in comparison to the textual representation of the XML document.

2.2 General Pre-/Postorder Properties

Starting with the derivation of general properties, we summarize knowledge about the pre-/postorder encoding, which can be employed for effective evaluation of the major XPath axes, as shown in the several cases presented in the following sections.

If we look at a single nodev ∈D^∗, D^∗ being the set of all non-attribute nodes of documentD, the major XPath axes satisfy the following equation:

D^∗=v/self∪v/descendant∪v/ancestor∪v/following∪v/preceding, (2.3) with every set on the righthand side being disjoint with any other. The validity of this equation can be proven in a straightforward manner by expressing the axis steps

(14)

a•

•bjjjjjjjjjj

•czzzz d◦ e•

f• 22

2DDDDg• h•zzzz i• DD DD

j• k• 22 2TTTTT•l TT TT T

(a) TreeT_Dof DocumentD

•document node

◦context node

postOO

pre

//h0,0i

−

1−−−−

5−−−−−

10−

+

1++++

5+++++

10+

•a

•b

•c

◦d

•e^•f

•g

•h

•i

•j^•k

•l

(b) Pre/Post Plane of DocumentD

Figure 2.2: Tree and pre/post plane representation of the XML document of Fig. 2.1.

Dotted lines indicate the document regions as seen from the chosen context noded.

in pre-/postorder semantics. The pre/post-numbered document tree in Fig. 2.2 shows that the set v/descendantcan be determined as follows:

v/descendant = {v⁰∈D^∗|(pre(v⁰)>pre(v))∧(post(v⁰)<post(v))}. (2.4) Analogous equations can be derived for all other major axes:

v/ancestor = {v⁰∈D^∗|(pre(v⁰)<pre(v))∧(post(v⁰)>post(v))}, (2.5) v/following = {v⁰∈D^∗|(pre(v⁰)>pre(v))∧(post(v⁰)>post(v))}, (2.6) v/preceding = {v⁰∈D^∗|(pre(v⁰)<pre(v))∧(post(v⁰)<post(v))}, (2.7) v/self = {v⁰∈D^∗|(pre(v⁰) =pre(v))∧(post(v⁰) =post(v))}. (2.8) A visual interpretation of these equations is shown in Fig. 2.1(b), where pre- and postorder values populate a two-dimensional field, the so called pre/post plane.

We can say that in respect to any single node, the major XPath axes partition the pre/post plane in four document regions. Obviously the document regions are disjoint and their union together with the context node itself results in the complete set D^∗.

With the above pre-/postorder equations for major XPath axes, we are able to evaluate these axes for single context node sequences with selections on two attributes, namely the pre- and postorder values. In this way, XPath semantics are already translated into standard database operations, efficiently performed by any RDBMS, especially when supported by available index structures.

2.2.1 Tree Awareness

Usual selection algorithms, however, would need to scan the whole pre/post relation to evaluate such range queries. Enhancing the database operations to become more

“tree aware”, i.e., giving them knowledge about tree properties, enables further improvements of their performance.

If we think of a preorder sorted sequence of nodes, the descendants and following nodes of any single node v are not arbitrarily allocated within this sequence, but follow v in two dense blocks: first all descendants, then all following nodes. On the other side, in front of v, blocks of preceding nodes are interrupted by single ancestors ofv, visualized in Fig. 2.3.

In order to search the document for all descendants ofv, the selection algorithm could be optimized first to determine the bounds of the descendant block and then

(15)

A P A P A P D F

context node A ncestor

D escendant P receding F ollowing - _preorder

Figure 2.3: Document regions as seen from context node din preorder sorted sequence.

to select all nodes within it without any further value comparison, which could considerably speed up execution. The process description includes two kinds of tree aware optimization techniques, to which we refer in the following as:

Skipping If we know in advance that parts of the document table do not contain nodes of the searched axis, they are skipped while scanning the table.

Copy without Test If tree properties assure a certain block of nodes to belong to the result, it can be copied without any further check on the properties of the nodes contained in the block.

Skipping and copying without test achieve their full strength, if employed on a node set BAT of typehvoid,oidi. Monet’s possibility to directly address a record at a certain offset allows skipping without having to access the records in between.

Copying without test in the case of a dense numbered node sequence even results in writing without reading. When trying to derive cost models in Chapter 4, operations are thus distinguished between using theoid- orvoid-version of the algorithm.

2.2.2 Ancestor and Descendant Sizes

Another important observation on pre-/postorder ranks of a node concerns their implication on the number of its ancestors and descendants. Again we concentrate on a single node v ∈ D^∗. Since preorder values count XML start tags, pre(v) is determined by all start tags visited before the start tag of v in document order, which are the start tags of all ancestor and preceding nodes ofv:

pre(v) =|v/ancestor|+|v/preceding|+ 1.

Analogously,post(v) is defined by the preceding and descendant nodes ofv, because their end tags are seen before the end tag ofv:

post(v) =|v/descendant|+|v/preceding|+ 1.

The combination of these two dependencies reveals an interesting relationship between the pre- and postorder values, namely:

post(v)−pre(v) =|v/descendant| − |v/ancestor|

post(v)−pre(v) +level(v) =|v/descendant|. (2.9) Although the number of ancestors is usually unknown, it is limited by the height of the document tree,height(TD):

height(TD) = max

v∈D|v/ancestor|. (2.10)

(16)

Reminding thatheight(TD) stays a rather small value for typical XML instances, we can specify tight bounds for the number of descendants of a single nodev:

post(v)−pre(v)≤ |v/descendant| ≤post(v)−pre(v) +height(TD). (2.11) The usefulness of this estimation becomes apparent when thinking of the mentioned tree-aware optimization techniques, which need precise predictions on the bounds of descendant blocks.

2.3 XPath Operators

It has been shown that with the XPath accelerator XML documents can be stored in relational databases. However, the substantial goal of this encoding is not efficient relational storage alone, but also the acceleration of XPath evaluation.

So far we have introduced general approaches using pre-/postorder values for efficiently supporting XPath evaluation. However, XPath requires set, precisely sequence, oriented operators for all axes steps rather than operators working on single nodes. With the above described background, a “node by node” execution schema could easily be developed that computes axis steps separately for every node inside a context sequence. Although this approach directly corresponds to the semantic description of the axis steps, given in [BBC⁺03], it would be quite inefficient for large context sets. In the following sections we introduce more advanced algorithms for handling the evaluation of axis steps from any given context set.

In order to premention one of the most important results: we can find suitable algorithms for all XPath axes such that a single sequential scan of the data suffices, even for entire context sets. As a side effect of sequential processing, all operations preserve document order without having to apply further sorting routines. This ensures sortedness of all intermediate results and, hence, sorted input for the operators that follow. It also matches XPath semantics, which requires the document order of the output. For simplicity and uniformity considerations, each XPath operation gets its input and presents its output in terms of a BAT containing only the preorder values in ascending order.

2.3.1 Descendant and Ancestor Axis

Recent work [GKT03] has been done on developing a new join algorithm, the so called staircase join, that encapsulates “tree awareness” inside the database operators for ancestor and descendant step evaluation. For explanation, we first stick to the descendant case only and split its execution in two logical parts: an initial preprocessing of the context set and the final staircase join evaluation thereafter.

Pruning If we choose the node sequence (b, d, i, j) in the sample document (Fig. 2.2) as the context set for a descendant step, a look at the document tree makes us re- alize that all descendants of the nodes dandj are already contained in the result of (b, i)/descendant. Generally, for any two nodesv, w∈D

w∈v/descendant ⇒ w/descendant⊂v/descendant, w∈v/ancestor ⇒ w/ancestor⊂v/ancestor.

For a descendant step, it is thus equivalent with respect to the result, to reduce the context set by excluding those nodes which are themselves descendants of other context nodes. The method to perform this preprocessing, calledpruning, scans the context set once in ascending preorder. With a markermaxpost storing the highest postorder value visited so far, it simply skips the nodes with postorder values smaller

(17)

• document node

◦ context node postOO

pre

//h0,0i

−

1

−

5

−

10

−

+

1

+ + + +

5

+ + + + +

10

+

•

◦b

•

d◦

•

i◦

◦j

•

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _

_ _ _ _ _

Figure 2.4: Pruning for the context sequence (b, d, i, j) builds a staircase shaped search region.

thanmaxpost while copying other nodes to the resulting pruned context setCSpr

(Algorithm 1).

It is important to notice that pruning not only reduces the context set, but also guarantees that all remaining context nodes relate to each other on the preceding/following axis. Together with the vertical and horizontal limits of the pre/post plane, CSpr defines exact bounds of the search region. Figure 2.4 shows such a pruned set and the region it marks. The staircase-like shape of that region gives the name for the following algorithm.

Staircase Join After the preprocessing is done it remains to scan the pre/post plane for all the nodes within the staircase region. The basic approach here is to vertically partition the pre/post plane along the preorder values of all context nodes, which means to evaluate the staircase “step by step”. For every partition, all nodes within it are tested as to whether their postorder value is beyond or under the step boundary. Partitioning the pre/post plane, however, does not cause additional work. Since the pre/post table is sorted on preorder values, it suffices to scan it in ascending order while the postorder predicate for the comparison changes dynamically with each step. Therefore we could characterize the algorithm as a merge join with a dynamic range predicate. The basic framework of the staircase Algorithm 1: Context set pruning for descendant staircase join

prunecontext desc(context: table(pre,post) sorted in ascending preorder)≡ begin

result←new table(pre,post);

maxpost←0;

foreachciincontextdo ifpost(ci)> maxpostthen

insertciinresult;

maxpost←post(ci);

returnresult;

end

(18)

join as described here is presented in Algorithm 2.

Algorithm 2: Staircase join algorithm for descendant axis

staircasejoin desc(doc prepost: table(pre,post),context: table(pre,post))≡ begin

result←new table(pre,post);

foreachpair (ci, ci+1)incontextdo

scanpartition desc(pre(ci),pre(ci+1),post(ci));

c← last node incontext;

n← last node indoc prepost;

scanpartition desc(pre(c),pre(n),post(c));

returnresult;

end

scanpartition desc(pref rom, preto, postmax)≡ begin

fori frompref romtopretodo

ifpost(doc prepost[i])< postmax then appenddoc prepost[i]to result;

else

break; /* skipping */

end

Although we described the pruning of the context set as a separate preceding process, it can as well be integrated in the main evaluation procedure. Looking ahead from context node ci at the next one ci+1, ci+1 is disregarded and therefore simply skipped if it lies in descendant position to ci. “On the fly” pruning thus avoids the intermediate writing and re-reading ofCSpr.

Examination of the staircase join result reveals further advantages of the algorithm. Due to the sequential scan over the pre/post relation, the result set remains preorder sorted andduplicate free, in contrast to a “node by node” axis evaluation of the context set. Hence, no additional postprocessing is needed to meet XPath semantics.

The basic staircase join is already very efficient in evaluating descendant steps because it allows to access all the data in single sequential scans. Nevertheless, it is possible to further optimize its execution by introducing more “tree aware”

adaptations. Figure 2.3 in the last section shows that for preorder-sorted nodes all descendants of a single node follow that node in a dense block. This knowledge allows to apply the already mentioned techniques:

Skipping Regarding a single partition of the staircase, the first appearance of a node with a postorder value exceeding the staircase boundary indicates the end of the descendant block corresponding to the current context node ci. All further nodes within that partition lie beyond the postorder limitpost(ci) and therefore can be skipped, which means that the scanning cursor on the pre/post table is moved to the preorder value of the next context nodepre(ci+1).

Copy without Test The inequality (2.11) defines lower bounds for the number of nodes within descendant blocks, which are very close to their actual sizes. In order to save CPU costs of postorder comparisons, the nodes within these lower bounds can be copied to the result set without any further test.

Applying both techniques limits the numbernof postorder comparisons during the whole staircase join by

n≤height(TD)∗ |CSpr|.

(19)

Ancestor Axis Pruning could similarly be applied to a context set for ancestor steps, in which case all those nodes being themselves ancestors of other context nodes are eliminated. The pruning algorithm, however, would require to process the context nodes in reverse order, and thus could not be done “on the fly”. In contrast to the descendant axis, the tree properties of the node set ensure correct staircase join evaluation also for non-pruned context sequences. Scanning of any partition [ci, ci+1[ only has to includeciitself, to check whetherciis an ancestor of ci+1.

As a further difference to the descendant axis, ancestors of a single node v are not clustered together, but are located separately between blocks of preceding nodes (Fig. 2.3). Therefore, skipping and copying without test cannot be applied analogously. Nevertheless, it is possible to perform a slightly less effective skipping of non-relevant preceding blocks. If, while scanning the pre/post plane, a preceding nodexwith respect to the current context nodeciis encountered, all descendants of this node, i.e.,x/descendant, are on the preceding axis of ci as well. Since we are able to define a lower bound for the size of this block|x/descendant|, the scanning cursor could be advanced by that number without further tests.

2.3.2 Following and Preceding Axis

Based on the same approach as for the descendant and ancestor axes, the processing begins with the pruning of the context set. In the case of the following and preceding axis, we can strengthen the conditions for set inclusion. For any two nodesv, w∈D, the following nodes ofware contained completely in the following set of v, ifwis itself on the following axis ofv, or secondly, ifwis an ancestor ofv:

w∈v/following∪v/ancestor ⇒ w/following⊆v/following, w∈v/preceding∪v/ancestor ⇒ w/preceding⊆v/preceding.

The translation of these conditions into pre/post semantics further simplifies the analysis:

w∈ {(v/following∪v/ancestor} = post(w)>post(v), w∈ {(v/preceding∪v/ancestor} = pre(w)<pre(v).

Obviously, for an arbitrary context setCSit means that pruning always reduces its nodes to a singleton sequence:

CS/following = v/following, v∈CS such thatpost(v) = min

w∈CSpost(w), CS/preceding = v/preceding, v∈CS such thatpre(v) = max

w∈CSpre(w).

There is no need to develop special algorithms for pruning on following/preceding axis because database systems already support the neededmin/maxsearch. Whereas a standard lookup formin/max values has to access all data items within a table, the guaranteed preorder sortedness of the context sequence enables more efficient pruning. Obviously it saves costs when searching for max(pre), but exploiting tree properties, it also enables a simplified lookup of min(post). Note that only height(TD) nodes can be in descendant/ancestor relationship with each other. In other words, within a set ofheight(TD) + 1 nodes there are at least two nodes related to each other on the preceding/following axis. Applying this tree knowledge ensures finding min(post) within the first height(TD) nodes of a preorder sorted context set.

After pruning is done, the main evaluation phase remains the following/preceding step from a single node. We have already explained in Section 2.2 how these

(20)

operations can be executed using two selections. Again, it is possible to improve the performance by supporting selections with “tree-awareness”.

Similar to descendants, all followings of a node build a dense block inside the pre/post relation, which is limited “on the right” by the end of the table and “on the left” by the end of the descendant block (Fig. 2.3). Using (2.11) for descendant estimation, we can determine a very tight region to search for the unknown left boundary of the following block. The total number of postorder comparisons is thus limited by height(TD) because all nodes inside the specified block can be copied to result without further testing.

The Monet system even allows to save the costs for having to write the result set. Applying the sliceoperation creates a new view-like result BAT. In spite of copying tuples, a second BAT descriptor is generated, pointing to the same table, bounded, however, by the given offset numbers of the respective start and end records of the slice.

In case of the preceding axis, a technique symmetrical to ancestor skipping can be applied. Instead of skipping the descendants of the preceding nodes, these blocks qualify for the result without postorder test.

2.3.3 Child Axis

Unlike the four major axes representing region queries on the pre/post plane, the child and the parent axes are rather poorly supported when having merely pre- and postorder values. To evaluate a child step from a single node v the following equivalence is to be employed:

v/child≡v/descendantEXCEPTv/descendant/descendant. (2.12) Note that in the above equivalencevcannot be substituted by an arbitrary context set. If, e.g., a context set includes two nodesv, w withw being a descendant ofv, all children of v would have to be excluded from the result. It is thus not possible to apply any pruning technique as used for the major axis steps.

The naive approach to evaluating child steps separately for each node in the context set leads to execution of multiple descendant steps and unions on interme- diary results obtained from single nodes, thus causing unacceptable performance overhead. To avoid these problems, we introduce another algorithm that enables evaluation of the child/parent axis in one sequential scan of all input and output tables. The basic idea of the following approach is to make use of a small stack whenever the execution has to deal with context nodes in descendant/ancestor relation.

A further difference with respect to the major axes is that the evaluation takes place on the doc leveltable instead of analyzing postorder values. The change for using level information here is not strictly necessary, but at least contributes to the simplification of the algorithms and their explanation. Its main advantage will become apparent, when we analyze the data access costs (Chapter 4).

Whereas level information obviously allows to easily determine child level nodes of a context node v, a closer look reveals that the preorder sortedness of the level table also enables to specify the beginning and the end of the descendant block of v. Figure 2.5 visualizes the document in apre/level plane. The descendant block of nodeb is finished as soon as the preorder numbering encounters the first following- sibling h on the same level as b. In case of node k, no descendant nodes exist because the next nodelis already a following-sibling of an ancestor ofk. Generally, tree properties ensure, that the first node outside the descendant block in ascending preorder will be a following-sibling of the context node or of one of its ancestors.

With respect to preorder and level information, the descendant block of any nodev

(21)

• document node

◦ context node

end marking node level

OO

pre

//h0,0i

−

1

−

2

−

3

−

4

+

1

+ + + +

5

+ + + + +

10

+

•a

◦b ^• c •d

•e

•f

•g

h

•i ^• j

◦k

l _

_ _ _ _ _ _ _ _

_ _

Figure 2.5: Pre/Level plane of the example document (marks end of last◦’s gray colored descendant region. Possible children are found on the dashed lines).

• document node

◦ context node

end marking node level

OO

pre //h0,0i

−

1

−

2

−

3

−

4

+

1

+ + + +

5

+ + + + +

10

+

•a

◦b

•c

◦d

•e

•f

g

h

•i

•j

•k

•l _

_

_ _ _

_ _ _ _ _

Figure 2.6: Context nodes in descendant/ancestor relation and their overlapping descendant regions.

• starts on pre(v) + 1, if at least one child node exists,

• ends with the first node x withlevel(x)≤level(v) or reaching the bounds of the level table.

The algorithmic idea of evaluating the given equivalence (2.12) is thus to walk sequentially through the level table starting atpre(v) + 1 and to collect all nodes on the child level ofv until the end of the descendant block is reached. Thinking of a sorted context sequence containing only the nodes in following/preceding relation, a child step can be evaluated in the same way within in a single scan of the whole level table.

Consider the context nodesb, dof Fig. 2.6 in ancestor/descendant relationship.

Node das a descendant of b is encountered during the scan for the children of b.

As already mentioned, the solution here is to use a stack which allows to interrupt the search for the children ofb by pushingb onto the stack and first collecting all children of d. Afterwardsb is fetched back from the stack and the search for its children may be resumed. Applying this method to arbitrary context sequences, the stack size never grows beyondheight(TD), since there can never be two equal- leveled nodes on the stack at the same time. Algorithm 3 presents the described method in detail.

The introduced algorithm already applies skipping techniques suspending the sequential read of the level table. Whenever the stack is empty and the search for the current context node is finished, the table scanning cursor is moved behind the next context node. In case of the child step algorithm, skipping thus ensures to scan exactly the descendant region of the context set. In the scope of preorder and level information it is impossible to apply skipping while moving from one child to the next. Without postorder values available, we cannot determine the lower bounds of the non-relevant descendant region belonging to each child.

Similar to the operations on the major axes, the child step evaluation procedure guarantees uniqueness and sortedness of the result due to the sequential processing of the node set BAT. However, the proposed procedure requires the node set always

(22)

Algorithm 3: Child step algorithm

child step(doc level: table(pre,level),context: table(pre,level))≡ begin

result←new table(pre,void);

c←first node incontext; clast ←last node incontext;

p←doc level[pre(c) + 1]; plast←last node indoc level;

label1:

whilec<clast do

whilelevel(p)>level(c)do iflevel(p) =level(c) + 1then

appendpto result;

ifp=next node incontextthen stack.push(c);

p←next node incontext; c←next node indoc level;

goto label1;

p←next node indoc level;

ifstack.empty()then

c←next node incontext; p←doc level[pre(c) + 1];

else

c←stack.pop();

forclastand further nodes onstackdo

while(p≤plast)and(level(p)>level(c))do iflevel(p) =level(c) + 1then

appendpto result;

p←next node indoc level;

c←stack.pop();

returnresult;

end

to contain all nodes within the document in dense ascending order. It thus cannot be regarded as a join combining two independent operand sets, in contrast to the staircase join, which allows to evaluate the descendant/ancestor axis on arbitrary subsets ofD^∗. The same constraint holds for all further described level-based axis step operations. Thinking of strategical query optimization, it is thus impossible to push down the node test in case of these axes.

2.3.4 Parent Axis

In case of the parent axis it suffices to modify the algorithm for child step evaluation described in the previous section. Again, the sorted level table allows to determine the parent of a given context node v.

v/parent = first nodexafter v in descending preorder (2.13) withlevel(x)< level(v).

The “preorder-level” semantics of the parent axis reveals the main difference to the child step procedure: the scan of the level table has to be performed in reverse order, since parents have lower preorder ranks than their children. As a consequence, the context sequence has to be processed in reverse as well.

Another difference concerns the uniqueness of the result set. Whereas two context nodes always have disjoint children sets, they can have the same parent node.

To translate this fact into algorithmic context means that while searching for the parent of v another context node w may be encountered which has its parent on

(23)

• document node

◦ context node

end marking parent level

OO

pre

//

−

0

−

1

−

2

−

3

+

11

+

10

+ + + + +

5

+ + + + +

0 a

b

•c

◦d

•e

•f

◦g

•h

◦i

•j

•k

•l

|__|

|_ _ _ _ _ _ _|

|_ _ _|

Figure 2.7: Context nodes◦and their parentsin pre/level plane. The parents are found at the end of the dashed lines.

the same level asv (for instance, context nodesg andd in Fig. 2.7). In this case w and v have the same parent, which is already collected in the ongoing search.

The other case to be handled is when the current search for the parent of context nodev encounters another context nodewhavinglevel(w)> level(v) (see context nodes i and g in Fig. 2.7). Similarly to the child axis case, a stack can be used to suspend the ongoing search to first handle the new encountered nodewwith its parent being reached before the parent ofv in descending preorder. Applying the just described policy of ignoring siblings, again ensures that the stack size never grows beyondheight(TD).

Another detail concerning the implementation is that table bounds check required in the child step algorithm can be simplified here, because, except for the root, the searched parent node is guaranteed to exist in the level table. Leaving out the root as the first (respectively, the last) node of the context sequence is thus enough to avoid failures.

The reverse processing of the context set and level table leads to a result sequence in descending order, in contrast to all other axis operations. To avoid the necessity of final resorting, entries can be written in reverse order into the allocated memory of the result BAT.

2.3.5 Other Axes

XPath defines six further axes: self, descendant-or-self, ancestor-or-self, following- sibling, preceding-sibling and attribute. This section is aimed at showing that the methods introduced so far can be adapted to serve the purpose of these further axis steps as well.

Self This axis, being the simplest one, does not cause any work. Only if node tests are involved selections occur on the current context set. Their evaluation, however, is not a matter of axis step evaluation itself.

Descendant-/Ancestor-or-Self A straightforward approach in this case would be to employ the union operation of the DBMS on the context and the result sets of a descendant/ancestor axis. In order to meet the requirements of uniqueness and document ordering of the result, costly post-processing is necessary. A solution avoiding any kind of post-processing is simply to change the comparison operator on postorder ranks in the scanpartition procedure (Algorithm 2) to allow equal- ity: post(v) ≤ postmax. Furthermore, the scanned regions have to be extended to include the context nodes themselves. These small adaptations suffice to merge context nodes and their descendants (respectively, their ancestors) while performing sequential table processing.

(24)

Algorithm 4: Parent step algorithm

parent step(doc level: table(pre,level),context: table(pre,level))≡ begin

result←new table(pre,void);

c←last node incontext; clast←first node incontext;

ifclast = root nodethen

clast ←next node incontext;

p←doc level[pre(c)−1];

label1:

whilec>clast do

whilelevel(p)>level(c) + 1do

ifp=previous node incontextthen iflevel(p)6=level(c)then

stack.push(c);

p←previous node incontext; c←previous node indoc level;

goto label1;

p←previous node indoc level;

appendptoresult;

ifstack.empty()then

c←previous node incontext; p←doc level[pre(c)−1];

else

c←stack.pop();

forclastand further nodes onstackdo whilelevel(p)>level(c) + 1do

p←previous node indoc level;

appendptoresult;

c←stack.pop();

returnresult;

end

Following-/Preceding-Sibling Evaluation of the following-sibling/preceding- sibling axis appears to be closely related to the child axis task, which is not surpris- ing as following/preceding siblings are themselves children of the same parent node.

For the following-siblings of a single nodev the search region is narrowed down to

• start atpre(v) + 1,

• end at the first nodexin ascending preorder with level(x)<level(v).

Nodes within that region on the same level with v are collected for the result.

Comparing these conditions with the ones for the child axis shows that minor mod- ifications in the child step algorithm suffice to implement the search for following- siblings. The same procedure, but processing all involved BATs in reverse document order, outputs preceding-siblings instead.

Attribute The remaining attribute axis differs from all others, since it can be efficiently supported in terms of standard database operations. The data model provides the BATdoc aownermaintaining the relation between attributes and their owner elements. Hence, evaluation of the attribute axis results in a join formulated by the following MIL statement:

query(CS/attribute)≡join(CS.mirror, doc aowner.reverse).reverse Due to the preorder numbering of the attributes and their owner elements the BAT doc aowneris sorted on both the head and the tail. All attributes belonging

(25)

to the same element reside clustered inside doc aowner. Monet can thus employ a sort-merge join when executing the above statement which preserves the sortedness of its result. Uniqueness is ensured as attribute identifiers represent primary keys in thedoc aownertable.

2.3.6 Node Tests

From a database point of view both name and kind tests represent selections, e.g., a name test on name tag is expressed by the selection σname=tagCS. In the fully fragmented setting of Monet name test selections are performed on the tag-name tabledoc tag. Since axis steps do not work on this relation an additional key join is necessary.

query(CS/axis::tag)≡CS.axisstep.mirror.join(doc tag).uselect(tag) There are more ways of combining axis steps and name tests leading to different query plans, but since there is no general rule for defining the best solution, plan selection becomes an issue of query optimization.

In order to accelerate tag-name selections themselves, the data model can be enhanced by building an enumeration type for all occurring tag-names, which brings about the following two advantages:

• In terms of CPU performance, integer comparisons can be evaluated considerably faster than string comparisons.

• Integer values have fixed length and thus can be stored inside a BAT. String values, on the contrary, are maintained on a separate heap for variable sized data items whereas BATs contain only pointers for finding the respective values. This implies additional random memory accesses for every string comparison.

Besides speeding up the selections, the enumeration encoding also reduces the overall storage requirements for tag-names. As XML documents usually contain only a limited number of different tag-names the encoding table for the generated enumeration type is expected to remain small, such that the storage for the tag- name table shrinks significantly.

A last remark concerns the usage of the uselect operation. Monet does not support the concept of pipelining between operators. Therefore minimization of the intermediate results becomes an important issue. In this case it can be achieved by selecting only the heads of relevant tuples.

2.4 Conclusion

Based on the existing encoding of the XPath accelerator and the staircase join idea, we extended the set of efficient algorithms to support further axes. Thus, all XPath axes now have an associated specialized database operation, that

• accesses all data in single sequential scans,

• restricts node set access as far as possible,

• directly provides preorder-sorted output.

In case of the preceding and following axes, we showed how the execution can be accelerated by shortening the pruning procedure, on the one hand, and by using tree-aware optimization techniques, such as copying without test, on the other hand.

(26)

The main contribution of this chapter, however, lies in the development of a new type of level-based algorithms, used for evaluation of the child, parent and both sibling axes. We showed, that application of a small stack suffices to enable sequential processing of the node set, thus ensuring sorted output. Since the algorithms only need preorder and level information, they can be run on thedoc leveltable, which further reduces the data access due to the four times smaller level entries. How- ever, compared to the operations on the major axes, the level-based algorithms are constrained so far to operate on dense node sets and skipping is possible only in a quite limited number of cases.

(27)

Result Size Estimation

As mentioned in the introduction, adequate cost models for database operations require knowledge about the amount of data to be processed. Usually input sizes of all operands as well as the cardinality of the operation’s output play a crucial role in calculating memory access times. Therefore, it is essential to provide appropriate result estimates for each operator, in this case for all XPath axes and node tests.

Optimization is, however, not the only application field for result estimates. Upper bounds for result sizes are also used for more precise memory allocation on the implementation level of operators.

The estimation problem can be specified as follows: We are given an arbitrary context sequence CS and want to approximate|CS/axis|or |nodetest(CS)|, the number of distinct document nodes to be found on the defined axis step or the number of nodes insideCSwhich qualify with respect to the given node test predicate.

Estimating the result size of a complete path expression lies beyond the scope of this work. Especially the major XPath axes require more information than the context set size to provide an accurate estimation on their result sizes. Thinking, for instance, of a step along the following axis from a single node, the result may be an empty sequence, but also a set containing nearly all nodes of the document, depending on the pre-order rank of the context node. Thus, the presence of the context set becomes an important condition for employing some of our proposed methods.

In some cases the static context analysis already reveals further information about the context set, for example if the root node is present or if the set contains leaves only. In the following study we always consider the XPath-typical case where all nodes in the context set passed the name test of the previous step expression and hence have identical tag-names.

The chapter starts with a short overview of related research. After introducing common notations, we separately develop and discuss result size estimation techniques for each axis step. A final experimental study presents the results of various kinds of tests to determine the accuracy and performance of the proposed methods.

3.1 Related Research

Result estimates as well as cost models introduced in the next chapter always imply the trade-off between prediction quality on the one hand and tight space and time limitations for their execution on the other hand. Existing work on size estimation issues in the context of XML query processing [AAN01, WPJ02, WJLY03] has proposed elaborated calculation models, but they either do not address the same task or produce unacceptable time or space overhead for the needed statistical information.

24