• Keine Ergebnisse gefunden

3.3 Optimizations

3.3.1 Static Optimizations

3.3.1.6 FLWOR expressions

In XQuery, numerous other equivalences exist, a fact which makes the language very flexible. For instance, many queries can be written as either pure XPath, using location paths, or XQuery, using the iterative FLWOR expression. The latter is more verbose, but some consider it to be more easily readable, particularly if queries get more complex:

for $item in doc(’xmark’)/descendant::item where $item/payment = ’Creditcard’

return $item

The following XPath expression will yield the same result:

doc(’xmark’)/descendant::item[payment = ’Creditcard’]

Some FLWOR queries exist that cannot be expressed in XPath. As an example, no XPath equivalent exists for the ORDER clause, which sorts iterated values. Next, XQuery is needed to post-process items one by one, as shown in this example:

3.3. Optimizations

for $n in 1 to 10 return $n * 2

To avoid that query optimizations have to be implemented twice, both variants are first normalized to one common representation. The existing XPath representation appears to be most appropriate in our scope, as the predicate tests are already part of the ex-pressions that might be suitable for index rewritings. Accordingly, FLWOR queries are normalized by rewriting the optional WHERE clause to one or more predicates and at-taching them to the expressions defined by the variable declarations. Before the clause can be rewritten, two preconditions must be met:

1. All FOR clauses must not specify a positional or a full-text scoring variable.

2. A recursive algorithm checks if all occurrences of the variables, which are intro-duced by FOR, can be removed from the WHERE expression and substituted with a context item expression (.). The substitution is prohibitive whenever the new context item reference conflicts with an update of the context item at runtime, which is e.g. the case if the checked variable is enclosed in a deeper predicate, or specified in the middle of a path expression. The occurrence test can be safely skipped for sub-expressions if the variable is shadowed by another variable with the same name.

If the checks are successful, WHERE is rewritten as shown in Algorithm 11:

Line 1: The WHERE expression is stored intests. If it is a logical AND expression, it is replaced by its arguments, as single predicates can be optimized more easily in subsequent steps7.

Line 2: An array targetsis created, which, for all tests, contains pointers to the variable bindings (i.e., the expression of the FOR or LET clause). By default, all pointers are set to 0 and thus reference the first (outermost) binding.

Line 3-13: The most suitable FOR binding is now chosen for all tests, and will be stored inbest: A second loop starts from the innermost binding. If the binding is a FOR clause, it is selected as newbestcandidate. If the current test uses the variable in question at least once, the binding referenced as best is chosen as target for attaching the test, and the check of the remaining, outer bindings is skipped. If no best target candidate has been selected yet, which happens, e.g., if the innermost

7A single predicate, in which multiple tests are combined with an AND expression, can as well be repre-sented via multiple predicates, provided that no positional predicates are used.

3.3. Optimizations Algorithm 11FLWOR.CompileWhere()

1 tests:=expressions specified in WHERE clause

2 targets:=integer array, initialized with 0

3 fort:=0to#tests– 1do

4 best:=null

5 forb:= #BINDINGS – 1to0do

6 best:=bifBINDINGS[b] is a For clause

7 iftests[t] usesBINDINGS[b].VAR then

8 return ifbest=null

9 targets[t]:=best

10 break

11 end if

12 end for

13 end for

14 fort:=0to#tests– 1do

15 bind:=BINDINGS[targets[t]]

16 expr:=tests[t] with allbind.VARreferences replaced by a context item

17 wrapexprwithfn:boolean()function if type is numeric

18 addexpras predicate tobind.EXPR 19 end for

20 eliminate WHERE clause

variable is declared by a LET clause, the optimization is canceled. If none of the variables is used by the test, it will be evaluated by the outermost binding.

Line 14-17: In a second loop over all tests, the target binding for the current test is stored in bind. All references of the target variable in the test are recursively substituted by a context item. If static typing indicates that the expression expr will yield a numeric result, it is wrapped with afn:boolean()function to prevent that the evaluated value will be mistaken as a positional test.

Line 18: If the existing expression bind.EXPR is a path expression, expr will be attached as a predicate to the last axis step. If the expression is a filter expression, exprwill be added as a predicate to this expression. Otherwise, the expression will be converted to a filter expression withexprattached as single predicate.

Line 20: Finally, WHERE is removed from the FLWOR expression.

As indicated before, the substitution process will never attach predicates to inner LET clauses. The following query demonstrates the need to differ between FOR and LET:

Original: for $a in 1 let $b := 2 where $b = 3 return $a Modified: for $a in 1 let $b := 2[. = 3] return $a

3.3. Optimizations

The first query returns an empty sequence, as the comparison in the WHERE clause will never be true. The second query, in which the WHERE expression has been attached to the LET clause, returns 1, as LET will always cause one iteration, no matter if zero or more items are bound tob. If a WHERE expression is attached to an outermost LET clause, however, the query remains correct as the attached predicate will be independent from all inner bindings.

After the WHERE clause has been replaced, the inlining of variables, as described in 3.3.1.2, might lead to a complete elimination of the FLWOR expression: The query presented in the beginning of this section will be automatically rewritten to its XPath equivalent; see Figure 3.5 for the original and the optimized query plans.

Original: FLWOR

Figure 3.5:FLWOR expression: original and optimized query