• Keine Ergebnisse gefunden

2.3 XML Linking

2.3.1 XPointer

In Section 2.2.1, the XML query language XPath has been discussed and it has been shown how an XML tree can be traversed in order to select specific nodes. In addition to the basic addressing mechanisms defined in XPath, some XML applications like XLink [XLi01] (cf. Section 2.3.3) and XInclude [XIn04] (cf. Section 2.3.2) need more sophis-ticated techniques for special addressing purposes. For instance, with XPath, it is not possible to address ranges of XML trees with arbitrary start and end points. These and further addressing mechanisms are specified in the W3CXPointer Framework [XPt03b]

and its accompanying scheme specifications.

The XPointer Framework introduces the basic constructs for addressing parts of XML documents. XPointers are usually part of URIs in form of fragment identifiers in order to address parts of specific documents:

http://www.example.org/file.xml#xpointer–expression

| {z }

fragment identifier

The XPointer specification distinguishes between shorthand pointers and schema-based pointers.

Shorthand Pointers

Using DTDs or XML Schema, elements can be associated with anID. With shorthand pointers, it is possible to address at most one element having a specificID. On the other hand, it results in an error if no element is found corresponding to a shorthand pointer.

In order to define a shorthand pointer, the user simply has to supply the identifying value. This concept is similar to anchors in HTML and the following URI shows how to reference the element representing Germany in the Mondial database using the shorthand pointer “D”:

http://dbis.informatik.uni-goettingen.de/Mondial/mondial.xml#D

Note that in HTML, anchors have to be explicitly defined in a remote document in order to be referenced by external URIs while in XML it suffices to supply a schema (DTD or XML Schema) that specifies theIDs. Also, IDs are stable wrt. any restructuring of the XML document, i.e. IDs can be kept unchanged while the document and its structure evolves.

Schema-based Pointers

In addition to the straightforward addressing properties of shorthand pointers, the XPointer Framework also offers more flexible pointers. These schema-based pointers can be given as follows:

schema-name(schema-expr)

The schema-name indicates the name of the XPointer schema to be used while the schema-expr inside the parentheses has to conform to the given schema. The W3C proposes three schemas: element(),xmlns()and xpointer()which will be discussed in the following paragraphs. Additionally, users may define their own schemas and thus achieve even more flexibility for their applications. In this thesis, however, we will focus on the predefined schemas.

XPointer element() Scheme. For addressingone specific XMLelement, theXPointer element() Scheme [XPt03a] offers basic mechanisms. Similar to shorthand pointers as described above, a singleIDvalue can be given in combination withelement()expressions.

Thus, the element with the givenID is located, e.g. the expression element(D)

addresses the “Germany” element of the Mondial database. Note that in contrast to shorthand pointers, it is allowed to specifyelement()expressions withID values that do not locate any element.

These simple expressions based on ID values can be extended with so-called child sequences, i.e. sequences of integers separated by slashes (/)9. Given in combination with anIDvalue, the first integernaddresses thenth child of the element corresponding to the ID and each following integer locates the appropriate child of the previously addressed element. For instance, if applied to the Mondial database, the expression

element(D/8/1)

addresses the first child of the 8th child of the element identified by “D”.

Without ID value, and applied to an XML document, a child sequence always has to start with /1 in order to address the root element. Then, the remaining integers will recursively locate the corresponding children in document order.

XPointer xpointer() Scheme. The most sophisticated scheme-based pointers can be specified using the XPointer xpointer() Scheme [XPt02]. It is based on XPath. Expres-sions conforming to the xpointer() scheme have the following form:

xpointer(xpointer-expr).

For example, the XPointer

http://. . . /Mondial/mondial.xml#xpointer(//country[@car code=”D”])

addresses the node that represents Germany in http://. . . /Mondial/mondial.xml. Thus, the concepts of XPath can be applied in a straightforward way in xpointer()expressions for selecting parts of XML documents. In this thesis, we consider only XPath expressions in place of xpointer-expr. Expressions given in the other schemes can be easily mapped to the xpointer() scheme using appropriate XPath constructs, e.g. the id() function for selecting a specific element via its id.

On the other hand, the specification defines additional constructs for addressing strings, points and ranges. For instance with range-to(xpath-expr), the range from the context location to the point defined by xpath-expr is returned. Note that with these constructs, the addressed parts of XML documents do not necessarily conform to well-formed XML. We do not discuss these concepts here because they have been defined mainly for browsing purposes or for the document-centric viewpoint of XML instances.

Evaluation of XPointers. Given the different XPointer schemes, an XPointer expres-sion has the following structure:

xptr-expr1xptr-expr2. . . xptr-exprn

The evaluation takes place from left to right and the first expression that evaluates to a non-empty node sequence supplies the result of the XPointer expression. Consider for example the following expression:

9Note that in XML,IDvalues are not allowed to start with a digit.

xpointer(//country[name=’Deutschland’])xpointer(//country[@car code=’D’]) When applying the above XPointer expression to the Mondial database, the first pointer part results in the empty sequence because all countries are given with En-glish names. Thus, the second part is applied and will return the element representing Germany. For simplicity reasons, in the remainder of this work, we base our investi-gations on the evaluation of XPointers that contain only one expression of thexpointer scheme. Our results can be iteratively applied for several successive expressions in a straightforward way.

XPointer xmlns() Scheme. The XPointer xmlns() Scheme [XPt03c] declares name-space prefixes for XPointer expressions:

xmlns(mon=http://. . . /Mondial/)xpointer(//mon:country)

Here, the namespace prefix “mon” is bound to the Mondial namespace given by the URI “http://.../Mondial/”. This URI is used to identify the namespace in a unique way.

It is not considered as a target to be selected. Then, the declared namespace prefix is used directly in the XPointer expression to the right of thexmlns()directive for selecting allcountry elements.