XML Databases
8. Updates + XSLT, 16.12.09
Silke Eckstein Andreas Kupfer
Institut für Informationssysteme
Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de
8.1 Introduction
8.2 Full document replacement 8.3 XQuery Update Facility
8. Updates + XSLT
8.4 XSLT & the XSLTRANSFORM function 8.5 Overview
8.6 References
• Three general techniques for modifying XML documents:
– Full document replacement
• Replace existing document with an updated one
– XQueryUpdate Facility
8.1 Introduction
– XQueryUpdate Facility
• Standardized extension to XQuery
• Modify, insert or delete individual elements and attributes within an XML document
– Extensible Stylesheet Language Transformation (XSLT)
• Apply a style sheet to an XML document
• Use XSLTRANSFORM function to do this in SQL statements
8.1 Introduction
8.2 Full document replacement 8.3 XQuery Update Facility
8. Updates + XSLT
8.4 XSLT & the XSLTRANSFORM function 8.5 Overview
8.6 References
• Replacing a full XML document
– Use regular SQL UPDATE statement to replace a full XML document in a table with a new document
• treats XML document as a "black box"
• application needs to provide new document
8.2 Full document replacement
• application needs to provide new document
– UPDATE statement needs to select a single row
• predicate on the relational columns of the table
• predicates on an XML element value
• predicates on an XML attribute value
• predicates on XML and relational values
– New documents can be provided via parameter markers
• Example
8.2 Full document replacement
UPDATE customer SET info =
'<?xml version="1.0" encoding="UTF-8" ?>
<customerinfo Cid="1010">
<name>Larry Trotter</name>
<addr country="England">
<street>5 Rosewood</street>
...
</addr>
<phone type="work">416-555-1358</phone>
</customerinfo>' WHERE cid = 1000;
• Using parameter marker or host variables
– to provide the new XML document:
– ... and to provide the relational value:
8.2 Full document replacement
UPDATE customer SET info = ? WHERE cid = 1000 UPDATE customer SET info = :hvar WHERE cid = 1000
– ... and to provide the relational value:
• Replacing an existing XML document with a NULL value
– removes the document from the row without deleting the row:
UPDATE customer SET info = ? WHERE cid = ?
UPDATE customer SET info = :hvar WHERE cid = :hvar2
UPDATE customer SET info = NULL WHERE cid = 1000
• 2 more examples:
8.2 Full document replacement
UPDATE customer SET info = ?
WHERE XMLEXISTS('$INFO/customerinfo[name = "Larry Trotter"]
AND cid = 1000;
UPDATE customer SET info = ?
WHERE XMLEXISTS('$INFO/customerinfo/phone[type = "work"
and text()="416-555-1358"]');
8.1 Introduction
8.2 Full document replacement 8.3 XQuery Update Facility
8. Updates + XSLT
8.4 XSLT & the XSLTRANSFORM function 8.5 Overview
8.6 References
• XQuery Update Facility
– Standardized extension to XQuery
– Allows to modify, insert or delete individual elements or attributes within an XML document
– Makes updating easier and provides more performance than full document replacements
8.3 XQuery Update Facility
than full document replacements
– Allows to modify nodes in the following way:
• Replace the value of a node
• Replace a node with a new one
• Insert a new node (at a specific location)
• Delete a node
• Rename a node
• Modify multiple nodes in a document in a single statement
• Update multiple documents ib a single statement
• XQuery Update Facility: New XQuery expressions
8.3 XQuery Update Facility
XQuery expressions
ExprSingle ::= FLWORExpr
| QuantifiedExpr
| TypeswitchExpr
| IfExpr
| InsertExpr
Syntax and examples taken from the W3C
– N.B. Updating expressions (insert, delete, rename, replace) lead to a loss of type/validation information at the affected nodes.
Such information may be recovered by revalidation.
| InsertExpr
| DeleteExpr
| RenameExpr
| ReplaceExpr
| TransformExpr
| OrExpr
taken from the W3C web site.
• Node insertion
– An insert expression is an updating expression that inserts copies of zero or more nodes into a designated position with respect to a target node.
8.3 XQuery Update Facility
Syntax Syntax
InsertExpr ::= "insert" ("node" | "nodes")
SourceExpr InsertExprTargetChoice TargetExpr InsertExprTargetChoice ::= (("as" ("first" | "last"))? "into")
| "after" | "before"
SourceExpr ::= ExprSingle TargetExpr ::= ExprSingle
• Node insertion: Examples
8.3 XQuery Update Facility
Insert a year element after the publisher of the first book.
insert node <year>2005</year>
after fn:doc("bib.xml")/books/book[1]/publisher
Navigating by means of several bound variables, insert a new police report into the list of police reports for a particular accident.
insert node $new-police-report
as last into fn:doc("insurance.xml")/policies /policy[id = $pid]
/driver[license = $license]
/accident[date = $accdate]
/police-reports
• Node deletion
– A delete expression deletes zero or more nodes from an XDM instance.
– The keywords node and nodes may be used interchangeably, regardless of how many nodes are actually deleted.
8.3 XQuery Update Facility
Syntax Syntax
DeleteExpr ::= "delete" ("node" | "nodes") TargetExpr TargetExpr ::= ExprSingle
Delete the last author of the first book in a given bibliography.
delete node
fn:doc("bib.xml")/books/book[1]/author[last()]
Delete all email messages that are more than 365 days old.
delete nodes /email/message
[fn:currentDate() - date > xs:dayTimeDuration("P365D")]
• Node replacement
– Replace takes two forms, depending on whether value of is specified:
8.3 XQuery Update Facility
Syntax
ReplaceExpr ::= "replace" ("value" "of")? "node"
TargetExpr "with" ExprSingle TargetExpr ::= ExprSingle
– Replace takes two forms, depending on whether value of is specified:
• If value of is not specified, a replace expression replaces one node with a new sequence of zero or more nodes. The replacement nodes occupy the position in the node hierarchy that was formerly occupied by the node that was replaced.
– Hence, an attribute node can be replaced only by zero or more attribute nodes, and an element, text, comment, or processing instruction node can be replaced only by zero or more element, text, comment, or processing
instruction nodes.
• If value of is specified, a replace expression is used to modify the value of a node while preserving its node identity.
• Node replacement: Examples
8.3 XQuery Update Facility
Replace the publisher of the first book with the publisher of the second book.
replace node fn:doc("bib.xml")/books/book[1]/publisher with fn:doc("bib.xml")/books/book[2]/publisher
Increase the price of the first book by ten percent.
replace value of node fn:doc("bib.xml")/books/book[1]/price with fn:doc("bib.xml")/books/book[1]/price * 1.1
• Renaming nodes
– A rename expression replaces the name property of a data model node with a new QName.
8.3 XQuery Update Facility
Syntax
RenameExpr ::= "rename" "node" TargetExpr "as"
NewNameExpr NewNameExpr
Rename the first author element of the first book to principal-author.
rename node fn:doc("bib.xml")/books/book[1]/author[1]
as "principal-author"
Rename the first author element of the first book to the QName that is the value of the variable $newname.
rename node fn:doc("bib.xml")/books/book[1]/author[1]
as $newname
• Renaming is local!
– The effects of a rename expression are limited to its target node, descendants are not affected.
Global change of names or namespaces needs explicit iteration.
8.3 XQuery Update Facility
Example (Change all QNames from prefix abc to xyz and new namespace URI http://xyz/ns for node $root and its decendents.)
for $node in $root//abc:*
for $node in $root//abc:*
let $localName := fn:local-name($node),
$newQName := fn:concat("xyz:", $localName) return
rename node $node as fn:QName("http://xyz/ns", $newQName), for $attr in $node/@abc:*
let $attrLocalName := fn:local-name($attr),
$attrNewQName := fn:concat("xyz:", $attrLocalName) return
rename node $attr as fn:QName("http://xyz/ns",
$attrNewQName)
• Node transformation
– . . . creates modified copies of existing nodes. Each copied node obtains a new node identity. The resulting XDM instance can contain both, newly created and previously existing nodes.
Node transformation is a non-updating expression, since it does not modify existing nodes!
8.3 XQuery Update Facility
Syntax
– Idea:
1. Bind variables of copy clause (non-updating expressions), 2. update copies (only!) as per modify clause,
3. construct result by return (copied/modified and/or other nodes).
Syntax
TransformExpr ::= "copy" "$"VarName ":=" ExprSingle ("," "$"VarName ":=" ExprSingle)*
"modify" ExprSingle
"return" ExprSingle
• Node transformation: Examples
8.3 XQuery Update Facility
Return a sequence consisting of all employee elements that have Java as a skill, excluding their salary child-elements.
for $e in //employee[skill = "Java"]
return
copy $je := $e
modify delete node $je/salary
– N.B. Underlying persistent data not changed by these examples!
modify delete node $je/salary return $je
Copy a node, modify copy, then return original and modified copy.
let $oldx := /a/b/x return
copy $newx := $oldx
modify (rename node $newx as "newx",
replace value of node $newx by $newx * 2) return ($oldx, $newx)
• On the semantics of the XQuery Update Facility
– Formally specifying the exact semantics of the XQuery UF is non-trivial for several reasons:
• Formal update semantics are always a lot more involved
8.3 XQuery Update Facility
• Formal update semantics are always a lot more involved than retrieval semantics.
• Updates and bulk operations do not go together well (cf.
SQL set-oriented updates).
• XUF uses a notion of "snapshots" and "pending update lists"
to work around some of the subtleties.
• The details are beyond the scope of this lecture.
8.1 Introduction
8.2 Full document replacement 8.3 XQuery Update Facility
8. Updates + XSLT
8.4 XSLT & the XSLTRANSFORM function 8.5 Overview
8.6 References
• XSL Languages
– It started with XSL and ended up with XSLT, XPath and XSL-FO
• It started with XSL
8.4 XSLT – Intro
– XSL stands for EXtensible Stylesheet Language.
– The World Wide Web Consortium (W3C) started to develop XSL because there was a need for an XML- based Stylesheet Language.
• CSS = Style Sheets for HTML
– HTML uses predefined tags, and the meaning of each tag is well understood.
– The <table> tag in HTML defines a table - and a browser knows how to display it.
8.4 XSLT – Intro
browser knows how to display it.
– Adding styles to HTML elements are simple. Telling a browser to display an element in a special font or color, is easy with CSS.
• XSL = Style Sheets for XML
– XML does not use predefined tags (we can use any
tag-names we like), and therefore the meaning of each tag is not well understood.
– A <table> tag could mean an HTML table, a piece of
8.4 XSLT – Intro
– A <table> tag could mean an HTML table, a piece of furniture, or something else - and a browser does not know how to display it.
– XSL describes how the XML document should be displayed!
• XSL - More Than a Style Sheet Language
– XSL consists of three parts:
– XSLT - a language for transforming XML documents – XPath - a language for navigating in XML documents
8.4 XSLT – Intro
XPath - a language for navigating in XML documents – XSL-FO - a language for formatting XML documents
• XSLT
– Extensible Stylesheet Language – Transformations
– A language to describe transformations from source to target tree structures (= XML documents)
– A transformation in XSLT
8.4 XSLT
– A transformation in XSLT
• Is described by a well-formed XML document called stylesheet
• Can use elements of the XSLT namespace as well as of other namespaces
• Contains template rules to execute the transformation
XML file
XSLT
XML file XSLT
Processor
XSLT stylesheet XSLT tree
8.4 XSLT
• XSLT processor
XSLT stylesheet and
XML document
XSLT tree and Source tree
Transformation
process Result tree Result
document
• Template rules
– A rule consists of a pattern and a template.
– The pattern is compared to the nodes of the source document tree.
8.4 XSLT
– The template can be instanciated to create a part of the target tree. It can contain elements of the XSLT namespace which are instructions to create fragments.
• XSLT processing model
– By processing a list of source nodes,
fragments of the target tree can be created.
– The list starts with the root node only.
– A node is processed
• By selecting the best matching pattern from all rules
8.4 XSLT
• By selecting the best matching pattern from all rules (resolving any conflicts).
• The template of the best matching rule is instanciated with the current node as context node.
– A template usually contains instructions to select further source tree nodes for processing.
– Recursivly repeat the selection of matching rules, instanciation and selecting of new source nodes until the list is empty.
• Structure of a stylesheet
8.4 XSLT
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
id={id}
extension-element-prefixes={token}
– Elements and attributes with the XSLT namespace must be recognized by the XSLT processor
– PIs and comments are ignored
extension-element-prefixes={token}
exclude-result-prefixes={token}
version={number}>
<!-- Content: (xsl:import*, top-level-elements)-->
</xsl:stylesheet>
• Top level elements
– E.g. xsl:import, xsl:include, and most importantly xsl:template
8.4 XSLT
<xsl:template
match = {pattern}
match = {pattern}
name = {qname}
priority = {number}
mode = {qname}
<!-- Content: (xsl:param*, template) -->
</xsl:template
• A pattern specifies a set of conditions to a node
– Uses a set of alternative (|-seperated) address paths in the child and attribute axis.
– The use of '/' and '//', 'id' and 'key' functions is possible.
8.4 XSLT
– The use of '/' and '//', 'id' and 'key' functions is possible.
– Pattern predicates ('[…]') can use all XPath expressions.
• Multiple matching patterns
– If multiple patterns match a node, the conflict is resolved by priorities (cf. priority attribute)
• Imported rules have a lower priority than rules of the primary stylesheet
• Alternatives are processed as if each alternative is defined by a
8.4 XSLT
• Alternatives are processed as if each alternative is defined by a single rule
• ChildOrAttributeAxisSpecifier::QName patterns have priority 0
• ChildOrAttributeAxisSpecifier::NCName patterns have priority -0.25
• ChildOrAttributeAxisSpecifier::NodeTest patterns have priority -0.5
• All other patterns have priority 0.5
• XSLT contains default rules
– Process the document recursivly
– But have lower priority than rules in the stylesheet – Example:
8.4 XSLT
Example:
<xsl:template match="*|/">
<xsl:apply-templates/>
</xsl:template>
• Rules
– Can be named and be called in templates of other rules – Can have parameters which can be passed along on
their invocation, default values can be defined
– The mode attribute allows a rule to be processed multiple
8.4 XSLT
– The mode attribute allows a rule to be processed multiple times and with different results
– If the template is invoked directly with xsl:call- template or xsl:apply-template, the filter
attributes (match, mode, priority or name) are not processed
• Templates
– Can contain literal elements (non XSLT
namespace) and elements of the XSLT namespace (instructions).
– If the rule is selected, the template can construct
8.4 XSLT
– If the rule is selected, the template can construct fragments of the result tree.
– Processing depends on the context.
– Default behaviour is to write all elements which are not in the XSLT namespace to the result tree.
– Must be valid XML.
– Can contain instructions.
• Instructions to process nodes recusively
8.4 XSLT
<xsl:apply-template
select = {node set expression}
mode = {qname}>
<!-- Content: (xsl:sort, xsl:with-param)* -->
</xsl:apply-template>
– Without the attribute select all children of the context node are processed
– Select can be a (XPath-) expression to select nodes
• Could result in not terminating recursion!
</xsl:apply-template>
• Instructions to create a node
8.4 XSLT
<xsl:element
name = {qname}
namespace = {uri-reference}
use-attribute-sets = {qname} >
<!-- Content: template -->
– Name attribute is required, but can be calculated
– Other create instructions are similar
• xsl:attribute, xsl:attribute-set, xsl:text (to create a text/leaf node with whitespaces),
xsl:processing-instruction, xsl:comment
<!-- Content: template -->
</xsl:element>
• Instructions for flow control
– Conditional processing
8.4 XSLT
<xsl:if
test = {boolean expression}
<!-- Content: template -->
</xsl:if>
required
– Test expression is evaluated and result is casted to a boolean. If it is true the template will be instanciated
</xsl:if>
• Instructions for flow control
• Multiple choice ("if-then-else" / "switch")
<xsl:choose
<!-- Content: (xsl:when+, xsl:otherwise?) -->
</xsl:choose>
<xsl:when
test = {boolean expression}
8.4 XSLT
– If multiple xsl:when elements are true, only the first one is processed (no "break" needed)
– If no xsl:when element is true and there is no xsl:otherwise, no content is created
test = {boolean expression}
<!-- Content: template -->
</xsl:when>
<xsl:otherwise
<!-- Content: template -->
</xsl:otherwise>
• Repetition
8.4 XSLT
<xsl:for-each
select = {node-set expression}
<!-- Content: (xsl:sort*, template) -->
</xsl:for-each>
– The template is instanciated for each node selected by the node set expression.
– On instanciation the current node becomes the
context node and all selected nodes are the node list.
– If there is no explicit sort statement, the nodes are processed in document order.
• "Calculation" of output text
– The selected object is casted to a string value and is
8.4 XSLT
<xsl:value-of
select = {string expression}
disable-output-escaping = "yes" | "no" />
– The selected object is casted to a string value and is inserted as content of the instanciated text node.
• Other statements for sorting, numbering, variables, …
– see http://www.w3.org/TR/xslt
• Some advice
8.4 XSLT
• Some advice
– Denomination "variable" is misleading!
– Context node is changed by for-each!
8.4 XSLT
<?xml version="1.1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="resultset">
<html>
<head/>
<body>
<h1>
<xsl:text>Summary about </xsl:text>
<xsl:text>Summary about </xsl:text>
<xsl:value-of select="count(child::*)"/>
<xsl:text> Pizzeria</xsl:text>
<xsl:if test="count(child::*) > 1">
<xsl:text>s</xsl:text>
</xsl:if>
</h1>
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
8.4 XSLT
<?xml version="1.0"?>
<resultset statement="SELECT * FROM t1"
xmlns:xsi="http://www.w3.org/2001/XMLSchema- instance">
<row> <field name="Pid">1</field>
<field name="Name">Super Pizza</field>
<field name="Category">4</field>
<field name="Location">1</field> </row>
<field name="Location">1</field> </row>
<row>…</row>
</resultset>
<xsl:template match="field[attribute::name='Name']">
<h2>
<xsl:value-of select="."/>
</h2>
</xsl:template>
</xsl:stylesheet>
8.1 Introduction
8.2 Full document replacement 8.3 XQuery Update Facility
8. Updates + XSLT
8.4 XSLT & the XSLTRANSFORM function 8.5 Overview
8.6 References
Introduction and Basics 1. Introduction
2. XML Basics
3. Schema Definition 4. XML Processing Querying XML
Producing XML
9. Mapping relational data to XML
Storing XML 10. XML storage
8.5 Overview
Querying XML
5. XPath & SQL/XML Queries
6. XQuery Data Model 7. XQuery
XML Updates
8. XML Updates & XSLT
10. XML storage
11. Relational XML storage 12. Storage Optimization Systems
13. Technology Overview
• "Database-Supported XML Processors", [Gru08]
– Th. Grust
– Lecture, Uni Tübingen, WS 08/09
• "XML und Datenbanken", [Tür08]
7.6 References
• "XML und Datenbanken", [Tür08]
– Can Türker
– Lecture, University of Zürich, 2008
• DB2 pureXML CookBook [NK09]
– Matthias Nicola and Pav Kumar-Chatterjee – IBMPress, 2009, ISBN 9780138150471
• Now, or ...
• Room: IZ 232
• Office our: Tuesday, 12:30 – 13:30 Uhr
Questions, Ideas, Comments
• Office our: Tuesday, 12:30 – 13:30 Uhr or on appointment
• Email: eckstein@ifis.cs.tu-bs.de