Silke Eckstein Andreas Kupfer
Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de
XML Databases
6. SQL/XML
6.1Introduction
6.2 Publishing relational data in XML 6.3 XML data type
6.4 Queries 6.5 Validation
6.6 SQL/XML standard 6.7 Overview
6.8 References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 2
6. SQL/XML
• Creating XML documents from a database –Introduced in the last chapter
–On a more or less conceptual level
• Not handled so far
–Creating XML documents inside a database –Retrieving data from XML documents –Changing XML document content
• Solution: Integration in database –SQL/XML
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 3
6.1 Introduction
• SQL/XML
–Storage of XML in all big commercial DBMS available –Proprietary solution for embedding in SQL
–SQL/XML = Part 14 of the SQL-Standard: XML functionality
–Incorporates the corresponding standards for XML (XML Schema, XQuery)
–Basic idea:
•Mapping of SQL concepts to XML (see last chapter)
•Own datatype to store XML
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 4 [Kud07]
6.1 Introduction
• SQL/XML
–Datatype XML with belonging functions –Mapping between
SQL and XML
–Embedding XQuery in SQL
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 5 [Tür08]
6.1 Introduction
SQL database XML datatype SQL XQuery
<City>
<Name>
Braunschweig
</Name>
<Zip>38100</Zip>
<Zip>38106</Zip>
<State>
Niedersachsen
</State>
</City>
<City>
<City>
Storing XML documents inside
the database as values of type XML
Generating XML documents using SQL/XML functions
Mapping between SQl and XML
• Mapping SQL database to XML
–SQL charset to unicode (depends on implementation) –SQL identifiers to XML names
–SQL data types to XML schema data types –SQL values to XML values
–SQL tables to XML and XML schema documents –SQL schemas to XML and XML schema documents –SQL catalogues to XML and XML schema documents
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 6 [Tür08]
6.1 Introduction
Name Balance
Joe 2000
Jim 3500
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 7 [Tür08]
<ACCOUNT>
<row>
<NAME>Joe</NAME>
<BALANCE>2000</BALANCE>
</row>
<row>
<NAME>Jim</NAME>
<BALANCE>3500</BALANCE>
</row>
</ACCOUNT>
Mapping SQL tables <xsd:complexType name="ROW.ACCOUNT">
<xsd:sequence>
<xsd:element name="NAME"
type="CHAR_20"/>
<xsd:element name="BALANCE"
type="NUMERIC_12_2"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="TABLE.ACCOUNT">
<xsd:annotation><xsd:appinfo>
<xqlxml:sqlname type="BASE TABLE"
localName="ACCOUNT"/>
</xsd:appinfo></xsd:annotation>
<xsd:sequence>
<xsd:element name="row"
type="ROW.ACCOUNT"/>
</xsd:sequence>
</xsd:complexType>
<xsd:element name="ACCOUNT"
type="TABLE.ACCOUNT"/>
CREATE TABLEAccount (
Name CHAR(20), BalanceNUMERIC(12,2), );
Mapping SQL table columns to XML elements
Mapping table rows to XML
<row>
elements
• Relational table: Cities
• Many possible XML documents
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 8
6.1 Introduction
City Zip State
Braunschweig 38100 Niedersachsen
Braunschweig 38106 Niedersachsen
Hannover 30159 Niedersachsen
...
<City>
<Name>Braunschweig</Name>
<Zip>38100</Zip>
<Zip>38106</Zip>
<State>Niedersachsen</State>
</City>
...
...
<State name="Niedersachsen">
<City name="Braunschweig">
<Zip>38100</Zip>
<Zip>38106</Zip>
</City>
</State>
...
6.1Introduction
6.2 Publishing relational data in XML 6.3 XML data type
6.4 Queries 6.5 Validation
6.6 SQL/XML standard 6.7 Overview
6.8 References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 9
6. SQL/XML
• XMLELEMENTcreates an XML element –Example: creating name and content
–Can contain attributes, comments and other elements and options
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 10 [Kud07]
6.2 Publishing relational data
XMLELEMENT( NAME "City", 'Bad Oeynhausen' ) Creates
<City>Bad Oeynhausen</City>
XMLELEMENT( NAME "City", XMLCOMMENT ( "Example 2" ), XMLATTRIBUTES('Bayern' AS "State",
'80469' AS "Zip" ),'München' ) Creates
<City State="Bayern" Zip="80469"><!– Example 2 -->
München</City>
• XMLELEMENTreferencing the database –Can be used directly from an SQL statement
•Creates
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 11 [Kud07]
6.2 Publishing relational data
SELECT XMLELEMENT( NAME "City",
XMLCOMMENT ( "Example 3" ),
XMLATTRIBUTES( "State", "Zip" AS "PLZ" ),
"City" ) FROM Cities WHERE …;
<City STATE="Niedersachsen" PLZ="38100">
<!– Example 3 -->
Braunschweig
</City>
• XMLELEMENTnesting –Example
•Creates
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 12 [Kud07]
6.2 Publishing relational data
SELECT XMLELEMENT( NAME "City",
XMLELEMENT( NAME "Name", "City" ), XMLELEMENT( NAME "State", "State" ), XMLELEMENT( NAME "Zip", "Zip" ) ) FROM Cities WHERE …;
<City>
<Name>Braunschweig</Name>
<State>Niedersachsen</State>
<Zip>38100</Zip>
</City>
• XMLELEMENTsyntax diagram
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 13 [IBM]
6.2 Publishing relational data
• XMLFOREST
–Constructs a forest of elements without attributes
•Creates
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 14 [Kud07]
6.2 Publishing relational data
<City>Braunschweig</City><State>Niedersachsen</State>
<City>Braunschweig</City><State>Niedersachsen</State>
<City>Braunschweig</City><State>Niedersachsen</State>
<City>Hannover</City><State>Niedersachsen</State>
SELECT XMLFOREST ( "City", "State" ) FROM Cities;
• XMLFORESTsyntax diagram
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 15 [IBM]
6.2 Publishing relational data
• XMLCONCAT
–Concatenates multiple XML fragments into a single XML pattern
–Compare outputs
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 16 [Pow07]
6.2 Publishing relational data
SELECT XMLELEMENT("city", City) AS "CITY", XMLELEMENT("zip", Zip) AS "ZIP", XMLELEMENT("state", State) AS "STATE"
FROM Cities;
SELECT XMLCONCAT(
XMLELEMENT("city", CITY), XMLELEMENT("zip", ZIP), XMLELEMENT("state", STATE) ) FROM Cities;
• XMLAGG
–Aggregates seperate lines of output into a single string
•Creates
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 17 [Tür08]
6.2 Publishing relational data
City Zipcodes
Braunschweig <Zip>38100</Zip>
<Zip>38106</Zip>
Hannover <Zip>30159</Zip>
SELECT CITY, XMLAGG(
XMLELEMENT(NAME "Zip", Zip)) AS
"Zipcodes"
FROM Cities GROUP BY City;
• XMLAGG –Allows sorting
•Creates
–Disadvantage: Can only aggregate a single element, and thus fields are concatenated
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 18 [Pow07]
6.2 Publishing relational data
SELECT XMLAGG(
XMLELEMENT("address", Zip||' '||City) ORDER BY Zip DESC)
FROM Cities;
<address>38106 Braunschweig</address>
<address>38100 Braunschweig</address>
<address>30159 Hannover</address>
6.1Introduction
6.2 Publishing relational data in XML 6.3 XML data type
6.4 Queries 6.5 Validation
6.6 SQL/XML standard 6.7 Overview
6.8 References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 19
6. SQL/XML
• Storing XML in relational databases is possible as –Character data (VARCHAR, Character Large OBject) –New data type XML
• A value of the data type XML can contain –whole XML document
–XML element –a set of XML elements
• All XML publishing operators from chapter 6.2 create values of the data type XML, not a string
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 20
6.3 XML data type
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 21 [Tür08]
6.3 XML data type
XML(SEQUENCE)
XML(CONTENT(ANY))
XML(CONTENT(XMLSCHEMA)) XML(CONTENT(UNTYPED))
XML(DOCUMENT(ANY))
XML(DOCUMENT(UNTYPED)) XML(DOCUMENT(XMLSCHEMA))
NULL or document node Untyped elements &
attributes,
elements not NULL 1 element child
Validated against schema
1 element child Validated against schema
1 element child
• Specification of XML type
•Modifiers are optional
•Primary type modifier –DOCUMENT(XML document) –CONTENT(XML element)
–SEQUENCE(sequence of XML elements)
•Secondary type modifier –UNTYPED
–XMLSCHEMA(typed) –ANY(may be typed)
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 22 [Tür08]
6.3 XML data type
XML [({DOCUMENT|CONTENT|SEQUENCE}
[({ANY|UNTYPED|XMLSCHEMA schema name})])]
• Create a table that is an XML data type in itself –CREATE TABLE XMLDOCUMENT OF XMLTYPE;
• Create a table containing an XMLType data type column
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 23 [Pow07]
6.3 XML data type
CREATE TABLE XML (
ID NUMBER NOT NULL,
XML XMLTYPE,
CONSTRAINT XPK PRIMARY KEY (ID) );
• Example: Definition of an XML type column
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 24
6.3 XML data type
ID Name
123 <Groups>Annabelle</Groups>
234 <Groups>Magdalena, Marius</Groups>
345 <?xml version 1.0?>
<Groups>
<Person>Patrick</Person>
<Person>Robert</Person>
</Groups>
654 <Groups>Rebecca</Groups>
<Groups>Torben</Groups>
CREATE TABLE Groups (
ID INTEGER,
Name XML );
• Characteristics –Allowed values:
•XML documents (including prolog)
•XML content according to XML 1.0 (includes pure text comments, PI?)
•NULL
–No comparison possible (compare CLOB in SQL)
•User can define an order, if comparison is necessary –No corresponding type in programming languages for
embedding in SQL available
–Standard defines operators to convert to other SQL data types
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 25 [Kud07]
6.3 XML data type
• Parsing & Serialization –XMLParse:
•Parses a string value using an XML parser
•Produces value whose specific type is XML(DOCUMENT(ANY)), or …CONTENT…, or –XMLSerialize
•Transforms an XML value into a string value (CHAR, VARCHAR, CLOB, or BLOB)
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 26 [Mel05]
6.3 XML data type
<City>
<Name>
Braunschweig
</Name>
<Zip>38100</Zip>
<Zip>38106</Zip>
<State>
Niedersachsen
</State>
</City>
<City>
<Name>
Braunschweig
</Name>
<Zip>38100</Zip>
<Zip>38106</Zip>
<State>
Niedersachsen
</State>
</City>
6.1Introduction
6.2 Publishing relational data in XML 6.3 XML data type
6.4 Queries 6.5 Validation
6.6 SQL/XML standard 6.7 Overview
6.8 References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 27
6. SQL/XML
• Motivation
–How can SQL applications locate and retrieve information
in XML documents stored in an SQL database cell?
–Invoking XML query language within SQL statements
•Retrieve information — in SELECT list
•Locate information — in WHERE clause –Details on XML query language XQuery later
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 28 [Mel05]
6.4 Queries
• XMLQuery
–A new SQL expression, invoked as a pseudo- function, whose data type can be an XML type
—such as XML(CONTENT(ANY))—or an ordinary SQL type
• XMLExists
–A new SQL predicate, invoked as a pseudo-function, returning truewhen the contained XQuery
expression returns anything other than the empty sequence (false) or SQL null value (unknown)
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 29 [Mel05]
6.4 Queries
• XMLQuerysyntax
•
–Example
•
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 30 [Tür08]
6.4 Queries
CityList
<State name="Niedersachsen"><City>Braunschweig</City></State>
<State name="Niedersachsen"><City>Hannover</City></State>
SELECT XMLQUERY(
'<State name="{$Name}"><City>{$City}</City></State>' PASSING State as $Name, City AS $City NULL ON EMPTY) AS CityList FROM Cities;
XMLQUERY(<XQuery expression>
[PASSING <argument list>]
{NULL | EMPTY} ON EMPTY)
argument list := <SQL value> AS <XQuery variable>
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 31 [Tür08]
6.4 Queries
ID Paper
123 <Paper>…<author>Alice</author><title>Perpetual Motion</title><year>1999</year></Paper>
345 <Paper><year>2005</year><author>Bob</author><author>Charlie
</author><title>Beer</title>…</Paper>
ID AuthorNames 123 <Authors>Alice</Authors>
345 <Authors>Bob</Authors>
<Authors>Charlie</Authors>
SELECT ID, XMLQUERY(
'FOR $a IN $p//author RETURN
<Authors>{$a/text()}</Authors>' PASSING Paper AS "p") AS AuthorNames FROM Papers;
CREATE TABLE Papers (ID INTEGER, Paper XML);
• XMLTABLE
–Provides an SQL viewof XML data
•Output is notof the XML type
–Evaluates an XQuery “row pattern” with optional arguments (as with XMLQuery)
–Element/attribute values mapped to columns using XQuery
“column patterns”
–Names & types of columns required; default values optional –Syntax:
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 32 [Mel05]
6.4 Queries
XMLTABLE (<XQuery expression>
PASSING <argument list>
COLUMNS <column list>)
column := <name> <type> PATH <path expression>
• XMLTable: Example
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 33 [Tür08]
6.4 Queries
ID About Created
345 Beer 2005
SELECT ID, t.*
FROM Papers p, XMLTABLE(
'for $root in $papers
where $root//author/text() = "Bob"
return $root/Paper' PASSING p.Paper as "papers"
COLUMNS
About VARCHAR(30) PATH '/Paper/title', Created INTEGER PATH '/Paper/year' ) AS t;
6.1Introduction
6.2 Publishing relational data in XML 6.3 XML data type
6.4 Queries 6.5 Validation
6.6 SQL/XML standard 6.7 Overview
6.8 References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 34
6. SQL/XML
• Validation of XML
–Is like integrity constraints in DBs –Requires an XML Schema
–XML Schemas may be registeredwith the SQL-server
•Implementation-defined mechanism
•Known by SQL name & by target namespace URI –Schema does need a unique name
•Used by XMLValidate(), IS VALID, and to restrict values of
XML(DOCUMENT-or-CONTENT(XMLSCHEMA))
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 35 [Mel05]
6.5 Validation
• Schema registration
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 36
6.5 Validation
Register XMLSCHEMA
'http://www.Alfred-Moos.de/GrussSchema.xsd' FROM 'file://c:/XML_Schemata/GrussSchema.xsd' AS GrussSchema
COMPLETE
;
CREATE TABLE Dokument_XML (Dokument_XML_Nr CHAR (4)
NOT NULL PRIMARY KEY, Dokument XML,
CONSTRAINT validieren
CHECK (Dokument IS VALIDATED ACCORDING TO XMLSCHEMA ID GrussSchema )
)
;
• Schema definition –Syntax
–
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 37 [Tür08]
6.5 Validation
XML(CONTENT(XMLSCHEMA) <schema> [<elements>]))
<schema> := URI <namespace> [LOCATION <loc>]
| NO NAMESPACE [LOCATION <loc>]
| ID <registered schema name>
<element> := [NAMESPACE <namespace>]
ELEMENT <element name>
• New functions and predicates:
–XMLValidate
•Validates an XML value against an XML Schema (or target namespace), returning new XML value with type annotations –IS VALID
•Tests an XML value to determine whether or not it is valid according to an XML Schema (or target namespace); return true/falsewithout altering the XML value itself
–IS DOCUMENT
•determines whether an XML value satisfies the (SQL/XML) criteria for an XML document
–IS CONTENT
•determines whether an XML value satisfies the (SQL/XML) criteria for XML content
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 38 [Mel05]
6.5 Validation
• Benefits of schema registration –Security issues
•Schemas cannot “disappear” without SQL- server knowing about it
•Schemas cannot be “hijacked” (altered in inappropriate ways) without SQL-server knowing about it
•Documents cannot be marked “valid”
against schemas unless SQL-server knows about them
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 39 [Mel05]
6.5 Validation
• Predefined schemas (build-in namespaces)
–xs:http://www.w3.org/2001/XMLSchema
–xsi:http://www.w3.org/2001/XMLSchema-instance –sqlxml:http://standards.iso.org/iso/9075/2003/sqlxml –More depending on the DB implementation
• Completely supported per XML+Namespaces:
–XMLElement, XMLForest, XMLTable –Default namespace, explicit namespace (prefix)
–Declare namespace within scopes of WITHclause, column definitions, constraint definitions, insert/delete/update statements, compound statements
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 40 [Mel05]
6.5 Validation
6.1Introduction
6.2 Publishing relational data in XML 6.3 XML data type
6.4 Queries 6.5 Validation
6.6 SQL/XML standard 6.7 Overview
6.8 References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 41
6. SQL/XML
• SQL/XML standard published as –ISO/IEC 9074-14:2003
•Mappings and Publishing Functions –ISO/IEC 9075-14:2006
•Adds XQuery, including Data Model, Validation –ISO/IEC 9075-14:2008
•Updates
•Something else?
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 42 [Mel05]
6.6 SQL/XML standard
• SQL/XML:2003 plus –Additional publishing functions –XQuery data model
–More precise XML type (modifiers) –XMLQuery, XMLTable
–XMLValidate, IS VALID
–XMLExists, IS DOCUMENT, IS CONTENT –Casting between XML type and SQL types
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 43 [Mel05]
6.6 SQL/XML standard
• Overview of some operators for the XML type
–XMLELEMENT – creates an XML element node
–XMLFOREST – creates a sequence of XML element nodes from a table
–XMLCOMMENT – creates an XML comment node –XMLTEXT – creates a text node
–XMLPI – creates a processing instruction –XMLAGG – aggregates XML values of a group –XMLCONCAT – concatenates XML type values –XMLTRANSFORM – applies an XSL to a document
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 44 [Tür08]
6.6 SQL/XML standard
• ... Overview of some operators for the XML type
–XMLPARSE – a well-formed SQL text to XML value –XMLSERIALIZE – converts an XML value to a SQL text –XMLDOCUMENT – creates an XML document node
from an XML value
–XMLVALIDATE – validates an XML value with a schema –XMLQUERY – evaluates an XQuery expression
–XMLTABLE – transforms an XQuery result to a SQL table –XMLITERATE – transforms an XQery sequence to a SQL
table
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 45 [Tür08]
6.6 SQL/XML standard
• Review of SQL/XML –Two components
•A data type XML to store XML data
•Functions to map relational structures to XML –Only construction operators
•No extraction of values or search
•But construction operators are based on XQuery –Mapping of tables, schemas, catalogues ignores some
information from the relational schema
•UNIQUE
•REFERENCES
•CHECK
–Further extensions are expected
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 46 [Kud07]
6.6 SQL/XML standard
1. Introduction 2. XML Basics 3. Schema definition 4. XML query languages I 5. Mapping relational data
to XML 6. SQL/XML 7. XML processing
8. XML query languages II 9. XML storage I 10. XML storage - index 11. XML storage - native 12. Updates / Transactions 13. Systems
14. XML Benchmarks
6.7 Overview
47 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
• "XML und Datenbanken" [Tür08]
–Can Türker
–Lecture, University of Zurich, 2008
• Beginning XML Databases. [Pow07]
–Gavin Powell
–Wiley & Sons, 2007, ISBN 0471791202
• "XML-Datenbanken", [Kud07]
–Thomas Kudraß
–Lecture, HTWK Leipzig, WS2007/2008
• "SQL/XML", [Mel05]
–Jim Melton, –Oracle Corp. 2005
48
6.8 References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
• XQuery und SQL/XML in DB2-Datenbanken:
Verwaltung und Erzeugung von XML- Dokumenten in DB2 [Moo08]
–Alfred Moos
–Vieweg+Teubner, 2008
• ISO/IEC 9075-14:2003 Information Technology - Database Languages - SQL - Part 14: XML-Related Specifications (SQL/XML)
• DB2 SQL-Reference, IBM, March 2008 [IBM]
49
6.8 References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
• Now, or ...
• Room: IZ 232
• Office our: Tuesday, 12:30 – 13:30 Uhr or on appointment
• Email: eckstein@ifis.cs.tu-bs.de
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 50