XML Databases
3. Schema Definition, 10.11.08
Silke Eckstein Andreas Kupfer
Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de
3.1 Introduction
3.2 Document Type Definitions – DTDs 3.3 XML Schema
3.4 DTDs vs. XML Schema 3.5 Validation
3.6 Overview 3.7 References
2
3. Schema Definition
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
• Structure of XML documents
–XML prolog
–Document Type Definition (DTD) –Document Instance
–Have to be well-formed (see last week)
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 3
3.1 Introduction
<Bücher>
<Buch>
<Autor id="1234567890">Rainer Eckstein</Autor>
<Autor id="1234568723">Silke Eckstein</Autor>
<Titel>XML und Datenmodellierung</Titel>
<Untertitel>XML-Schema ...</Untertitel>
<Verlag id="3-89864">dpunkt.Verlag</Verlag>
</Buch>
</Bücher>
• Valid XML
–More often than not, applications that operate on XML data require the XML input data to conform to a specific XML dialect.
–This requirement is more strict than just XML well- formedness.
–The (hard-coded) application logic relies on, e.g.,
•the presenceor absence of specically named elements [attributes],
•the orderof child elements within an enclosing element,
•attributes having exactly one of several expected values, . . . –If the input data fails to meet the requirements, results are
often disastrous.
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 4 [Scholl07]
3.1 Introduction
• DTDs – Document Type Definitions
–The XML Recommendation includes technology that enables applications to rigidly specify the XML dialect(the document type) they expect to see: DTDs (Document Type Definitions).
–XML parsers use the DTD to ensure that input data is not only well-formed but also conforms to the DTD (XML speak: input data is valid).
•Valid XML documents ⊂⊂⊂⊂well-formed XML documents
• XML Schema
–Besides DTDs, there exists another schema description language: XML Schema
–It is more sophisticated than DTDs but has a less compact syntax.
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 5 [Scholl07]
3.1 Introduction
• Document validation is critical, if
–distinct organizations (B2B) need to share XML data:
also share the DTDs / the schemas, –applications need to discover and explore yet
unknown XML dialects,
–high-speed XML throughput is required (once the input is validated,we can abandona lot of runtime checks).
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 6
3.1 Introduction
3.1 Introduction
3.2 Document Type Definitions – DTDs 3.3 XML Schema
3.4 DTDs vs. XML Schema 3.5 Validation
3.6 Overview 3.7 References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 7
• A document's DTD is directly attached to its XML text using a DOCTYPE declaration:
–The DOCTYPE declaration follows the text declaration (<?xml. . . ?>) (comments <!--. . . -->, processing instructions <?. . . ?> in between are OK).
–The first parameter tof the DOCTYPE declaration is required to match the document's root element tag.
–The document type definition itself consists of an external subset (de≡ SYSTEM "uri",) as well as an internal subset (di≡[. . . ]), i.e., embedded in the document itself).
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 8 [Scholl07]
DOCTYPE Declaration
<?xml version="1.1"?>
<!DOCTYPE t dedi>
<t>
...
</t>
• Internal and external DTD
–Both subsets are optional. Should clashes occur, declarations in the internal subset override those in the external subset.
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 9
3.2 Document Type Definitions
DOCTYPE Declaration – external DTD
<?xml version="1.1"?>
<!DOCTYPE greeting SYSTEM "hello.dtd">
<greeting>Hello, world!</greeting>
DOCTYPE Declaration- internal DTD
<?xml version="1.1" encoding="UTF-8" ?>
<!DOCTYPE greeting [
<!ELEMENT greeting (#PCDATA)>
]>
<greeting>Hello, world!</greeting>
• The ELEMENT Declaration
–The DTD ELEMENT declaration, in some sense, defines the vocabulary available in an XML dialect.
–Any XML element t to be used in the dialect needs to be introduced via
<!ELEMENT t cm>
•The content model cmof the element defines which element content is considered valid.
•Whenever an application encounters a telement anywhere in a valid document, it may assume that t's content conforms to cm.
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 10 [Scholl07]
3.2 Document Type Definitions
• Element content models
• Example:
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 11
3.2 Document Type Definitions
Content model Valid content
ANY arbitrary well-formed XML content EMPTY no child elements allowed (attributes OK) Children only child elements, no character data;
order and occurrence of child elements must match regular expression over tag names and constructors ,,|,+,*,? Mixed character data, optionally interspersed with child elements (but
see constraints below)
<!ELEMENT adresse (strasse, nummer, plz, ort)>
<!ELEMENT strasse (#PCDATA)>
<!ELEMENT nummer (#PCDATA)>
<!ELEMENT plz (#PCDATA)>
<!ELEMENT ort (#PCDATA)>
• N.B.
–A DTD with <!ELEMENT tANY > gives the application no clue about t's content. Use judiciously.
–A <!ELEMENT t EMPTY > forbids any content for t elements.
Example: (X)HTML img, brtags:
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 12 [Scholl07]
3.2 Document Type Definitions
XHTML 1.0 Strict DTD
<!ELEMENT img EMPTY>
...
<!ELEMENT br EMPTY>
• Content model "Children"
–Regular expressions provide control over the exact order and occurence of children nodes below an element node:
–Example(abstract DTD):
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 13
3.2 Document Type Definitions
Reg. exp. Semantics
t (tag name) child element with tag t
c1, c2 c1followed by c2
c1| c2 c1or, altenatively, c2
c+ c, one or more times
c* c, zero or more times
c? optional c
<!ELEMENT A (B|C,(D|E)*)>
• Content model "Mixed"
–A mixture of character data and child elements
–The types of the child elements may be constrained, but not their order or their number of occurrences:
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 14
3.2 Document Type Definitions
<!ELEMENT anreisebeschreibung
(#PCDATA | auto | bahn | flugzeug)* >
<anreisebeschreibung>
Sie können unser Haus auf verschiedenen Wegen erreichen:
<bahn> per Bahn: 1 km ab Bhf Warnemünde </bahn>
<auto> per Auto: 19 km ab Autobahn A19 Rostock - Berlin </auto>
<flugzeug> per Flugzeug: 55 km ab Rostock-Laage, 235 km ab Berlin-Tegel </flugzeug>
Sie finden uns direkt an der Uferpromenade.
</anreisebeschreibung>
• Elements with mixed content typical for document-centric XML
–Free text interspersed with markup
•to highlight something
•to provide certain structures as e.g. addresses, tables
•etc.
–For elements with mixed content white space (#PCDATA) is regarded essential and thus reported to the application.
•In all other content models an XML parser will not report white space contained in an element to its underlying application.
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 15
3.2 Document Type Definitions
• Example: DTD and valid XML encoding academic titles
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 16 [Scholl07]
3.2 Document Type Definitions
<?xml version="1.1"?>
<!DOCTYPE academic [
<!ELEMENT academic (Prof?,
(Dr, (rernat|emer|phil)*)?, Firstname, Middlename*, Lastname) >
<!ELEMENT Prof EMPTY >
<!ELEMENT Dr EMPTY >
<!ELEMENT rernat EMPTY >
<!ELEMENT emer EMPTY >
<!ELEMENT phil EMPTY >
<!ELEMENT Firstname (#PCDATA) >
<!ELEMENT Middlename (#PCDATA) >
<!ELEMENT Lastname (#PCDATA) >
]>
<academic>
<Prof/><Dr/><emer/>
<Firstname>
Don
</Firstname>
<Middlename>
E
</Middlename>
<Lastname>
Knuth
</Lastname>
</academic>
• The ATTLIST Declaration
–Using the DTD ATTLIST declaration, validation of XML documents is extended to attributes.
–The ATTLIST declaration associates a list of attribute names aiwith their owning element named t:
•The attribute types idefine which values are valid for attribute ai.
•The defaults diindicate if aiis required or optional (and, if absent, if a default value should be assumed for ai).
•In XML, the attributes of an element are unordered. The ATTLIST declaration prescribes no order of attribute usage.
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 17 [Scholl07]
3.2 Document Type Definitions
ATTLIST Declaration
<!ATTLIST t a11d1
… an n dn >
• Via attribute types, control over the valid attribute values can be exercised:
• Example:
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 18 [Scholl07]
3.2 Document Type Definitions
Attribute Type i Semantics
CDATA character data (no <, but <, . . . ) (v1|v2|. . . |vm) enumerated literal values
ID value is document-wide unique identifier for owner element IDREF references an element via its ID attribute
Academic.xml (fragment)
<!ELEMENT academic (Firstname, Middlename*, Lastname) >
<!ATTLIST academic title (Prof|Dr) #REQUIRED type CDATA #IMPLIED >
>
<academic title="Dr" type="rer.nat."> ... </academic>
• Attribute defaulting in DTDs:
• Examplesof attribute-list declarations:
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 19 [Scholl07]
Attribute Default di Semantics
#REQUIRED element must have attribute ai
#IMPLIED attribute aiis optional
v(a value) attribute aiis optional, if absent, default value v for aiis assumed
#FIXED v attribute aiis optional, if present, must have value v
<!ATTLIST termdef
id ID #REQUIRED
name CDATA #IMPLIED>
<!ATTLIST list
type (bullets|ordered|glossary) "ordered">
<!ATTLIST form
method CDATA #FIXED "POST">
• Crossreferencing via ID and IDREF
–Well-formed XML documents essentially describe tree- structured data.
–Attributes of type IDand IDREFmay be used to encode graph structures in XML. A validating XML parser can check such a graph encoding for consistent connectivity.
–To establish a directed edge between two XML document nodes a and b
1. attach a unique identifierto node b(using an IDattribute), 2. referto bfrom avia this identifier (using an IDREFattribute), 3. for an outdegree > 1 (see below), use an IDREFSattribute.
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 20 [Scholl07]
aa bb
aa bb
cc
Graph.xml
<?xml version="1.1"?>
<!DOCTYPE graph [
<!ELEMENT graph (node+) >
<!ELEMENT node ANY > <!-- attach arbitrary data to a node -->
<!ATTLIST node
id ID #REQUIRED
edges IDREFS #IMPLIED > <!-- we may have nodes with outdegree 0 -->
]>
<graph>
<node id="A">a</node>
<node id="B" edges="A C">b</node>
<node id="C" edges="D">c</node>
<node id="D">d</node>
<node id="E" edges="D D">e</node>
</graph>
• Example
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 21 [Scholl07]
3.2 Document Type Definitions
• Drawbacks of the ID/IDREF concept –IDs have to be document wide unique –Only attributes can be referenced –Example:
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 22
3.2 Document Type Definitions
<!ELEMENT person (name)>
<!ATTLIST person id ID #REQUIRED>
<!ELEMENT department EMPTY>
<!ATTLIST department id ID #REQUIRED>
<!ELEMENT project EMPTY>
<!ATTLIST project personInCharge IDREF #REQUIRED>
–Not possible: guarantee, that only persons are being referenced as persons in charge.
(But: see XMLSchema below. . . )
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 23
3.2 Document Type Definitions
<person id='p0001'>
<name>Meier</name>
</person>
…
</department id='d0001'>
…
</project personInCharge='p0001'>
</project personInCharge='d0001'>
• Attributes versus Elements:
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 24
3.2 Document Type Definitions
Elements Attributes
Cardinalities 1, ?, +,* #REQUIRED, #IMPLIED Alternatives No alternatives
No defaults Defaults
No fixed values Fixed values No enumeration types Enumeration types Content with spaces No spaces in attribute values
No order
• Usage of attributs and elements:
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 25
3.2 Document Type Definitions
Elements Attributes
Representation of data Representation of metadata
"Visible" information Additional information (for interpretation / processing) Data objects and their components Characteristics
suitable for complex information suitable for unstructered, non- hierarchical data
Alternative conditions through attributes; not through presence or absence of elements
• Other DTD features
–User-defined entities via <!ENTITY e d>declarations (usage: &e;)
<!ENTITY phb "The Pointy-Haired Boss">
–Parameter entities("DTD macros") via <!ENTITY % e d> (usage: %e;)
<!ENTITY ident "ID #REQUIRED">
...
<!ATTLIST character id %ident; >
–Conditional sections in DTDs via <![INCLUDE[. . . ]]> and
<![IGNORE[. . . ]]>
<!ENTITY % withCharacterIDs "INCLUDE" >
<!ATTLIST bubble
<![%withCharacterIDs;
speaker %ident;
to %ident;
]]>
tone (angry|question|...) #IMPLIED >
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 26 [Scholl07]
3.2 Document Type Definitions
• Concluding remarks –DTD syntax:
•Pro: compact, easy to understand
•Con: not in XML –DTD functionality:
•no distinguishable types (everything is character data)
•no further value constraints (e.g., cardinality of sequences)
•no built-in scoping (but: use XMLns for name spaces) –From a database perspective, DTDs are a poor
schema definition language.
(But: see XMLSchema below. . . )
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 27 [Scholl07]
3.2 Document Type Definitions
3.1 Introduction
3.2 Document Type Definitions – DTDs 3.3 XML Schema
3.4 DTDs vs. XML Schema 3.5 Validation
3.6 Overview 3.7 References
28
3. Schema Definition
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
• XML Schema
–With XML Schema, the W3C provides a schema description language for XML documents that goes way beyond the capabilities of the "native" DTD concept.
Specically:
1. XML Schema descriptions are valid XML documents themselves.
2. XML Schema provides a rich set of built-in data types.
(Modelled after the SQL and Java type systems.) 3. Far-reaching control over the values a data type can
assume (facets).
4. Users can extend this type system via user-defined types.
5. XML element (and attribute) types may even be derived by inheritance.
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 29 [Scholl07]
3.3 XML Schema
• XML Schema Recommendations 1. XML Schema Part 0: Primer, 2nd Edition,
W3C Recommendation 28 October 2004
http://www.w3.org/TR/xmlschema-0/, [XMLSchema0-04]
• non-normative document intended to provide an easily readable description of the XML Schema facilities
2. XML Schema Part 1: Structures, 2nd Edition, W3C Recommendation 28 October 2004 http://www.w3.org/TR/xmlschema-1/
• how to define elements, attributes, content models etc.
3. XML Schema Part 2: Datatypes, 2nd Edition, W3C Recommendation 28 October 2004 http://www.w3.org/TR/xmlschema-2/
• standard datatypes and mechanims to build user-defined datatypes
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 30
3.3 XML Schema
• Some XML Schema Constructs 1.
No further typing specified: the author element may contain string values only.
2.
Absence of minOccurs/maxOccurs implies exactly once.
3.
Content of year takes the format YYYY-MM-DD.
4. XML Schema distinguishes 3 kinds of simple types: atomic types, list typesand union types.
• simple types can not contain elements or attributes and
• are either build-in or derived from other simple types (e.g. using restrictions)
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 31 [Scholl07]
Declaring an element
<xsd:element name="author"/>
Declaring an element with bounded occurence
<xsd:element name="author" minOccurs="1"
maxOccurs="unbounded"/>
Declaring a typed element
<xsd:element name="Birthday" type="xsd:date"/>
• XML Schema's built-in simple types (examples):
–decimal, double, float –integer
–boolean –time –hexBinary
–string, normalizedString, token –language, Name, NCName
–DTD--Typen (ID, IDREF, IDREFS, etc.)
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 32
• Simple types can be restricted using facets:
• Other facets: length, maxInclusive, minExclusive, …
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 33 [Scholl07]
3.3 XML Schema
Restricting the value space of a simple type (enumeration)
<xsd:simpleType name="Tone">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="question"/>
<xsd:enumeration value="angry"/>
<xsd:enumeration value="screaming"/>
</xsd:restriction>
</xsd:simpleType>
Restricting the value space of a simple type (regular expression)
<xsd:simpleType name="AreaCode">
<xsd:restriction base="xsd:string">
<xsd:pattern value="0[0-9]+"/>
<xsd:minLength value="3"/>
<xsd:maxLength value="5"/>
</xsd:restriction>
</xsd:simpleType>
• Complex types
–… are built from simple types using type constructors.
–An xsd:complexTypemay be used anonymously (no nameattribute).
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 34
3.3 XML Schema
Declaring sequenced content
<xsd:complexType name="Address" >
<xsd:sequence>
<xsd:element name="name" type="xsd:string"/>
<xsd:element name="street" type="xsd:string"/>
<xsd:element name="city" type="xsd:string"/>
<xsd:element name="state" type="xsd:string"/>
<xsd:element name="zip" type="xsd:decimal"/>
</xsd:sequence>
</xsd:complexType>
<xsd:element name="address" type="Address"/>
• Anonymous type declaration
–No reuse possible
–More compact representation
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 35
3.3 XML Schema
Declaring anonymous types
<xsd:element name="address">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="name" type="xsd:string"/>
<xsd:element name="street" type="xsd:string"/>
<xsd:element name="city" type="xsd:string"/>
<xsd:element name="state" type="xsd:string"/>
<xsd:element name="zip" type="xsd:decimal"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
• New complex types may be derived from an existing (base) type.
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 36 [Scholl07]
3.3 XML Schema
Deriving a new complex type
<xsd:complexType name="UKAddress">
<xsd:complexContent>
<xsd:extension base="Address">
<xsd:sequence>
<xsd:element name="postcode"
type="UKPostcode"/>
</xsd:sequence>
<xsd:attribute name="exportCode"
type="xsd:positiveInteger"
fixed="1"/>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
• Content Models
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 37
3.3 XML Schema
Example, Nested Choice and Sequence Groups
<xsd:complexType name="PurchaseOrderType">
<xsd:sequence>
<xsd:choice>
<xsd:group ref="shipAndBill"/>
<xsd:element name="singleUSAddress" type="USAddress"/>
</xsd:choice> <xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="items" type="Items"/>
</xsd:sequence>
<xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>
<xsd:group id="shipAndBill">
<xsd:sequence>
<xsd:element name="shipTo" type="USAddress"/>
<xsd:element name="billTo" type="USAddress"/>
</xsd:sequence>
</xsd:group>
• Mixed Content
–With attribute mixed="true", an
xsd:complexTypeadmits mixed content.
–
–In contrast to DTDs, order and number of occurrences of elements can be specified.
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 38
3.3 XML Schema
<productName>Baby Monitor</productName> shipped from Example: Snippet of Customer Letter
<letterBody>
<salutation>Dear Mr.<name>Robert Smith</name>.
</salutation>
Your order of <quantity>1</quantity>
<productName>Baby Monitor</productName> shipped from our warehouse on <shipDate>1999-05-21</shipDate>.
....
</letterBody>
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 39
3.3 XML Schema
Example: Snippet of Schema for Customer Letter
<xsd:element name="letterBody">
<xsd:complexType mixed="true">
<xsd:sequence>
<xsd:element name="salutation">
<xsd:complexType mixed="true">
<xsd:sequence>
<xsd:element name="name" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element name="quantity"
type="xsd:positiveInteger"/>
<xsd:element name="productName"type="xsd:string"/>
<xsd:element name="shipDate" type="xsd:date"
minOccurs="0"/>
<!-- etc. -->
</xsd:sequence>
</xsd:complexType>
</xsd:element>
• Attributesare declared within their owner element.
–Other xsd:attributemodifiers:
•use(required,optional,prohibited),
•fixed,
•default.
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 40 [Scholl07]
3.3 XML Schema
Declaring attributes
<xsd:element name="strip">
<xsd:attribute name="copyright"/>
<xsd:attribute name="year" type="xsd:gYear"/> ...
</xsd:element>
• XML schemas and target namespaces
–A schema can be viewed as a collection (vocabulary) of type definitions and element declarations whose names belong to a particular namespace called a target namespace.
–Target namespaces enable us to distinguish between definitions and declarations from different vocabularies.
•E.g., target namespaces enable us to distinguish between the declaration for elementin the XML Schema language vocabulary, and a declaration for element in a hypothetical chemistry language vocabulary.
•The former is part of the http://www.w3.org/2001/XMLSchema target namespace, and the latter is part of another target namespace.
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 41
3.3 XML Schema
• Example, Purchase Order Schema with Target Namespace, po1.xsd:
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 42 [XMLSchema0-04]
3.3 XML Schema
<schema xmlns="http://www.w3.org/2001/XMLSchema"
xmlns:po="http://www.example.com/PO1"
targetNamespace="http://www.example.com/PO1"
elementFormDefault="unqualified"
attributeFormDefault="unqualified">
<element name="purchaseOrder"
type="po:PurchaseOrderType"/>
<element name="comment" type="string"/>
…
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 43 [XMLSchema0-04]
…
<complexType name="PurchaseOrderType">
<sequence>
<element name="shipTo" type="po:USAddress"/>
<element name="billTo" type="po:USAddress"/>
<element ref="po:comment" minOccurs="0"/>
<!-- etc. -->
</sequence>
<!-- etc. -->
</complexType>
<complexType name="USAddress">
<sequence>
<element name="name" type="string"/>
<element name="street" type="string"/>
<!-- etc. -->
</sequence>
</complexType>
<!-- etc. -->
</schema>
• Example, a Purchase Order with Unqualified Locals, po1.xml
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 44 [XMLSchema0-04]
<?xml version="1.1"?>
<apo:purchaseOrder xmlns:apo="http://www.example.com/PO1"
orderDate="1999-10-20">
<shipTo country="US">
<name>Alice Smith</name>
<street>123 Maple Street</street>
<!-- etc. -->
</shipTo>
<billTo country="US">
<name>Robert Smith</name>
<street>8 Oak Avenue</street>
<!-- etc. -->
</billTo>
<apo:comment>Hurry, my lawn is going wild<!/apo:comment>
<!-- etc. -->
</apo:purchaseOrder>
• Uniqueness constraints, keys and referential integrity
–xsd:uniqueelement consists of
•one selectorelement
–selects a set of elements for which uniqueness has to be guaranteed
•one or more fieldelements
–identify elements or attributes, which have to have unique values
•based on XPATH expressions (see next week's lecture) –xsd:keyelement
•analog to xsd:uniqueelement –xsd:keyrefelement
•also analog to xsd:uniqueelement
•with additional element xsd:referwhich contains the name of a xsd:keyelement
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 45
3.3 XML Schema
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 46
3.3 XML Schema
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 47
3.3 XML Schema
• Other XML Schema Concepts –Fixed and default element content, –support for null values,
–reuse concepts (inheritance, model groups).
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 48
3.3 XML Schema
3.1 Introduction
3.2 Document Type Definitions – DTDs 3.3 XML Schema
3.4 DTDs vs. XML Schema 3.5 Validation
3.6 Overview 3.7 References
49
3. Schema Definition
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
Document Type Definition XML Schema
Not XML XML (XML tools can be used)
Compact syntax Verbose
No datatypes
(everything is character data)
Sophisticated type system Few possibilities to constrain the schema Diverse possibilities to constrain the
schema
•Only "*", "+", "?" as cardinality
constraints •Full flexibility for cardinality constraints
•Only rudimental key concept •Mature key concept Reuse only through entities Reuse concepts available (type
definitions, model groups etc.)
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 50
3.4 DTDs vs. XML Schema
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 51
3.5 Validation
Source: Mario Jeckle, www.jeckle.de
1. Introduction 2. XML Basics 3. Schema definition 4. XML query
languages I
5. Mapping relational data to XML
6. SQL/XML 7. XML processing
8. XML query languages II 9. XML storage I 10. XML storage - index 11. XML storage - native 12. Updates / Transactions 13. Systems
14. XML Benchmarks
3.6 Overview
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 52
• http://www.w3.org/ [W3C]
• Extensible Markup Language (XML) 1.1 (2nd Edition) [XML06]
–W3C Recommendation 16 August 2006, edited in place 29 September 2006
–http://www.w3.org/TR/xml11
• M. Scholl, "XML and Databases", Lecture, Uni Konstanz, WS07/08 [Scholl07]
53
3.7 References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
• XML und Datenmodellierung –R. und S. Eckstein
–Dpunkt-Verlag, 2004, ISBN 3898642224
• XML in a Nutshell [HM04]
–Harold & Means
–O'Reilly, 2004, ISBN 0596007647
• Vorlesung XML –M. Jeckle
–http://www.jeckle.de/vorlesung/xml/script.html#XMLSchema
54
3.7 References
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
• Now, or ...
• Room: IZ 232
• Office our: Tuesday, 12:30 – 13:30 Uhr or on appointment
• Email: eckstein@ifis.cs.tu-bs.de
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 55