Fragen zur Vorlesung „XPath & Co.“?

(1)

Ü Ü bung 4 bung 4

(2)

Ü Ü bung 4 bung 4

Fragen zur Vorlesung „XPath & Co.“?

XPath in Depth

XPath Example

Musterfragen

XML Tools: working with XPath

XML In Use: RSS

(3)

XPath in Depth

(4)

The XPath data model

• In XPath, every XML document is a tree of nodes

• There are seven types of node - Root (NOT the root element) - Element

- Attribute (NOT a child of the parent node) - Text

- Namespace (NOT a child of the parent node) - Processing instruction

- Comment

• What is NOT part of the XPath tree - XML declaration

- Document type declaration

(5)

XPath data model – XML document

<?xml version=„1.0“?>

<!–- simple XML document -->

<![CDATA[

//C++ comment

if ( this->getX() < 5 && value[ 0 ] != 3 ) oerr << this->displayError();

]]>

</sample>

C++ How to Program by Deitel & Deitel

</book>

(6)

XPath data model – XPath tree

Root node !=

Root element

(7)

Namespace nodes

<?xml version=„1.0“?>

<!–- Fig. 11.3 : simple2.xml -->

<!–- Processing instructions and namespaces -->

<head>

<title>Processing Instructions and Namespace Nodes</title>

</head>

…

xmlns and xmlns:prefix attributes are NOT attribute nodes

(8)

Namespace nodes

What is wrong here?

Namespace

nodes represent

a namespace in

the scope of an

element!

(9)

XPath node values

• Each node has a particular (string) value which it returns if selected by a XPath expression

- Root node: the entire text of the XML document

- Element node: the complete, parsed text between the element‘s start and end tags (all tags, comments and

processing instructions removed, all CDATA sections and entity references are resolved)

- Attribute node: the normalized attribute value

- Text node: the text content with all CDATA sections and entity references resolved

- Namespace node: the namespace URI

- Processing instruction node: the data of the processing instruction

- Comment node: the text content of the comment

(10)

XPath node lists

• Often, a XPath expression finds more than one match for the context node in the document

• In XPath, this is considered the context node list

• In XSLT, for example, each node in the list will be considered in turn

<xsl:template match=„//person“>

<xsl:value-of select=„//person“ />

</xsl:template>

(11)

XPath node sets

• A node list can be provided to any function which accepts the node-set datatype

• One function id() returns a node-set (of all nodes who have an attribute of type ID which has a value from the input string) e.g. in XSLT

<xsl:template match=„id(„aa ab“)“>

Returns a node-set of the elements which have an ID-type attribute with the value aa or ab

<xsl:value-of select=„id(„aa ab“)“ />

Returns the string values of the elements which have an ID-type attribute with the value aa or ab

(12)

XPath predicates

• Each location step may have zero or more predicates /person/profession[.=„doctor“][position()=2]

How to interpret this?

<profession>doctor</profession>

<profession>nurse</profession>

</person>

(13)

XPath predicates (2)

• XPath resolves predicates from left to right /person/profession[.=„doctor“][position()=2]

</person>

(14)

XPath predicates (3)

• XPath resolves predicates from left to right /person/profession[.=„doctor“][position()=2]

</person>

(15)

XPath functions (contd.)

• XPath 1.0 has 27 built-in functions

• Others which use XPath, e.g. XSLT or XPointer, extend this function list

• Some XPath/XSLT parsers allow for user-defined extension functions

• More string functions

- concat(string 1, string 2…) returns string

- contains(string 1, string 2) returns boolean

• More number functions

- ceiling(number n) returns smallest whole number

> n

- floor(number n) returns largest whole number < n

(16)

XPath functions (2)

• String manipulation

- substring(string s, number index, number length)

- substring(string s, number index)

- substring-after(string s1, string s2) - substring-before(string s1, string s2)

- translate(string s1, string s2, string s3)

e.g. translate(„I don‘t like the letter l“,“l„,“_“)

I don‘t like the letter l Æ I don‘t _ike the _etter_

(17)

XPath functions (3)

• lang(string language-code) returns boolean

The nearest xml:lang attribute on the context node or one of its ancestors determines the language of the node

If no such xml:lang attribute exists, lang() returns false

(18)

XPath functions (4)

• name() returns string

• name(node-set nodes) returns string

Returns qualified name (e.g. html:body) of the context node or the first node in the node-set

• local-name() returns string

• local-name(node-set nodes) returns string

As above, returning only the local name (after the namespace prefix) e.g. for <html:body> returns the string „body“

• namespace-uri() returns string

• namespace-uri(node-set nodes) returns string

As above, returning only the namespace URI of the node (not the

(19)

XPath functions (5)

• Handling whitespace in XML is often necessary, as the XML parser passes normally all whitespace and line breaks into the XML data model without changes

• normalize-space() takes a string and normalizes it:

- stripped of trailing and leading whitespace

- sequences of whitespace reduced to one whitespace character

- removes line breaks

• e.g. what is the XML element content for:

<person> John Edwards </person>

- Value of person is „ John Edwards „

- Value of normalize-space(person) is „John Edwards“

(20)

XPath Examples

(21)

Beispiele

Wähle das Wurzelelement AAA aus:

<AAA>

<BBB/>

<CCC/>

<BBB/>

<DDD>

<BBB/>

</DDD>

<CCC/>

</AAA>

Wähle alle CCC Elemente aus, die Kinder des Elements AAA sind:

<AAA>

<BBB/>

<CCC/>

<BBB/>

<DDD>

<BBB/>

</DDD>

<CCC/>

</AAA>

/AAA

/AAA /AAA/CCC /AAA/CCC

(22)

Beispiele

<AAA>

<BBB/>

<CCC/>

<BBB/>

<DDD>

<BBB/>

</DDD>

<CCC>

<DDD>

<BBB/>

</DDD>

</CCC>

<AAA>

<BBB/>

<CCC/>

<BBB/>

<DDD>

<BBB/>

</DDD>

<CCC>

<DDD>

<BBB/>

</DDD>

</CCC>

//BBB

//BBB //DDD/BBB //DDD/BBB

(23)

Beispiele

<AAA>

<XXX>

<DDD>

<BBB/>

<FFF/>

</DDD>

</XXX>

<CCC>

<BBB>

<BBB/>

</BBB>

</CCC>

</AAA>

<AAA>

<XXX>

<DDD>

<BBB/>

<FFF/>

</DDD>

</XXX>

<CCC>

<BBB>

<BBB/>

</BBB>

</CCC>

</AAA>

**///*/BBB**

**///*/BBB** //* //*

(24)

Beispiele

<AAA>

<BBB id="b1"/>

<BBB id="b2"/>

<BBB

name="bbb"/>

<BBB/>

</AAA>

//@id //@id

<AAA>

<BBB/>

</AAA>

/AAA/BBB[last()]

(25)

Beispiele

<AAA>

<BBB>

<CCC/>

<DDD/>

</BBB>

<XXX>

<EEE/>

<CCC/>

<FFF/>

<FFF>

<GGG/>

</FFF>

</XXX>

</AAA>

//CCC/following

//CCC/following- -sibling::* sibling::*

<AAA>

<BBB/>

<CCC/>

<DDD>

<CCC/>

</DDD>

<EEE/>

</AAA>

//CCC | //BBB

(26)

Musterfragen

(27)

Frage 1 Frage 1

Which of the following XPath expressions will select all shoe children of the current node that have a width attribute with the value of "EEEE"?

A. shoe[attribute::width="EEEE"]

B. shoe[@width="EEEE"]

C. shoe(@width="EEEE")

D. child::shoe[attribute::width="EEEE"]

E. child::shoe(attribute::width="EEEE")

(28)

Frage 2 Frage 2

What is the result of the following XPath expression?

translate("asagus","abc","ABC") A. asagus

B. Asagus C. AsAgus

D. AspBrCgus

E. ABCaragus

(29)

Frage 3 Frage 3

Which of the following is the correct syntax for expressing the IF-THEN-ELSE flow of control in XPath 1.0?

A. ( a = b ? c : d )

B. cond ( a = b, c, d )

C. if ( a = b ) then ( c ) else ( d ) D. if ( a = b ) then c else d

E. There is no such construct in XPath 1.0.

(30)

Frage 4 Frage 4

<?xml version="1.0" encoding="UTF-8"?>

<atomicNumberatomicNumber>47</atomicNumber> atomicNumber

<atomicWeight>107.8682</atomicWeight atomicWeight>atomicWeight

</chemicalElement>

<atomicNumberatomicNumber>79</atomicNumber> atomicNumber

<atomicWeight>196.96654</atomicWeight atomicWeight>atomicWeight

</chemicalElement>

</periodicTable>

Which XPath expression would select just the child elements of all chemicalElement

elements of the periodicTable element?

A. /periodicTable/chemicalElement*

B. /periodicTable/*

C. ///*

(31)

Frage 5 Frage 5

Which XPath expression selects all species elements that have a mutation element?

A. species[mutation]

B. species(mutation) C. species/mutation

D. species[@mutation]

(32)

Frage 6 Frage 6

Which XPath expression selects the first manifold node that has a riemann attribute?

A. manifold[@riemann[1]]

B. manifold[position()=1][@riemann]

C. manifold[1]/[@riemann]

D. manifold[@riemann][1]

E. [1]manifold[@riemann]

(33)

Frage 7 Frage 7

<?xml version="1.0" encoding="UTF-8"?>

<menu>

<dish>Jambalaya</dish>

</entree>

<dish>Borshch</dish>

</entree>

Which XPath expression selects the price of Borshch?

A. //entree[@cuisine='Borshch']/price B. //entree[dish='Borshch']/price

C. //entree[@dish='Russian']/price

D. //entree[text()='Borshch']/price

E. //entree['Borshch']/price

(34)

XML Tools: working XML Tools: working

with XPath

(35)

Tools for XPath

A simple Web form for providing XML documents and XPath

expressions, shows a match result (the context node list) rather than node values

http://www.futurelab.ch/xmlkurs/xpath.de.html

Inside Eclipse SDK you will need to install a new plugin e.g.

XPath-Developer plugin

http://www.bastian-bergerhoff.com/eclipse/features/web/ XPathDeveloper/

using.html

(36)

XPath-Developer

• Once the Plugin is installed, you should find a category named XPath-Developer under Window > Show View >

Other....

• The plugin currently contributes one view (the XPath-View).

(37)

XPath-Developer (2)

• As with many Eclipse editors, you can use TAB-completion for XPath expressions

• To get a list of proposals for completion of the current expression, hit 'Ctrl+Space' while editing in the area for expressions.

(38)

(39)

RSS:

Syndication with Syndication with

XML XML

(40)

RSS on the Web

• On many web pages you will see this symbol

• Means there is a „RSS feed“ (an URL with a RSS document)

• With a RSS feed reader you could get this from that URL

(41)

What is it used for? How is it used?

• RSS stands for „Really Simple Syndication“ (in RSS 2.0)

• Is used to publish frequently updated content e.g. news, discussion forums, blog entries, podcasts …

• Enables users to keep up to date with their favourite sites without having to check them manually

• You need a RSS feed reader (now available in Firefox, IE 7, Yahoo Mail, Google …)

• Subscribe to a feed by entering the RSS URL or clicking on the RSS icon within a „supporting“ application

• The reader checks the feed regularly, retrieving any updates it finds

(42)

What is RSS actually? How does it work?

• Did you realise it already? It is a simple XML document

• We‘ll look at RSS 2.0, which is the most „XML friendly“ format (e.g. supports extension through namespaces)

• There is also RSS 0.91, the most used RSS version. It is dubbed

„Rich Site Summary“

(43)

A RSS 2.0 document (pt 1)

<?xml version="1.0"?>

<rss version="2.0">

<channel>

<title>Liftoff News</title>

<link>http://liftoff.msfc.nasa.gov/</link>

<description>Liftoff to Space Exploration.</description>

<language>en-us</language>

<pubDate>Tue, 10 Jun 2003 04:00:00 GMT</pubDate>

<lastBuildDate>Tue, 10 Jun 2003 09:41:01 GMT</lastBuildDate>

<docs>http://blogs.law.harvard.edu/tech/rss</docs>

<generator>Weblog Editor 2.0</generator>

<managingEditor>editor@example.com</managingEditor>

<webMaster>webmaster@example.com</webMaster>

(44)

A RSS 2.0 document (pt 2)

<item>

<title>Star City</title>

<link>http://liftoff.msfc.nasa.gov/news/2003/news-starcity.asp</link>

<description>How do Americans get ready to work with Russians aboard the International Space Station? They take a crash course in culture,

language and protocol at Russia's Star City.</description>

<pubDate>Tue, 03 Jun 2003 09:39:21 GMT</pubDate>

<guid>http://liftoff.msfc.nasa.gov/2003/06/03.html#item573</guid>

</item>

<item> ……. </item>

</channel>

</rss>

(45)

Why XML?

• What are the advantages of XML for the RSS format?

- Validation through XML Schema

- Extensibility through XML Namespaces - Adaptability through XSLT

• E.g. consider a XSLT for RSS to HTML

- Instant RSS file viewing on your browser

• Or XSLT for RSS to WML

- Easy access to RSS items on your mobile phone

• Or XSLT for RSS to SMIL even

- Generate simple multimedia presentations of the latest items in your favourite RSS feed!

 Fragen zur Vorlesung „XPath & Co.“?

Ü Ü bung 4 bung 4

Ü Ü bung 4 bung 4

 Fragen zur Vorlesung „XPath & Co.“?

 XPath in Depth

 XPath Example

 Musterfragen

 XML Tools: working with XPath

 XML In Use: RSS

XPath in Depth

XPath in Depth

The XPath data model

• In XPath, every XML document is a tree of nodes

• There are seven types of node - Root (NOT the root element) - Element

- Attribute (NOT a child of the parent node) - Text

- Namespace (NOT a child of the parent node) - Processing instruction

- Comment

• What is NOT part of the XPath tree - XML declaration

- Document type declaration

XPath data model – XML document

XPath data model – XPath tree

Root node !=

Root element

Namespace nodes

Namespace nodes

What is wrong here?

Namespace

nodes represent

a namespace in

the scope of an

element!

XPath node values

XPath node lists

XPath node sets

XPath predicates

XPath predicates (2)

XPath predicates (3)

XPath functions (contd.)

• XPath 1.0 has 27 built-in functions

• Others which use XPath, e.g. XSLT or XPointer, extend this function list

• Some XPath/XSLT parsers allow for user-defined extension functions

• More string functions

- concat(string 1, string 2…) returns string

- contains(string 1, string 2) returns boolean

• More number functions

- ceiling(number n) returns smallest whole number

> n

- floor(number n) returns largest whole number < n

XPath functions (2)

XPath functions (3)

XPath functions (4)

XPath functions (5)

• Handling whitespace in XML is often necessary, as the XML parser passes normally all whitespace and line breaks into the XML data model without changes

• normalize-space() takes a string and normalizes it:

- stripped of trailing and leading whitespace

- sequences of whitespace reduced to one whitespace character

- removes line breaks

XPath Examples

XPath Examples

Beispiele

/AAA

/AAA /AAA/CCC /AAA/CCC

Beispiele

//BBB

//BBB //DDD/BBB //DDD/BBB

Beispiele

/*/*/*/BBB

/*/*/*/BBB //* //*

Beispiele

<AAA>

<BBB id="b1"/>

<BBB id="b2"/>

<BBB

name="bbb"/>

<BBB/>

</AAA>

//@id //@id

<AAA>

<BBB/>

<BBB/>

Fragen zur Vorlesung „XPath & Co.“?

Fragen zur Vorlesung „XPath & Co.“?

XPath in Depth

XPath Example

Musterfragen

XML Tools: working with XPath

XML In Use: RSS

**///*/BBB**

**///*/BBB** //* //*

C. ///*