Ü Ü bung 4 bung 4
Ü Ü bung 4 bung 4
Fragen zur Vorlesung „XPath & Co.“?
XPath in Depth
XPath Example
Musterfragen
XML Tools: working with XPath
XML In Use: RSS
XPath in Depth
XPath in Depth
The XPath data model
• In XPath, every XML document is a tree of nodes
• There are seven types of node - Root (NOT the root element) - Element
- Attribute (NOT a child of the parent node) - Text
- Namespace (NOT a child of the parent node) - Processing instruction
- Comment
• What is NOT part of the XPath tree - XML declaration
- Document type declaration
XPath data model – XML document
<?xml version=„1.0“?>
<!-- simple.xml -->
<!–- simple XML document -->
<book title=„C++ How to Program“ edition=„3“>
<sample>
<![CDATA[
//C++ comment
if ( this->getX() < 5 && value[ 0 ] != 3 ) oerr << this->displayError();
]]>
</sample>
C++ How to Program by Deitel & Deitel
</book>
XPath data model – XPath tree
Root node !=
Root element
Namespace nodes
<?xml version=„1.0“?>
<!–- Fig. 11.3 : simple2.xml -->
<!–- Processing instructions and namespaces -->
<html xmlns=„http://www.w3.org/TR/REC-html40“>
<head>
<title>Processing Instructions and Namespace Nodes</title>
</head>
…
xmlns and xmlns:prefix attributes are NOT attribute nodes
Namespace nodes
What is wrong here?
Namespace
nodes represent
a namespace in
the scope of an
element!
XPath node values
• Each node has a particular (string) value which it returns if selected by a XPath expression
- Root node: the entire text of the XML document
- Element node: the complete, parsed text between the element‘s start and end tags (all tags, comments and
processing instructions removed, all CDATA sections and entity references are resolved)
- Attribute node: the normalized attribute value
- Text node: the text content with all CDATA sections and entity references resolved
- Namespace node: the namespace URI
- Processing instruction node: the data of the processing instruction
- Comment node: the text content of the comment
XPath node lists
• Often, a XPath expression finds more than one match for the context node in the document
• In XPath, this is considered the context node list
• In XSLT, for example, each node in the list will be considered in turn
<xsl:template match=„//person“>
<xsl:value-of select=„//person“ />
</xsl:template>
XPath node sets
• A node list can be provided to any function which accepts the node-set datatype
• One function id() returns a node-set (of all nodes who have an attribute of type ID which has a value from the input string) e.g. in XSLT
<xsl:template match=„id(„aa ab“)“>
Returns a node-set of the elements which have an ID-type attribute with the value aa or ab
<xsl:value-of select=„id(„aa ab“)“ />
Returns the string values of the elements which have an ID-type attribute with the value aa or ab
XPath predicates
• Each location step may have zero or more predicates /person/profession[.=„doctor“][position()=2]
How to interpret this?
<person>
<profession>doctor</profession>
<profession>nurse</profession>
<profession>nurse</profession>
<profession>doctor</profession>
</person>
XPath predicates (2)
• XPath resolves predicates from left to right /person/profession[.=„doctor“][position()=2]
<person>
<profession>doctor</profession>
<profession>nurse</profession>
<profession>nurse</profession>
<profession>doctor</profession>
</person>
XPath predicates (3)
• XPath resolves predicates from left to right /person/profession[.=„doctor“][position()=2]
<person>
<profession>doctor</profession>
<profession>nurse</profession>
<profession>nurse</profession>
<profession>doctor</profession>
</person>
XPath functions (contd.)
• XPath 1.0 has 27 built-in functions
• Others which use XPath, e.g. XSLT or XPointer, extend this function list
• Some XPath/XSLT parsers allow for user-defined extension functions
• More string functions
- concat(string 1, string 2…) returns string
- contains(string 1, string 2) returns boolean
• More number functions
- ceiling(number n) returns smallest whole number
> n
- floor(number n) returns largest whole number < n
XPath functions (2)
• String manipulation
- substring(string s, number index, number length)
- substring(string s, number index)
- substring-after(string s1, string s2) - substring-before(string s1, string s2)
- translate(string s1, string s2, string s3)
e.g. translate(„I don‘t like the letter l“,“l„,“_“)
I don‘t like the letter l Æ I don‘t _ike the _etter_
XPath functions (3)
• lang(string language-code) returns boolean
The nearest xml:lang attribute on the context node or one of its ancestors determines the language of the node
If no such xml:lang attribute exists, lang() returns false
XPath functions (4)
• name() returns string
• name(node-set nodes) returns string
Returns qualified name (e.g. html:body) of the context node or the first node in the node-set
• local-name() returns string
• local-name(node-set nodes) returns string
As above, returning only the local name (after the namespace prefix) e.g. for <html:body> returns the string „body“
• namespace-uri() returns string
• namespace-uri(node-set nodes) returns string
As above, returning only the namespace URI of the node (not the
XPath functions (5)
• Handling whitespace in XML is often necessary, as the XML parser passes normally all whitespace and line breaks into the XML data model without changes
• normalize-space() takes a string and normalizes it:
- stripped of trailing and leading whitespace
- sequences of whitespace reduced to one whitespace character
- removes line breaks
• e.g. what is the XML element content for:
<person> John Edwards </person>
- Value of person is „ John Edwards „
- Value of normalize-space(person) is „John Edwards“
XPath Examples
XPath Examples
Beispiele
Wähle das Wurzelelement AAA aus:
<AAA>
<AAA>
<BBB/>
<CCC/>
<BBB/>
<BBB/>
<DDD>
<BBB/>
</DDD>
<CCC/>
</AAA>
</AAA>
Wähle alle CCC Elemente aus, die Kinder des Elements AAA sind:
<AAA>
<BBB/>
<CCC/>
<CCC/>
<BBB/>
<BBB/>
<DDD>
<BBB/>
</DDD>
<CCC/>
<CCC/>
</AAA>
/AAA
/AAA /AAA/CCC /AAA/CCC
Beispiele
<AAA>
<BBB/>
<CCC/>
<BBB/>
<DDD>
<BBB/>
</DDD>
<CCC>
<DDD>
<BBB/>
<BBB/>
</DDD>
</CCC>
<AAA>
<BBB/>
<CCC/>
<BBB/>
<DDD>
<BBB/>
</DDD>
<CCC>
<DDD>
<BBB/>
<BBB/>
</DDD>
</CCC>
//BBB
//BBB //DDD/BBB //DDD/BBB
Beispiele
<AAA>
<XXX>
<DDD>
<BBB/>
<FFF/>
</DDD>
</XXX>
<CCC>
<BBB>
<BBB>
<BBB/>
</BBB>
</BBB>
</CCC>
</AAA>
<AAA>
<XXX>
<DDD>
<BBB/>
<FFF/>
</DDD>
</XXX>
<CCC>
<BBB>
<BBB>
<BBB/>
</BBB>
</BBB>
</CCC>
</AAA>
/*/*/*/BBB
/*/*/*/BBB //* //*
Beispiele
<AAA>
<BBB id="b1"/>
<BBB id="b2"/>
<BBB
name="bbb"/>
<BBB/>
</AAA>
//@id //@id
<AAA>
<BBB/>
<BBB/>
<BBB/>
<BBB/>
</AAA>
/AAA/BBB[last()]
/AAA/BBB[last()]
Beispiele
<AAA>
<BBB>
<CCC/>
<DDD/>
</BBB>
<XXX>
<EEE/>
<CCC/>
<FFF/>
<FFF>
<GGG/>
</FFF>
</XXX>
</AAA>
//CCC/following
//CCC/following- -sibling::* sibling::*
<AAA>
<BBB/>
<CCC/>
<DDD>
<CCC/>
</DDD>
<EEE/>
</AAA>
//CCC | //BBB
//CCC | //BBB
Musterfragen
Musterfragen
Frage 1 Frage 1
Which of the following XPath expressions will select all shoe children of the current node that have a width attribute with the value of "EEEE"?
A. shoe[attribute::width="EEEE"]
B. shoe[@width="EEEE"]
C. shoe(@width="EEEE")
D. child::shoe[attribute::width="EEEE"]
E. child::shoe(attribute::width="EEEE")
Frage 2 Frage 2
What is the result of the following XPath expression?
translate("asagus","abc","ABC") A. asagus
B. Asagus C. AsAgus
D. AspBrCgus
E. ABCaragus
Frage 3 Frage 3
Which of the following is the correct syntax for expressing the IF-THEN-ELSE flow of control in XPath 1.0?
A. ( a = b ? c : d )
B. cond ( a = b, c, d )
C. if ( a = b ) then ( c ) else ( d ) D. if ( a = b ) then c else d
E. There is no such construct in XPath 1.0.
Frage 4 Frage 4
<?xml version="1.0" encoding="UTF-8"?>
<periodicTable>
<chemicalElement symbol="Ag">
<atomicNumberatomicNumber>47</atomicNumber> atomicNumber
<atomicWeight>107.8682</atomicWeight atomicWeight>atomicWeight
</chemicalElement>
<chemicalElement symbol="Au">
<atomicNumberatomicNumber>79</atomicNumber> atomicNumber
<atomicWeight>196.96654</atomicWeight atomicWeight>atomicWeight
</chemicalElement>
</periodicTable>
Which XPath expression would select just the child elements of all chemicalElement
elements of the periodicTable element?
A. /periodicTable/chemicalElement*
B. /periodicTable/*
C. /*/*/*
Frage 5 Frage 5
Which XPath expression selects all species elements that have a mutation element?
A. species[mutation]
B. species(mutation) C. species/mutation
D. species[@mutation]
Frage 6 Frage 6
Which XPath expression selects the first manifold node that has a riemann attribute?
A. manifold[@riemann[1]]
B. manifold[position()=1][@riemann]
C. manifold[1]/[@riemann]
D. manifold[@riemann][1]
E. [1]manifold[@riemann]
Frage 7 Frage 7
<?xml version="1.0" encoding="UTF-8"?>
<menu>
<entree cuisine="Cajun">
<dish>Jambalaya</dish>
<price denomination="dollar">3.33</price>
</entree>
<entree cuisine="Russian">
<dish>Borshch</dish>
<price denomination="ruble">6.66</price>
<price denomination="ruble">6.66</price>
</entree>
Which XPath expression selects the price of Borshch?
A. //entree[@cuisine='Borshch']/price B. //entree[dish='Borshch']/price
C. //entree[@dish='Russian']/price
D. //entree[text()='Borshch']/price
E. //entree['Borshch']/price
XML Tools: working XML Tools: working
with XPath
with XPath
Tools for XPath
A simple Web form for providing XML documents and XPath
expressions, shows a match result (the context node list) rather than node values
http://www.futurelab.ch/xmlkurs/xpath.de.html
Inside Eclipse SDK you will need to install a new plugin e.g.
XPath-Developer plugin
http://www.bastian-bergerhoff.com/eclipse/features/web/ XPathDeveloper/
using.html
XPath-Developer
• Once the Plugin is installed, you should find a category named XPath-Developer under Window > Show View >
Other....
• The plugin currently contributes one view (the XPath-View).
XPath-Developer (2)
• As with many Eclipse editors, you can use TAB-completion for XPath expressions
• To get a list of proposals for completion of the current expression, hit 'Ctrl+Space' while editing in the area for expressions.
RSS:
RSS:
Syndication with Syndication with
XML XML
RSS on the Web
• On many web pages you will see this symbol
• Means there is a „RSS feed“ (an URL with a RSS document)
• With a RSS feed reader you could get this from that URL
What is it used for? How is it used?
• RSS stands for „Really Simple Syndication“ (in RSS 2.0)
• Is used to publish frequently updated content e.g. news, discussion forums, blog entries, podcasts …
• Enables users to keep up to date with their favourite sites without having to check them manually
• You need a RSS feed reader (now available in Firefox, IE 7, Yahoo Mail, Google …)
• Subscribe to a feed by entering the RSS URL or clicking on the RSS icon within a „supporting“ application
• The reader checks the feed regularly, retrieving any updates it finds
What is RSS actually? How does it work?
• Did you realise it already? It is a simple XML document
• We‘ll look at RSS 2.0, which is the most „XML friendly“ format (e.g. supports extension through namespaces)
• There is also RSS 0.91, the most used RSS version. It is dubbed
„Rich Site Summary“
A RSS 2.0 document (pt 1)
<?xml version="1.0"?>
<rss version="2.0">
<channel>
<title>Liftoff News</title>
<link>http://liftoff.msfc.nasa.gov/</link>
<description>Liftoff to Space Exploration.</description>
<language>en-us</language>
<pubDate>Tue, 10 Jun 2003 04:00:00 GMT</pubDate>
<lastBuildDate>Tue, 10 Jun 2003 09:41:01 GMT</lastBuildDate>
<docs>http://blogs.law.harvard.edu/tech/rss</docs>
<generator>Weblog Editor 2.0</generator>
<managingEditor>editor@example.com</managingEditor>
<webMaster>webmaster@example.com</webMaster>
A RSS 2.0 document (pt 2)
<item>
<title>Star City</title>
<link>http://liftoff.msfc.nasa.gov/news/2003/news-starcity.asp</link>
<description>How do Americans get ready to work with Russians aboard the International Space Station? They take a crash course in culture,
language and protocol at Russia's Star City.</description>
<pubDate>Tue, 03 Jun 2003 09:39:21 GMT</pubDate>
<guid>http://liftoff.msfc.nasa.gov/2003/06/03.html#item573</guid>
</item>
<item> ……. </item>
<item> ……. </item>
</channel>
</rss>
Why XML?
• What are the advantages of XML for the RSS format?
- Validation through XML Schema
- Extensibility through XML Namespaces - Adaptability through XSLT
• E.g. consider a XSLT for RSS to HTML
- Instant RSS file viewing on your browser
• Or XSLT for RSS to WML
- Easy access to RSS items on your mobile phone
• Or XSLT for RSS to SMIL even
- Generate simple multimedia presentations of the latest items in your favourite RSS feed!