• Keine Ergebnisse gefunden

RDF: A Data Model for the Semantic Web

TheResourceDescriptionFramework (RDF) [RDF00a]) is a very popular data model in knowledge representation. It is a simple, logical model that is used to

2This is only true as long as there is no unique name assumption.

represent things (either real or imaginary) as resources. Resources are connected by properties that again are resources. RDF has become a W3C recommenda-tion in 2004. The data model of RDF is a directed graph structure, where every single node is a resource that is connected to other resources. The connecting edges are labelled and correspond to the properties of a resource. Every ex-pression is encoded in (sets of) triples of the formsubject predicate object. Such triples are calledstatements. Subject and object are nodes in the graph struc-ture. It is easy to see that both data and metadata can be encoded using the triple structure: :Peter :hasMother :Susan is a simple statement about the rela-tionship between two instances, from the point of view of the graph structure also the schema information:hasMother rdf:type rdf:Propertyare simply edges in a graph which are labeled with the resource identifier of the property.

:Peter

:hasMother

:Susan :hasMother

rdf:type

rdf:Property

Figure 3.2: Graph Structure of (:Peter :hasMother :Susan)

One basic design element in RDF are uniform resource identifiers, abbre-viated and more commonly known as URIs. The nodes and the labels of the edges can be identified by URIs, with the exception that the object of a state-ment can also be a literal. URIs are like addresses and in fact they are used as such in the Web: URLs (uniform resource locators) are a special kind of URIs, the notion locator indicates that URLs are meant to point to existing locations, reachable using the HTTP protocol. URIs are more general: for in-stance, the URIhttp://family.org#hasMotherconsists of a protocol specification (http://) followed by thehierarchical part (family.org) and finally thefragment (#hasMother). This is not a URL because it does not point to a valid address in the Web. In ontology design, thefragment usually identifies the local name of a resource (a concept, a property, or an individual name) whereas the rest of the URI is given as anamespace prefix. Nevertheless this is just a way to improve readability for human readers, the name of the resource is the full URI. This is also reflected in the unusual diction using colons in the example above, where the default namespace (:) and the RDF namespace (rdf:) have been used (but no definitions for the namespaces have been given). Although URIs need not point to an existing location, they often do. Consider the following example, where the resource identifiers are given in full length:

Example 3.4

( h t t p : / / e x a m p l e . o r g#P e t e r h t t p : / / f a m i l y . o r g#h a s M o t h e r h t t p : / / e x a m p l e . o r g#Susan ) ( h t t p : / / f a m i l y . o r g#Susan

h t t p : / /www . w3 . o r g /1999/02/22−r d f−s y n t a x−ns#t y p e h t t p : / / f a m i l y . o r g#Mother )

The example uses the built-in predicatetypefrom the RDF namespace http://www.w3.org/1999/02/22-rdf-syntax-ns#. This URI points to a real doc-ument on the web where the resource is defined:

<r d f : P r o p e r t y

The other URIs in Example 3.4 do not point to an existing document. As long as an URI is valid (wellformed) and unique, it is possible to use this resource in any ontology and all information about the same resource adds up to a whole if those ontologies are brought together. It is important to bring to mind that ontological data is not necessarily local, it can be distributed over many different locations.

RDF provides the user with a set of built-in predicates like rdf:type. This predicate states that the subject of a statement is a member of the class which is specified by theobject of the statement. Furthermore there are constructs for collections like bags, sequences, and lists.

There are different ways in which RDF graphs can be serialised for output, storage and communication. For example, RDF can be serialised in XML which is especially useful for the interchange of data. More suitable in terms of read-ability is N3. The Example3.4is given as serialisations to N3 and RDF/XML:

Example 3.5(Extending Example3.4using namespace prefixes)

@ p r e f i x r d f : <h t t p : / /www . w3 . o r g /1999/02/22r d f−s y n t a x−ns#>.

@ p r e f i x f a m i l y : <h t t p : / / f a m i l y . o r g #>.

@ p r e f i x : <h t t p : / / e x a m p l e . o r g #>.

: P e t e r f a m i l y : h a s M o t h e r : Susan . : Susan r d f : t y p e f a m i l y : Mother .

Example 3.6(The same RDF data, serialised as RDF/XML)

<?xml v e r s i o n =”1.0” e n c o d i n g =”ISO−8859−1”?>

In Example 3.5, there are two statements, but one seems to be contained in the first. If :Peter has a mother :Susan, :Susan is of course a mother. This derivation can be made easily by common sense knowledge, but it is not possible to draw that conclusion logically from these RDF statements alone. For that

purpose, there exists a schema layer on top of RDF, named RDF Schema (or RDFS) [RDF00b] which adds further concepts for thedescriptionof knowledge, or, in other words, for the modelling of meta-data.

Example 3.7

RDFS specifies additional axioms that can be used to define assertions. (N3 notation allows to use semicolons to separate property-value pairs on the same subject).

@ p r e f i x r d f : <h t t p : / /www . w3 . o r g /1999/02/22r d f−s y n t a x−ns#>.

@ p r e f i x r d f s : <h t t p : / /www . w3 . o r g /2000/01/ r d f−schema#>.

@ p r e f i x f a m i l y : <h t t p : / / f a m i l y . o r g #>.

f a m i l y : h a s M o t h e r r d f : t y p e r d f : P r o p e r t y ; r d f s : r a n g e f a m i l y : Mother ; r d f s : domain f a m i l y : C h i l d .

Now it is defined that every object of a hasMother relationship is of type family:Mother. Thedomainaxiom from the RDFS namespace makes it possible to give an assertion with regard to the subject of a relationship: if any resource has afamily:hasMotherrelationship, it is known to be afamily:Child. Note that in contrast to databases these restrictions are assertions: they add knowledge to a resource instead of putting constraints on relationships. If, for example, the statement (:Susan family:hasMother :Peter) is added to the facts of the ex-ample this is no contradiction. Neither is there an irreflexive definition for the family:hasMotherrelationship, nor exists any information which leads to a con-tradiction when:Peterbecomes a mother (although, in fact:Peterwill never be a mother).

Example3.7implicitly defines the resourcesfamily:Childandfamily:Motherto be classes (concepts). This deduction can be made because both therdfs:domain property and therdfs:rangeproperty relate properties to classes.

Example 3.8(Mothers and Children are Persons)

@ p r e f i x r d f s : <h t t p : / /www . w3 . o r g /2000/01/ r d f−schema#>.

@ p r e f i x f a m i l y : <h t t p : / / f a m i l y . o r g #>.

f a m i l y : Mother r d f s : s u b C l a s s O f f a m i l y : Person . f a m i l y : C h i l d r d f s : s u b C l a s s O f f a m i l y : Person .

Hereby instances of family:Motherorfamily:Childbecome also instances of fam-ily:Person.

RDF data can easily be put together, it is simply a merging of graph struc-tures that connect at identical nodes. They can be split, separated, distributed, and supplemented in any fashion. Therefore, it makes no difference, if the facts are given in separated files (as above in Examples 3.5, 3.7 and 3.8) or in one file like in the following example. In the course of this work many of the ex-amples will not display the whole contents of a knowledge base but just add some additional details to existing facts (previous examples). In such cases the dependencies will be mentioned.

Example 3.9(Examples3.5,3.7and3.8in one file)

@ p r e f i x r d f : <h t t p : / /www . w3 . o r g /1999/02/22r d f−s y n t a x−ns#>.

@ p r e f i x r d f s : <h t t p : / /www . w3 . o r g /2000/01/ r d f−schema#>.

@ p r e f i x f a m i l y : <h t t p : / / f a m i l y . o r g #>.

@ p r e f i x : <h t t p : / / e x a m p l e . o r g #>.

: P e t e r f a m i l y : h a s M o t h e r : Susan . : Susan r d f : t y p e f a m i l y : Mother .

f a m i l y : h a s M o t h e r r d f : t y p e r d f : P r o p e r t y ; r d f s : r a n g e f a m i l y : Mother ; r d f s : domain f a m i l y : C h i l d . f a m i l y : Mother r d f s : s u b C l a s s O f f a m i l y : Person . f a m i l y : C h i l d r d f s : s u b C l a s s O f f a m i l y : Person .

Note that the explicit rdf:type information for :Susan is redundant as it can be

derived from (:Susan rdf:range family:Mother). 2

TBox ABox

family:Child

rdfs:subClassOf

family:Mother

rdfs:subClassOf family:hasMother

rdf:type

rdfs:domain rdfs:range

rdf:Property

family:Person

:Peter

family:hasMother

:Susan

rdf:type rdf:type rdf:type

Figure 3.3: Graph Structure of an RDF/RDFS Knowledge Base

The data from Example3.9is presented as a graph in Figure3.3. The upper part of the diagram shows the terminological knowledge, which is here called a TBox in analogy to Description Logic knowledge bases (see Section2.2for a for-mal introduction to Description Logics). The TBox is comparable to the schema information in a relational database. The lower part contains assertional know-ledge (ABox) where all information about instantiations of concepts belong.

The notion of an ABox is again a reference to Description Logics. Note that this separation of TBox and ABox is only theoretical. In most knowledge bases all information is contained in the same graph. The gray edges in the example above represent explicit, the red edges implicit knowledge. Implicit knowledge can be derived by a reasoning engine. It is not contained as real statements but it can be obtained by queries to the reasoning engine. The subject of queries to Semantic Web data is now investigated.