• Keine Ergebnisse gefunden

OntoSelect: Towards the Integration of an Ontology Library, Ontology Selection and Knowledge Markup (Position Paper)

N/A
N/A
Protected

Academic year: 2022

Aktie "OntoSelect: Towards the Integration of an Ontology Library, Ontology Selection and Knowledge Markup (Position Paper)"

Copied!
2
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Lecture Notes in Computer Science 1

OntoSelect: Towards the Integration of an Ontology Library, Ontology Selection and Knowledge Markup

Paul Buitelaar DFKI Language Technology

Stuhlsatzenhausweg 3, D-66123 Saarbrücken, Germany

paulb@dfki.de

A central task in the Semantic Web effort is the annotation of data and documents with appropriate semantic information (i.e. knowledge markup or ontology population) derived from one or more ontologies published on the Semantic Web. The added knowledge allows automatic procedures (agents, web services, etc.) to interpret the underlying data and/or documents in a unique, formally specified way, thereby ena- bling autonomous information processing.

Most of the current work in knowledge markup is concerned with annotation of concepts relative to a particular ontology that is typically developed specifically for the task at hand. Instead, a more realistic approach would be to access an ontology library and to select one or more appropriate ontologies. Although the large-scale development and publishing of ontologies is still only in a beginning phase, many are already available (see e.g. the DAML ontology library1, OWL ontology library2, or SchemaWeb3). To select the most appropriate ontology (or a combination of comple- mentary ontologies) will therefore be an increasingly important subtask of knowledge markup.

Here we present an approach towards an integration of the collection and classifica- tion of ontologies in a dynamic web-based ontology library, methods for the selection of an ontology from this library and its use in knowledge markup. Building on the idea of the DAML and SchemaWeb ontology libraries, we aim to take this to its fullest consequence through the construction of a fully dynamic ontology library (OntoSelect) that will be updated continuously, organized in a meaningful way and with automatic support for ontology selection in knowledge markup.

The OntoSelect approach aims at providing an access point for ontologies on any possible topic or domain. However, unlike these libraries, OntoSelect is not based on a static registration of published ontologies, but instead includes a dynamic ontology crawling procedure that monitors the web for any newly published ontologies in the representation formats: RDF/S, DAML or OWL.

Collected ontologies are analyzed using the OWL API4 that allows for the extrac- tion of structure and content of any RDF/S, DAML or OWL ontology. There are cur-

1 http://www.daml.org/ontologies/

2 http://protege.stanford.edu/plugins/owl/ontologies.html

3 http://www.schemaweb.info/

4 http://owl.man.ac.uk/api.shtml

(2)

Lecture Notes in Computer Science 2

rently around 800 ontologies in the OntoSelect library, covering a wide range of topics and domains. Ontologies are stored in a database and are organized according to: for- mat; ontology-, class- and property-names; class- and property-labels. The assignment of labels is unfortunately not so wide spread. However, specifically from the semantic annotation and knowledge markup perspective this is an important aspect, as auto- matic annotation or markup of documents crucially depends on the availability of terminology for classes and/or properties.

OntoSelect provides a dynamically updated library of ontologies that may be used in a knowledge markup process. However, as there is a rapidly increasing number of published ontologies available, it is becoming a more and more difficult task to select the most appropriate one(s). To provide semi-automatic support for this, OntoSelect includes a functionality for selecting ontologies for a given knowledge markup task, based on the following criteria that address ontology content and structure:

Coverage: How many of the terms in the document collection of the particular knowledge markup task are covered by the classes and properties in the ontol- ogy?

Structure: How detailed is the knowledge structure that the ontology repre- sents?

Connectedness: Is the ontology connected to other ontologies and how well es- tablished are these?

After selection of an appropriate ontology from the OntoSelect ontology library, a document collection under consideration will be marked up with the knowledge from this ontology. We are currently working towards an instance-based learning approach that considers knowledge markup as a classification task. Classifiers for the knowl- edge markup process will be generated by collecting occurrences (i.e. linguistic reali- zations of classes and properties: labels or class-/property-names with their linguistic contexts) from relevant text collections that are to be associated with each of the on- tologies in the OntoSelect library.

A central problem to be addressed in this is the extraction of relevant terms in text and their appropriate classification by the constructed classifier. Additional problems that are to be addressed include multilinguality (e.g. the use of an English-based on- tology in knowledge markup of German documents) and ambiguity (e.g. multiple definitions of the same concept in several ontologies or multiple use of the same label for different concepts within one ontology).

Acknowledgements

This research has been supported by research grants for the projects VIeWs (by the Saarland Ministry of Economic Affairs) and SmartWeb (by the German Ministry of Education and Research: 01 IMD01 A). Thanks to Thomas Eigner and Srikanth Ramaka for their work on the OntoSelect framework.

Referenzen

ÄHNLICHE DOKUMENTE

In particular, OIL has rather strong tool support in the following areas: Ontology Editors to build new ontologies; Ontology-based annotation tools to link unstructured

Bayesian ontology languages are a family of probabilistic on- tology languages that allow to encode probabilistic information over the axioms of an ontology with the help of a

This paper describes a generic Ontology Design Pattern (ODP) based on a project to convert bibliographic records from Montana State University’s Open Access Institutional

In summary, the geo-ontology design pattern uses fixes and segments to cap- ture the trajectory data, and defines a number of interfaces to integrate related geographic

The name description logics is motivated by the fact that, on the one hand, the important notions of the do- main are described by concept descriptions, i.e., expressions that are

To provide semi-automatic support for this, OntoSelect includes a functionality for selecting ontologies for a given knowledge markup task, based on the following criteria

The ISOLDE (Information System for Ontology Learning and Domain Exploration) system we describe in section 3 generates a domain ontology by extracting class candidates

We presented an ontology-based question answering system that parses user questions with respect to a domain-specific lexicon built automatically from a specification of