• Keine Ergebnisse gefunden

Design of a Unified Querying Formalism for Semantic Web Services

4.4 A Unified Querying Formalism for Semantic Web Services

4.4.1 Design of a Unified Querying Formalism for Semantic Web Services

In addition to the requirements addressed in Section 4.2, which focus on the capabilities and integration of a querying approach for SWS, the query language itself has to fulfill different criteria [1, 38]:

Expressive power: Concerning the expressiveness, a subset of the requirements, which have been often used in the context of RDF query languages, can also be adapted to other query languages [38].

This subset comprises:

• An appropriate, but also powerful expression syntax for navigating the component hierar-chies, which are contained in the data.

• Functionality for querying the components and their attributes.

• Support for data types and a comparison of values.

• Capability to deal with optional parameters.

Schema awareness: A query language should be schema-aware, i.e., aware of the data model, a query is referred to. When queries are defined, this allows for the exploitation of the underlying schema for different purposes, e.g., for type checking or optimization. In addition, this structure consciousness is closely tight with the awareness of the underlying semantics and the requirement for expressive power.

Semantics: Direct consequences of imprecise semantics are ambiguity and misunderstandings. Therefore, a formal description of the semantics of the query formalism is required.

Program manipulation: In order to allow for a manipulation (e.g., translation) of queries, the query language has to be machine-processable, i.e., the query language, with respect to syntax and se-mantics, must be simple enough and unambiguous to permit an automated processing of a query.

However, a trade-off exists between user-friendliness and simple parsing, which also has to be considered. In doing so, a simple and convenient structure has to be established while avoiding redundancy and sustaining readability.

Compositionality: This requirement addresses the decomposition potential of queries, so that larger queries can be split into a number of smaller queries. For this, the query language must permit to reuse the output of one query as the input of another query.

These requirements show that it is necessary to address two different aspects in order to enable a query formalism for SWS: First, the requirement forschema awarenessshows the necessity to provide a data model the queries will be based on. Second, the query language itself has to be modeled in terms of its syntax, semantics, andexpressive powerin general. Compositionality is not explicitly regarded in the solution at hand, but could be addressed in further versions in order to optimize queries.

The solution at hand (cp. Section 4.4.2) exploits SPARQL as foundation for a sophisticated SWS query formalism. Hence, a number of the abovementioned criteria is to some extent adopted from SPARQL, especially theexpressive power. Nevertheless, the expressive power of SWS2QL is different from SPARQL, as both address different data models – accordingly, we will highlight the corresponding characteristics of SWS2QL.

In the following, we will present the both models which provide the foundation of SWS2QL: The data model and the abstract query model.

Data Model

accessURI ServiceBinding

name concept

Interface name description

Service

name concept

Operation

name concept

Input

name concept

Output

* 0..1

1

*

1

*

*

* 1

Effect Precondition

*

*

*

1 name

address phone URL

Provider

Figure 4.5:Abstract Data Model for Services

A data model for SWS queries describes the components of a service as well as their properties and relationships to other service components. In Chapter 3, ATSM has been introduced, which is a common model for SAWSDL and OWL-S (cp. Appendix A.1.3). Hence, it already comprises the most important service components applied in service discovery but is restricted to those components usually needed in matchmaking. For its usage as foundation for query formulation, it needs to be slightly enhanced:

First of all, service requests are often stated by specifying a description of the “perfect” service example (cp. Section 4.1). In general, the creation of such a query by example will be probably restricted to more sophisticated service requesters. Nevertheless, the opportunity to specify an already known service in order to search for similar services (e.g., as a substitute for a given service) has to be considered within the query model. Therefore, the ServiceBindingcomponent is also adopted into the model in order to be able to specify the access URI of the respective service. Furthermore, a service requester may also be interested in services offered by a specific service provider. This information is also provided by common registry standards. Therefore, the data model also incorporates aProvidercomponent, even though it is not part of the ATSM service description. Furthermore, the structure of input and output parameters in ATSM and the data model at hand differs slightly. While in ATSM we differentiate between inputs/outputs (respectively message references, cp. Appendix A.1.3), for SWS2QL, input and output parameters are direct subelements of an operation. As the research community is still debating about a common format for preconditions and effects in SWS formalisms like, e.g., SAWSDL and OWL-S, we consider these elements in our abstract data model for services. However, we do not define any attributes for these service components.

Figure 4.5 shows an overview of the resulting service model. In general, the data model makes no claim to be complete, but aims at providing a self-contained, lean data model or schema capturing the most relevant information of a service, which can be individually customized. This data model is applied as an example and could be replaced by another model; of course, this would make it necessary to change the applied query syntax (see below).

Abstract Query Model

lightservice fullservice

wildcard table

query type

syntactic semantic

Figure 4.6:Overview of Query Types

Based on the data model provided, it is possible to define an abstract model for queries, which provides the foundation to exploit and enhance an existing query syntax. In general, queries based on common query languages (e.g., SQL or SPARQL) encompass different sections (orcomponents), which can be sum-marized to a result parameter component, a query statement component, and a component to specify solution modifiersfor altering the respective result set [39]. Regarding SPARQL, these different compo-nents would be a pattern matching part, solution modifiers, andoutput. The latter is the specification of the variables of interest [39]. Since the first and the last component both refer to the results of a query, the enumeration can be shortened to a query statement component and a result component. The structure of both parts is discussed in the following.

Before elaborating on the semantics of the query statement component, it is necessary to address common use cases with respect to query types in order to derive a generic query statement structure.

For this, four basic query types are used, which are derived from the discussion of common approaches to query formulation in Sections 2.3.1 and 4.1. The query types are depicted in Figure 4.6. The first two query types are subtypes of syntactic, keyword-based approaches, while the last two types represent semantic-based approaches. In detail, wildcard search allows the user to define single terms as well as placeholders, whiletable-based search additionally permits to state attribute-value pairs [127]. The fullserviceapproach is used to refer to a complete service description representing the service request, i.e.

a “query by example”, while thelightserviceapproach allows for the specification of arbitrary excerpts of a complete service description according to the needs of a service requester. For example, a fullservice query can be established by specifying the access URI of a service, so that the corresponding service description can be retrieved. In contrast, a lightservice query can be used to specify the properties of arbitrary elements like single interfaces or single operations combined with the input and output parameters a service offer has to provide.

As already mentioned in Section 4.2, a flexible query mechanism should also address the dynamic selection of different matchmakers controlled by the service requester, so that appropriate matchmakers can be applied for specific matching purposes. In doing so, the support for a comparison of values is enhanced or rather generalized, which increases the expressive power of the query language (see above).

Within the work at hand, the four query types are used to exemplarily refer to different matchmakers.

For example, in order to process the information based on one of the keyword-based approaches, the default registry capabilities could be applied, while the semantic-based approaches could be directed to a custom semantic matchmaker. Of course, it is also possible to integrate other matchmakers into the service registry and therefore reference in a query.

In order to be able to associate the respective statements of a query to their corresponding matchmak-ers,query sectionsare introduced representing an instance of a specific query type. Besides the definition of multiple query sections referring to different query types within a service request, also the combina-tion (e.g., conjunccombina-tion, inclusive or exclusive disjunccombina-tion) of these seccombina-tions needs to be addressed within a query. For this, query sections have to be organized in some structure, i.e., a global query container, indicating the type of combination. The capabilities of query sections address the requirement of compo-sitionality (see above) of a query language, since single query sections can be specified or several query sections can be combined to a complex query construct.

serviceQuery queryContainer

querySection element

any

resultSection attribute

Figure 4.7:Abstract Query Model

In general, the actual query parameters are specified within a query in terms of query statements using the respective query syntax and commonly refer to specific attributes of a component of the underlying data model, which also addresses the expressive power of a query language. In general, the values of properties can be expressed using arbitrary data types. Nevertheless, a specific data type needs to be defined for each property in order to establish a machine-processable request.

In addition to the criteria of the request, the result parameters and the properties of the result set need to be specified within a query. The result parameters specify the desired objects to be returned and the properties may refer to different aspects of the result set (e.g., the maximum number of results, sorting criteria). Both parts can be defined within the result section. Based on the components of a unified service query, an abstract query model is introduced, which serves as a blueprint for service queries: Its overall structure has been defined as depicted in Figure 4.7. Basically, service queries based on the model comprise a global query container and a result section. The global query container may either contain further query containers or query sections. In the latter case, the query sections are interconnected using a single combination type (AND,OR,XOR) defined in the global query container, whereas in the first case, a nested structure of query containers is established in order to organize the query sections using arbitrary combination types.

Within a query section, an arbitrary number of elements of any type can be stated representing the arguments of a query. Regarding the combination of the query elements within a single query section, a conjunction (i.e.,AND) is assumed for two reasons: On the one hand, a query section can be considered as a semantic unit, where all elements belong to the same parent element (e.g., all input and output parameters defined within a query section belong to the same operation), and on the other hand, the introduction of additional combination types within a single query section would increase complexity and is therefore not very user-friendly.

In general, thresholds are defined in order to narrow the potential result set (cp. Section 4.2). This can be achieved by a service requester by prescribing a minimum level of similarity with respect to the properties or even the whole service offer. For this, the opportunity to define similarity thresholds is introduced for both, i.e., each element within a query section, as well as for the whole service. The actual determination of similarities is accomplished by the responsible matchmaker and thus, the data type of the threshold value depends on the selected matchmaker. Therefore, threshold values are not restricted to numerical values but can also be expressed in terms of text-based constants. This allows for the utilization of similarity functions from the field of IR to determine the syntax-based similarity of terms as well as the application of reasoning techniques by specifying DoM values [125, 211].

Concerning the result section, an arbitrary number of attributes can be specified, which refer to the desired result parameters to be returned. In addition, properties of the result set can be defined within

this section (e.g., a descending or ascending order of the results with respect to some attribute of the result set).

Besides the result set modifiers, the attributes which should be contained within the result set are also defined (e.g., the name of the service or the provider name). One of these attributes can also be used as sorting criterion by specifying its ID as the value of anorderByattribute. The abstract query model is defined using an XSD document, which is presented in Appendix D.2.