Related Research - Definitions and Related Work

Definitions and Related Work

2.3 Related Research

Publications on knowledge processing systems mainly contain big-picture views of the sys-tem components and do not provide enough details, or best practices, to be able to reimple-ment the system. Examples are Ginsawat et al. [47], Guetat et al. [49], Gu et al. [48], or Han et al. [50].

In common literature on knowledge based systems (Akerkar and Sajja [3], Beierle and Kern-Isberner [15], Russell and Norvig [113], or Wunderlich [137]), an overview of the architec-ture of knowledge based systems is given as already depicted in figure 2.1. This is only a high-level view of the basic components, not suitable for implementing a custom system.

Another problematic aspect is that most modern literature explains the development of knowledge based systems using theProloglanguage. This introduces difficulties in inte-grating knowledge processing into existing enterprise applications because none of these is implemented in Prolog.

2.3.1 Search Methodology

To identify literature on software architecture for knowledge processing systems, parts of methodologies likemapping study / scoping review andstructured literature review (see Pe-tersen et al. [105] and Kitchenham [66]) were applied.

Knowledge based systems have been around for a long time (for example, MyCIN [117]

since 1976). Until today, there has been ongoing research in this domain, resulting in a huge amount of publications. These deal with a lot of topics related to knowledge process-ing systems: Knowledge acquisition, representation, storage, processprocess-ing, and so on. The literature review should answer the following two questions:

1. Does documentation of software engineering techniques for the implementation of knowledge processing systems exist?

2. Do knowledge processing system descriptions reference a particluar architecture doc-umentation?

Searching online databases like Google Scholar, IEEE, Springer and ACM did not lead to any further resources than listed in this section, even though the publications are consid-erably old. This is not a sign of an outdated and irrelevant research topic. The design,

implementation and application of knowledge based systems is still relevant, but research in best practices for implementation is missing. The focus has been on the algorithms itself.

Several difficulties in finding related work to answer the research questions exist. To over-come these, the following restrictions are needed:

1. Numerous terms: Throughout time, many terms forknowledge processing systems ap-peared (see Avram [8]). In searching for related work, the following terms were con-sidered:

• Expert System

• Knowledge Processing System

• Rule Based System

• Knowledge System

• Knowledge Based System

2. Large number of publications: It is impossible to read every publication (or at least would take an unacceptable amount of time) that is found using the search terms listed above. Therefore, additional search terms were used in combination with the others:

• Software Architecture

• Software Engineering

• Design Patterns

• Implementation

3. Different interpretations of terms: The termsarchitecture,engineeringand implemen-tationalso lead to publications on the design of rule sets, knowledge representation and the introduction of knowledge based systems in companies. These publications do not contain any information on how to design and implement the actual software architecture. Therefore, these were excluded.

After the identification of relevant papers, their references have been checked for further publications related to the topic. Also, citations on the identified papers were checked.

Because this literature review intends to show the current state of the art of knowledge processing system design and implementation, the publications were not restricted to any specific type or any specific journal or conference.

The main criterion for selection was that in some way, the software architecture (in the sense as defined by Bass et al. [11], or Starke [119]) has to be mentioned.

2.3.2 Knowledge Based System Architecture

Searching for publications on the design and architecture of knowledge processing systems mainly leads to papers on knowledge acquisition and design, providing no details on their

internal structure (for example, Ogu et al. [98], Gu et al. [48], or Ginsawat et al. [47]).

This section contains an overview of published architectures of knowledge based systems, showing the lack of details provided for the development of such systems.

Ginsawat et al. [47] describe a knowledge management system software architecture for the domain of ”software maintenance”. They propose a raw separation of the system into so-called modules, such as a user interface module, an application module and a data access module. These can be compared to the layers of a layered architecture. The user interface module provides the primary interface for the user to interact with the system. The applica-tion module contains the main funcapplica-tionality of the system and the data access funcapplica-tion fo-cuses on methods for storing and retrieving data to and from a storage system. The concrete implementation is done in the form of a set of Infrastructure Services (storage and communi-cation), Knowledge Services (knowledge creation, sharing and reuse) and the Presentation Services (personalization and visualization of the knowledge). The software is designed as a single application running on a webserver with a relational database for storing the data.

Guetat et al. [49] combine the different layers (Presentation, Business Logic, Data) of a layered software architecture with the Model View Controller (MVC) pattern to support the requirements of ”information system urbanization”. They present a5+1layer concept consisting of Interface, Navigation, Orchestration, Choreography, Services and Data Access layer. This architecture consists of the layered architecture in combination with extra exten-sibility, provided by the Orchestration and Choreography layers.

Maier and Sametinger [77] describe a knowledge management system software architecture that uses the Peer-To-Peer technology. Knowledge is not only stored in a central repository, but it is separated into three different workspaces: private, protected and public. This allows sharing parts of knowledge and keep other parts private.

The architecture described in Gu et al. [48], is based on the domain of Medical Genetic Test-ing. They describe the main components requested by a knowledge management system (workflow management system, search engine, groupware) and use technologies like BPEL (Business Process Execution Language) as the programming language, RDF (Resource De-scription Framework) / WSDL (Web Service DeDe-scription Language) and OWL (Web Ontol-ogy Language) for search (including database search) and the integration of a groupware system. Nevertheless, no specific remarks on the layering or non-functional requirements of the software system architecture are made.

A common architecture of a knowledge based system is described by Oberortner et al. [97].

It is separated into an inference engine and a knowledge base. The access to the system hap-pens either directly via a graphical user interface, or any remote communication interface (e.g., web service, or RPC).

The system software architecture in Ma and Hemmje [50] uses a modular approach. The T-KM module handles tacit knowledge (knowledge that is hard to express explicitly), the E-KM module handles explicit knowledge (easy to store) and the I-E-KM module is responsible for collecting new knowledge (implicit knowledge). No specific remarks were made about the design of the modules themselves.

Hüttenegger [58] describes three main building blocks of a Knowledge Management Sys-tem Software Architecture. The first block is a virtual information pool where all sorts of knowledge are stored. The second block is a central user interface that provides global access to the knowledge base. The third building block consists of automatic updating / extending methods for the knowledge base by using common techniques from the area of data mining and artificial intelligence.

A knowledge processing system for an E-Librarian Service is described by Linckels and Meinel [75]. It is based on web ontologies and describes the implementation by using a simple layered architecture. It consists of aknowledge,inference,communicationand presenta-tionlayer.

Some descriptions of knowledge processing system architectures can be found. A few of them describe details in the form of UML class diagrams, like Chattratichat et al. [23] and Kouamou [68]. They describe how to integrate data mining algorithms, mainly provided by the WEKA framework³, and how to design the software architecture to keep it flexible.

Others define architectures of such systems on a high level. The central components are shown, but not how they communicate with each other or how their internals have to be implemented. These architectures mainly relate on a “layered” architecture pattern and just describe where the various components of a knowledge processing system should be placed. No inference can be made how the parts can be integrated or what patterns should be used, and no details about components and connectors exist. Some of these architectures are described in Ginsawat et al. [47], Guetat et al. [49], Maier and Sametinger [77], Gu et al. [48], Ma and Hemmje [50] and Hüttenegger [58].

Even though there are a few detailed descriptions of components for knowledge based sys-tems in the form of design patterns, these are not enough to design and implement a com-plete knowledge based system / framework. The proposed architectures in literature are either very high-level or handle only single specific parts of the system (for example, infer-ence algorithm or storage of concrete types of knowledge).

Besides a long list of advantages of knowledge processing systems, Abdullah et al. [120], already mentioned the importance of integrating these systems into enterprise applications in 2006. “Furthermore, while traditionally knowledge systems have been designed as stan-dalone applications, today they are becoming a part of an enterprise’s information system.

While once they were a research laboratory technology, now they are commercial applica-tions (see, for example, Liebowitz [74]).”

According to Google Scholar, this work has been cited 57 times (by the time of writing).

None of these citing publications deal with the issues of developing (in the sense of "pro-gramming") or integrating such systems. The importance and usage of knowledge based systems is also underlined in surveys on their application domain by Pannu [102], Jabbar and Khan [59], Sharma and Jain [116], Avram [8] and also Abdullah et al. [120].

3http://www.cs.waikato.ac.nz/ml/weka/

2.3.3 Design Patterns

One of the most famous resource for software design patterns for object-oriented program-ming is the book of the “Gang-of-Four”. It was an early approach to bring the idea of pat-terns as a solution to recurring problems, originally introduced by Alexander [4], to the domain of object-oriented software development. Another famous book on application ar-chitecture patterns is written by Fowler [43]. Many other books on arar-chitecture styles and patterns exist, often only describing the most common ones (e.g., Pomberger and Pree [107], Starke [119] or Posch et al. [108] to name just a few). A collection of architecture and design patterns can also be found in the publication of Avgeriou and Zdun [7].

Another famous example of literature on design patterns is the series “Pattern-Oriented Software Architecture”, consisting of five volumes, starting with "A system of patterns"

[22] and finishing with "On Patterns and Pattern Languages" [21].

2.3.4 Design Patterns for Knowledge Processing Systems

Some already picked up the idea of using design patterns to document best practices:

Lalanda [72] described the application of design patterns for the development of multi-expert systems, based on the Blackboard design pattern (see Buschmann et al. [22]). This pattern focuses on the integration of different modules in a large system. The basis for this paper is another one by Lalanda [73], introducing the Shared Repository pattern.

An extensible knowledge based system architecture is described by Brown et al. [18] for the area of computer tomography. Again, the Blackboard design pattern is used as the central component for reading and writing data (knowledge). Moreover, the system is sep-arated into different modules (model, inference engine and image processing routines).

Additional extensibility is added by choosing an appropriate model for representing the knowledge.

Davis et al. [26] have used design patterns for combining knowledge based systems with business systems (enterprise applications). Therefore, they introduced the Knowledge Di-rector pattern. The pattern consists of the componentsKnowledge based system,Database, User Interface,Negotiator,TranscriberandState Registration. Also, an example is given inte-grating a non object-oriented knowledge based systems. The work presented in chapter 4 is related to this research, especially the part on external integration, targeting a similar problem.

Arsanjani [6] introduces a pattern language for the definition of business rules: Rule Object.

This pattern language provides design patterns for modelling business rules in the form of if ...then ... in source code. The Rule pattern presented in chapter 5 is an extension of this work. Also the work by Yoder et al. [138] deals with the representation of business rules.

Only a few of the above-presented architectures explicitly describe patterns used to build these systems (e.g., the MVC, or the Blackboard pattern). No paper exists, containing a

collection of relevant patterns (e.g., as shown by Kohler and Kerkow [67]) that can be used to build a knowledge management and processing system.

The Open Group Architecture Framework [123] tries to bring the idea of architecture pat-terns to the domain of enterprise architecture. Besides a description of software architecture and design patterns, it also provides a list of resources for such patterns.

2.3.5 Summary

To sum up, no common architecture guideline exists. There is a lack of resources on how to implement or integrate custom knowledge processing systems.

Does documentation of software engineering techniques for the implementation of knowl-edge processing systems exist?

The literature review revealed that thorough documentation only subsists for enterprise applications. For the implementation of knowledge processing systems, no dedicated doc-umentation is available, only big-picture overviews. Docdoc-umentation exists for small special parts (e.g., processing algorithms, or knowledge storage). There is no guideline on how to implement one’s own custom knowledge processing system.

Do knowledge processing system descriptions reference a certain architecture documen-tation?

No common references on an architecture documentation, or guideline, could be identified.

The most common "architecture" refers to the common description of knowledge processing / based systems shown in figure 2.1.

Im Dokument Application of architecture and design patterns in the context of knowledge processing or knowledge based systems / submitted by Stefan Nadschläger MSc (Seite 50-55)