Openness to new knowledge - Discussion and Outlook

Evaluation and Discussion

6.2 Discussion and Outlook

6.2.3 Openness to new knowledge

Design patterns can be used to solve general architectural problems, like keeping the sys-tem open for new knowledge and additional, or improved, processing methods. To keep a software architecture extensible and maintainable, the principles ofhigh cohesionandloose coupling (see, for example, Starke [119]) have to be applied. High cohesion means that one component should only be responsible for one single technical aspect. Loose coupling means that these components should not rely heavily on specific other components. The dependencies should be easily exchangeable.

The following architecture and design patterns help to support the implementation ofhigh cohesion andloose coupling. When these patterns are applied in a knowledge processing system architecture, the system also becomes open to new knowledge.

SOA and Microservice

Service Oriented Architecture (SOA) [122] is an architecture pattern. The Microservice [111] pattern is a continuation of the idea of SOA.

Context: Software can be decomposed to different small components that have to work together as one single system.

Problem: The system should stay maintainable and single components should be de-veloped in parallel by different teams.

Forces:

• Multiple teams develop the system in parallel.

• The system has to be split up into small components, each component handling only one single aspect (high cohesion).

• Single components run in parallel.

Solution: Use Service Oriented Architecture or Microservice to split the system into single components (so-called "services") that can be developed and executed in parallel.

Consequences:

• Parallel development of single services is easy.

• Amount of classes increases and therefore also the system complex-ity.

• Application stays maintainable: Changes affect only single compo-nents, not the whole system.

• By selecting a suitable communication protocol / method / technol-ogy, single services can execute their tasks on their own and therefore can be run in parallel.

A Service Oriented Architecture consists of single services (components). Every service is responsible for a single specific technical aspect. This reduces the complexity in code changes and increases cohesion. Single services can be updated without having to change other code in the application. Often the services are just logical structures of the program code and run in the same process. This reduces the communication overhead.

When small services can be run on their own, this is called a Microservice architecture. Of-ten, a web service interface is provided by the services that will be used by others for com-munication (e.g., SOAP [84] or REST [40]). The Microservice architecture has the benefit that services can be executed in parallel and in a distributed environment (e.g., in a Cloud).

This allows creating larger applications that scale well, also when having to process a large amount of data.

Data Access Object (DAO)

The Data Access Object pattern is described by Deepak et al. in [5].

Context: Access of a data storage system in a software application.

Problem: Data access and manipulation should be encapsulated in a separate layer.

Forces:

• Data in a persistent data storage system has to be accessed and ma-nipulated.

• The technological code to access the storage system has to be decou-pled from the rest of the application.

• A uniform data access API is provided for different storage systems.

• Maintainability and portability are supported by encapsulating pro-prietary access code.

Solution: Use a Data Access Object to manage connections and to hide proprietary access code to a persistent storage system.

Consequences:

• Adds an extra layer to the application.

• Enables easier migration.

• Reduces complexity of the code and organizes all data access code into one single layer.

• Needs class hierarchy design (therefore fits best in object-oriented programming languages.

The aim of this pattern is to separate data access and manipulation into layers (as shown in figure 6.8). Therefore, the concrete data storage technology is hidden from the rest of the application and can be exchanged without having any negative effect on the rest of the code.

In a system, multiple data access objects can exist. It is common to create one DAO for each table, or any logical compound entity. Moreover, it is important to provide an interface for every DAO. The interface does not contain any implementation, only method signatures, but ensures that the implementation can change without affecting other code.

The underlying storage technology does not matter in this case. Any kind of storage mech-anism can be hidden by the DAO pattern. It is also possible to use multiple different storage technologies together. Because the interface does not change and provides the most com-mon methods needed for a storage system, the implementation can change without affecting the knowledge processing algorithm itself.

Figure 6.8: Example of the DAO.

Data Transfer Object (DTO)

The Data Transfer Object pattern is described by Deepak et al. in [5].

Context: Communication of a client with a server and data exchange between them.

Problem: Multiple data should be transported over a tier.

Figure 6.9: DTO usage as described by Fowler [43].

Forces:

• Clients access and update data in other tiers.

• The number of remote requests has to be kept low.

• Network performance must not suffer due to a huge amount of re-mote calls from a client, accessing only bits of data.

Solution: A Data Transfer Object can be used to combine bits of data into one single object and only one remote call from the client is needed to access the data.

Consequences:

• Network traffic is reduced by fewer calls of the client.

• Introduces possible stale data into the client.

• A mapping from the internal presentation of data to the external one is needed.

• The client gets only the data it needs.

The data transfer object tries to hide the internal representation of data structures from the user of a knowledge processing system. Therefore, all internal data is copied to a transfer object that can have similar properties like the internal data structure but may also differ.

The user (e.g., other services) can then interact with the system via these DTOs. This allows the internal structure to change without affecting the clients.

The Data Transfer Object can also be used to improve the performance by only sending the data that is really needed. Often the internal structure of an object contains more infor-mation than is needed for the client (e.g., anid, a reference, etc.). In this case, the data sent over the network can be reduced, or even aggregated data can be sent.

Figure 6.9 shows an example of this pattern. The mapping from the internal representations of data (firstName_andlastName) to the DTOnameis done in a business logic method.

Dependency Injection

The following description of Dependency Injection (or also called Inversion of Control) is defined by Fowler [44].

Context: Software contains multiple components that have dependencies on each other (e.g. services).

Problem: The components have to be coupled but have to preserve the "high cohe-sion" and "loose coupling" principle.

Forces:

• Components have to be coupled easily.

• Coupling happens loose.

• Exchange of components has to be easy.

• High cohesion of services must be maintained.

Solution: Use the Dependency Injection pattern to avoid tightly coupled compo-nents.

Consequences:

• Components are loosely coupled and can be exchanged easily.

• Introduces overhead for needed configuration (e.g., XML, plain text, code, etc.).

• Complexity is increased due to the introduction of many new classes that separate concerns.

• The performance can be worse using a dependency injection con-tainer that has to manage all instances and dependencies.

In the following explanation, the termservicestands for a specific data structure in a single program (as described at the beginning of section 6.2.3). This can be compared toobjectsin an object-relational programming language.

Dependency Injection offers flexibility and possibilities to integrate new knowledge pro-cessing services into a knowledge based system. This design pattern is also calledInversion of Control. The aim of this pattern is to avoid that a service has to search for references to other services by itself. Instead, a container manages all instances of services and “injects”

them into other services that require this dependency.

The definition of which service needs another service happens mainly via an XML configu-ration file (other ways are also possible, but mainly depend on the programming language features, e.g., Java Annotations). In this file, the needed reference is defined. It is best prac-tice to defineinterfacesfor such services. This allows adding arbitrary implementations of this interface to the application (e.g., different implementations of algorithms for a specific knowledge processing task). The container then takes one concrete instance, defined in the XML file, and injects it into the service that needs this dependency.

Figure 6.10: Dependency Injection.

An example of a famous dependency injection container is theSpring Framework³. It man-ages the instances of services and the injection of all dependencies at runtime. A huge ben-efit is that the application can be extended with new algorithms without having to change existing code. The new implementation can be registered in the XML configuration and will be injected at runtime. An example of this mechanism can be seen in figure 6.10.

Discussion and Example

All three systems described in section 2.4.1 were implemented using the patterns described above. The basis is a layered architecture consisting of three layers: Data Access Layer, Business Logic Layer and Web Service Layer. The applications are implemented as a set of services using the Service Oriented Architecture (SOA) pattern.

Data exchange between client and the knowledge management and processing system hap-pens via DTOs. Therefore, the internal data structure of the system is decoupled from data structures provided by the clients. The client can change its data structure, but it is guaran-teed that the format for communication stays the same.

The access to a data storage system (e.g., relational database, NoSQL database, filesystem, etc.) happens via DAOs. The DAO hides the concrete storage system from the rest of the application, and it can implement access mechanisms for many different data storage sys-tems. This allows adding any new kind of storage technology for knowledge. Only the code of the DAO has to be changed, the rest of the application can be left untouched.

Flexibility is incorporated by applying the Dependency Injection pattern. Every prototype consists of a set of services. Every service is only responsible for one single task, thus sup-ports high cohesion. Often, services need other services to perform tasks (especially in the business logic layer). Therefore, services have dependencies on other services. To support loose coupling, the dependencies will be managed by a container and not by the services themselves.

3https://projects.spring.io/spring-framework/

SOA and Microservice Knowledge can become large, and therefore, the systems that work on this knowledge have to be scalable. By using the SOA, or the Microservice ar-chitecture pattern, the system can be split up into single services that can also be run in parallel. Therefore, the application also becomes cloud-ready and can make use of modern cloud platforms for parallel data processing.

All prototypes were developed using this architecture pattern and therefore consist of a large set of services. This allows executing algorithms in parallel and even to run single services on their own (supporting a higher workload).

DAO and DTO All three systems use multiple data sources. S-Mate uses a classical re-lational database and an index for fast search queries. Linkilike supports different social media platforms (e.g., Facebook or Google+). Moreover, in this project, multiple databases are used (relational and NoSQL). In the CLAFIS project, the needed field and weather data comes from different sources.

To provide uniform access to all data sources, the DAO pattern was used. The typical meth-ods that are needed for data access (e.g.,find,save,update,delete, etc.) are provided for every data source and changes only affect one single layer. Therefore, any data source could be added to the prototypes without having to use special access code in the business logic components.

Knowledge management and processing systems are typically only applications running on a server. For communication with a client who wants to use the functionality of the service, the data has to be transported over a network, using, for example, a web service.

To avoid changes of the client when a data source, or the internal representation of data changes, the DTO pattern was used. The data needed for the client is copied to a Data Transfer Object. This allows the client to be independent of any changes that are done on the server.

A simple example of Dependency Injection Dependency injection was used in all three prototypes. This pattern introduced the needed flexibility to exchange algorithms or DAOs.

In the CLAFIS project aDisease Pressure Calculation Serviceis responsible for calculating the disease pressure metric. It needs weather data for the calculation. Therefore, an interface IWeatherDataProvideris created that specifies the method signature for

getWeatherData(Date start, Date end). During development, or when travelling and there is no internet connection, the weather data can be simulated. An implementation of IWeatherDataProvideris created:WeatherDataProviderSimulator. Later in the project phase additional implementations can be added:LocalDBWeatherDataProvider_and WebWeatherDataProvider. So there are three services that all provide the same method.

The XML configuration could then look like this:

<bean id=”dpmCalculator”

class=”at.jku.faw.clafis.DiseasePressureService”>

<property name=”weatherDataProvider”

ref=”localDbWDProvider” />

</bean>

<bean id=”dummyWDProvider”

class=”at.jku.faw.clafis.WeatherDataProviderSimulator” />

<bean id=”webWDProvider”

class=”at.jku.faw.clafis.WebWeatherDataProvider” />

<bean id=”localDbWDProvider”

class=”at.jku.faw.clafis.LocalDbWeatherDataProvider” />

The reference of the propertyweatherDataProvidercan be changed to any other imple-mentation at any time. This keeps the whole application flexible and extensible.

Chapter 7

Im Dokument Application of architecture and design patterns in the context of knowledge processing or knowledge based systems / submitted by Stefan Nadschläger MSc (Seite 144-153)