• Keine Ergebnisse gefunden

Design Pattern Collection

4.4 Discussion

In this section, the identified patterns will be discussed. The usage in the three systems (introduced in section 2.4.1) and in the ATM example (introduced in section 2.4.2) will be discussed.

4.4.1 Data Access

This layer is an important one in knowledge processing systems because it is the one that deals with the access to knowledge and data. Accessing the stored knowledge is an essential part of such an application. All three systems operate on common data, like data describing a user. Therefore, a pattern, like theDAO, or any other pattern of the same group, can be used to implement the access to this data that is usually kept in a relational database. The benefit of such a pattern is its long term proven usage for such cases and its abstraction from the used database, providing possibilities to extend and improve the system.

All three systems use additional data in the form of XML and JSON files / data streams.

Existing data access frameworks (like Hibernate6, for example) commonly do not support such formats. To solve this issue, patterns can be used to hide the concrete access to data.

For example, in Java, one choice would be to use the Java Persistence API (JPA), accessing a relational database with a so-calledentity manager. Instead of an entity manager, a simple file reader can be used to access data stored in a text file. The user of a DAO instance does not care about the code behind. The usage is uniform for all data storage types. Therefore,

6http://hibernate.org/

most patterns that abstract from the concrete data storage system and at least provide ba-sicCreate, Read,Update, andDelete (CRUD) operations, can be used in the data access layer. These provide a uniform view on the data. An example that tries to achieve this is the Spring Data project7, applying theRepositorypattern.

In the ATM example (shown in figure 5.7, it is assumed that the knowledge processing sys-tem accesses the customer data stored in a relational database syssys-tem (for example, available users and credit-cards), but also the rules that describe the business logic have to be stored in a storage system, to be able to flexibly edit them, also at runtime. Figure 4.5 shows how this is managed in an enterprise system architecture.

Figure 4.5: Data Access in the ATM example.

Requirements

All the patterns described in section 4.3.2 comply with the following requirements as de-scribed in table 4.1.

• High cohesion: All patterns enforce that one data access service only has to take care of one business entity,

• support of different storage technologies and

• abstraction from the technology code: Multiple storage systems can be used in paral-lel. The code for a particular persistence technology will be hidden.

Loose couplingis not supported byActive RowandRow Data Gateway. It contains the domain model, data storage access code and, in the case of Active Row, business logic code and therefore does not allow a clear separation between data access layer and business logic layer.

Extensibility with custom queriesshould be avoided in theRepositorypattern. In contrast to the others, this pattern provides the benefit of astable interface. And it can be used and extended by developersnot familiar with the concrete data storage technology.

The patternsRow Data Gateway andActive Row supportquick prototyping. They combine domain model and data access logic into one single service and can easily be used in pro-totyping without having to change a lot of other code.

7http://projects.spring.io/spring-data/

4.4.2 Integration External Systems

In all three projects, introduced in section 2.4.1, external systems provide knowledge and they all have in common, that they access external services via known remote communica-tion technologies. InLinkilikeandCLAFIS, there is a similar case, where patterns, like the Message Busand theData Transfer Objectcan be used. In the case ofLinkilike, the external data is the user profiles from social media systems, having the same content, but differing in format (e.g., XML, JSON, or plain text) or structure. In the knowledge processing sys-tem this has to be reduced to one common processable profile data structure. In the case ofCLAFISthe same situation occurs with weather data. This also can come from differ-ent providers in differdiffer-ent formats and structures. The situation is differdiffer-ent forS-Mate. In this project, various data from independent sources is accessed. Therefore, the data does not have to be reduced to one single data structure, but the sources have to be exchange-able. Here also theMessage Buspattern can be applied to keep the addresses of the external systems exchangeable.

Instead of theMessage Buspattern, any other from the list of patterns defined by Hohpe and Woolf [55] (see section 4.3.3) can be used. ThePortal, EntityandProcess Integrationpatterns define, on which layer the integration is done. In the three knowledge processing system ex-amples, this is commonly done by usingEntity Aggregation. For exchanging common data, all three systems useShared Database.

In the ATM example, external systems can be responsible for accessing credit card data from other providers, or extended validity checking services. These can be provided by third parties via remote communication technology. Figure 4.6 shows an example.

Figure 4.6: External integration in the ATM example.

Requirements

Table 4.2 summarizes the mapping of the patterns to the requirements for external integra-tion. Most of the patterns for integration of external systems support loose coupling in a way that the internal service is not affected by changes of implementation details of an exter-nal service. Patterns that do not support this requirement require certain fixed information, for example, thePoint-to-Point Channel,Portal Integration,Process Integration,Shared Database andPresentation Integration. These patterns do not support communication with "unknown"

other services. On the contrary, these patterns support prototyping because they can be im-plemented quickly. Patterns likeDTO, orEntityAggregationsupport the internal change of the data structure, without conflicting with an external representation. If multiple services have to be used, each having different interfaces, theEntity Aggregation,Presentation Integra-tion, orProcess Integration patterns can be used. Some patterns support the notification of new results from asynchronous tasks (therefore avoid polling for results):Message Busand all variants, MOM Integration, Service Oriented Integration andPublish-Subscribe. Patterns that are flexible in terms of communication with different remote technologies are Mes-sage BridgeandDatatype channel. If asynchronous communication is important, the patterns MOM Integration, Service-Oriented Integration,Presentation IntegrationandPublish-Subscribe can be used.

Internal Systems

Also, the internal structure of a system has to be exchangeable. Often knowledge processing systems are created in research projects, and different algorithms are tested and applied on data. An easy way to allow the exchange of single components in the code itself is to use theDepedency Injectionpattern.

Some of the patterns identified can be used in multiple ways. E.g., theGatewayor the Black-boardpatterns can be applied either in the data access layer or for integrating services. They either support abstraction of the access to external services or also possible solutions for integration problems.

In the ATM example, internal integration deals with the dependencies of services / compo-nents within the system. For example, the service providing logic for evaluating rules and the ones for managing customers and credit cards. Figure 4.7 shows just an excerpt of an example to demonstrate the case.

Requirements

All patterns for internal integration support the requirements forloose couplingand the com-munication with "unknown" services.Pipes and Filtersis not a good choice ifextensibilityshould be supported.Reconfiguration at runtimeis supported byService Locator,Dependency Injection, Listener / ObserverandWhiteboard. Moreover,Blackboardsupports anasynchronous communi-cationand thesharing of data. Also, theListener / Observerpattern supports anasynchronous communication. An overview of the requirements and patterns is given in table 4.3.

Figure 4.7: Internal integration in the ATM example.

4.4.3 Parallelism and Cloud Parallelism

The performance of long-running algorithms and processing of large amounts of knowledge can be improved by parallelizing processing algorithms when used in production. There-fore, the standard parallelization technologies of different programming frameworks can be used. E.g., the default parallelization functionality in Java supportsFuture,Task Parallelism andFork / Join. If the resources of one server are not enough, patterns likeActorsor Map-Reducecan be added to spread the calculation load over multiple (physical or virtual) nodes.

If the rules of the ATM example get more complicated, because, for example, a large ficti-tious ruleset and a large amount of data is included for tracing irregular activities to identify stolen credit-cards, the evaluation algorithms can become slow, therefore parallel versions have to be used. An example is shown in figure 4.8.

Figure 4.8: Parallelism in the ATM example.

Requirements

In the list of patterns for parallel algorithm execution, theMap-Reducepattern is a specific one, as can be seen in table 4.4. It is the only one that was really designed toprocess large amounts of dataand not only providefast processing. It was designed to providephysically

distributed processing, besides the patternsActorsandMaster / Worker. A pattern that is more an extension of the other ones toget notified of resultsisFuture. Except for thePipelinepattern, all supportscaling. The easiest way to integrate multiple algorithms is to usePipeline,Actors orMaster/Worker. Most of the patterns are supported by standard development frameworks (e.g., the Java Development Kit (JDK), or the .NET framework) and can beeasily integrated, onlyMap-Reduce andActors require more advanced libraries. Master Worker, Actors and Pipelineallow the simultaneous usage of multiple algorithms.

Cloud

Knowledge processing systems often have to operate on large amounts of data. Therefore, the technologies and services of a Cloud environment can be used. The identified patterns solve common problems one has to deal with. The communication and transfer of data is expensive and decreases performance. Also, the large amount of data makes it impossi-ble to store it all on one node. Communication with different nodes leads to an increased risk of losing data (e.g., when one node fails).Cache-aside, Event Sourcing, Materialized View andShardingare simple patterns to handle data storage problems due to the large amount of data. Circuit-Breaker, Compensation TransactionandRetryare useful for failure handling of the network communication. CQRShandles performance problems andRuntime Recon-figurationsimplifies the development. The three example systems make use of them to be cloud-ready and to efficiently handle communication failures.

To be able to serve many customers in many countries, the ATM example could run in a cloud environment (e.g., hosting parallel instances of services). This is shown exemplarily in figure 4.9.

Figure 4.9: Cloud in the ATM example.

Requirements

Patterns that support the performance of the communication with a cloud system areCache aside, CQRSandSharding. Besides these three patterns, Materialized Viewcan be used to support the processing of large amounts of data. To avoid data loss,Event Sourcing, Materi-alized ViewandShardingcan be used. Fault tolerance is supported byCache aside,CQRSand

Sharding. A special one isRuntime Reconfiguration, it supports the ease of development. All results are presented in table 4.5.

4.4.4 Server Communication

Knowledge processing systems seldom have a user interface. They are not intended to be ac-cessed directly by users. Only simple management GUIs should be used for administrative tasks (e.g., changing special data entries).Therefore, the system has to provide its services via common remoting interfaces. In the case of the three example systems, the REST and SOAP technologies were used. REST is an architectural style, whereas SOAP uses the pre-viously identified patternsRequest-ResponseandCommand / Document message.

Requirements

The requirements defined in 4.3.5 are met by all the described patterns (see table 4.6).

Chapter 5