• Keine Ergebnisse gefunden

2.4 C HOSEN ANN M ODELS

2.4.2 The Combinatorial Neural Model

The Combinatorial Neural Model (Machado and Rocha 1989, 1992; Machado et al.

1998) has been explored and developed during the past decade. Experiments with this model have demonstrated that it is well suited for classification problems, with excellent results in terms of accuracy (Leão and Rocha 1990; Feldens and Castilho 1997). The CNM integrates, in a straightforward architecture, symbolic and non-symbolic knowledge. This model has characteristics that are desirable in a classification system:

• Simplicity of neural learning - due to the neural network’s generalization capacity.

• Explanation capacity – the model can map a neural network’s knowledge into a symbolic representation.

• High-speed training - only one pass over the training examples is required.

• Immunity against some common neural network’s pitfalls – i.e. local optima, plateau, etc.

• Incremental learning possibility - previously learned knowledge can be improved with new cases.

• Flexible uncertainty handling – it can accept fuzzy inputs, probabilistic inputs, etc, as inputs fall into the interval [0, 1].

The CNM includes mapping of previous knowledge to the neural network, training algorithms, and pruning criteria, in order to extract only significant pieces of knowledge. The detailed explanations on the possible learning algorithms are in (Machado et al. 1998). Here are considered the Starter Reward and Punishment (SRP) and the Incremental Reward and Punishment (IRP) learning algorithms, which are the original learning algorithms proposed for the model. Those learning algorithms offer the possibility of building a CNM network based

on knowledge elicited from an expert (using SRP), and the possibility of refinement with new examples the knowledge existing in the network (using IRP).

Learning Algorithm

?

Output Neurons (hypotheses)

Output Neurons (hypotheses)

Input Neurons (findings) Combinatorial Neurons

Input Neurons (findings)

Figure 2.11 – The CNM network generation

The CNM is a 1-hidden layer, feed-forward network. It has particular characteristics in the way the topology is constructed, in neurons, in the connections among neurons, and in its training algorithm.

The domain knowledge is mapped to the network through evidences and hypotheses (Figure 2.11). The evidence may have many distinct values that must be evaluated separately by the neural network, called findings. The input layer represents the defined set of findings (also called literals). A finding can be a categorical or numeric pattern feature. The categorical feature can only take on a finite number of values while a numeric feature may take on any value from a certain real domain. A categorical feature requires at most one input neuron for each of the feature possible values. The state of each such neuron is either zero or one. A numeric feature has to be partitioned in fuzzy sets where each set will correspond to an input neuron and its state will correspond to the fuzzy set degree of membership. An example: if an evidence age is modeled, it probably has findings that can be modeled as fuzzy intervals (Kosko, 1992). The domain expert defines fuzzy sets for different ages (e.g. child, adolescent, adult, senior) (Figure 2.12). Each fuzzy set defined will correspond to a finding and, as a consequence, to a CNM input neuron.

Fuzzy

Figure 2.12 - Fuzzy Sets

In this case each input value is in the [0,1] interval (fuzzy set membership functions), indicating the pertinence of the training example to a certain concept, or the degree of confidence. In the example of Figure 2.12, the age 19 will correspond to a zero membership value for the child and senior fuzzy sets and to a 0.6 value for the adolescent fuzzy set and 0.3 to the adult fuzzy set.

The intermediate (combinatorial) layer is automatically generated. A neuron is added to this layer for each possible combination of evidences, from order 1 to a maximum order, given by the user.

The output layer corresponds to the possible classes (hypotheses) to which an example could belong. Combinatorial neurons behave as conjunctions of findings that lead to a certain class. For that reason, the pth combinatorial neuron propagates input values according to a fuzzy AND operator (Equation 2.22), taking as its output the minimum value received by the inputs.

Equation 2.22 – Fuzzy AND

In Equation 2.22 above,Ip ⊆{1,...,n} indicates the appropriate input neurons, and either sip(xi)= xi or sip(xi)=1−xi (fuzzy negation). In the first case the synapse (i,p) is called excitatory and in the later case inhibitory.

Output neurons group the classification hypotheses implementing a fuzzy OR operator (Equation 2.23), propagating the maximum value received by its inputs.

)

Equation 2.23 – Fuzzy OR

In Equation 2.23 above, m indicates the number of output neurons, andwp ∈[0,1] is the weight associated with the connection from the pth combinatorial neuron to the output neuron.

The weights modifications are determined by using a supervised algorithm which attempts to minimize the mean square error (Equation 2.24) incurred by the network when presented with a set of examples.

Equation 2.24 – Mean Square Error calculation for the new weight set of values

In Equation 2.24 above, w is the weight vector of a CNM network, and the learning is done using a set of examplesE⊂[0,1]n. For an example eE, let yˆ e( ) denote the desired output, and y(e,w) the output generated by the network with the weight vector w.

The Incremental Reward and Punishment (IRP) and the Starter Reward and Punishment (SRP) learning algorithms are based on the concept of rewards and punishments.

The connections between neurons (synapses) have weights and also pairs of accumulators for punishment (Pp) and reward (Rp). Before the training process, in absence of previous knowledge, all weights are set to one and all accumulators to zero. During the training, as each example is presented and propagated, all links that led to the proper classification have their reward accumulators incremented. Similarly, misclassifications increment the punishment accumulators of the path that led to wrong outputs. Weights remain unchanged during the training process, only accumulators are incremented.

The training process is generally done in one pass over the training examples. At the end of this sequential pass, the connections that had more punishments than rewards are pruned. The remaining connections have their weights changed using the accumulators.

2.4.2.1 The IRP learning

The IRP is used when the CNM neural network already exists and the goal is to add new knowledge to the network by learning new examples. In the case of the IRP, the learning proceeds as following.

If for a weight vector w, an example eEyields y(e,w) > yˆ e( ), then every pathway p (connection among a combinatorial neuron and an output neuron), in the network is rewarded proportionally to:

At the end of each learning iteration, the synapses with the Rp < Pp are deleted (pruned), and for the others, the accumulators are used to recalculate the value of wp. The remaining pathways with Rp > 0 and Pp = 0 (pathognomonic pathways) have their weights updated based on Equation 2.25:

)

Equation 2.25 – Pathognomonic pathway weight update

The constant t is an arbitrary acceptance threshold and t

(

0,1

]

. The other remaining pathways (ordinary pathways), have their weights updated based on Equation 2.26:

)

Equation 2.26 – Ordinary pathway weight update

At the end of the iteration, the values of Rp and Pp are passed on to the next iteration, enabling the incremental learning capability.

2.4.2.2 The SRP learning

The SRP learning applies the same general principle of rewards and punishments of the IRP learning in a one-iteration procedure. Before applying the SRP, the accumulators are set to zero and the weights are set to one. All the data examples are then applied to the network in a one-iteration procedure promoting the changes of the reward and punishment accumulators. After applying all examples, the same weight updates used on IRP are applied:

the non-rewarded pathways are pruned and the others have their weights modified by Equations 2.25 and 2.26 above. The SRP is the starting point for the CNM learning. After its application, the IRP can be used to increment the CNM knowledge.

Important aspects to consider on the object model for the CNM are:

• The input neurons just bypass the information. Usually this information is normalized values among 0 and 1, resulting from a pre-processing of the input data.

• The combinatorial neurons implement the fuzzy AND function and the output neurons operate the fuzzy OR function.

• The connections among the neurons are feedforward, there are no feedback connections and the network is not fully connected.

• The number of input and output neurons is determined based on the domain knowledge the expert has, and the combinatorial neurons are created as the possible combinations, from zero to a given order number (typically 3), of the input neurons. The generation of the combinatorial layer may demand enormous memory footprint to be able to generate the neurons and synapses to all necessary combinations.

• The CNM synapses have reward and punishment accumulators that are used to decide whether the synapse must be pruned or not, and also to recalculate the new value for the synaptic weight.