Networks of Sigma-Pi Units - Using Networks of Sigma-Pi Units for the Learning and Query of Re-

4.3 Using Networks of Sigma-Pi Units for the Learning and Query of Re-

4.3.1 Networks of Sigma-Pi Units

Figure 4.11 shows a schematic view of a network of sigma-pi units. For every input variable x_i there is a set of t_i input units with receptive fields covering the domains of the inputs, which receive their activations ui,j, j ∈ {1, . . . , t_i} through population coding, as in Equation 4.8. Synaptic connections with associated sigma-pi weights

w1,1

u1,1

u1,7

u1,j

u2,1

u2,7

u2,k

w1,7

w7,7

u1,j

x1 x2

Figure 4.11: Schematic view of a network of sigma-pi units, for simplicity in the case of two one-dimensional inputs. Two sets of input neurons have their receptive fields in the domains of the inputs x₁ and x₂, and receive activationsu_1,j and u_2,k, respectively.

Possible synaptic connections with associated sigma-pi weightsw_j,kare shown as a grid of crosses. Each unit in each set of input neurons can be connected to every other unit in the remaining sets of input units. Note that there are no lateral connections, and crosses in the grid only represent possible connections for which the network can store sigma-pi weights.

combine exactly one neuron from every set of input units. Thus, if S_i ={1, . . . , t_i} is the set of indexes for input neurons corresponding to thei-th input, then each element sin the Cartesian product S of the sets S_i,

S = S₁× · · · ×S_n (4.12)

= {s= (s₁, . . . , s_n)|s_i ∈S_i}, (4.13) corresponds to a possible synaptic connection w_s between unitsu_1,s₁, . . . , u_n,s_n.

Networks of sigma-pi units belong to the class of “higher order” neural networks, as the simple additive units of linear feed-forward neural networks are extended by multiplicative connections. Thus, whereas in first-order neural networks the net input to a unit is given by

neti=X

wijuj, (4.14)

for the sigma-pi units the activation function includes the multiplication of inputs, net_i,j = X

s∈S, si=j

w_su_1,s₁. . . u_n,s_n (4.15)

= X

s∈S, si=j

w_s

m=1

u_m,s_m. (4.16)

4.3 Using Networks of Sigma-Pi Units for the Learning and Query of Redundant Mappings, and for Robot Control

The introduction of these multiplicative connections allows units to gate one another (Rumelhart and McClelland, 1986): If one unit has zero activation, then the activation of other units in the multiplicative connection have no effect.

In the implementation used in this work, the sum and product operators were replaced by the max and min operators, respectively, to avoid the need for normalization of network responses. Thus, the modified net input to a unit is given by

net_i,j = max

s∈S, s_i=j

w_s·minⁿ

m=1(u_m,s_m)

. (4.17)

We want the network weights w_s, s ∈S, to reflect the amount to which the input units determined by s tend to be co-active during training. Thus, if the neurons are always activated together, then the weight should adopt a high value, and if one or more units are never active along with the others during training, then the weight should be zero (i.e. there is no connection). If whenever one of the neurons was active, only half of the time also all the other neurons determined byswere also co-activated, then the connection weight should have a value around 0.5.

To achieve this, we can use a simple Hebbian learning rule. Given a training set of tuples of input vectors (x1, . . . , xn), the input neurons are activated according to Equation 4.8. The resulting activationsu_i,jare used to compute the network activation for alls∈S (cf. Equation 4.16), which is then used to update the network weights as

δws=λ·

m=1

um,sm (4.18)

with learning rate λ. Again, for the implementation used in this work, the product operator was replaced with the min operator, giving

δw_s=λ·minⁿ

m=1u_m,s_m. (4.19)

Furthermore, one-shot learning from training samples can be realized by omittingλin the update rule (i.e., settingλ= 1), and updating the weights according to

w_s^t= max(w_s^t−1, δw_s^t), (4.20) wherew_s^t−1 is the weight before, andw^t_s is the weight after processing thet-th training sample.

After training, we want to query the network by specifying a task description in terms of target values for one or several input variables, and want to retrieve possible values for the remaining variables in the form of population codes representing sets of redundant solutions. We specify which variables to constrain by defining a set Q⊆ {1, . . . , n}, corresponding to the indexes of respective input domains. Given the notation of S in Equation 4.13 and the net input in Equation 4.16, we can formulate the network query as

eu_i,j = X

s∈S, si=j

w_s Y

m∈Q

u_m,s_m, (4.21)

0 2

2 0

-2 0 1

0 2

2 0

-2 0 1

-2 0 2

0 1

0 2 4

0 1

Figure 4.12: Graphical interpretation for the network query using the example of a network that has learned the function y = x² in the interval x ∈ [−2,2]. To retrieve all solutions for x² = 1, the population of input neurons in the y domain are activated for y = 1, see bottom right plot. This activation is fed into the network and after the multiplication with the synaptic weights corresponds to an activation along the synaptic connections that can be seen in the top left plot. The net input to each unit in the readout is computed by accumulating this activity from all points on the grid to which it is connected.

whereui,j is the input activation that was specifies for the query for all units in the sets of input units given by Q, anduei,j is the activation value that was retrieved from the network in response to the query. The activation of a unit is a sum over the activation of all elements sof the Cartesian product S in which that unit is itself a member, i.e.

si=j. For all of these elements we compute the product of the activations of the units that were specified in the query, weighted by the connection weight w_s. Again, in the implementation used for this work a modified version was used, which is

eui,j = max

s∈S si=j

ws·min

m∈Q(um,sm)

. (4.22)

There is an intuitive graphical interpretation for this query, see Figure 4.12. In the simple case where there are only two sets of input neurons, the network weights can be arranged in a planar grid, such that each knot in the grid represents the multiplication of two neurons (cf. also Figure 4.11). Thus, all knots in the grid together represent the Cartesian product of the sets of input neurons. If in a query we specify activations of input neurons in one domain and compute the product of the activations with the associated weights, we get an activation along the synaptic connections, such that we have one value for each of the knots in the grid. In the next step, we accumulate for each unit all the values along the line of knots that are connected to that unit, which gives us the retrieved activation value of the unit as the response of the query. The same picture can easily be extended to the higher dimensional case, in which there are more than two inputs or where the inputs have more than a single dimension. Here, the

4.3 Using Networks of Sigma-Pi Units for the Learning and Query of Redundant Mappings, and for Robot Control

θ2

θ3

θ4

θ1

Figure 4.13: Schematic drawing of the kinematic chain used for the analyzes described in the text. The chain is composed ofnsegments, withn∈2,3,4, ending with either the blue, green or red segment, respectively. All segments are of equal lengthl, with a total length of the chain of 1.

network weights are arranged in a hyper-cube instead of a grid, and the line of knots corresponds to a slice through the hyper-cube.

4.3.2 Evaluation of the Sparsity in Networks of Sigma-Pi Units when

Im Dokument Building Blocks for Cognitive Robots: Embodied Simulation and Schemata in a Cognitive Architecture (Seite 123-127)