• Keine Ergebnisse gefunden

Chasing behavior with minimalistic control

Im Dokument Neural dynamics of social behavior (Seite 97-101)

4.3.1 Evolutionary setup

To evolve a basic chasing behavior we used the MRC (minimal recurrent controller) as initial RNN (cf. Figure 4.2a). This RNN, adapted from (H¨ulse et al., 2004) and originally evolved for theKhepera robot (H¨ulse and Pasemann, 2002), exhibits a highly robust obstacle avoidance and exploration behavior. Thus, we equipped one non-evolving robot with this RNN and let this robot continuously emit a sound signal making it the moving target for the other evolving robots. The task of these robots was to minimize the distance between them and the target. For this purpose we used the following fitness function:

F = 1

n

n

X

i

1− ri

1.5

, (4.3)

wheren is the number of robots which have to follow the target and ri is the distance between robot i and the target. Hence the maximal detection range for sound signals is 1.5 meters, ri was thresholded to this value. Thus, if ri is larger than the threshold, roboti does not contribute to the fitness of the group.

The evolutionary strategy is akin to the experiment described in Section 3.4.1 (p. 78). That is, one RNN from the evolving population was copied six times to con-trol each of the six robots which are part of the group. Then the average performance of these six robots was taken to determine the selection criteria for this particular RNN. Note, that this corresponds to an explicit averaging of the fitness function (cf.

Section 2.5.4, p. 61).

25.0 25.0

−50.0

−50.0

−20.0

−20.0

12.0 12.0

−55.7

−43.6 25.0

−5.0

−10.6

25.0

12.1 7.5

−34.0 −25.1

I1,I2: left, right infrared sensor

I3,I4: left, right sound signal O1,O2: left, right wheel speed

initial structure

b

evolved solution

c

a

1

2

3

4

5

6

I1 I2

I3 I4

O1 O2

I1 I2

I3 I4

O1 O2

Figure 4.2: Neural network realizing chasing behavior. The initial (a) and the resulting evolved RNN (b) which realizes a chasing behavior as indicated in the right panel (c): The black robot (controlled by the initial RNN) continuously emits a sound signal and is, therefore, the moving target for the gray robots (controlled by the evolved RNN). Snapshot 1 shows the starting position of each robot. The interval between subsequent snapshots is 10 seconds of simulated real time (i.e., 100 time steps in simulation).

We have to admit, that this fitness function does not rely on internal variables (cf. Section 2.5.5), because we used global variables which were not accessible by the agents themselves. However, this was done because the sound sensor is not able to measure the intensity of a signal and therefore can not determine the distance to a signal. Nevertheless, the fitness function is still implicit because it does not describe how to solve the required sub-behaviors, such as avoiding obstacles or other robots, and how to approach and follow the moving target.

To bootstrap the system theMRC was provided as an initial structure (Figure 4.2a).

Then, we used a so called semi-restrictive method (H¨ulse et al., 2004; H¨ulse, 2007), that is, already existing structural elements are not allowed to be removed (even though their parameters can change), but new structural elements could be added within the whole network which has new, initially unconnected, sensory inputs (the sound sensors in this case).

4.3.2 Neural mechanisms

Figure 4.2b shows a successfully evolved RNN. Compared to the initial structure (Fig-ure 4.2a) we see only minor changes of the original synaptic weights and only two new synaptic connections between the sound detecting input neurons and the motor neurons. This very small module, applied to a group of six robots, realizes a highly robust chasing behavior (Figure 4.2c) which can be described as the integration of two tropisms: a negative one to avoid obstacles and other robots, and a positive one to follow the sound signal.

The most important structural elements of this RNN are the strong positive

posi-obstacle avoidance sound following

o1

o2

i1 i2

i1=0.8

i1=0.8

i3 i4

i2=0.8

i2=0.8

1.1

0 1

−0.1 0 1 0 1 0 1

1.1

−0.1

Figure 4.3: Neural mechanisms of the chasing behavior. Shown are bifurcation diagrams for the motor neuronsO1 andO2 while varying all sensory inputs (not varied inputs were set to zero, except for the inset diagrams in the right panel, where always one infrared sensor was set to a value different from zero, as indicated).

tive self-connections atO1 andO2 and the even loop between these two neurons. Such elements can exhibit hysteresis effects as a result of bi-stability (for a profound math-ematical discussion see Pasemann, 1997, 2002). How such hysteresis effects realize robust obstacle avoidance behavior is already discussed elsewhere (H¨ulse and Pase-mann, 2002; H¨ulse et al., 2004; H¨ulse, 2007). However, we want to briefly summarize the main properties and features because they are fundamental to the behavior in our case as well.

First, we want to explain the obstacle avoidance behavior with the help of the bifurcation diagrams given in the left panel of Figure 4.3. When we vary the input of I1 (the left infrared sensor) we see that o1, which controls the left wheel, stays in the upper saturation domain of the activation function over the whole input space. Thus, the left wheel rotates forward with maximum speed. Considering o2, which controls the speed of the right wheel, we see that that it jumps between the upper and the lower saturation domain at specific input values. We also see that a jump from the upper to the lower domain occurs at a higher input value than the jump back from the lower to the upper domain. Between these two values the system is bi-stable, it ends in either one of two fixed points depending whether the input values are increasing or decreasing. That means the behavior of the system does not only depend on its current state, but also on its history.

To get an impression how this is related to the behavior of a robot, let us suppose a robot is approaching an obstacle to its left side. The closer the robot comes to the obstacle the higher getsi1. At a specific value ofi1,o2 jumps from 1.0 to 0.0 leading to a backward rotation of the right wheel which turns the robot away from the obstacle.

That in turn decreasesi1, buto2 jumps back from 0.0 to 1.0 at a lower value ofi1. The width of this hysteresis domain is determined by the strength of the self-connection at O2 and the synaptic strength of the even loop between O1 and O2. The larger the hysteresis domain the larger is the turning angle of the robot away from an obstacle

(see H¨ulse et al., 2004 for a deeper discussion about this correlation). The same can be found when we change i2, the right infrared sensor value, but this time we find a bi-stable region for o1, resulting in a turn to the left when an obstacle is detected on the right side of the robot.

There are two main properties why this mechanism ensures a robust obstacle avoid-ance behavior (H¨ulse et al., 2004):

• Noise is filtered efficiently because small and fast changes of the sensory input do not result in small and fast changes of the output.

• Because the turning still continues although the input value is already lower than the value which initiated the turning, the robot is able to escape even acute angles or dead ends in the environment.

The very same mechanism realizes the tropism toward a target, only that this time the robot is not turning away from the source of sensory changes, but adjusts its heading direction toward it (compare left and right panel of Figure 4.3). The reason for this inverse reaction becomes obvious when we consider the connection between the different sensor modalities (infrared and sound sensors) to the motor neurons (cf.

Figure 4.2b).

Thus, two different behaviors are integrated into one control unit where the very same dynamical properties are used to realize either a positive or a negative tropism depending on the coupling between sensory input and motor output.

Which of the competing behaviors becomes apparent is determined by how strong the different sensor modalities are connected to the control system. When the robot is confronted with both, that is, it detects an obstacle and a target signal at the same time and on the same side, the dominant behavior is determined by the strength of the synaptic weights projecting from the different sensor modalities. As we can see in Figure 4.2b the synapses from the infrared sensors are stronger than the synapses from the sound sensors. The insets in the right panel of Figure 4.3 illustrates two occasions of conflict. For instance, when an obstacle on the left side of a robot is detected (high i1) and we vary i3 (representing sound signals to the left side of a robot), we see that the hysteresis domain of o1 is shifted to the right side in a way that o1 actually does not jump to the lower saturation domain, at least not in the range of values ofi3. That means, an orientation to the left is inhibited when, at a same time, the left infrared sensor detects close obstacles, as it is also the case for the other side of the robot (see insets in the right panel of Figure 4.3 for varying i4). This means, obstacle avoidance is always the dominant behavior and heading toward a sound signal takes only place when the values of the infrared sensors are below a certain threshold.

Now that we have clarified the neural mechanism of the chasing and obstacle avoid-ance behavior, the next section will discuss what happens if we add a speaker to all robots (i.e., each robot becomes a potential target) and increase their number. Thus, we change a property of the robot, but we do not change anything concerning the control architecture.

1 2 3 4 5 6 7 8

a b

c

0 1 2 3 4 5 6 7 8

distance

100

0 20 40 60 80

time [steps]

40 20

0 0.2 0.4 0.6

12

0 4 8

0 0.4 0.8 1.2

t=100

t=0 4

8 12 16

rdf

0

distance

0 1 2 3 4 5 6 7 8

25 20 15 10 5

0

16

Figure 4.4: Clustering of 120 robots. The top panel indicates the average number of neighbors from an individual at specific distances by means of a radial distribution function (see text for details). Shown is its development over time (a) and a snapshot at the beginning and after 100 time steps (b). The bottom panel (c) visualizes the aggregation process. Snapshot 1 shows the initial position of each robot. The interval between subsequent snapshots is ten seconds of simulated real time (i.e., 100 time steps in simulation).

Im Dokument Neural dynamics of social behavior (Seite 97-101)