• Keine Ergebnisse gefunden

Testing the predictions

2.3 The Royal Staircase Fitness function

4.1.4 Testing the predictions

e(x−m)22v dx = erfm,v(fmax) = N −1 N and obtain

fmax = erfm,v1 (1− 1 N).

4.1.4 Testing the predictions

Predictions are expected to be most accurate for a very large population size and a rather small selection coefficient. If the population size is too small, the actual fitness distribution cannot be approximated by a normal distribu-tion. Similarly, a big selection coefficient in a finite population leads to an

0.5 1f

0.5 1f

0.5 1f

0.5 1f 0.5 1f

0.5 1f

0.5 1f

0.5 1f

Figure 4.1: Actual (thin line) and predicted (thick line) fitness distribution in a population of 5000 individuals in generation 0, 10, 20, 30 in the upper row and generation 40, 50, 100 and 200 in the lower row. The actual fitness distributions are derived from a single run of the evolutionary algorithm.

actual distribution of fitness after selection which is qualitatively different from a normal distribution, because in a finite population, the probability for an individual to have fitness above a certain threshold is 0. In an ap-proximation based on normal distributions, however, all fitness values have a positive probability to occur. The outcome of selection in a finite population, therefore, is not simply a shift of the mean of the fitness distribution, but depends on the actually occurring fitness values.

In order to test the analytic predictions, the parameters of the evolution-ary algorithm are set as follows: The population size is chosen at P = 5000 and the selection coefficient at S = 5. The predictions are based on the two recursion relations for mean and variance of fitness and thus only on the functionsfm(x) andfv(x) for mean and variance of mutant fitness for parent fitness x. Actually, the best predictions are obtained when using not only one linear function forfm(x) andfv(x) but instead for each a set of 10 linear functions for different fitness ranges. These functions are derived from the correlation matrix by linear interpolation between the points of mean and variance of mutant fitness for all different classes. The 10 linear functions per parameter are nearly equal to each other, but nevertheless slight differences are important for the accuracy of predictions.

Figure 4.1 shows a comparison of the development of actual and predicted fitness distribution over generations. Actual and predicted mean and best fitness in the population evolving over 200 generations are depicted in Figure 4.2. The statistics of actual behavior of the evolutionary algorithm are

de-50 100 150 200

Generation 0.4

0.6 0.8 Mean fitness

50 100 150 200

Generation 0.4

0.6 0.8 Best fitness

Figure 4.2: Actual (thin line) and predicted (thick line) mean and best fit-ness in a population of individuals evolving on a TSP fitfit-ness landscape. The predictions are based on an analytic description of the evolutionary dynam-ics.

rived from a single run. For high population size, different runs of the actual process do not showimportant differences.

The accuracy of the predictions is remarkable. This shows not only that the TSP fitness landscape can be described by one-dimensional correlation statistics. It also demonstrates that the correlation matrix itself can be ap-proximated in a very simple way. We reduced the high-dimensional fitness landscape to two sets of linear functions, which allow for an analytic predic-tion of the evolupredic-tionary process. In the next secpredic-tion this method of analytic description is applied to the NKp fitness landscapes of lowneutrality.

4.2 NKp fitness landscapes

In an NKp fitness landscape, the fitness of a genotype is on average a sum of N(1 −p) random numbers. For not very high values of p, the fitness distribution therefore tends to be normal. Under mutation, on average (K+ 1)(1−p2) bits of the genotype vector change their contribution and mutant fitness is likewise approximately normally distributed, if p is not very high.

In the NKp landscape of lowneutrality (N = 40, K = 3, and p = 0.3) that was investigated in Section 2.2, N(1−p) = 28 and (K+ 1)(1−p2) = 3.64. Consequently, the fitness distribution, as well as mutant fitness for most fitness classes, are approximated quite well by normal distributions (see Section 2.2 and Figure 2.30).

0.2 0.3 0.4 0.5 x 0.2

0.3 0.4 0.5 fm

0.2 0.3 0.4 0.5 x 0.0005

0.00055 0.0006 fv

Figure 4.3: The functionsfm(x)andfv(x)for mean and variance of neighbor fitness depending on parent fitness x in the NKp landscape of low neutrality are linear in a good approximation. In the first picture the identity function is depicted in gray.

The analytic predictions for the evolution of fitness distribution that we have developed for TSP landscapes can therefore, in principle, also be ap-plied to the NKp landscape of lowneutrality investigated in Section 2.2, as for this landscape a one-dimensional correlation description has been suc-cessful. The population size is set to P = 5000 and we apply the stochastic selection scheme already used for the evolutionary algorithm of the TSP, see last section. The selection coefficient is chosen at S = 10.

4.2.1 Testing the predictions

For each of the functions of mean and variance of mutant fitness for parent fitness x,fm(x) and fv(x), not only one linear function is taken. Like in the case of the TSP of the last section predictive accuracy is improved, when for both fm(x) and fv(x), 10 linear functions for different fitness ranges are taken. They are derived from the correlation matrix by linear interpolation between mean values and variances of mutant fitness for different classes, see Figure 4.3. With the assumption of normally distributed mutant fitness, the correlation matrix nowhas a continuous representation by means of the functions fm(x) and fv(x).

A comparison of actual and predicted evolution of fitness distribution is presented in Figure 4.4. The actual distribution is derived from a single run. For a population size of P = 5000, different runs of the evolutionary

0.2 0.4 0.6 f

0.2 0.4 0.6 f

0.2 0.4 0.6 f 0.2 0.4 0.6

f

0.2 0.4 0.6 f

0.2 0.4 0.6 f

Figure 4.4: Actual (thin line) and predicted (thick line) fitness distribution in a population of 5000 individuals in generation 0, 20, 40 in the upper row and generation 60, 80 and 100 in the lower row.

20 40 60 80 100

Generations 0.3

0.4 0.5 0.6 Mean fitness

20 40 60 80 100

Generations 0.3

0.4 0.5 0.6 Best fitness

Figure 4.5: Actual (thin line) and predicted (thick line) mean and best fitness for evolution in an NKp landscape. The actual statistics are derived from a single run, whereas the predictions are based on an analytic approximation of the average dynamics on the level of fitness.

algorithm do not showimportant differences. A comparison of best and mean fitness per generation in one run of the actual algorithm and the analytic predictions is presented in Figure 4.5.

The predictions, which are only based on the two sets of linear functions for the mean and variance of mutant fitness, are very close to the actual dynamics. An analytic approximation of evolutionary dynamics based on normal distributions was successful for the TSP landscape and the NKp landscape of lowneutrality. In the next subsection, however, some ideas for improving predictive accuracy and for more general approximations are presented.