Testing the predictions - The Royal Staircase Fitness function

2.3 The Royal Staircase Fitness function

4.1.4 Testing the predictions

e⁻^(x−m)2^2v dx = erf_m,v(fmax) = N −1 N and obtain

f_max = erf⁻_m,v¹ (1− 1 N).

4.1.4 Testing the predictions

Predictions are expected to be most accurate for a very large population size and a rather small selection coeﬃcient. If the population size is too small, the actual ﬁtness distribution cannot be approximated by a normal distribu-tion. Similarly, a big selection coeﬃcient in a ﬁnite population leads to an

0.5 1f

0.5 1f 0.5 1f

0.5 1f

Figure 4.1: Actual (thin line) and predicted (thick line) ﬁtness distribution in a population of 5000 individuals in generation 0, 10, 20, 30 in the upper row and generation 40, 50, 100 and 200 in the lower row. The actual ﬁtness distributions are derived from a single run of the evolutionary algorithm.

actual distribution of ﬁtness after selection which is qualitatively diﬀerent from a normal distribution, because in a ﬁnite population, the probability for an individual to have ﬁtness above a certain threshold is 0. In an ap-proximation based on normal distributions, however, all ﬁtness values have a positive probability to occur. The outcome of selection in a ﬁnite population, therefore, is not simply a shift of the mean of the ﬁtness distribution, but depends on the actually occurring ﬁtness values.

In order to test the analytic predictions, the parameters of the evolution-ary algorithm are set as follows: The population size is chosen at P = 5000 and the selection coeﬃcient at S = 5. The predictions are based on the two recursion relations for mean and variance of ﬁtness and thus only on the functionsf_m(x) andf_v(x) for mean and variance of mutant ﬁtness for parent ﬁtness x. Actually, the best predictions are obtained when using not only one linear function forfm(x) andfv(x) but instead for each a set of 10 linear functions for diﬀerent ﬁtness ranges. These functions are derived from the correlation matrix by linear interpolation between the points of mean and variance of mutant ﬁtness for all diﬀerent classes. The 10 linear functions per parameter are nearly equal to each other, but nevertheless slight diﬀerences are important for the accuracy of predictions.

Figure 4.1 shows a comparison of the development of actual and predicted ﬁtness distribution over generations. Actual and predicted mean and best ﬁtness in the population evolving over 200 generations are depicted in Figure 4.2. The statistics of actual behavior of the evolutionary algorithm are

de-50 100 150 200

Generation 0.4

0.6 0.8 Mean fitness

50 100 150 200

Generation 0.4

0.6 0.8 Best fitness

Figure 4.2: Actual (thin line) and predicted (thick line) mean and best ﬁt-ness in a population of individuals evolving on a TSP ﬁtﬁt-ness landscape. The predictions are based on an analytic description of the evolutionary dynam-ics.

rived from a single run. For high population size, diﬀerent runs of the actual process do not showimportant diﬀerences.

The accuracy of the predictions is remarkable. This shows not only that the TSP ﬁtness landscape can be described by one-dimensional correlation statistics. It also demonstrates that the correlation matrix itself can be ap-proximated in a very simple way. We reduced the high-dimensional ﬁtness landscape to two sets of linear functions, which allow for an analytic predic-tion of the evolupredic-tionary process. In the next secpredic-tion this method of analytic description is applied to the NKp ﬁtness landscapes of lowneutrality.

4.2 NKp ﬁtness landscapes

In an NKp ﬁtness landscape, the ﬁtness of a genotype is on average a sum of N(1 −p) random numbers. For not very high values of p, the ﬁtness distribution therefore tends to be normal. Under mutation, on average (K+ 1)(1−p²) bits of the genotype vector change their contribution and mutant ﬁtness is likewise approximately normally distributed, if p is not very high.

In the NKp landscape of lowneutrality (N = 40, K = 3, and p = 0.3) that was investigated in Section 2.2, N(1−p) = 28 and (K+ 1)(1−p²) = 3.64. Consequently, the ﬁtness distribution, as well as mutant ﬁtness for most ﬁtness classes, are approximated quite well by normal distributions (see Section 2.2 and Figure 2.30).

0.2 0.3 0.4 0.5 x 0.2

0.3 0.4 0.5 f_m

0.2 0.3 0.4 0.5 x 0.0005

0.00055 0.0006 f_v

Figure 4.3: The functionsf_m(x)andf_v(x)for mean and variance of neighbor ﬁtness depending on parent ﬁtness x in the NKp landscape of low neutrality are linear in a good approximation. In the ﬁrst picture the identity function is depicted in gray.

The analytic predictions for the evolution of ﬁtness distribution that we have developed for TSP landscapes can therefore, in principle, also be ap-plied to the NKp landscape of lowneutrality investigated in Section 2.2, as for this landscape a one-dimensional correlation description has been suc-cessful. The population size is set to P = 5000 and we apply the stochastic selection scheme already used for the evolutionary algorithm of the TSP, see last section. The selection coeﬃcient is chosen at S = 10.

4.2.1 Testing the predictions

For each of the functions of mean and variance of mutant ﬁtness for parent ﬁtness x,f_m(x) and f_v(x), not only one linear function is taken. Like in the case of the TSP of the last section predictive accuracy is improved, when for both fm(x) and fv(x), 10 linear functions for diﬀerent ﬁtness ranges are taken. They are derived from the correlation matrix by linear interpolation between mean values and variances of mutant ﬁtness for diﬀerent classes, see Figure 4.3. With the assumption of normally distributed mutant ﬁtness, the correlation matrix nowhas a continuous representation by means of the functions f_m(x) and f_v(x).

A comparison of actual and predicted evolution of ﬁtness distribution is presented in Figure 4.4. The actual distribution is derived from a single run. For a population size of P = 5000, diﬀerent runs of the evolutionary

0.2 0.4 0.6 f

0.2 0.4 0.6 f 0.2 0.4 0.6

0.2 0.4 0.6 f

Figure 4.4: Actual (thin line) and predicted (thick line) ﬁtness distribution in a population of 5000 individuals in generation 0, 20, 40 in the upper row and generation 60, 80 and 100 in the lower row.

20 40 60 80 100

Generations 0.3

0.4 0.5 0.6 Mean fitness

20 40 60 80 100

Generations 0.3

0.4 0.5 0.6 Best fitness

Figure 4.5: Actual (thin line) and predicted (thick line) mean and best ﬁtness for evolution in an NKp landscape. The actual statistics are derived from a single run, whereas the predictions are based on an analytic approximation of the average dynamics on the level of ﬁtness.

algorithm do not showimportant diﬀerences. A comparison of best and mean ﬁtness per generation in one run of the actual algorithm and the analytic predictions is presented in Figure 4.5.

The predictions, which are only based on the two sets of linear functions for the mean and variance of mutant ﬁtness, are very close to the actual dynamics. An analytic approximation of evolutionary dynamics based on normal distributions was successful for the TSP landscape and the NKp landscape of lowneutrality. In the next subsection, however, some ideas for improving predictive accuracy and for more general approximations are presented.

Im Dokument Correlation Analysis of Fitness Landscapes (Seite 102-107)