Multi-objective genetic algorithms - Maximum-Score Diversity Selection

As we have just seen, one main component of genetic algorithms is the fitness function that assigns each individual a single fitness values, which is later on used for selection.

However in Chapter4we have seen that when dealing with multi-objective problems, no single quality criterion exists and solutions cannot be ranked in a total order. Neverthe-less, fitness functions still exist, but each objective has its own, as in the single-objective

case. There are two types of multi-objective genetic algorithms that can be distin-guished by the way they use the different fitness functions to rank individuals. One class uses the dominance relation, whereas the other takes some indicator function (see Section 4.4). Fortunately, all other ingredients of genetic algorithms – genetic repre-sentations, operators, selection strategies – can be used unmodified for multi-objective problems. Dominance-based algorithms are by far the most widely used. The main reason is that they are usually faster than indicator-based ones, since many indicators cannot be computed efficiently. The reason why there is a need for a second algorithm class at all, is the fact that the dominance relation rapidly loses significance if the num-ber of objectives increases. The more objectives are involved, the more likely it is for a single solution to be non-dominated in at least one objective. This can be circumvented by directly optimizing the chosen indicator function, e.g. the hypervolume indicator.

However, since MSDS is a problem with only two objectives, there is no need to resort to more complicated indicator based algorithms and thus we shall not discuss them further.

Dominance-based algorithms use multiple fitness functions to establish the non-dominated relation between pairs of individuals. Intuitively, non-non-dominated individuals are fitter than dominated ones and should be preferred during combination. Still, it is unclear how two mutually non-dominated solutions are to be compared. Here another aspect of multi-objective optimization comes into play, namely the diversity of the non-dominated set. Unfortunately, multi-objective genetic algorithms face a similar problem to MSDS: getting as near to the real Pareto front as possible (=finding solutions of maximum score) while at the same time covering the complete front as well as possi-ble (=selecting a diverse set of solutions on the front). Therefore the second ranking criterion in almost all multi-objective genetic algorithms deals with maintaining a good diversity in the population. The exact way in which this is achieved is the main aspect in which all dominance-based algorithms, such as NPGA[34], SPEA2[72], or NSGA-II[21]

differ from each other.

NPGA

NPGA (Niched Pareto Genetic Algorithm) was one of the first algorithms to be success-fully applied to multi-objective problems. Selection is carried out in two steps. First two individuals are randomly chosen to take part in a tournament. However, instead of making a direct one-to-one comparison, a second comparison set is drawn from the pop-ulation. Each of the two candidates is then compared against each individual in the set.

If one dominates the set and the other does not, it wins the tournament and is selected.

If this first round ends without a winner (both are dominating the set or none does), the tie is broken by applying niching techniques which are also used in single-objective GAs (see [45] for an extensive overview). Niching tries to avoid the whole population crowing around one (local) optimum but instead maintains several sub-populations around other (local) optima. One popular way is fitness sharing where individuals in the same niche share their fitness values. The more crowded the niche, the more the individual fitness values are decreased and some of the solutions in that region are then discarded because other, slightly less fit solutions, in other less crowded niches, have a higher value after fitness sharing. Sharing is performed by defining a niche sharing radiusσsharing, counting the number of neighbors (measured in either genotype or phenotype space) inside the radius and then degrading the raw fitness in proportion to the niche count. Since in the multi-objective case there is no single fitness value, NPGA only uses the niche count it-self: individuals with less neighbors are preferred. Presumably for performance reasons, the niche count is only computed in the comparison set and not the whole population.

The complete selection process is depicted in Figure5.4. Both marked candidatesiandj neither dominate the comparison set nor are they completely dominated. Therefore the number of individuals inside the two niches, defined by σ_sharing, decides that individual i wins.

Figure 5.4: Ranking by dominance and niche count performed by NPGA.

The major drawback of NPGA is its lack of an elitism strategy i.e. the best individuals are not necessarily carried over into the next generation. It has been shown that elitism improves convergence considerably and therefore NPGA is outperformed by more recent algorithms such as SPEA2 or NSGA-II.

SPEA2

In contrast to NPGA, SPEA uses a so-called archive – in addition to the regular popu-lation – in which the best individuals are permanently stored and carried on into future generations. The algorithm starts with an empty archive and a randomly generated population. Then fitness values are calculated for all individuals, both in the popula-tion and the archive (which is non-empty after the first iterapopula-tion). Fitness values are computed based on the so-called strength values, which are the numbers of individuals dominated by a solution. Non-dominated solutions have a strength value of 0. Then for each solution the strength values of all dominating solutions are summed up to form the raw fitness value. Figure 5.5 show a sample population together with the strength and raw fitness values.

0/0 0/0

0/0 2/0

1/0

5/0

2/0 0/2

1/5 0/8 2/5 0/7 0/0 0/8

1/2 0/3

2nd 3rd 1st

Figure 5.5: Fitness assignment in SPEA2. The first value beneath the solutions indi-cates the strength value, whereas the second value indiindi-cates raw fitness. Non-dominated solutions (in green) always have raw fitness values of zero. An example for the density computation is shown for k = 3.

In order to distinguish between individuals with the same raw fitness value, a density

value is computed for each solution. It is computed by taking the inverse of the distance to thek-th nearest neighbor, which is usually chosen to be the square root of the size of both the population and the archive. This ensures that individuals in sparsely populated regions are favored (both raw fitness and density values are to be minimized). After fitness assignment all non-dominated individuals (from both the population and the previous archive) are copied to the new archive. If the archive is not completely filled up dominated solutions are added according to their raw fitness and density values. If the archive is overfilled solutions are removed based on their density values. The individual with the smallest distance to its k-th nearest neighbor is iteratively removed until the archive has the desired size. Finally binary tournament selection is performed on the archive (not on the population!) and mutation and combination operators are applied to shape the population for the next generation.

While SPEA2, with sophisticated fitness assignment and archive management pro-cedures, has advantages over NSGA-II, when it comes to many objectives it is more complicated to implement. Since MSDS involves two objectives only, we have chosen the simpler NSGA-II algorithm.

NSGA-II

Similar to most other dominance-based approaches, the algorithm uses the non-dominated relation between all solutions. First all globally non-dominated solutions are selected, put into the first so-called front, and temporarily removed from the population. After removal, another set of individuals exists that is now not dominated by any other re-maining individual. They are selected again, put into the second front, and removed.

This process is termed non-dominated sorting and continues until all solutions have been assigned to a front. The colors of the solutions shown in Figure5.6indicate the different fronts.

The index of the front is now used as a rank when it comes to selecting individuals for crossover. However, the rank is again not the only quality measure. Additionally the so-calledcrowding distance of each individual is taken into account. For this, the distances to its next neighbors along all objective axes are computed. They span a cuboid in which no other solutions reside, see Figure 5.6. For easier comparison and computation, the average length of all edges is taken as the crowding distance.

When it comes to selection, first the individuals’ ranks are taken into account. If two individuals are in the same front their crowding distance is used to break ties,

Figure 5.6: Computation of the crowding distance of individual i. The colors of the points indicate the different fronts they have been assigned to.

favoring individuals with larger crowding distances. Individual i in the figure has a greater crowding distance than j and would therefore be preferred. This ensures that the algorithm spreads the solutions more uniformly over the Pareto front and does not get trapped in local optima too early. These two criteria define a partial order ≥_cd on all solutions:

i≥_cd j ⇔(rank(i)< rank(j))∨((rank(i) = rank(j))∧(cdist(i)> cdist(j))) Differing from the original NSGA-II algorithm, which works with standard binary tourna-ment selection, unbiased tournatourna-ment selection[63] was used for the experitourna-ments instead.

This approach reduces the variance in the probability of an individual getting selected for a tournament. This cased the loss of diversity in the next generation (because certain individuals are never picked for a tournament) to be reduced. Algorithm1shows NSGA-II’s main function. The key functions – non-dominated sorting and crowding distance assignment – are shown in algorithms 2 and 3, respectively.

Now that we have an algorithm that can optimize multi-objective problems the last thing that is missing in order to solve MSDS is a suitable genetic representation for the subsets. In the next section several possibilities, together with their genetic operators are presented.

Algorithm 1: NSGA-II

P0 ← random initial population;

F ← non-dominated-sort(T_t);

C_t+1 ← new-population(C_t+1); /* selection, crossover, mutation */

t←t+ 1;

end

5.3 Genetic representations and operators for

Im Dokument Maximum-Score Diversity Selection (Seite 80-86)