• Keine Ergebnisse gefunden

6.6 Evaluation

6.6.1 Distance Matrices

As follows, we evaluate the proposed methodology, which uses distance matrices for the diversification of the population.

Using an archive

Figure 6.8 illustrates the median value and the standard error of the IGD+ indicator regarding the instance size.

We additionally report the number of Pareto-optimal solutions (indicated by # of POS) for each problem instance. From a graphical perspective, it is evident that the results vary depending on the type and size of the problem instance.

In most cases, the NSGA-II-FDAO (archive-only) obtained the best median value and the lowest standard error. Counting the number of experiments where an algorithm was outperformed by at least one other algorithm confirmed that NSGA-II-FDAO was outperformed the least. Comparing this variant solely with the baseline algorithm indicates that the baseline algorithm was outperformed 152 times, whereas the proposed variant was never outperformed. Table 6.1 shows the wins, losses and ties of each algorithm on all 256 problems [WM21b].

The variant with Fréchet distance only (NSGA-II-FDWOA, no archive), also performed well compared to the baseline algorithm alone. The proposed vari-ation of the algorithm was outperformed 76 times (concerning the value of the IGD+indicator) for the 256 chosen problems, whereas the baseline

algo-Table 6.1: Wins, losses and ties of each algorithm pair (rows vs.

column) with statistical significance atp<0.01, Bonferroni correction applied, IGD+ indicator [WM21b].

NSGA-II-FD NSGA-II-FDAO NSGA-II-FDWOA

NSGA-II 0/132/124 0/152/104 0/85/171

NSGA-II-FD 0/45/211 25/34/197

NSGA-II-FDAO 51/33/172

rithm was outperformed 369 times. In the remaining experiments, there was no statistically significant difference.

The mixed variant NSGA-II-FD was never outperformed when compared only to the baseline algorithm. The baseline, by contrast, was outperformed 108 times with respect to the IGD+indicator.

When analysing the results, we noted that the high number of ties indicates room for improvement. This could be due to the fact that the Fréchet distance sorting, given two subsets of distinct paths with relatively short distance values inside the subset, results in a short distance value for each path even if the two sets are apart from each other . Nonetheless, the results show an improvement over the baseline algorithm since the proposed variants won in more problems and lost in fewer. The archive-only algorithm obtained the most promising results, supporting Patil’s statement that only the use of an archive can significantly affect the results [WM21b].

Comparison of different metrics

Settings In the experiments, we examined the proposed approaches on the benchmark test suite ASLETISMAC [WM22b] (cf. Chapter 4). We consider a two-dimensional space with three obstacle types (NO, LA and CH), with K3 neighbourhood, enabled backtracking and grid sizes of{15,20,24,26,28,30}.

NO indicates no obstacles, whereas LA and CH introduce bulk and checker-board obstacles, as shown in Figure 4.4. In this way, CH constrains the decision space to a few feasible paths. The K3 neighbourhood restricts the number of possible neighbours to eight, i.e., all surrounding cells. All these combina-tions result in 84 test instances. Given a solution represented by a path pas p= (ni,ni+1· · ·,nk) (a list of nodes), we evaluate it by five objectives to be minimised: 1) Euclidean length, 2) Delays, 3) Elevation, 4) Travelling time and 5) Smoothness (Curvature) (see Chapter 4).

We use the same operators for pathfinding as in [WM22b] i.e., a one-point crossover, which creates new offspring chromosomes by crossing two parent paths at one common point. We also use the proposed perimeter mutation for the mutation operator, which mutates the middle point of two arbitrary points within a specific network distance inside a given maximum radius and reconnects the paths afterwards. We thus compare the algorithms using the three incorporated distance metrics. Although the problem is a many-objective optimisation problem, we consider a smaller computational budget than our previous studies did. The reason is that we observed that the quality of results changed only marginally after 100 generations. This decision means we want to take time performance considerations into account by sacrificing quality.

Furthermore, we use a population size ofµ =100 to further account for fewer function evaluations. In real-world applications, obtaining results in a short time is often a requirement. For the experiments, we calculate the IGD+indicator that is a distance measurement between the obtained front of non-dominated solutions and the known true Pareto reference front [IMN15]. Furthermore, we

Figure 6.9: IGD+and IGDX over instance sizes. Top, middle and bottom rows illustrate NO, CH and LA obstacles [WM22a].

(a) IGDX results

15 20 25 30

Instance Size

0 0.5 1 1.5 2 2.5 3

IGDX Value

102 103

# of Pareto-optimal solutions

NO P1 BT K3

DTW-MED FD-MED

HD-MED

# of POS

15 20 25 30

Instance Size

0 0.5 1 1.5 2 2.5 3

IGDX Value

102 103

# of Pareto-optimal solutions

CH P1 BT K3

15 20 25 30

Instance Size

0 0.5 1 1.5 2 2.5 3

IGDX Value

102 103

# of Pareto-optimal solutions

LA P1 BT K3

(b) IGD+results

15 20 25 30

Instance Size

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5

IGD+ Value

102 103

# of Pareto-optimal solutions

NO P1 BT K3

DTW-MED FD-MED

HD-MED

# of POS

15 20 25 30

Instance Size

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5

IGD+ Value

102 103

# of Pareto-optimal solutions

CH P1 BT K3

15 20 25 30

Instance Size

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5

IGD+ Value

102 103

# of Pareto-optimal solutions

LA P1 BT K3

report the respective wins, losses and ties of each algorithm for all problem instances [WM22a].

To assess the quality of solutions in the decision space, we also employ the IGDX indicator [AQY09], which measures the distance between the known Pareto-set and the found solutions. Again, we compare the results of the algorithms to each other and test for statistical significance using Bonferroni correction (because we perform multiple comparisons). Our null hypothesis states that the populations have equal medians [WM22a].

Results Figure 6.9 illustrates the median values and standard error of the IGDX and IGD+ indicators, respectively, for the instance sizes and obstacle settings. Each row in the figure shows a different obstacle profile:NO,CHor LA.

We also report the number of Pareto-optimal solutions for each problem instance on the right axis, which we obtained using the exact approach. From a graphical perspective, it is evident that the results vary depending on the type and size of the problem instance. It is noteworthy that in the instance of LA P1 BT K3 (bottom left) of size 28, the algorithms using Fréchet and DTW distance obtained a small IGDX value, whereas the IGD+ value was relatively high.

This indicates that solutions near the optimal solutions in the decision space were found, but they were of a low quality in the objective space. For the

Table 6.2: Wins, losses and ties of each algorithm pair (rows vs.

column) with statistical significance atp<0.01, Bonferroni correction applied, IGDX and IGD+ indicator [WM22a].

FD-MED HD-MED

DTW-MED IGDX 3/6/75 23/16/45

IGD+ 0/0/84 0/6/78

FD-MED IGDX 23/14/47

IGD+ 0/2/82

multi-objective pathfinding problem, we have shown that solutions close to the optimum in decision space are not necessarily close to the optimum in terms of the respective objective values. However, the same two algorithms obtained a worse IGDX value on instance of size 30 for the same map type, whereas the algorithm incorporating the Hausdorff distance could still obtain a low value. Nevertheless, in the other two obstacle settings, using the Hausdorff distance was not as stable as the other two distance metrics regarding IGDX values as backtracking was allowed. This meant the metrics that took the flow of a curve into account yielded better results. If a path gets closer to its origin, whereas another path does not, the Hausdorff distance does not always reflect their similarity correctly, in the sense of their flow. As a result, the Hausdorff distance can be small although the Fréchet and DTW distances give higher values, as the paths are more distinct.

Concerning the IGD+ indicator, all three algorithms obtained similar values.

The indicators’ results show that using the proposed niching methodology improves the quality of solutions in terms of closeness to the true Pareto-set.

However, there is room for improvement regarding the objective values. Because we limited the search to 10 000 function evaluations (which is relatively low for many-objective optimisation problems), the results regarding IGD+were expected. In this study, the search process focused on employing measurements in the decision space to minimise the objective functions. Therefore, there was little improvement in the objective space that was measured by IGD+. The underlying problem is deceptive, which can result in paths that are close to an optimal solution in the decision space (measured by IGDX) but far from the optimum in the objective space. For a real-world application, the impact would be that a slight perturbation when executing or traversing a path can result in a severe deterioration in terms of objective functions. Future research could develop more advanced methodologies to focus on local optimisation. Often, only a small portion of the path needs to be changed to result in better objective values [WM22a].

Table 6.2 shows the wins, losses and ties of each algorithm on all 84 test problems. Again, the differences concerning the two performance indicators is evident, as most outcomes regarding IGD+were ties. Several instances had a definite winner when the IGDX values were compared [WM22a].

Figure 6.10 shows a comparison of two variants of the algorithm usingδdF, i.e., (1) taking the median value of all distances to other paths, denoted byFD-MED, and (2) taking the minimum value of the distances, denoted byFD-MIN. The two variants are again compared concerning IGDX (Figure 6.10a) and IGD+ (Figure 6.10b). Interestingly, regarding IGD+,FD-MED either won or was of the same quality asFD-MIN, whereas the latter won in several instances regarding IGDX. The reason is that themin()approach works better for local optimisation since the closest paths are used as a reference [WM22a].

A more detailed view is provided in Figure 6.11, where the respective indicator values over different sizes for a specific instance type are depicted. Clearly,

Figure 6.10: Wins/Losses/Ties of the algorithm incorporating Fréchet distance with respect to the IGD+and IGDX indicators over different instance size intervals [WM22a].

(a) IGDX results

Results by size interval (IGDX indicator)

[15,20] [20,24] [24,26] [26,28] [28,30]

Size Interval 0

5 10 15 20 25

Number of Instances

FD-MED FD-MIN Ties

(b) IGD+results

Results by size interval (IGD+ indicator)

[15,20] [20,24] [24,26] [26,28] [28,30]

Size Interval 0

5 10 15 20 25

Number of Instances

FD-MED FD-MIN Ties

Figure 6.11: Indicator values of the instance CH P1 K3 BT for different sizes, comparing FD-MEDand FD-MIN[WM22a].

(a) IGDX results

15 20 25 30

Instance Size

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5

IGDX Value

102 103

# of Pareto-optimal solutions

CH P1 BT K3 FD-MED

FD-MIN

# of POS

(b) IGD+results

15 20 25 30

Instance Size

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5

IGD+ Value

102 103

# of Pareto-optimal solutions

CH P1 BT K3 FD-MED

FD-MIN

# of POS

FD-MINobtained better or similar values in terms of the IGDX values but was outperformed byFD-MEDwith respect to IGD+.

When analysing the results, we noted that the high number of ties indicates room for improvement. The many ties are a result of the path similarity sorting.

Sorting with two subsets of distinct paths that have relatively short distance values inside the subset will result in a short distance value for each path, although the two sets may be apart from each other [WM22a].