Interaction-based Methods - Related Variable Grouping Methods

3.3 Related Variable Grouping Methods

3.3.3 Interaction-based Methods

The third category of grouping methods regards the interaction based groups. These methods aim to identify interacting variables (see Section 2.3.3) through an analysis,

in most cases prior to the start of the actual optimisation phase of algorithms. As a result, these methods consume function evaluations to examine the effect of changes in the variables on the fitness function(s).

A general difficulty of such interaction-detection mechanisms lies in the balance between the computational budget and the level of detail with which the analysis is carried out.

Without prior knowledge about the problem, the decision whether variables are considered interacting has to be based on definitions like the one given above in Section 2.3.3.

However, under the assumption that the problems are seen as black boxes, the given equations can only be tested with certain variable value combinations at a time, and not computed analytically and on a global scale. The large computational budget usually results from the fact that a large number of variable combinations need to be tested to determine for each pair of variables whether the interaction conditions are fulfilled. It is further of importance that there is the possibility of wrong judgements when variables interact only in local areas of the search space, and conditions like the ones in Section 2.3.3 are for instance only tested with the lower and upper bounds of variables. This risk can be reduced by testing more combinations of values for each pair of variables, which, on the other hand, then results in higher amounts of used evaluations again.

The definition of variable interaction is based on differences in objective function values as shown in Definition 2.5. Therefore, the single-objective methods to check for these differences are not applicable to multi-objective problems, as the definition of “differences”

between the objective function vectors is not clear any more. For instance, it is possible that certain variables interact with respect to one of the m objective functions, but not with respect to the other functions. There are, however, methods to use these definitions in multi-objective optimisation. Specific ways were implemented into MOEA/DVA and LMEA before, and their implementations are outlined below. A more generalised way to apply single-objective grouping methods to multi-objective optimisation was also proposed in [26] by the author of the present thesis. In the following, some selected single-and multi-objective interaction-grouping methods are described in further detail.

Differential Grouping

Differential Grouping (DG) is an interaction-based grouping approach for single-objective optimisation which was introduced in the year 2014 [84]. Its goal is to detect the interaction between variables by comparing the amount of change in a single objective function in reaction to a change of a variable xi before and after another variable xj was changed [1]. Formally, the definition of interaction used in DG differs from the commonly used one in other works, which we introduced in Section 2.3.3. DG uses the following equations to determine interaction, where it is noted that they defined interaction for additively separable functions, and therefore DG does not claim to find any interactions.

3.3. RELATED VARIABLE GROUPING METHODS 59

~x1= (0.0,0.0,0.0,0.0,0.0,0.0)

~x₂= (1.0,0.0,0.0,0.0,0.0,0.0)

~x3= (0.0,1.0,0.0,0.0,0.0,0.0)

~x₄= (1.0,1.0,0.0,0.0,0.0,0.0)

∆1=f(~x1)−f(~x2) ∆2 =f(~x3)−f(~x4)

|∆₁−∆₂|>

Figure 3.13: Visualisation of Differential Grouping.

Definition 3.1 (Differential Grouping Interaction) Two decision variables xi and xj are interacting, if ∀a, b1 6=b2, δ∈R, δ6= 0 the following condition holds.

∆δ,xif(~x)|xi=a,xj=b1 6= ∆_δ,x_if(~x)|xi=a,xj=b2 (3.1) where

∆_δ,x_if(~x) =f(..., x_i+δ, ...)−f(..., x_i, ...) (3.2) Using these equations, DG answers the following question [1]: “When changing the value of xi, does the amount of change in f(~x) remains the same regardless of the value of another variable xj?” If this is the case, the variablesxi and xj seem not to interact with each other, i.e. the fitness function is influenced by a change in each of them, but the amount of change stays the same. These variables can be separated into different groups, as the optimal values ofx_i are supposedly independent of the values ofx_j. If the condition is not fulfilled, the two variables seem to interact, so they are assigned to the same group.

In practice, the definition above is of course too restrictive, as it is highly unlikely that these differences are ever equal, and variables which only interact minimally, i.e. the

∆ values in Eq. (3.1) are close together but not exactly equal, could be considered as non-interacting. To account for this, a threshold valueis used in DG which controls the maximally allowed amount of variation in fitness so that two variables are not regarded as interacting. The procedure of DG is shown exemplarily in Fig. 3.13. DG creates four different solutions, where the values of two variables are changed according to the above equations. In their implementation in [84], the respective upper and lower bounds of each of the variables were used for the values of a, b₁, b₂. The resulting differences in the function values are compared with the threshold . The number and sizes of the groups are determined by the algorithm automatically depending on how many variables interact with each other. If|∆₁−∆₂|> , the variables are considered interacting and they are put in the same group. All variables that do not interact with any other variables are gathered in an additional group, which in the end holds all non-interacting variables.

In the literature, DG showed good performance in single-objective optimisation [84], but requires a large computational overhead to find the interactions. Although the algorithm

implementation adds variables iteratively to the groups, and therefore might not need the complete n·(n−1) checks of variable pairs, each single check needs four function evaluations, and the total computational effort is quadratic in the number of decision variables. In [84], it is computed that, assuming that the true interactions of the problem result in a number of γ = ⁿ_l evenly sized groups with l variables each, the number of function evaluations consumed is in O(ⁿ_l²). Based on the analysis in the original article, for a fully separable problem using n = 2000 decision variables, DG requires n·(n−1) + 2·n= 4,002,000 function evaluations to perform the grouping, and the effort reduces with increasing numbers of interactions between variables [1]. Since the actual used computational budget is dependent on the interactions in the problem, it is hard to know beforehand how many function evaluations are actually needed until DG terminates its analysis. This might not be desirable since it would then not be clear how much computational budget is left for the actual optimisation of the problem. Some of the shortcomings of DG were addressed in its successor, described in the following section.

Differential Grouping 2

The Differential Grouping 2 (DG2) was proposed recently in the year 2017 [87] and addresses some of the drawbacks of DG, which are (1) the dependency on a threshold parameter, (2) the inability to deal with so-called “overlapping” components and (3) the high computational overhead. Compared to its predecessor, DG2 obtains a better group quality and requires a lower computational budget. The results of evaluated solutions are stored and reused during the grouping process. In this way, the necessary function evaluations for DG2 are reduced to ⁿ⁽ⁿ⁺¹⁾₂ + 1. For a 1000-variable problem this results in 500,500 evaluations. This makes DG2 superior compared to its predecessor, although for real world problems, this amount can still be infeasible depending on the application.

Moreover, DG2 improves on the quality of the found groups. DG had the disadvantage of iterating over the variables and adding variables to groups once the first interaction with another variable was found. Therefore, variables which interact with only one variable in one group but with many variables in a second group might end up in the first of these groups. DG2, in contrast, builds up the complete “graph” of interactions between all variable pairs and forms groups based on this information. For more details on the specific mechanisms of DG2 the reader is further referred to the original publication.

Interdependence Analysis in MOEA/DVA

The basis for the grouping mechanism of MOEA/DVA, called Interdependence Analysis in the original article, is the definition in Eq. (2.6). After creating an initial population, the variables are first divided into convergence- and diversity-related variables. The following interaction-based analysis is then applied to all variables, although the division into groups is done for the convergence-variables only in the subsequent step.

3.3. RELATED VARIABLE GROUPING METHODS 61

For every combination of variables x_i and x_j (i, j ∈ 1, .., n), one solution is selected by random choice out of the current population of the algorithm. Then, DG2 creates three new individuals by sampling random values for x_i and x_j within their feasible domains, and replacing the variable values in the chosen solution candidate. For each of the m objective functions, the interaction is checked separately, using these four created solutions, according to Definition 2.5 . DG2 possesses a parameterN IA, which determines how often this process is repeated in order to increase the probability of finding the variables’ interactions. To utilise the created solutions, the current population is updated after each check using the three newly created solutions. If the variables at hand are convergence-related and dominate the current solution in the population, the new values for xi andxj are kept and the solution in the population is replaced.

In the following step, the interaction information for each of the objective functions is used to form the different groups. In order for the algorithm to consider two variables as interacting, they both need to be convergence-related and have an interaction in at least one of the objective functions. In addition, any interaction between two variables leads to the inclusion of all variables that further interact with any of these two into the same groups. The article describes this concept as forming the groups as maximal connected subgraphs of the variable interaction graph. This concept might lead to potentially large groups if the problem contains many interacting variables or the interactions are very different in the different objective functions.

Regarding the computational budget, it is stated in [24] that the interaction analysis needs 3n(n−1)∗N IA

2 function evaluations. This is a larger amount than needed by the DG and DG2 algorithms, and is due to the parameter NIA. The interactions in MOEA/DVA are checked using random values instead of fixed (i.e. upper and lower bounds in DG) ones. Therefore, it can potentially discover more local interactions, but to increase the chances of finding these interactions in all objective functions, the procedure is repeated NIA times. As a result, to analyse a 1000-variable problem, the algorithm needs approximately N IA·1,500,000 function evaluations. Using a value of N IA = 6, as was used for instance in the experiments in [25], the algorithm uses almost 9,000,000 evaluations to analyse the problem, which consumed most of the evaluations used in the experiments in the article.

Interaction Analysis in LMEA

The interaction-based grouping method in LMEA was calledInteraction Analysis [25], and works similarly to the method used in MOEA/DVA. Interaction is assumend between two variables using the above Definition 2.5. As in MOEA/DVA, an interaction in one of the objective functions is sufficient to regard two variables as interacting.

The difference to the MOEA/DVA method lies in the way that variables are assigned to the created groups. The interaction analysis builds the groups iteratively by adding the variables one after another. Each variablex_i is checked for interactions with all variables

in the already existing groups. If an interaction with at least one of the variables exists,x_i is added to that respective group. This check of interaction is done using the mentioned equations and a number of randomly chosen solutions from the current population. The number of solutions drawn and used for the interaction checks is called nCor in the article and roughly corresponds to the parameter N IA in MOEA/DVA. Groups are formed by iteratively building the union of sets which share an interaction with a variable.

Therefore, the groups are formed again in the same way as in MOEA/DVA, meaning that one interacting pair of variables between two groups is sufficient to join these two groups and form one big one, even if many of the variables inside this group may not interact with each other.

In contrast to MOEA/DVA, in LMEA only the convergence-related variables are used in the analysis. For problems with a larger number of diversity-related variables, this may lead to lower computational overhead. However, the general computational costs are similar to the ones of MOEA/DVA. In the experiments in the original article the parameter nCor was set to 6, meaning that the analysis takes again up to 9,000,000 evaluations to analyse the interactions of a 1000-variable problem.

Random-based Dynamic Grouping

One of the most striking disadvantages of interaction based grouping methods like DG or the ones used in MOEA/DVA and LMEA is the necessity of a large computational budget to find the interacting variables. To address this issue, a Random-based Dynamic Grouping strategy called RDG was proposed in 2016 [67]. The aim of this RDG strategy is a reduction of the necessary function evaluations when creating the groups based on the interaction information. The RDG method was implemented into the same framework as the MOEA/DVA algorithm, and therefore also only divides the convergence-based variables into groups [6].

The authors of RDG argue that especially in many-objective optimisation, groups that are suitable for all objective functions might be hard to find or non-existent, and that interaction, which might exist only locally between variables, can only be found with a large computational overhead. Therefore, they apply random groups in RDG, where in each iteration of the main loop of MOEA/DVA, new random groups are created and optimised based on the CC-inspired framework. The dynamic property of the method refers to the sizes of groups. All groups are always created randomly, but the probabilities of choosing the size of the groups varies and is updated based on the success of previous usages of the group sizes. In the beginning of the algorithm, all sizes of the random groups have the same probability to be chosen. When a group size was used to create random groups, the different groups are optimised and a performance metric is used afterwards to determine how much of the old population is dominated by the population that was just optimised with the given group size. Through this mechanism, the algorithm iteratively

Im Dokument Large-scale multi-objective optimisation : new approaches and a classification of the state-of-the-art (Seite 57-63)