Uncrowded Hypervolume Improvement: COMOCMAES and the Sofomore framework
Abstract.
We present a framework to build a multiobjective algorithm from singleobjective ones. This framework addresses the dimensional problem of finding solutions in an dimensional search space, maximizing an indicator by dynamic subspace optimization. Each singleobjective algorithm optimizes the indicator function given fixed solutions. Crucially, dominated solutions minimize their distance to the empirical Pareto front defined by these solutions. We instantiate the framework with CMAES as singleobjective optimizer. The new algorithm, COMOCMAES, is empirically shown to converge linearly on biobjective convexquadratic problems and is compared to MOCMAES, NSGAII and SMSEMOA.
1. Introduction
Multiobjective optimization problems must be solved frequently in practice. In contrast to the optimization of a single objective, solving a multiobjective problem involves to handle tradeoffs or incomparabilities between the objective functions such that the aim is to approximate the Pareto set—the set of all Paretooptimal or nondominated solutions. One might be interested to obtain an approximation of unbounded size (the more points the better) or just to have points approximating the Pareto set. Evolutionary Multiobjective Optimization (EMO) algorithms aim at such an approximation in a single algorithm run whereas more classical approaches, e.g. optimizing a weighted sum of the objectives with changing weights, operate in multiple runs.
The first introduced EMO algorithms simply changed the selection of an existing singleobjective evolutionary algorithm keeping the exact same search operators. The population at a given iteration was then providing an approximation of the Pareto set. This idea led to the practically highly successful NSGAII algorithm (Deb et al., 2002) that employs a twostep fitness assignment: after a first nondominated ranking (Goldberg, 1989), solutions with equal nondomination rank are further distinguished by their crowding distance—based on the distance of each solution to its neighbors in objective space. However, it has been pointed out that NSGAII does not converge to the Pareto set in a mathematical sense due to socalled deteriorative cycles: if all population members of the algorithm are nondominated at some point in time, it is only the crowding distance that is optimized, without indicating any search direction towards the Pareto set to the algorithm. As a result, solutions which had been nondominated solutions at some point in time can be replaced by previously dominated ones during the optimization, ending up in a cyclic but not in convergent behavior (Berghammer et al., 2010).
To improve the convergence properties of EMO algorithms, different approaches have been introduced later, most notably the indicatorbased algorithms and especially algorithms based on the hypervolume indicator. They replace the crowding distance of NSGAII with the (hypervolume) indicator contribution, see e.g. (Igel et al., 2007; Beume et al., 2007). Using the hypervolume indicator has the advantage that it is the only known strictly monotone quality indicator (Knowles et al., 2006) (see also next section) and thus, its optimization will result in solution sets that are subsets of the Pareto set.
The optimization goal of indicatorbased algorithms such as SMSEMOA (Beume et al., 2007) or MOCMAES (Igel et al., 2007) is to find the best set of solutions with respect to a given quality indicator (the set with the largest quality indicator value among all sets of size ). This optimal set of solutions is known as the optimal distribution (Auger et al., 2009). In principle, the search for the optimal distribution can be formalized as a dimensional optimization problem where is the number of solutions and is the dimension of the search space.
As we will discuss later, it turns out that this optimization problem is not only of too high dimension in practice but also flat in large regions of the search space if the hypervolume indicator is the underlying quality indicator. The combination of nondominated ranking and hypervolume contribution as in SMSEMOA or MOCMAES corrects for this flatness, but also introduces search directions that are pointing towards already existing nondominated solutions and not towards notyetcovered regions of the Pareto set. In this paper, we show that we can correct the flat region of the hypervolume indicator by introducing a search bias towards yetuncovered regions of the Pareto set by adding the distance to the empirical nondomination front, which leads to the new notion of Uncrowded Hypervolume Improvement. Then, we define a (dynamic) fitness function that can be optimized by singleobjective algorithms. From there, going back to this original idea of EMO algorithms to use singleobjective optimizers to build an EMO, we define the Singleobjective Optimization FOr Optimizing Multiobjective Optimization pRoblEms framework (Sofomore) to build in an elegant manner, a multiobjective algorithm from a set of singleobjective optimizers. Each singleobjective algorithm optimizes (iteratively or in parallel) a dynamic fitness that depends on the output of the other optimizers.
We instantiate the Sofomore framework with the stateoftheart singleobjective algorithm CMAES. We show experimentally that the ensuing COMOCMAES (CommaSelection Multiobjective CMAES) exhibits linear convergence towards the optimal distribution on a wide variety of biobjective convex quadratic functions. In contrast, default implementations of the SMSEMOA where the reference point is fixed and NSGAII do not exhibit this linear convergence. The comparison between COMOCMAES and a previous MATLAB implementation of the elitist MOCMAES also shows the same or an improved convergence speed in COMOCMAES except for the double sphere function.
The paper is structured as follows. In the next section, we start with preliminaries related to multiobjective optimization and quality indicators. Section 3 discusses the fitness landscape of indicator and especially hypervolumebased quality measures and eventually introduces our Sofomore framework. Section 4 gives details about the new COMOCMAES algorithm as an instantiation of Sofomore with CMAES. Section 5 experimentally validates the new algorithm and compares it with three existing algorithms from the literature and Section 6 discusses the results and concludes the paper.
2. Preliminaries
In the following, we assume without loss of generality the minimization of a vectorvalued function that maps a search point from the search space of dimension to the objective space . This minimization of is generally formalized in terms of the weak Pareto dominance relation for which we write that a search point weakly Paretodominates another search point (written in short as or with an abuse of notation as ) if and only if for all . Note also that we can naturally extend the (weak) Pareto dominance relation to subsets as if and only if for all , there exists such that . If the relation is strict for at least one objective function, we say that Paretodominates (and write ). The set of nondominated search points constitutes the socalled Pareto set, its image under is called the Pareto front. In the remainder, we will also use the term empirical nondominated front or empirical Pareto front () for objective vectors that are on the boundary of the (objective space) region dominating a reference point , and not dominated by any element of with :
(1) 
where is the boundary of the nondominated region . Note that is the Pareto front when contains the Pareto set.
IndicatorBased Set Optimization Problems
Pareto sets and Pareto fronts are, under mild assumptions, dimensional manifolds. In practice, we are often interested in a finite size approximation of these sets with, let us say, many search points. To assess the quality of a Pareto set approximation , a quality indicator assigns a real valued quality to . Formally speaking, this transforms the original multiobjective optimization of into the singleobjective set problem of finding the socalled optimal distribution (Auger et al., 2012)
(2) 
as the set of search points of cardinality (or lower) with the highest indicator value among all sets of this size (Auger et al., 2009).
Natural candidates for practically relevant quality indicators are monotone or even strictly monotone indicators such as the epsilonindicator (Zitzler and Künzli, 2004), the R2 indicator (Hansen and Jaszkiewicz, 1998), or the hypervolume indicator ((Zitzler and Thiele, 1998a; Auger et al., 2009), still the only known strictly monotone indicator family to date). We remind that an indicator is called monotone if —or in other words, if it does not contradict the weak Pareto dominance relation. If , we say that is strictly monotone.
Hypervolume, Hypervolume Contribution, and Hypervolume Improvement
Because the hypervolume indicator (Zitzler and Thiele, 1998a; Auger et al., 2009) and its weighted variant is the only known strictly monotone indicator, we will later on use it as well in our framework. The hypervolume (Zitzler and Thiele, 1998b) of a finite set of solutions with respect to the reference point is defined as , where is the Lebesgue measure on the objective space and is the objective function. In the case of two objective functions, the hypervolume indicator value of nondominated solutions with can also be written as the sum of the area of axis parallel rectangles: ; .
Furthermore, the hypervolume contribution of a search point to a solution set with respect to the reference point is the hypervolume indicator value that we lose when we remove from the set (Bringmann and Friedrich, 2011):
3. Sofomore: Building Multiobjective from SingleObjective Algorithms
Quality indicators have been introduced as a way to measure the quality of a set of objective vectors but also to define a multiobjective optimization problem as a singleobjective set problem of maximizing the quality indicator as in (2). This naturally defines a singleobjective dimensional problem to be maximized
(3) 
Because and in particular are typically large in practice, we usually do not attempt to solve a multiobjective optimization problem by directly optimizing (3). Nevertheless, when is the hypervolume indicator, Hernández et al. suggest to use a Newton method to directly solve (3). It assumes that is twice continuously differentiable, in which case the gradient and Hessian of can be computed analytically (Hernandez et al., 2018). Yet, directly attacking (3) is not possible because dominated points have a zero subgradient and the Newton direction is therefore zero. Thus, Hernández et al. need to start from a set of nondominated points, close enough to the Pareto set, which requires in practice to couple the approach with another algorithm (Hernandez et al., 2018).
Instead of directly optimizing (3), our proposed Sofomore framework performs iterative subspace optimization of the function and penalizes the flat landscape of in dominated regions. More precisely, the basic idea behind Sofomore is to optimize subspace or componentwise, by iteratively fixing all but one search point and only optimizing the indicator with respect to while the other search points are temporarily fixed. Hence we maximize the functions
(4) 
If the placement of each of the search points () is optimized iteratively by fixing a different point set each time, as we suggest in our Sofomore framework below, we are in the setup of optimizing a dynamic fitness. More details on this aspect of our Sofomore framework will be given below in Section 3.2.
3.1. A Fitness Function for Subspace Optimization
If we use as quality indicator in (4) a (strictly) monotone indicator like the hypervolume indicator, the overall fitness is flat in the interior domain of regions where points are dominated. Hence, we suggest to not optimize (4) directly but to unflatten it in dominated areas of the search space without changing the optimization goal.
Any solution that is dominated by the other points in will receive zero fitness when we use as indicator in (4) the hypervolume indicator of the entire set with respect to the reference point or replace it with the hypervolume improvement of the solution to . This situation is depicted in the first column of Figure 1 where for a fixed set of six arbitrarily chosen search points, the hypervolume improvement’s level sets (of equal fitness) in both search and objective space are shown. This flat fitness with zero gradient will not allow to steer the search towards better search points which has also been highlighted by Hernández et al. (Hernandez et al., 2018).
A common approach to guide an optimization algorithm in the dominated space is to use the hypervolume (contribution) as secondary fitness after nondominated sorting (Goldberg, 1989), as it is done for example in the SMSEMOA (Beume et al., 2007) or the MOCMAES (Igel et al., 2007). The idea is that all search points with a worse nondomination rank get assigned a fitness that is worse than for search points with a better nondomination rank. Within a set of the same rank, the hypervolume contribution with respect to all points with the same rank is used to refine the fitness. The middle column of Figure 1 shows the resulting level sets of equal fitness. As we can see, this fitness assignment distinguishes between dominated solutions, i.e. the fitness is not flat anymore. Yet it still has another major disadvantage: the search direction in the dominated area (perpendicular to its level sets) points towards already existing nondominated solutions. Attracting dominated solutions towards nondominated solutions seems however undesirable, as they will compete for the same hypervolume area. Instead, we want dominated points to enter the uncrowded space between nondominated points thereby complementing their hypervolume contribution (improvement).
Uncrowded Hypervolume Improvement
For this purpose, we define the Uncrowded Hypervolume Improvement UHVI based on the Hypervolume Improvement for nondominated search points and on the Euclidean distance to the nondominated region for dominated search points. More concretely, of a search point with respect to a finite set and the reference point is defined as
(5) 
where is the distance between an objective vector and the empirical nondomination front of the set defined as in (1).
We define the fitness for a search point with respect to other solutions in as
(6) 
Note that is continuous on the empirical nondomination front where both the hypervolume improvement and the considered distance are zero.
Figure 2 illustrates this fitness for one nondominated and two dominated search points (blue plusses) with respect to a set of six other search points (black crosses). The righthand column of Figure 1 shows the level sets of this fitness. The newly introduced hypervolume improvement and distance based fitness shows smooth level sets, both in search and in objective space. Maybe most importantly, in the dominated area, the fitness function’s descent direction (perpendicular to its level sets) now points towards the gaps in the current Pareto front approximation.
3.2. Iteratively Optimizing the Fitness: The Sofomore Framework
After we have discussed a fitness assignment that looks worth to optimize, we come back to our initial idea of subspace optimization and define the underlying algorithmic framework behind Sofomore.
At first, we consider a singleobjective optimizer in an abstract manner as an iterative algorithm with state updated as where is the singleobjective function optimized by the optimizer and encodes possible random variables sampled within one iteration if we consider a randomized algorithm (and can be taken as constant in the case of a deterministic optimizer). The transition function contains all updates done within the algorithm in one iteration.
We assume that in each iteration , the optimizer returns a best estimate of the optimum, often called incumbent solution or recommendation. This is the solution that the optimizer would return if we stop it at iteration . We denote this incumbent as —mapping the state of the algorithm to the estimate of the optimum given this state.
The overall idea behind the subspace optimization and the Sofomore framework can then be formalized as in Algorithm 1: after initializing singleobjective algorithms with their states and denoting their transition functions as (), we consider their incumbents or recommendations as the search points that are expected to approximate the optimal distribution.
In each step of the Sofomore framework, we choose one of the algorithms (denoted by its number , with ) and run it iterations on the fitness to update the recommendation while keeping all other recommendations fixed. It is important to note that the fitness used for algorithm is actually changing dynamically with the optimization because it depends on all the other incumbents but which, over time, are expected to move towards the Pareto set as well.
Algorithm 1 proposes a generic framework where the order in which the singleobjective algorithms are run and the number of iterations for them are not explicitly defined. A simple strategy would be to choose the algorithms at random or in a given, fixed order and run each singleobjective algorithm a fixed number of time steps. But also more elaborate strategies can be envisioned, for example based on the idea of multiarmed bandits (Bubeck and CesaBianchi, 2012): we can log the changes in the fitness value of each incumbent over time and favor as the next chosen algorithms the ones that give the highest expected fitness improvements. Note also that the singleobjective algorithms’ types may be different such that we can combine local with global algorithms or even change the algorithms over time, allow restarts etc. In the following experimental validation of our concept, however, we choose a single optimization algorithm and a simple, random strategy to choose which of them to run next.
With a simple change, Algorithm 1 can be made parallelizable (resulting in slightly different search dynamics though): postponing the updates of the after every algorithm has been touched at least once makes the optimization of the fitness functions independent such that they can be performed in parallel.
Relation of Sofomore with other existing algorithms
We briefly discuss how some existing algorithms and algorithm frameworks relate to the new Sofomore proposal.
The coupling of singleobjective algorithms to form a multiobjective one has been done before, especially in the MOEA/D framework (Zhang and Li, 2007). In MOEA/D, static search directions (in objective space) are defined via (singleobjective) scalarizing functions. Each of them is optimized in parallel with solutions potentially shared between neighboring search directions. On the contrary, the fitness in Sofomore is dynamic, depending on the other incumbents. Optimizing a set of scalarizing functions in classical approaches to multiobjective optimization have static optimization problems to solve without any interaction between them (Miettinen, 1999).
Many other EMO algorithms, such as NSGAII, SMSEMOA, or MOCMAES are not covered by the Sofomore framework. One simple reason is that the UHVI is newly defined.
The already mentioned Newton algorithm on the hypervolume indicator fitness of (Hernandez et al., 2018) is probably the closest existing approach from Sofomore, but (Hernandez et al., 2018) needs to initialize the Newton algorithm with a set of nondominated solutions in order for the algorithm to optimize due to the flat regions of its objective space. Also algorithms for expensive multiobjective optimization based on the optimization of the expected hypervolume improvement (Wagner et al., 2010) can be seen as related to Sofomore, although the proposal of new solutions in algorithms like SMSEGO (Ponweiser et al., 2008) or Smetric based ExI (Emmerich and Klinkenberg, 2008) use Gaussian Processes to model the objective function. These algorithms, in contrary to Sofomore, propose iteratively a single solution based on the expected hypervolume improvement over all known solutions and do not aim at replacing successively a single recommendation by another (better) one. Interesting to note is that algorithms like SMSEGO and Smetric based ExI employ the expected hypervolume indicator improvement as fitness while the approach of Keane (Keane, 2006) “uses the Euclidean distance to the nearest vector in the Pareto front” (Wagner et al., 2010).
4. ComoCmaEs
In this section, we instantiate Sofomore with the CMAES as single objective optimizer.
Regarding the choice of which optimization algorithm to run (and how long), we opt for a simple strategy: we sample a permutation from , the set of all permutations on uniformly at random and use this fixed permutation to touch each algorithm once in the order of the permutation. Once all algorithms have been touched, we then resample a new permutation. We run each algorithm for a single iteration. Letting the algorithms run for a too long period right from the start seems suboptimal. As the fitness is dynamic, we do not need to optimize it too precisely. We mainly have two requirements for the choice of single objective optimizers: (i) an optimization algorithm has to be stoppable at any iteration and resumable thereafter and (ii) an optimization algorithm needs to be able to give a good recommendation about the best estimate of the optimum, given its current state. The Covariance Matrix Adaptation Evolution Strategy (CMAES, (Hansen and Ostermeier, 2001)) is a natural choice. Not only is it a stateoftheart algorithm for difficult blackbox optimization problems but also does it fulfill our requirements. In CMAES, the state of the algorithm is composed of a step size and the parameters of a multivariate normal distribution, namely a mean vector representing the favorite solution and a covariance matrix . In addition, two dimensional evolution paths speed up stepsize and covariance matrix adaptation. For each , the incumbent solution is the mean of the CMA algorithm. A convenient implementation of CMAES is via the ask and tell interface (Collette et al., 2010), where the ask function returns candidate solutions and the tell function updates the state from their fitness values. The interface allows to easily stop and resume the optimization and to integrate the dynamic fitness of Sofomore, see Algorithm 2. We call this instantiation of the Sofomore framework COMOCMAES. The CMAES instances are called kernels.
We see in particular how CMAES is integrated into Sofomore via its askandtell interface. After choosing the next kernel , the corresponding CMAES instance samples solutions (“ask”). It then evaluates them on the uncrowded hypervolume improvement based fitness defined in Eq (6)—given all other kernels being fixed. After sorting the solutions with respect to their fitness, COMOCMAES feeds the sampled points with their fitness values back to the CMAES instance (“tell”) which updates all its internal algorithm parameters. Finally, the new mean of the corresponding CMAES instance updates the list of the COMOCMAES’s proposed solutions. Note here that CMAES is usually not evaluating the mean of the sample distribution which therefore is done in line 22.
5. Experimental Validation
We present in this section numerical experiments of the COMOCMAES. Though, in principle, the algorithm can be defined for any number of objectives, we present results only for . We use the pycma Python package (Hansen et al., 2019) version for CMAES as singleobjective optimizer without further parameter tuning.
5.1. Test Functions and Performance Measures
For a matrix and two vectors and , we denote
(7) 
We also denote by the allzeros vector, the allones vector, and the unit vector with its only nonzero value at position . Starting from a positive diagonal matrix , and two independent orthogonal matrices and , we consider the classes of biobjective convex quadratic problems Sep, One and Two defined as follows (Toure et al., 2019)

, .

,

, with .
If is the identity matrix, we call the problems as spheresep in the first case and bisphere in the second and third cases (the rotations are ineffective). If for , then we denote the problems as ellisep, ellione or ellitwo. If , and for , then we have cigtabsep, cigtabone or cigtabtwo.
We fix the reference point to . The scalings above ensure that the reference point is dominated by all Pareto fronts considered, and that the Sep and the One problems have the same Pareto front (see (Toure et al., 2019)) than the bisphere . Note that the expression does not depend on the dimension .
We use two performance measurements in each run of an algorithm. First the convergence gap defined as the difference between an offset called hv_max and the hypervolume of the points found by the algorithm (in case of COMOCMAES or of the population for the other algorithms tested) called hv; and second the archive gap defined as the difference between an offset called hvarchive_max and the hypervolume of all nondominated points found by the algorithm called hvarchive. The setting of hv_max is done for each problem as the maximum hypervolume value of kernels found so far anytime the problem was optimized in our machines, plus a small number (). For the Sep and the One problems, we take hvarchive_max as which corresponds to the hypervolume of the theoretical Pareto front. For the twoclass of problems, we use the analytic expression of their Pareto set (Toure et al., 2019) to sample a large number of points on the Pareto set, and compute their hypervolume as hvarchive_max. Thus for the ellitwo problem in dimension , we sample points.
5.2. Linear convergence of COMOCMAES
We investigate the convergence of COMOCMAES for different dimensions and number of kernels, and display the results on the spheresep, ellisep, cigtabsep and ellitwo functions for and . The global initial stepsize is set to and the initial lower, upper bounds (line 8 of Algorithm 2) respectively to , . In Figure 3, we observe linear convergence in the convergence gap (first column) on all test functions, starting roughly when all displayed ratios of nondominated points reach (second column). The last three columns of Figure 3 illustrate the eigenspectra of the kernels covariance matrices. The first two columns reveal two phases.
First, the kernels incumbents approach the nondominated region: for spheresep this takes about evaluations per kernel, for ellisep, cigtabsep and ellitwo it takes about , and evaluations per kernel. Afterwards, the convergence gap converges linearly. In our settings, there are evaluations per kernel during the update of a kernel, thus for the * functions (which have the same Pareto set and front), the linear convergence rate is about and for ellitwo, it is about .
For the first function evaluations per kernel on ellisep, there is no point dominating the reference point, which means that the algorithm started far from the Pareto front. Looking at ellitwo, we confirm that it has a different Pareto front than the three other problems: (instead of ).
The Uncrowded Hypervolume Improvement depends on other kernels’ incumbents and therefore changes in each iteration. Yet, the last three columns are similar to what one would observe when optimizing a single objective convexquadratic function with corresponding Hessian matrix. After a large enough number of iterations, the probability that the incumbents and their offspring are in the Pareto set becomes close to . Then if the incumbents is a subset of the Pareto set and is nondominated, Eq (6) becomes: . Its Hessian on smooth biobjective problems is . For our test functions, it is a linear combination of the single objectives Hessian matrices, up to a rankone matrix and its transpose (the gradients are colinear on the Pareto set of biobjective convex quadratic problems (Toure et al., 2019)). That might give a glimpse on the behaviour seen in the last three columns of Figure 3.
5.3. Comparing COMOCMAES with MOCMAES, NSGAII and SMSEMOA
We compare four multiobjective algorithms: COMOCMAES, MOCMAES (Igel et al., 2007), NSGAII (Deb et al., 2002) and SMSEMOA (Beume et al., 2007), by testing them on classes of biobjective convexquadratic problems. We draw once and for all one rotation for ellione in and two different rotations for ellitwo in . The Simulated Binary Crossover operator (SBX) and the polynomial mutation are used for NSGAII (run with the evoalgos package (Wessing, 2017)) and SMSEMOA (run with the Matlab version by Fabian Kretzschmar and Tobias Wagner (Wagner and Trautmann, 2010)): we use a crossover probability of and a mutation probability of , and the distribution indexes for crossover and mutation operators are both equal to . We use the version of MOCMAES from (Voßet al., 2010). The number of kernels for COMOCMAES corresponds to the population size of the other algorithms, that we set to either or , and the dimensions considered are and . The global initial stepsize of COMOCMAES is set to with initial lower, upper bounds (line 8 of Algorithm 2) set to the allzeros and allones vectors. The initial population for the three other algorithms is sampled uniformly at random in .
We run each multiobjective optimization times and display the convergence gap (of the population or the incumbent solutions of the kernels) and the archive gap.
In Figure 4, the values of the convergence gap reached by COMOCMAES and MOCMAES are several orders of magnitude lower than for the two other algorithms. On the 5dimensional bisphere, COMOCMAES and MOCMAES appear to show linear convergence, where the latter appears to be about 30% faster than the former. On the cigtabsep function, COMOCMAES is initially slow, but catches up after about evaluations per kernel. In all other cases, COMOCMAES shows superior performance for the convergence gap. On the 10dimensional cigtabsep, COMOCMAES shows a plateau between 2000 and 4000 evaluations per kernel. This kind of plateau cannot be observed in the MOCMAES and the observed final convergence speed is better for COMOCMAES than for MOCMAES. The observed plateau is typical for the behavior of nonelitist multirecombinative CMAES on the tablet function, because CSA barely reduces an initially large stepsize before the tabletshape has been adapted, which is related to the neutral subspace defect found in (Krause et al., 2017). Elitism as in the MOCMAES, on the other hand, also helps to decrease an initially too large stepsize.
Although COMOCMAES was not designed to perform well on the archive gap, it shows consistently the best results over all experiments. Only on the cigtabsep in with kernels, NSGAII reaches and slightly surpasses the archive gap of COMOCMAES after function evaluations per kernel. This suggests, as expected from the known dependency between optimal stepsize and population size (Hansen et al., 2015), that COMOCMAES adds valuable diversity while approaching the optimal distribution of the Pareto front at the same time.
6. Conclusions
We have proposed (i) the Sofomore framework to define multiobjective optimizers from singleobjective ones, (ii) a fitness for dominated solutions to be the distance to the empirical Pareto front (Uncrowded Hypervolume Improvement UHVI) and (iii) the nonelitist ”comma” CMAES to instantiate the framework (COMOCMAES). We observe that COMOCMAES converges linearly towards the optimal distribution of the hypervolume indicator on several biobjective convex quadratic problems. The COMOCMAES appears to be robust to independently rotating the Hessian matrices of convexquadratic problems, even if such rotations transform the Pareto set from a line segment to a bent curve. In our limited experiments, COMOCMAES performed generally better than MOCMAES, SMSEMOA and the NSGAII, w.r.t. convergence gap and archive gap while COMOCMAES was solely designed to optimize the convergence gap. We conjecture that the advantage on the archive gap is due to (i) the large stationary variance obtained with nonelitist evolution strategies and (ii) the fitness assignment of dominated solutions which favors the vacant (uncrowded) space between nondominated solutions and hence serves as implicit crowding distance penalty measure.
Acknowledgements
Part of this research has been conducted in the context of a research collaboration between Storengy and Inria. We particularly thank F. Huguet and A. Lange from Storengy for their strong support, practical ideas and expertise.
References
 (1)
 Auger et al. (2009) Anne Auger, Johannes Bader, Dimo Brockhoff, and Eckart Zitzler. 2009. Theory of the hypervolume indicator: optimal distributions and the choice of the reference point. In Foundations of Genetic Algorithms (FOGA 2009). ACM, Orlando, Florida, USA, 87–102.
 Auger et al. (2012) Anne Auger, Johannes Bader, Dimo Brockhoff, and Eckart Zitzler. 2012. Hypervolumebased multiobjective optimization: Theoretical foundations and practical implications. Theoretical Computer Science 425 (2012), 75–103.
 Berghammer et al. (2010) R. Berghammer, T. Friedrich, and F. Neumann. 2010. Setbased Multiobjective Optimization, Indicators, and Deteriorative Cycles. In Genetic and Evolutionary Computation Conference (GECCO 2010). ACM, Portland, Oregon, 495–502. https://doi.org/10.1145/1830483.1830574
 Beume et al. (2007) N. Beume, B. Naujoks, and M. Emmerich. 2007. SMSEMOA: Multiobjective Selection Based on Dominated Hypervolume. European Journal of Operational Research 181, 3 (2007), 1653–1669.
 Bringmann and Friedrich (2011) Karl Bringmann and Tobias Friedrich. 2011. Convergence of hypervolumebased archiving algorithms I: Effectiveness. In Proceedings of the 13th annual conference on Genetic and evolutionary computation. ACM, Dublin, Ireland, 745–752.
 Bubeck and CesaBianchi (2012) Sébastien Bubeck and Nicolo CesaBianchi. 2012. Regret analysis of stochastic and nonstochastic multiarmed bandit problems. Foundations and Trends® in Machine Learning 5, 1 (2012), 1–122.
 Collette et al. (2010) Yann Collette, Nikolaus Hansen, Gilles Pujol, Daniel Salazar Aponte, and Rodolphe Le Riche. 2010. On ObjectOriented Programming of Optimizers  Examples in Scilab. In Multidisciplinary Design Optimization in Computational Mechanics, Rajan Filomeno Coelho and Piotr Breitkopf (Eds.). Wiley, New Jersey, 499–538. https://hal.inria.fr/inria00476172
 Deb et al. (2002) K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. 2002. A Fast and Elitist Multiobjective Genetic Algorithm: NSGAII. IEEE Transactions on Evolutionary Computation 6, 2 (2002), 182–197.
 Emmerich et al. (2005) Michael Emmerich, Nicola Beume, and Boris Naujoks. 2005. An EMO algorithm using the hypervolume measure as selection criterion. In International Conference on Evolutionary MultiCriterion Optimization. Springer, Guanajuato, Mexico, 62–76.
 Emmerich and Klinkenberg (2008) Michael Emmerich and Janwillem Klinkenberg. 2008. The computation of the expected improvement in dominated hypervolume of Pareto front approximations. Technical Report 42008. Leiden Institute of Advanced Computer Science, LIACS.
 Goldberg (1989) D. E. Goldberg. 1989. Genetic Algorithms in Search, Optimization, and Machine Learning. AddisonWesley, Reading, Massachusetts.
 Hansen and Jaszkiewicz (1998) M. P. Hansen and A. Jaszkiewicz. 1998. Evaluating The Quality of Approximations of the NonDominated Set. Technical Report. Institute of Mathematical Modeling, Technical University of Denmark. IMM Technical Report IMMREP19987.
 Hansen et al. (2019) Nikolaus Hansen, Youhei Akimoto, and Petr Baudis. 2019. CMAES/pycma on Github. Zenodo, DOI:10.5281/zenodo.2559634. (Feb. 2019). https://doi.org/10.5281/zenodo.2559634
 Hansen et al. (2015) Nikolaus Hansen, Dirk V Arnold, and Anne Auger. 2015. Evolution strategies. In Springer handbook of computational intelligence. Springer, Berlin, 871–898.
 Hansen and Ostermeier (2001) N. Hansen and A. Ostermeier. 2001. Completely Derandomized SelfAdaptation in Evolution Strategies. Evolutionary Computation 9, 2 (2001), 159–195.
 Hernandez et al. (2018) VAS Hernandez, O Schutze, H Wang, A Deutz, and M Emmerich. 2018. The SetBased Hypervolume Newton Method for BiObjective Optimization. IEEE transactions on cybernetics in print (2018). (in print).
 Igel et al. (2007) C. Igel, N. Hansen, and S. Roth. 2007. Covariance matrix adaptation for multiobjective optimization. Evolutionary Computation 15, 1 (2007), 1–28.
 Keane (2006) Andy J. Keane. 2006. Statistical improvement criteria for use in multiobjective design optimization. AIAA journal 44, 4 (2006), 879–891.
 Knowles et al. (2006) J. Knowles, L. Thiele, and E. Zitzler. 2006. A Tutorial on the Performance Assessment of Stochastic Multiobjective Optimizers. TIK Report 214. Computer Engineering and Networks Laboratory (TIK), ETH Zurich.
 Krause et al. (2017) Oswin Krause, Tobias Glasmachers, and Christian Igel. 2017. Qualitative and quantitative assessment of step size adaptation rules. In Proceedings of the 14th ACM/SIGEVO Conference on Foundations of Genetic Algorithms. ACM, Copenhagen, Denmark, 139–148.
 Miettinen (1999) K. Miettinen. 1999. Nonlinear Multiobjective Optimization. Kluwer, Boston, MA, USA.
 Ponweiser et al. (2008) Wolfgang Ponweiser, Tobias Wagner, Dirk Biermann, and Markus Vincze. 2008. Multiobjective Optimization on a Limited Budget of Evaluations Using ModelAssisted Metric Selection. In Parallel Problem Solving from Nature (PPSN 2008). Springer, Dortmund, Germany, 784–794.
 Toure et al. (2019) Cheikh Toure, Anne Auger, Dimo Brockhoff, and Nikolaus Hansen. 2019. On BiObjective convexquadratic problems. In International Conference on Evolutionary MultiCriterion Optimization. Springer, Lansing, Michigan, USA, 3–14.
 Voßet al. (2010) T. Voß, N. Hansen, and C. Igel. 2010. Improved Step Size Adaptation for the MOCMAES. In Genetic and Evolutionary Computation Conference (GECCO 2010), J. Branke et al. (Eds.). ACM, Portland, OR, USA, 487–494.
 Wagner et al. (2010) Tobias Wagner, Michael Emmerich, André Deutz, and Wolfgang Ponweiser. 2010. On expectedimprovement criteria for modelbased multiobjective optimization. In International Conference on Parallel Problem Solving from Nature. Springer, Krakow, Poland, 718–727.
 Wagner and Trautmann (2010) Tobias Wagner and Heike Trautmann. 2010. Online convergence detection for evolutionary multiobjective algorithms revisited. In IEEE Congress on Evolutionary Computation. IEEE, Barcelona, Spain, 1–8.
 Wessing (2017) Simon Wessing. 2017. evoalgos: Modular evolutionary algorithms. Python package version 1. (2017). https://pypi.python.org/pypi/evoalgos [Online; accessed 31January2019].
 Yang et al. (2019) Kaifeng Yang, Michael Emmerich, André Deutz, and Thomas Bäck. 2019. MultiObjective Bayesian Global Optimization using expected hypervolume improvement gradient. Swarm and evolutionary computation 44 (2019), 945–956.
 Zhang and Li (2007) Q. Zhang and H. Li. 2007. MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition. IEEE Transactions on Evolutionary Computation 11, 6 (2007), 712–731. https://doi.org/10.1109/TEVC.2007.892759
 Zitzler and Künzli (2004) Eckart Zitzler and Simon Künzli. 2004. Indicatorbased selection in multiobjective search. In International Conference on Parallel Problem Solving from Nature. Springer, Birmingham, UK, 832–842.
 Zitzler and Thiele (1998a) E. Zitzler and L. Thiele. 1998a. Multiobjective Optimization Using Evolutionary Algorithms  A Comparative Case Study. In Conference on Parallel Problem Solving from Nature (PPSN V) (LNCS), Vol. 1498. Springer, Amsterdam, The Netherlands, 292–301.
 Zitzler and Thiele (1998b) Eckart Zitzler and Lothar Thiele. 1998b. Multiobjective optimization using evolutionary algorithms  A comparative case study. In International conference on parallel problem solving from nature. Springer, Amsterdam, The Netherlands, 292–301.