NDTreebased update: a Fast Algorithm for the Dynamic NonDominance Problem
Abstract
In this paper we propose a new method called NDTreebased update (or shortly NDTree) for the dynamic nondominance problem, i.e. the problem of online update of a Pareto archive composed of mutually nondominated points. It uses a new NDTree data structure in which each node represents a subset of points contained in a hyperrectangle defined by its local approximate ideal and nadir points. By building subsets containing points located close in the objective space and using basic properties of the local ideal and nadir points we can efficiently avoid searching many branches in the tree. NDTree may be used in multiobjective evolutionary algorithms and other multiobjective metaheuristics to update an archive of potentially nondominated points. We prove that the proposed algorithm has sublinear time complexity under mild assumptions. We experimentally compare NDTree to the simple list, Quadtree, and MFront methods using artificial and realistic benchmarks with up to 10 objectives and show that with this new method substantial reduction of the number of point comparisons and computational time can be obtained. Furthermore, we apply the method to the nondominated sorting problem showing that it is highly competitive to some recently proposed algorithms dedicated to this problem.
I Introduction
In this paper we consider the dynamic nondominance problem [1], i.e. the problem of online update of a Pareto archive with a new candidate point. The Pareto archive is composed of mutually nondominated points and this property must remain fulfilled following the addition of the new point.
The dynamic nondominance problem is typically used in multiobjective evolutionary algorithms (MOEAs) and more generally in other multiobjective metaheuristics (MOMHs), whose goal is to generate a good approximation of the Pareto front. Many MOEAs and other MOMHs use an external archive of potentially nondominated points, i.e. a Pareto archive containing points not dominated by any other points generated so far, see e.g. [2, 3, 4, 5, 6, 7, 8, 9]. We consider here MOEAs that generate iteratively new candidate points and use them immediately to update a Pareto archive. Updating a Pareto archive with a new point means that:

is added to the Pareto archive if it is nondominated w.r.t. any point in the Pareto archive,

all points dominated by are removed from the Pareto archive.
The time needed to update a Pareto archive, in general, increases with a growing number of objectives and a growing number of points. In some cases it may become a crucial part of the total running time of a MOEA. The simplest data structure for storing a Pareto archive is a plain list. When a new point is added, is compared to all points in the Pareto archive until either all points are checked or a point dominating is found. In order to speed up the process of updating a Pareto archive some authors proposed the use of specialized data structures and algorithms, e.g. Quadtree [10]. However, the results of computational experiments reported in literature are not conclusive and in some cases such data structures may in fact increase the update time compared to the simple list.
A frequently used approach allowing reduction of the time needed to update a Pareto archive is the use of bounded archives [11] where the number of points is limited and some potentially nondominated points are discarded. Please note, however, that such an approach always reduces the quality of the archive. In particular, one of the discarded points could be the one that would be selected by the decision maker if the full archive was known. Bounded archives may be especially disadvantageous in the case of manyobjective problems, since with a growing number of dimensions it is more and more difficult to represent a large set with a smaller sample of points. The use of bounded archives may also lead to some technical difficulties in MOEAs (see [11]). Summarizing, if an unbounded archive can be efficiently managed and updated, it is advantageous to use this kind of archive.
In this paper, our contribution is fourfold: firstly, we propose a new method, called NDTreebased update, for the dynamic nondominance problem. The method is based on a dynamic division of the objective space into hyperrectangles, which allows to avoid many comparisons of objective function values. Secondly, we show that the new method has sublinear time complexity under mild assumptions. Thirdly, a thorough experimental study on different types of artificial and realistic sets shows that we can obtain substantial computational time reductions compared to stateoftheart methods. Finally, we apply NDTreebased update to the nondominated sorting problem obtaining promising results in comparison to some recently proposed dedicated algorithms.
The remainder of the paper is organized as follows. Basic definitions related to multiobjective optimization are given in Section II. In Section III, we present a state of the art of the methods used for online updating a Pareto archive. The main contribution of the paper, i.e. NDTreebased update method is described in Section IV. Computational experiments are reported and discussed in Section V. In sections VI, NDTreebased is applied to the nondominated sorting problem.
Ii Basic definitions
Iia Multiobjective optimization
We consider a general multiobjective optimization (MO) problem with a feasible set of solutions and objective functions to minimize. The image of the feasible set in the objective space is a set of points where .
In MO, points are usually compared according to the Pareto dominance relation:
Definition 1.
Pareto dominance relation: we say that a point dominates a point if, and only if, . We denote this relation by .
Definition 2.
Nondominated point: a point corresponding to a feasible solution is called nondominated if there does not exist any other point such that . The set of all nondominated points is called Pareto front.
Definition 3.
Coverage relation: we say that a point covers a point if or . We denote this relation by .
Please note that coverage relation is sometimes referred to as weak dominance [12].
Definition 4.
Mutually nondominated relation: we say that two points are mutually nondominated or nondominated w.r.t. each other if neither of the two points covers the other one.
Definition 5.
Pareto archive (): set of points such that any pair of points in the set are mutually nondominated, i.e .
In the context of MOEAs, the Pareto archive contains the mutually nondominated points generated so far (i.e. at a given iteration of a MOEA) that approximates the Pareto front . In other words contains points that are potentially nondominated at a given iteration of the MOEA.
Please note that in MOEAs not only points but also representations of solutions are preserved in the Pareto archive, but the above definition is sufficient for the purpose of this paper.
The new method NDTreebased update is based on the (approximate) local ideal and nadir points that we define below.
Definition 6.
The local ideal point of a subset denoted as is the point in the objective space composed of the best coordinates of all points belonging to , i.e. . A point such that will be called Approximate local ideal point.
Naturally, the (approximate) local ideal point covers all points in .
Definition 7.
The local nadir point of a subset denoted as is the point in the objective space composed of the worst coordinates of all points belonging to , i.e. . A point such that will be called Approximate local nadir point.
Naturally, the (approximate) local nadir point is covered by all points in .
IiB Dynamic nondominance problem
The problem of updating a Pareto archive (also called nondominance problem), can be divided into two classes: the static nondominance problem is to find the set of nondominated points among a set of points . The other class is the dynamic nondominance problem [1] that typically occurs in MOEAs. We formally define this problem as follows. Consider a candidate point and a Pareto archive . The problem is to update with and consists in the following operations. If is covered by at least one point in , is discarded and remains unchanged. Otherwise, is added to . Moreover, if some points in are dominated by , all these points are removed from , in order to keep only mutually nondominated points (see Algorithm 1).
In this work we consider only the dynamic nondominance problem. Note that in general static problems may be solved more effectively than their dynamic counterparts since they have access to richer information. Indeed, some efficient algorithms for static nondominance problem have been proposed, see [13, 14, 15, 16].
MOEAs and other MOMHs usually update the Pareto archive using the dynamic version of the nondominance problem, i.e. the Pareto archive is updated with each newly generated candidate point. In some cases it could be possible to store all candidate points and then solve the static nondominance problem. The latter approach has, however, some disadvantages:

MOEAs need to store not only points in the objective space but also full representations of solutions in the decision space. Thus, storing all candidate points with corresponding solutions may be very memory consuming.

Some MOEAs use the Pareto archive during the run of the algorithm, i.e. Pareto archive is not just the final output of the algorithm. For example in [17], one of the parents is selected from the Pareto archive. In [7] the success of adding a new point to the Pareto archive influences the probability of selecting weight vectors in further iterations. The same applies to other MOMHs as well. For example, the Pareto local search (PLS) method [18] works directly with the Pareto archive and searches neighborhood of each solution from the archive. In such methods, computation of the Pareto archive cannot be postponed till the end of the algorithm.
Note that as suggested in [19] the dynamic nondominance problem may also be used to speed up the nondominated sorting procedure used in many MOEAs. As the Pareto archive contains all nondominated points generated so far the first front is immediately known and the nondominated sorting may be applied only to the subset of dominated points. Using this technique, Drozdík et al. showed that their new method called MFront could obtain better performance than Deb’s fast nondominated sorting [20] and JensenFortin’s algorithm [21, 22], one of the fastest nondominated sorting algorithms.
Iii State of the art
We present here a number of methods for the dynamic nondominance problem proposed in literature and used in the comparative experiment. This review is not supposed to be exhaustive. Other methods can be found in [23, 24] and reviews in [25, 26]. We describe linear list, Quadtree and one recent method, MFront [19].
Iiia Linear List
IiiA1 General case
In this structure, a new point is compared to all points in the list until a covering point is found or all points are checked. The point is only added if it is nondominated w.r.t. all points in the list, that is in the worst case we need to browse the whole list before adding a point. The complexity in terms of number of points comparison is thus in with the size in the list.
IiiA2 Biobjective case: sorted list
When only two objectives are considered, we can use the following specific property: if we sort the list according to one objective (let’s say the first), the nondominated list is also sorted according to the second objective. Therefore, roughly speaking, updating the list can be efficiently done in the following way. We first determine the potential position of the new candidate point in the sorted list according to its value of the first objective, with a binary search. If the new point is not dominated by the preceding one in the list (if there is one), the new point is not dominated and can be inserted at position . If the new point has been added, we need to check if there are some dominated points: we browse the next points in the list, until a point is found that has a better evaluation according to the second objective. All the points found that have a worse evaluation according to the second objective have to be removed since they are dominated by the new point.
The worstcase complexity is still in since it can happen that a new point has to be compared to all the other points (in the special case where we add a new point in the first position and all the points in the sorted list are dominated by this new point). But on average, experiments show that the behavior of this structure for handling biobjective updating problems is much better than the simple list.
The algorithm of this method is given in Algorithm 2 (for the sake of clarity, we only present the case where the candidate point has a distinct value for the first objective compared to all the other points in the archive ).
IiiB Quadtree
The use of Quadtree for storing potentially nondominated points was proposed by Habenicht [27] and further developed by Sun and Steuer [28] and Mostaghim and Teich [10]. In Quadtree, points are located in both internal nodes and leaves. Each node may have children corresponding to each possible combination of results of comparisons on each objective where a point can either be better or not worse. In the case of mutually nondominated points children are possible since the combinations corresponding to dominating or covered points are not used. Quadtree allows for a fast checking if a new point is dominated or covered. A weak point of this data structure is that when an existing point is removed its whole subtree has to be reinserted to the structure. Thus, removal of a dominated point is in general costly.
IiiC MFront
MFront has been proposed relatively recently by Drozdík et al. [19]. The idea of of MFront is as follows. Assume that in addition to the new point a reference point relatively close to and belonging to the Pareto archive is known. The authors define two sets:
and prove that if a point is dominated by then it belongs to and if dominates then it belongs to . Thus, it is sufficient to compare the new points to sets and only. To find all points with objective values in a certain interval MFront uses additional indexes one for each objective. Each index sorts the Pareto archive according to one objective.
To find a reference point close to , MFront uses the kd tree data structure. The kd tree is a binary tree, in which each intermediate node divides the space into two parts based on a value of one objective. While going down the tree the algorithm cycles over particular objectives, selecting one objective for each level. Drozdík et al. [19] suggest to store references to points in leaf nodes only, while intermediate nodes keep only split values.
Iv NDTreebased update
Iva Presentation
In this section we present the main contribution of the paper. The new method for updating a Pareto archive is based on the idea of recursive division of archive into subsets contained in different hyperrectangles. This division allows to considerably reduce the number of comparisons to be made.
More precisely, consider a subset composed of mutually nondominated points and a new candidate point . Assume that some approximate local ideal and approximate local nadir points of are known. In other words, all points in are contained in the axesparallel hyperrectangle defined by and .
We can define the following simple properties that allow to compare a new point to the whole set :

If is covered by , then is covered by each point in and thus can be rejected. This property is a straightforward consequence of the transitivity of the coverage relation.

If covers , then each point in is covered by . This property is also a straightforward consequence of the transitivity of the coverage relation.

If is nondominated w.r.t. both and , then is nondominated w.r.t. each point in .
Proof.
If is nondominated w.r.t. then there is at least one objective on which is worse than and thus worse than each point in . If is nondominated w.r.t. then there is at least one objective on which is better than and thus better than each point in . So, there is at least one objective on which is better and at least one objective on which is worse than each point in .∎
If none of the above properties holds, i.e. is neither covered by , does not cover , nor is nondominated w.r.t. both and , then all situations are possible, i.e. may either be nondominated w.r.t. all points in , covered by some points in or dominate some points in . This can be illustrated by showing examples of each of the situations. Consider for example a set with and . A new point dominates a point in , a new point (1, 1, 2) is dominated (thus covered) by a point in , and points and are nondominated w.r.t. all points in .
The properties are graphically illustrated for the biobjective case in Figure 1. As can be seen in this figure, in the biobjective case, if is covered by and is nondominated w.r.t. then is dominated by at least one point in . Note, however, that this does not hold in the case of three and more objectives as shown in the above example  the point is covered by , nondominated w.r.t. and is not dominated by any points in .
In fact it is possible to distinguish more specific situations when none of the three properties hold, e.g. a situation when a new point may be covered but cannot dominate any point, but since we do not distinguish them in the proposed algorithm we do not define them formally.
The above properties allow in some cases to quickly compare a new candidate point to all points in set without the need for further comparisons to individual points belonging to . Such further comparisons are necessary only if none of the three properties hold. Intuitively, the closer the approximate local ideal and nadir points the more likely it is that further comparisons can be avoided. To obtain close approximate local ideal and nadir points we should:

Split the whole set of nondominated points into subsets of points located close in the objective space.

Have good approximations of the exact local ideal and nadir points. On the other hand calculation of the exact points may be computationally demanding and a reasonable approximation may assure the best overall efficiency.
Based on these properties, we can now define the NDTree data structure.
Definition 8.
NDTree data structure is a tree with the following properties:

With each node is associated a set of points .

Each leaf node contains a list of points and .

For each internal node , is the union of disjoint sets associated with all children of .

Each node stores an approximate ideal point and approximate nadir point .

If is a child of , then and .
The algorithm for updating a Pareto archive with NDTree is given in Algorithm 3. The idea of the algorithm is as follows. We start by checking if the new point is covered or nondominated w.r.t. all points in by going through the nodes of NDTree and skipping children (and thus their subtrees) for which Property 3 holds. This procedure is presented in Algorithm 4.
The new point is first compared to the approximate ideal point and nadir point ( of the current node. If the new point is dominated by it is immediately rejected (Property 1). If is covered, the node and its whole subtree is deleted (Property 2). Otherwise if or , the node needs to be analyzed. If is an internal node we call the algorithm recursively for each child. If is a leaf node, may be dominated by or dominate some points of and it is necessary to browse the whole list of the node . If a point dominating is found, is rejected, and if a point dominated by is found, the point is deleted from .
If after checking NDTree the new point was found to be nondominated it is inserted by adding it to a close leaf (Algorithm 5). To find a proper leaf we start from the root and always select a child with closest distance to . As a distance measure we use the Euclidean distance to the middle point, i.e. a point lying in the middle of line segment connecting approximate ideal and approximate nadir points.
Once we have reached a leaf node, we add the point to the list of the node and possibly update the ideal and nadir points of the node (Algorithm 7). However, if the size of became larger than the maximum allowed size of a leaf set, we need to split the node into a predefined number of children. To create children that contain points that are more similar to each other than to those in other children, we use a simple clustering heuristic based on Euclidean distance (see Algorithm 6).
The approximate local ideal and nadir points are updated only when a point is added. We do not update them when point(s) are removed since it is a more complex operation. This is why we deal with approximate (not exact) local ideal and nadir points.
IvB Comparison to existing methods
Like other methods NDTreebased update uses a tree structure to speed up the process of updating the Pareto archive. The tree and its use is, however, quite different from Quadtree or kd tree used in MFront. For example, both Quadtree or kd tree partition the objective space based on comparisons on particular objectives, while in NDTree the space is partitioned based on the distances of points. Both Quadtree or kd tree have strictly defined degrees. In kd tree it is always two (binary tree) while in Quadtree it depends on the number of objectives. In NDTree the degree is a parameter. In Quadtree the points are kept in both internal nodes and leaves, while NDTree keeps points in leaves only. In MFront kd tree is used to find an approximate nearest neighbor and the Pareto archive is updated using other structures (sorted lists for each objective). In our case, NDTree is the only data structure used by the algorithm.
IvC Computational complexity
IvC1 Worst case
The worst case for the UpdateNode and Insert algorithm is when we need to compare the new point to each intermediate node of the tree. For example, consider the following particular case: two objectives, NDTree with maximum leaf size equal to 2, 2 children, and constructed by processing the following list of points (with ): (0, 0), (1, 1), (, ), (, ),…, (4, 4), (2, 2). This list of points is constructed in such a way that the first two points are put in one leftmost leaf and the third point creates a separate leaf. Then each further point is closer to the child on the left side but finally after splitting the leftmost node the new point creates a new leaf. The NDTree obtained is shown in Figure 2.
Consider now that the archive is updated with point (0.5, 0.5). This point will need to be compared to all intermediate nodes and then to both points in leftmost leaf. In this case:
(1) 
where is the number of points in the archive and is the number of point comparisons needed to update an archive of size . Term appears because we check two children and for each child approximate ideal and nadir points are compared. Solving the recurrence we get:
(2) 
Thus, the algorithm has time complexity.
We are not aware of any result showing that the worstcase time complexity of any algorithm for the dynamic nondominance problem may be lower than in terms of point comparisons. So, our method does not improve the worstcase complexity but according to our experiments performs significantly better in practical cases.
IvC2 Best case
Assume first that the candidate point is not covered by any point in . In the optimistic case, at each intermediate node the points are equally split into predefined number of children and there is only one child that has to be further processed (i.e. there is only one child for which none of the three properties hold). In fact we could consider even more optimistic distribution of points when the only node that has to be processed contains just one point, but the equal split is much more realistic assumption. In this case:
(3) 
where is the number of children. If the candidate point is covered by a point in the UpdateNode algorithm may stop even earlier and there will be no need to run Insert algorithm.
IvC3 Average case
Analysis and even definition of average case for such complex algorithms is quite difficult. The simplest case to analyze is when each intermediate node has two children which allows us to follow the analysis of wellknown algorithms like binary search or Quicksort. If a node has points then one of the two children may have points and the other child the remaining number of points. Assuming that each split has equal probability and only one child is selected:
(4) 
Multiplying both sides by :
(5) 
Assuming that :
(6) 
Subtracting equations 5 and 6:
(7) 
Simplifying:
(8) 
Solving this recurrence:
(9) 
where is Nth harmonic number. Using wellknown properties of harmonic numbers we get .
We can expect, however, that in realistic cases more than one child will need to be further processed in UpdateNode algorithm because the candidate point may cover approximate nadir points or may be covered by approximate ideal points of more than one child. Assume that the probability of selecting both children is equal to . Then:
(10) 
Following the above reasoning we get:
(11) 
Solving this recurrence, we obtain:
(12) 
Since
(13) 
We have:
(14) 
and the algorithm remains sublinear for any . In the worst case, both children need to be selected, so and T which confirms the analysis presented above.
This analysis may give only approximate insight into behaviour of the algorithm since in UpdateNode algorithm will not be constant at each level. We may rather expect that while going down the tree from the root towards leaves the probability that two children will need to be processed will decrease because approximate nadir and ideal points of the children will lie closer. Anyway, this analysis shows that the performance of UpdateNode algorithm may be improved by decreasing the probability that a child has to be processed. This is why we try to locate in one node points lying close to each other in Insert and Split algorithms.
In Insert algorithm always one child is processed, so the time complexity remains in average case.
The main part of Split algorithm has constant time complexity since it depends only on the maximum size of a leaf set which is a constant parameter of the algorithm.
UpdateIdealNadir algorithm goes up the tree starting from a leaf which is equivalent to going down the tree and selecting just one child. So, its analysis is exactly the same as of Insert algorithm.
We also need to consider the complexity of the operation of removal of node and its subtree. In the worst case, the removed node is the root, and thus all point need to be removed. Such situation is very unlikely, since it happens when the new point dominates all points in the current archive. Typically, the new point will dominate only few points.
V Computational experiments
We will show results obtained with NDTree and other methods in two different cases:

Results for artificially generated sets which allow us to easily control the number of points in the sets and the quality of the points.

Results for sets generated by a MOEA, namely MOEA/D [2] for the multiobjective knapsack problem.
We compare the simple list, sorted list (biobjective case), Quadtree, MFront and NDTree for these sets according to the CPU time [ms]. To avoid the influence of implementation details all methods were implemented from the scratch in C++ in as much homogeneous way as possible, i.e. when possible the same code was used to perform the same operations like Pareto dominance checks.
For the implementation of Quadtree, we use Quadtree2 version as described by Mostaghim and Teich [10].
For MFront, we use as much as possible the description found in the paper of Drozdík et al. [19]. However the authors do not give the precise algorithm of kd tree used in their method. In our implementation of kd tree, when a leaf is reached a new division is made using the average value of the current level objective. The split value is average between the value of new point and the point in the leaf. Also like in Drozdík et al. [19] the approximate nearest neighbor is found exactly as in the standard exact nearest neighbor search, but only four evaluations of the distance are allowed. Note that at https://sites.google.com/site/ndtreebasedupdate/ we present results of an additional experiment showing that for a higher number of objectives the details of the implementation of kd tree do not have any substantial influence on the running time.
We also noticed that a number of elements of MFront can be improved to further reduce the running time. The improvement is in our opinion significant enough to call the new method MFrontII. In particular in some cases MFrontII was several times faster than original MFront in our computational experiments. The modifications we introduced are as follows:

In original MFront the sets and are built explicitly and only then the points contained in these sets are compared to . In MFrontII we build them only implicitly, i.e. we immediately compare the points that would be added to the sets to .

We analyze the objectives in such a way that we start with objectives for which . In other words, we start with points from set . Since many new points are dominated this allows to stop the search immediately when a point dominating is found. Note that a similar mechanism is in fact used in original MFront but only after the sets and are explicitly built.

The last modification is more technical. MFront uses linked lists (
std::list
in C++) to store the indexes and a hashtable (std::unordered_map
in C++) to link points with their positions in these lists. We observed, however, that even basic operations like iterating over the list are much slower with linked lists than with static or dynamic arrays (likestd::vector
in C++). Thus we use dynamic arrays for the indexes. In this case, however, there is no constant iterator that could be used to link the points with their positions in these indexes. So, we use a binary search to locate a position of a point in the sorted index whenever it is necessary. The overhead of the binary search is anyway smaller than the savings due to the use of faster indexes.
For NDTree we use 20 as the maximum size of a leaf and as the number of children. These values of the parameters were found to perform well in many cases. We analyze the influence of these parameters later in this section.
The code, as well as test instances and data sets, are available at https://sites.google.com/site/ndtreebasedupdate/. All of the results have been obtained on an Intel Core i75500U CPU at 2.4 GHz.
Va Artificial sets
VA1 Basic, globally convex sets
The artificial sets are composed of points with objectives to minimize. The sets are created as follows. We generate randomly points in with the following constraint: . With this constraint, all the nondominated points will be located inside the hypersphere with the center at and with a radius of length equal to . In order to control the quality of the generated points, we also add a quality constraint: . In this way, with a small , only highquality points will be generated. We believe that it is a good model for points generated by real MOEAs since a good MOEA should generate points lying close to the true Pareto front. The hypersphere is a model of the true Pareto front and parameter controls the maximum distance from the hypersphere. We have generated data sets composed of 100 000 and 200 000 points, with , and for to 10. In the main experiment we use data sets with 100 000 points because for the larger sets running times of some methods became very long. Also because of very high running times for sets with many objectives, in the main experiment we used sets with up to 6 objectives. For each value of , five different quality levels are considered: quality q1, ; q2, ; q3, ; q4, ; q5, . The fraction of nondominated points grows both with increasing quality and number of objectives and in extreme cases all points may be nondominated (see Table I).
Quality  

convex  nonconvex  clustered  
2  q1  519  379  449 
2  q2  713  613  552 
2  q3  1046  1037  785 
2  q4  1400  1454  1059 
2  q5  2735  2748  1781 
3  q1  4588  2587  3729 
3  q2  6894  5344  5514 
3  q3  12230  11497  9720 
3  q4  19095  18648  15502 
3  q5  53813  53255  44173 
4  q1  14360  6853  11963 
4  q2  21680  16420  18120 
4  q3  39952  37709  35460 
4  q4  64664  63140  57725 
4  q5  98283  98243  97137 
5  q1  28944  13437  23966 
5  q2  42246  34357  38028 
5  q3  77477  75796  71063 
5  q4  96002  95867  93842 
5  q5  99975  99975  98521 
6  q1  45879  22956  40483 
6  q2  65195  57966  61096 
6  q3  96687  96480  94978 
6  q4  99788  99786  99652 
6  q5  100000  100000  99999 
VA2 Globally nonconvex sets
In order to test whether the global convexity of the above sets influences the behavior of the tested methods we have also generated sets whose Pareto fronts are globally nonconvex. They were obtained by simply changing the sign of each objective in the basic sets.
VA3 Clustered sets
In these sets, the points are located in small clusters. We have generated sets composed of 100 clusters, where each cluster contains 1000 points (the sets are thus composed of 100 000 points). The sets have been obtained as follows: we start with the 200 000 points from the basic convex sets. We select from the set one random point, and we then select the 999 points closest to this point (using the Euclidean distance). We repeat this operation 100 times to obtain the 100 clusters of the sets.
The shapes of exemplary biobjective globally convex, globally nonconvex and clustered data sets can be seen at https://sites.google.com/site/ndtreebasedupdate/.
Each method was run 10 times for each set, with the points processed in a different random order for each run. The average running times for basic sets are presented in Figures 3 to 7. We use average values since the different values were generally well distributed around the average with small deviations. Please note that because of large differences the running time is presented in logarithmic scale.
In addition, in Figure 8 we illustrate the evolution of the running times according to the number of objectives for the data sets of intermediate quality q3. In this case we use sets with up to 10 objectives. With 7 and more objectives even the sets of intermediate quality q3 contain almost only nondominated points (see Table I). This is why the running times of the simple list are practically constant for 7 and more objectives, because the simple list boils down in this case to the comparison of each new point to all points in the list. The running times of all other methods including NDTree increase with a growing number of objectives, but NDTree remains 5.5 times faster than the second best method (Quadtree) for sets with 10 objectives.
Method  Comparisons per ms 

List  26 752 
MFront  3 476 
MFrontII  8 370 
NDTree  17 733 
Quadtree  9 040 
Furthermore, we measured the number of comparisons of points with the dominance relation for the data sets of intermediate quality q3 with . For NDTreebased update it includes also comparisons to the approximate ideal and nadir points and for Quadtree comparisons to subnodes. The results are presented in Figure 9. Please note that the results for the two versions of MFront overlap in the Figure.
The differences in running times cannot be fully explained by the number of point comparisons because the methods differ significantly in the number of point comparisons per milliseconds (see Table II). This ratio is highest for the simple list because this method performs very few additional operations. It is also relatively high for NDTree. Other methods perform many other operations than point comparisons that strongly influence their running times. This is particularly clear in comparison of MFront and MFrontII. These method perform the same number of point comparisons, but MFrontII is several times faster than MFront because in the latter method the sets and are built explicitly and this method uses slower linked lists. Overall, NDTree performs fewest number of point comparisons for data sets with . These results indicate that NDTreebased update substantially reduces the number of comparisons with respect to the simple list. For example for , all points in the data set are nondominated, thus on average each of the new points has to be compared to an archive composed of points, while with NDTreebased update it only requires point comparisons on average.
The results obtained for nonconvex and clustered sets were very similar to the results with basic sets. Thus, in Figures 10 to 13 we show only exemplary results for the three and sixobjective cases. These results indicate that the running times of the tested methods are not substantially affected by the global shape of the Pareto front.
VA4 Discussion of the results for artificial sets
The main observations from this experiment are:

NDTree performs the best in terms of CPU time for all test sets with three and more objectives. In some cases the differences to other methods are of two orders of magnitude and in some cases the difference to the second best method is of one order of magnitude. NDTree behaves also very predictably, its running time grows slowly with increasing number of objectives and increasing fraction of nondominated points.

For biobjective instances sorted list is the best choice. In this case, MFront and MFrontII also behave very well since they become very similar to sorted list.

The simple list obtains its best performances for data sets with many dominated points like with lowest quality. In this case the new point is often dominated by many points, so the search process is quickly stopped after finding a dominating point.

Quadtree performs very badly for data sets with many dominated points, e.g. in biobjective instances where it is the worst method in all cases. In this case, many points added to Quadtree are then removed and the removal of a point from Quadtree is a costly operation. As discussed above when an existing point is removed its whole subtree has to be reinserted to the structure. On the other hand, it is the second best method for most data sets with six and more objectives.

MFrontII is much faster than MFront on data sets with larger fraction of dominated points. In this case, MFrontII may find a dominating point faster without building explicitly the whole sets and .

The performance of both MFront and MFrontII deteriorates with an increasing number of objectives. With six and more objectives MFront is the slowest method in all cases. Intuitively this can be explained by the fact that MFront (both versions) uses each objective individually to reduce the search space. In the case of two objectives the values of one objective carry a lot of information since the order on one objective induces also the order on the other one. The more objectives, the less information we get from an order on one of them. Furthermore, sets and are in fact unions of corresponding sets for particular objectives, which also results in their growth. Finally, in many objective case, a reference point close on Euclidean distance does not need to be very close on each objective, since it will rather have a good balance of differences on many coordinates.
In an additional experiment we analyzed the evolution of the running times of all methods with increasing number of points. We decided to use one intermediate globally convex data set with and quality q3. We used 200 000 points in this case and 10 runs for each method (see Figure 14). The CPU time is cumulative time of processing a given number of points. In addition, since the running time is much smaller for NDTree its results are presented in Figure 15 separately. We see that NDTree is the fastest method for any number of points and its cumulative running time grows almost linearly with the number of points. In other words, time of processing a single point is almost constant. Please note that unlike in other figures the linear scale is used in these two figures in order to make them more informative.
NDTree has two parameters  the maximum size (number of points) of a leaf, and the number of children, so the question arises how sensitive it is to the setting of these parameters. To study it we again use the intermediate data set with and quality q3 and run NDTree with various parameters, see Figure 16. Please note that number of children cannot be larger than the maximum size of a leaf since after exceeding the maximum size the leaf is split into the given number of children. We see that NDTree is not very sensitive to the two parameters: the CPU time remains between about one and three seconds regardless of the values of the parameters. The best results are obtained with 20 for the maximum size of a leaf and with 6 for the number of children.
In our opinion the results confirm that NDTree performs relatively well for a wide range of the values of the parameters. In fact, it would remain the best method for this data set with any of the parameters settings tested.
VB Sets generated by MOEA/D
In order to test if the observations made for artificial sets hold for sets generated by real evolutionary algorithms, we use sets of points generated by wellknown MOEA/D algorithm [2] for multiobjective multidimensional knapsack problem instances with 2 to 6 objectives. We used the code available at http://dces.essex.ac.uk/staff/zhang/webofmoead.htm [2]. We used the instances with 500 items available with the code with 2 to 4 objectives. The profits and weights of these instances were randomly generated uniformly between 1 and 100. We have generated ourselves the 5 and 6 objectives instances by adding random profits and weights, between 1 and 100. MOEA/D was run for at least 100 000 iterations and the first 100 000 points generated by the algorithm were stored for the purpose of this experiment. The numbers of nondominated points are given in Table III.
Figure 17 presents running times of each of the tested methods as well the running times of MOEA/D excluding the time needed to update the Pareto archive. These results confirm that the observations made for artificial sets also hold in the case of real sets. NDTree is the fastest method for three and more objectives. Quadtree performs particularly badly in the biobjective case. Both versions of MFront relatively deteriorate with a growing number of objectives. Furthermore, these results show that the time of updating the Pareto archive may be higher than the remaining running time of MOEA/D. In particular, for the sixobjective instance the running time of MFront is 5 times higher than the remaining running time of MOEA/D. The running time of the simple list, Quadtree and MFrontII are comparable to the remaining running time of MOEA/D, and only the running time of NDTree is 10 times shorter. This confirms that the selection of an appropriate method for updating the Pareto archive may have a crucial influence on the running time of a MOEA.
2  140 
3  1789 
4  5405 
5  10126 
6  16074 
We have also generated sets of points by applying another MOMH, namely Pareto local search [18] to solve the multiobjective traveling salesman problem (MOTSP). These results can be found at https://sites.google.com/site/ndtreebasedupdate/ (the same conclusions apply).
Vi NDTreebased nondominated sorting
NDTreebased update may also be applied to the problem of the nondominated sorting. This problem arises in the nondominated sortingbased MOEAs, e.g. NSGAIII [29] where a population of solutions needs to be assigned to different fronts based on their dominance relationships.
We solve the nondominated sorting problem in the very straightforward way, i.e. we find the first front by updating an initially empty Pareto archive with each point in the population. Then the points from the first front are removed from the population and the next front is found in the same way. This process is repeated until all fronts are found.
We compare this approach to some recently proposed nondominated sorting algorithms, i.e. ENSBS/SS [30] and DDANS [31]. ENSBS/SS algorithm sorts the points lexicographically based on the values of objectives. Then it considers each solution using this order to efficiently find the last front that contains a point dominating the considered point. For each front this method in fact solves the dynamic nondominance problem with the simple list and if the considered solution is nondominated within this front it needs to be compared to each solution in this front. If there is just one front in the population, the method boils down to solving the dynamic nondominance problem with the simple list and requires comparisons. DDANS algorithm sorts the population according to each objective which requires objective function comparisons and builds comparison matrices for each objectives. Then it uses some matrix operations which in general have complexity to build the fronts. We also compare our algorithm to MFrontII [19] applied in the same way as NDTreebased update. Please note that similarly to what was done in [31] we apply MFrontII for finding each front, while in [19] only the first front was found by MFront.
Also, like [30, 31] we used populations of size 5000. We used both random populations drawn with uniform probability from a hypercube, and populations composed of 5000 randomly selected points from our data sets of intermediate quality q3. For each number of objectives 10 populations were drawn.
The results are presented in Figures 18 to 21. We report both CPU times and the number of comparisons of points. We also show the number of fronts on the right axis. Please note that we do not show the number of comparisons for DDANS, since this algorithm does not explicitly compares points with the dominance relation. The slowest algorithm is DDANS. Our results are quite contradictory to the results reported in [31] where DDANS is the fastest algorithm in most cases. Please note, however, that in [31] the algorithms were implemented in MATLAB which, as the authors note, is very efficient in matrix operations. On the other hand, the CPU times reported in [31] are of orders of magnitude higher compared to our experiment which suggests rather that the MATLAB implementation of MFront and ENSBS/SS is quite inefficient.
ENSBS/SS performs well for populations with many fronts but its performance deteriorates when the number of fronts is reduced. As it was mentioned above if there is just one front in the population, the method boils down to the dynamic nondominance problem solved with the simple list. This is why in the case of populations drawn from our sets which contain points of higher quality than random populations, and for higher numbers of objectives, where the populations often contain just one front, the number of comparisons saturates at a constant level.
In most cases NDTreebased nondominated sorting is the most efficient method in terms of both CPU time and the number of comparisons. These results are very promising but only preliminary and further experiments, especially with populations generated by real MOEAs, are necessary. Please note that as suggested in [19] the practical efficiency of the nondominated sorting in the context of a MOEA may be further improved by maintaining the first front in a Pareto archive.
Vii Conclusions
We have proposed a new method for the dynamic nondominance problem. According to the theoretical analysis the method remains sublinear with respect to the number of points in the archive under mild assumptions and the time of processing a single point observed in the computational experiments are almost constant. The results of computational experiments with both artificial data sets of various global shapes, as well as results with sets of points generated by two different multiobjective methods, i.e. MOEA/D evolutionary algorithm and Pareto local search metaheuristic, indicate that the proposed method outperforms competitive methods in the case of three and more objective problems. In biobjective case the best choice remains the sorted list.
We believe that with the proposed method for updating a Pareto archive, new stateofthe art results could be obtained for many multiobjective problems with more than two objectives. Indeed, results of our computational experiment indicate that the choice of an appropriate method for updating the Pareto archive may have crucial influence on the running time of multiobjective evolutionary algorithms and other metaheuristics especially in the case of higher number of objectives.
Interesting directions for further research are to adapt NDTree to be able to deal with archives of a relatively large but bounded size or to solve the static nondominance problem.
We have also obtained promising results by applying NDTreebased update to the nondominated sorting and comparing it to some stateoftheart algorithms for this problem. Further experiments are, however, necessary especially with populations generated by real MOEAs. Furthermore, good results of ENS algorithm for populations with many fronts suggest that a combination of ENS with NDTree may be an interesting direction for further research.
Acknowledgment
The research of Andrzej Jaszkiewicz was funded by the the Polish National Science Center, grant no. UMO2013/11/B/ST6/01075.
References
 [1] O. Schütze, “A new data structure for the nondominance problem in multiobjective optimization,” in Proceedings of the 2nd International Conference on Evolutionary Multicriterion Optimization, ser. EMO’03. Berlin, Heidelberg: SpringerVerlag, 2003, pp. 509–518.
 [2] Q. Zhang and H. Li, “MOEA/D: A multiobjective evolutionary algorithm based on decomposition,” Evolutionary Computation, IEEE Transactions on, vol. 11, no. 6, pp. 712–731, 2007.
 [3] E. Zitzler, M. Laumanns, and L. Thiele, “SPEA2: Improving the Strength Pareto Evolutionary Algorithm,” Computer Engineering and Networks Laboratory (TIK), Swiss Federal Institute of Technology (ETH) Zurich, Zurich, Switzerland, Tech. Rep. 103, May 2001.
 [4] C. A. C. Coello, G. T. Pulido, and M. S. Lechuga, “Handling multiple objectives with particle swarm optimization,” IEEE Transactions on Evolutionary Computation, vol. 8, no. 3, pp. 256–279, June 2004.
 [5] S. Agrawal, B. K. Panigrahi, and M. K. Tiwari, “Multiobjective particle swarm algorithm with fuzzy clustering for electrical power dispatch,” IEEE Transactions on Evolutionary Computation, vol. 12, no. 5, pp. 529–541, Oct 2008.
 [6] B. Li, K. Tang, J. Li, and X. Yao, “Stochastic ranking algorithm for manyobjective optimization based on multiple indicators,” IEEE Transactions on Evolutionary Computation, vol. 20, no. 6, pp. 924–938, Dec 2016.
 [7] X. Cai, Y. Li, Z. Fan, and Q. Zhang, “An external archive guided multiobjective evolutionary algorithm based on decomposition for combinatorial optimization,” IEEE Transactions on Evolutionary Computation, vol. 19, no. 4, pp. 508–523, Aug 2015.
 [8] S. Yang, M. Li, X. Liu, and J. Zheng, “A gridbased evolutionary algorithm for manyobjective optimization,” IEEE Transactions on Evolutionary Computation, vol. 17, no. 5, pp. 721–736, Oct 2013.
 [9] L. Tang and X. Wang, “A hybrid multiobjective evolutionary algorithm for multiobjective optimization problems,” IEEE Transactions on Evolutionary Computation, vol. 17, no. 1, pp. 20–45, Feb 2013.
 [10] S. Mostaghim and J. Teich, “Quadtrees: A data structure for storing Pareto sets in multiobjective evolutionary algorithms with elitism,” in EMO, Book Series: Advanced Information and Knowledge, A. Abraham, L. Jain, and R. Goldberg, Eds. SpringerVerlag, 2004, pp. 81–104.
 [11] J. Fieldsend, R. Everson, and S. Singh, “Using unconstrained elite archives for multiobjective optimization,” IEEE Transactions on Evolutionary Computation, vol. 7, no. 3, pp. 305–323, June 2003.
 [12] W. Brockhoff, Theory of Randomized Search Heuristics: Foundations and Recent Developments. World Sc. Publ. Comp., 2010, ch. Theoretical aspects of evolutionary multiobjective optimization, pp. 101–139.
 [13] H. T. Kung, F. Luccio, and F. P. Preparata, “On finding the maxima of a set of vectors,” J. ACM, vol. 22, no. 4, pp. 469–476, Oct. 1975.
 [14] H. N. Gabow, J. L. Bentley, and R. E. Tarjan, “Scaling and related techniques for geometry problems,” in Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing, ser. STOC ’84. New York, NY, USA: ACM, 1984, pp. 135–143.
 [15] F. P. Preparata and M. I. Shamos, Computational Geometry: An Introduction. New York, NY, USA: SpringerVerlag New York, Inc., 1985.
 [16] P. Gupta, R. Janardan, M. Smid, and B. Dasgupta, “The rectangle enclosure and pointdominance problems revisited,” International Journal of Computational Geometry & Applications, vol. 7, no. 5, pp. 437–455, 1997.
 [17] K. Deb, M. Mohan, and S. Mishra, “Evaluating the domination based multiobjective evolutionary algorithm for a quick computation of paretooptimal solutions,” Evol. Comput., vol. 13, no. 4, pp. 501–525, Dec. 2005.
 [18] L. Paquete, M. Chiarandini, and T. Stützle, “Pareto local optimum sets in the biobjective traveling salesman problem: an experimental study,” in Metaheuristics for Multiobjective Optimisation, X. Gandibleux, M. Sevaux, K. Sörensen, and V. T’kindt, Eds. Berlin: Springer. Lecture Notes in Economics and Mathematical Systems Vol. 535, 2004, pp. 177–199.
 [19] M. Drozdík, Y. Akimoto, H. Aguirre, and K. Tanaka, “Computational cost reduction of nondominated sorting using the Mfront,” IEEE Transactions on Evolutionary Computation, vol. 19, no. 5, pp. 659–678, 2015.
 [20] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: NSGAII,” IEEE Transactions on Evolutionary Computation, vol. 6, no. 2, pp. 182–197, Apr 2002.
 [21] M. Jensen, “Reducing the runtime complexity of multiobjective EAs: The NSGAII and other algorithms,” IEEE Transactions on Evolutionary Computation, vol. 7, no. 5, pp. 503–515, Oct 2003.
 [22] F.A. Fortin, S. Grenier, and M. Parizeau, “Generalizing the improved runtime complexity algorithm for nondominated sorting,” in Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation. New York, NY, USA: ACM, 2013, pp. 615–622.
 [23] O. Schütze, S. Mostaghim, M. Dellnitz, and J. Teich, “Covering Pareto sets by multilevel evolutionary subdivision techniques,” in Proceedings of the Second Int. Conf. on Evolutionary MultiCriterion Optimization. Berlin: Springer, 2003, pp. 118–132.
 [24] J. E. Fieldsend, R. M. Everson, and S. Singh, “Using unconstrained elite archives for multiobjective optimization,” IEEE Transactions on Evolutionary Computation, vol. 7, no. 3, pp. 305–323, 2003.
 [25] N. Altwaijry and M. Bachir Menai, “Data structures in multiobjective evolutionary algorithms,” Journal of Computer Science and Technology, vol. 27, no. 6, pp. 1197–1210, 2012.
 [26] S. Mostaghim, J. Teich, and A. Tyagi, “Comparison of data structures for storing Paretosets in MOEAs,” in Evolutionary Computation, 2002. CEC ’02., vol. 1, May 2002, pp. 843–848.
 [27] W. Habenicht, Essays and Surveys on MCDM: Proceedings of the Fifth International Conference on Multiple Criteria Decision Making, Mons, Belgium, August 9–13, 1982. Springer, 1983, ch. Quad Trees, a Datastructure for Discrete Vector Optimization Problems, pp. 136–145.
 [28] M. Sun and R. Steuer, “Quad trees and linear list for identifying nondominated criterion vectors,” INFORM Journal on Computing, vol. 8, no. 4, pp. 367–375, 1996.
 [29] K. Deb and H. Jain, “An evolutionary manyobjective optimization algorithm using referencepointbased nondominated sorting approach, part i: Solving problems with box constraints,” IEEE Transactions on Evolutionary Computation, vol. 18, no. 4, pp. 577–601, Aug 2014.
 [30] X. Zhang, Y. Tian, R. Cheng, and Y. Jin, “An efficient approach to nondominated sorting for evolutionary multiobjective optimization,” IEEE Transactions on Evolutionary Computation, vol. 19, no. 2, pp. 201–213, April 2015.
 [31] Y. Zhou, Z. Chen, and J. Zhang, “Ranking vectors by means of the dominance degree matrix,” IEEE Transactions on Evolutionary Computation, vol. 21, no. 1, pp. 34–51, Feb 2017.