# Maximizing Non-monotone/Non-submodular Functions by Multi-objective Evolutionary Algorithms

## Abstract

Evolutionary algorithms (EAs) are a kind of nature-inspired general-purpose optimization algorithm, and have shown empirically good performance in solving various real-word optimization problems. However, due to the highly randomized and complex behavior, the theoretical analysis of EAs is difficult and is an ongoing challenge, which has attracted a lot of research attentions. During the last two decades, promising results on the running time analysis (one essential theoretical aspect) of EAs have been obtained, while most of them focused on isolated combinatorial optimization problems, which do not reflect the general-purpose nature of EAs. To provide a general theoretical explanation of the behavior of EAs, it is desirable to study the performance of EAs on a general class of combinatorial optimization problems. To the best of our knowledge, this direction has been rarely touched and the only known result is the provably good approximation guarantees of EAs for the problem class of maximizing monotone submodular set functions with matroid constraints, which includes many NP-hard combinatorial optimization problems. The aim of this work is to contribute to this line of research. As many combinatorial optimization problems also involve non-monotone or non-submodular objective functions, we consider these two general problem classes, maximizing non-monotone submodular functions without constraints and maximizing monotone non-submodular functions with a size constraint. We prove that a simple multi-objective EA called GSEMO can generally achieve good approximation guarantees in polynomial expected running time.

## 1Introduction

Evolutionary algorithms (EAs) [3] are a kind of randomized metaheuristic optimization algorithm, inspired by the evolution process of natural species, i.e., natural selection and survival of the fittest. Starting from a random population of solutions, EAs iteratively apply reproduction operators to generate a set of offspring solutions from the current population, and then apply a selection operator to weed out bad solutions. EAs have been applied to diverse areas (e.g., antenna design [18], bioinformatics [23] and data mining [24]) and can produce human-competitive results [20]. Compared with the application, the theoretical analysis of EAs is, however, far behind. Many researchers thus have been devoted to understanding the behavior of EAs from a theoretical point of view, which is still an ongoing challenge.

In the last two decades, a lot of progress has been made on the running time analysis of EAs, which is one essential theoretical aspect. The running time measures how many objective (i.e., fitness) function evaluations an EA needs until finding an optimal solution or an approximate solution. The running time analysis of EAs started with artificial example problems. In [7], a simple single-objective EA called (1+1)-EA was shown able to solve two well-structured pseudo-Boolean problems OneMax and LeadingOnes in and (where is the problem size) expected running time, respectively. These two problems are to maximize the number of 1-bits of a solution and the number of consecutive 1-bits counting from the left of a solution, respectively. Both of them have a short path with increasing fitness to the optimum. For some problems (e.g., SPC) where there is a short path with constant fitness to the optimum, the (1+1)-EA can also find an optimal solution in polynomial time [19]. But when the problem (e.g., Trap) has a deceptive path (i.e., a path with increasing fitness away from the optimum), the (1+1)-EA will need exponential running time [15]. More results can refer to [2].

The analysis on simple artificial problems disclosed theoretical properties of EAs (e.g., which problem structures are easy or hard for EAs), and also helped to develop approaches for analyzing more complex problems. The running time analysis of EAs was then extended to combinatorial optimization problems. For some P-solvable problems, EAs were shown able to find an optimal solution in polynomial time. For example, the minimum spanning tree problem can be solved by the (1+1)-EA and a simple multi-objective EA called GSEMO in [29] and [28] expected time, respectively. Note that , and are the number of edges, the number of nodes and the maximum edge weight of a graph, respectively. For some NP-hard problems, EAs were shown able to achieve good approximation ratios in polynomial time. For example, for the partition problem, the (1+1)-EA can achieve a -approximation ratio in expected time [36]; for the minimum set cover problem, the expected running time of the GSEMO until obtaining a -approximation ratio is [11], where , and denote the size of the ground set, the number of subsets and the maximum cost of a subset, respectively. More running time results of EAs on combinatorial optimization problems can refer to [30].

For the analysis of the GSEMO (which is a multi-objective EA) on single-objective optimization problems (e.g., minimum spanning tree and minimum set cover), the original single-objective problem is transformed into a multi-objective problem, which is then solved by the GSEMO. Note that multi-objective optimization here is just an intermediate process (which might be beneficial [11]), and we still focus on the quality of the best solution w.r.t. the original single-objective problem, in the population found by the GSEMO. Running time analysis of EAs on real multi-objective optimization problems has also been investigated, where the running time is measured by the number of fitness evaluations until finding the Pareto front (which represents different optimal tradeoffs between the multiple objectives) or an approximation of the Pareto front. For example, Giel [12] proved that the GSEMO can solve the bi-objective pseudo-Boolean problem LOTZ in expected time; for the NP-hard bi-objective minimum spanning tree problem, it was shown that the GSEMO can obtain a -approximation ratio in pseudo-polynomial time [27].

The analysis on combinatorial optimization problems helped to reveal the ability of EAs. However, most of the previous promising results were obtained for isolated problems, while EAs are known to be general-purpose optimization algorithms, which can be applied to various problems. Thus, it is more desirable to provide a general theoretical explanation of the behavior of EAs, that is, to theoretically study the performance of EAs on a general class of combinatorial optimization problems.

To the best of our knowledge, only two pieces of work in this direction have been reported. Reichel and Skutella [35] first studied the problem class of maximizing linear functions with matroid constraints, which includes some well-known combinatorial optimization problems such as maximum matching, Hamiltonian path, etc. They proved that the (1+1)-EA can obtain a -approximation ratio in expected running time, where , and denote the size of the ground set, the minimum rank of the ground set w.r.t. one matroid and the maximum weight of an element, respectively. Later, a generalization of this problem class, i.e., the objective function is relaxed to satisfy the monotone and submodular property, was considered in [10]. The (1+1)-EA was shown able to achieve a -approximation ratio in expected time, where and . Friedrich and Neumann [10] also studied a specific non-monotone case for the objective function, i.e., the objective function is symmetric. They proved that the expected running time until the GSEMO obtains a -approximation ratio for maximizing symmetric submodular functions with matroid constraints is .

The aim of this paper is to contribute to this line of research. Considering that the objective function of many combinatorial optimization problems can be non-monotone (not necessarily symmetric) or non-submodular, we study the performance of EAs on these two general problem classes, maximizing non-monotone submodular functions without constraints and maximizing monotone non-submodular functions with a size constraint. Note that the objective function is a set function which maps a subset of the ground set to a real value, and a size constraint means that the size of a subset is not larger than a budget . We prove that for both problem classes, the GSEMO can obtain a good approximation guarantee in polynomial expected running time. Our main results can be summarized as follows.

For the problem class of maximizing non-monotone submodular functions without constraints, with special instances including maximum cut [13], maximum facility location [1] and variants of the maximum satisfiability problem [14], we prove that the GSEMO can achieve a constant approximation ratio of in expected running time (i.e.,

**Theorem ?**), where is the size of the ground set and .For the problem class of maximizing monotone non-submodular functions with a size constraint, with special instances including sparse regression [6], dictionary selection [21] and robust influence maximization [5], we prove the approximation guarantee of the GSEMO w.r.t. each notion of “approximate submodularity”, which measures how close a general set function is to submodularity.

(1) In [22], a set function is -approximately submodular if the diminishing return property holds with some deviation , i.e., for any and , . is submodular iff . We prove that the GSEMO within expected time can find a subset with (i.e.,**Theorem ?**), where is the base of the natural logarithm and denotes the optimal function value.

(2) In [6], the approximate submodularity notion of a set function is characterized by a quantity called submodularity ratio. is submodular iff . We prove that the GSEMO within expected time can find a subset with (i.e.,**Theorem ?**).

(3) In [17], a set function is -approximately submodular if there exists a submodular set function such that , . is submodular iff . We prove that the GSEMO within expected time can find a subset with (i.e.,**Theorem ?**).

Note that since EAs are general-purpose algorithms which utilize a small amount of problem knowledge, we cannot expect them to beat the best problem-specific algorithm. For the first problem class, the approximation ratio nearly obtained by the GSEMO is worse than the best known one , which was previously obtained by the double greedy algorithm [4]. For the second problem class, the approximation ratio obtained by the GSEMO w.r.t. each notion of approximate submodularity reaches the best known one, which was previously obtained by the standard greedy algorithm [22]. Particularly, when the objective function is submodular, the obtained approximation ratio by the GSEMO becomes , which is optimal in general [25], and also consistent with the previous result in [10]. Our analytical results on general problem classes together with the previous ones [35] provide a theoretical explanation for the empirically good behaviors of EAs in diverse applications.

The rest of this paper is organized as follows. Section 2 introduces some preliminaries. The running time analyses for non-monotone and non-submodular cases are then presented in Sections 3 and 4, respectively. Section 5 concludes the paper.

## 2Preliminaries

In this section, we first introduce the two problem classes studied in this paper, and then introduce the multi-objective evolutionary algorithm GSEMO and also how to optimize the studied problem classes by the GSEMO.

### 2.1Non-monotone/Non-submodular Function Maximization

Let and denote the set of reals and non-negative reals, respectively. Given a finite nonempty set , we study the functions defined on subsets of . A set function is monotone if for any , . Without loss of generality, we assume that the monotone function is normalized, i.e., . A set function is submodular [26] if for any and ,

or equivalently for any ,

We assume that a set function is given by a value oracle, i.e., for a given subset , an algorithm can query an oracle to obtain the value .

Our first studied problem class as presented in Definition ? is to maximize a non-monotone submodular set function. Without loss of generality, we assume that the objective function is non-negative. This problem generalizes many NP-hard combinatorial optimization problems including maximum cut [13], maximum facility location [1], variants of the maximum satisfiability problem [14], etc. The best known approximation guarantee is , which was achieved by the double greedy algorithm [4].

The second studied problem class is presented in Definition ?. The goal is to find a subset with at most elements such that a given monotone non-submodular set function is maximized. This problem generalizes many NP-hard problems including sparse regression [6], dictionary selection [21] and robust influence maximization [5], etc. When the objective function is submodular, it was well known that the standard greedy algorithm achieves the optimal approximation guarantee [25]. For non-submodular cases, several notions of “approximate submodularity” [22] were introduced to measure to what extent a general set function has the submodular property. For each approximately submodular notion, the best known approximation guarantee was achieved by the standard greedy algorithm [22], which iteratively adds one element with the largest improvement until elements are selected.

In [22], the approximate submodularity as presented in Definition ? was defined based on the diminishing return property (i.e., Eq. ()). That is, the approximately submodular degree depends on how large a deviation of the diminishing return property can hold with. It is easy to see that is submodular iff . The standard greedy algorithm was proved to find a subset with [22], where denotes the optimal function value.

In [6], the submodularity ratio as presented in Definition ? was introduced to measure the closeness of a set function to submodularity. It is easy to see from Eq. () that is submodular iff for any and . When is clear in the paper, we will use shortly. The standard greedy algorithm was proved to find a subset with [6].

The above two notions of approximate submodularity are based on the equivalent statements Eqs. (Equation 1) and (Equation 2) of submodularity, while in [17], the approximate submodularity of a set function as presented in Definition ? was defined based on the closeness to other submodular functions. It is easy to see that is submodular iff . The standard greedy algorithm was proved to find a subset with [17].

### 2.2Multi-objective Evolutionary Algorithms

To investigate the performance of EAs optimizing the two problem classes in Definitions ? and ?, we consider a simple multi-objective EA called GSEMO, which has been widely used in previous theoretical analyses [11]. The GSEMO as presented in Algorithm ? is used for maximizing multi-objective pseudo-Boolean problems with objective functions (). Note that a pseudo-Boolean function naturally characterizes a set function , since a subset of can be naturally represented by a Boolean vector , where the -th bit means that , and means that . Throughout the paper, we will not distinguish and its corresponding subset for notational convenience.

Before introducing the GSEMO, we first introduce some basic concepts in multi-objective maximization. Since the objectives to be maximized are usually conflicted, there is no canonical complete order on the solution space. The comparison between two solutions relies on the *domination* relationship. For two solutions and , *weakly dominates* (i.e., is *better* than , denoted by ) if , ; *dominates* (i.e., is *strictly better* than , denoted by ) if and for some . The domination relationship between two solutions and can be formally stated as follows:

if ;

if and for some .

But if neither is better than nor is better than , we say that they are *incomparable*. A solution is *Pareto optimal* if there is no other solution that dominates it. The set of objective vectors of all the Pareto optimal solutions constitutes the *Pareto front*. The goal of multi-objective optimization is to find the Pareto front, that is, to find at least one corresponding solution for each objective vector in the Pareto front.

The procedure of the GSEMO algorithm is presented in Algorithm ?. Starting from a random solution (lines 1-2), it iteratively tries to improve the quality of the solutions in the population (lines 3-12). In each iteration, a new solution is generated by randomly flipping bits of an archived solution selected from the current population (lines 4-5); the complementary set of is also generated (line 6); these two newly generated solutions are then used to update the population (lines 7-11). In the updating procedure, if is not dominated by (i.e., not strictly worse than) any previously archived solution (line 8), it will be added into , and meanwhile those previously archived solutions weakly dominated by (i.e., worse than) will be removed from (line 9). It is easy to see that the population will always contain a set of incomparable solutions due to the domination-based comparison. Note that the GSEMO here is a little different from its original version [11]. In the original GSEMO algorithm, only is generated and used to update the population in each iteration, while here we also generate the complementary set of and use both of them to update the population.

For optimizing the two problems in Definitions ? and ? by the GSEMO, each problem is transformed into a bi-objective maximization problem

where and . That is, the GSEMO is to maximize the objective function and minimize the subset size simultaneously. Note that denotes the number of 1-bits of a solution . When the GSEMO terminates after running a number of iterations, the best solution w.r.t. the original single-objective problem in the resulting population will be returned. For the first problem in Definition ?, the solution with the largest value in (i.e., ) will be returned. For the second problem in Definition ?, the solution with the largest value satisfying the size constraint in (i.e., ) will be returned. The running time of the GSEMO is measured by the number of fitness evaluations until the best solution w.r.t. the original single-objective problem in the population reaches some approximation guarantee for the first time. Since only the new solutions and need to be evaluated in each iteration of the GSEMO, the number of fitness evaluations is just the double of the number of iterations of the GSEMO.

Note that multi-objective optimization here is just an intermediate process, which has been shown helpful for solving some single-objective combinatorial optimization problems [11]. We still focus on the quality of the best solution w.r.t. the original single-objective problem, in the population found by the GSEMO, rather than the quality of the population w.r.t. the transformed multi-objective optimization problem.

## 3Non-monotone Submodular Function Maximization without Constraints

In this section, we theoretically analyze the performance of the GSEMO for maximizing non-monotone submodular functions without constraints. We prove in Theorem ? that the GSEMO can achieve a constant approximation ratio of nearly in polynomial time. Note that denotes the optimal function value. Inspired from the proof of Theorem 4 in [10], our proof idea is to follow the behavior of the local search algorithm [9], which iteratively tries to improve a solution by inserting or deleting one element. The proof relies on Lemma ?, which shows that it is always possible to improve a solution until a good approximation has been achieved. This lemma is extracted from Lemma 3.4 in [9].

We divide the optimization process into three phases: (1) starts from an initial random solution and finishes until finding the special solution (i.e., ); (2) starts after phase (1) and finishes until finding a solution with the objective value at least ; (3) starts after phase (2) and finishes until finding a solution with the desired approximation guarantee. We analyze the expected running time of each phase, respectively, and then sum up them to get an upper bound on the total expected running time of the GSEMO.

For phase (1), we consider the minimum number of 1-bits of the solutions in the population , denoted by . That is, . Assume that currently , and let be the corresponding solution, i.e., . It is easy to see that cannot increase because cannot be weakly dominated by a solution with more 1-bits. In each iteration of the GSEMO, to decrease , it is sufficient to select in line 4 of Algorithm ? and flipping only one 1-bit of in line 5. This is because the newly generated solution now has the smallest number of 1-bits (i.e., ) and no solution in can dominate it; thus it will be included into . Let denote the largest size of during the run of the GSEMO. The probability of selecting in line 4 of Algorithm ? is due to uniform selection, and the probability of flipping only one 1-bit of in line 5 is , since has 1-bits. Thus, the probability of deceasing by at least 1 in each iteration of the GSEMO is at least . Note that . We can then get that the expected number of iterations of phase (1) (i.e., reaches 0) is at most

Note that the solution will be always kept in once generated, since it has the smallest subset size 0 and no other solution can weakly dominate it.

For phase (2), it is sufficient that in one iteration of the GSEMO, the solution is selected in line 4, and only a specific 0-bit corresponding to the best single element (i.e., ) is flipped in line 5. That is, the solution is generated. Since the objective function is submodular and non-negative, we easily have . After generating the solution , it will then be used to update the population , which makes always contain a solution weakly dominating , i.e., and . Thus, we only need to analyze the expected number of iterations of the GSEMO until generating the solution . Since the probability of selecting in line 4 of the GSEMO is at least and the probability of flipping only a specific 0-bit in line 5 is , the expected number of iterations of phase (2) is .

As in [9], we call a solution a -approximate local optimum if for any and for any . According to Lemma ?, we know that a -approximate local optimum satisfies that . For phase (3), we thus only need to analyze the expected number of iterations until generating a -approximate local optimum in line 5 of Algorithm ?. This is because both and will be used to update the population , and then for either one of and , will always contain one solution weakly dominating it, which implies that . We then consider the largest value of the solutions in the population , denoted by . That is, . After phase (2), , and let be the corresponding solution, i.e., . It is obvious that cannot decrease, because cannot be weakly dominated by a solution with a smaller value. As long as is not a -approximate local optimum, we know that a new solution with can be generated through selecting in line 4 of Algorithm ? and flipping only one specific 1-bit (i.e., deleting one specific element from ) or one specific 0-bit (i.e., adding one specific element into ) in line 5, the probability of which is at least . Since now has the largest value and no other solution in can dominate it, it will be included into . Thus, can increase by at least a factor of with probability at least in each iteration. Such an increase on is called a successful step. Thus, a successful step needs at most expected number of iterations. It is also easy to see that until generating a -approximate local optimum, the number of successful steps is at most . Thus, the expected number of iterations of phase (3) is at most

From the procedure of the GSEMO, we know that the solutions maintained in must be incomparable. Thus, each value of one objective can correspond to at most one solution in . Because the second objective can only belong to , we have . Hence, the expected running time of the GSEMO for finding a solution with the objective function value at least is

## 4Monotone Non-submodular Function Maximization with a Size Constraint

In this section, we analyze the performance of the GSEMO for maximizing monotone non-submodular functions with a size constraint. We prove the polynomial-time approximation guarantee of the GSEMO w.r.t. each notion of approximate submodularity (i.e., Definitions ?- ?), respectively. Theorem ? gives the approximation guarantee of the GSEMO w.r.t. the -approximately submodular notion in Definition ?, where denotes the optimal function value. Note that this approximation guarantee obtained by the GSEMO reaches the best known one, which was previously obtained by the standard greedy algorithm [22]. Inspired from the proof of Theorem 2 in [10], our proof idea is to follow the behavior of the standard greedy algorithm, which iteratively adds one element with the currently largest improvement on . The proof relies on the property of in Lemma ?, that for any , there always exists another element, the inclusion of which can bring an improvement on roughly proportional to the current distance to the optimum.

Let be an optimal solution, i.e., . We denote the elements in by , where . Then, we have

where the first inequality is by the monotonicity of , and the last inequality is derived by Definition ? since is -approximately submodular. Let . Then, we have

We divide the optimization process into two phases: (1) starts from an initial random solution and finishes until finding the special solution ; (2) starts after phase (1) and finishes until finding a solution with the desired approximation guarantee. As the analysis of phase (1) in the proof of Theorem ?, we know that the population will contain the solution after iterations in expectation.

For phase (2), we consider a quantity , which is defined as

That is, denotes the maximum value of such that in the population , there exists a solution with and . We analyze the expected number of iterations until , which implies that there exists one solution in satisfying that and . That is, the desired approximation guarantee is reached.

The current value of is at least 0, since the population contains the solution , which will always be kept in once generated. Assume that currently . Let be a corresponding solution with the value , i.e., and . It is easy to see that cannot decrease because cleaning from (line 9 of Algorithm ?) implies that is weakly dominated by a newly generated solution , which must satisfy that and . By Lemma ?, we know that flipping one specific 0 bit of (i.e., adding a specific element) can generate a new solution , which satisfies that . Then, we have

where the last inequality is derived by . Since , will be included into ; otherwise, must be dominated by one solution in (line 8 of Algorithm ?), and this implies that has already been larger than , which contradicts with the assumption . After including , . Thus, can increase by at least 1 in one iteration with probability at least , where is a lower bound on the probability of selecting in line 4 of Algorithm ? and is the probability of flipping a specific bit of while keeping other bits unchanged in line 5. Then, it needs at most expected number of iterations to increase . Thus, after at most iterations in expectation, must have reached .

As the proof of Theorem ?, we know that . Thus, by summing up the expected running time of two phases, we get that the expected running time of the GSEMO for finding a solution with and is .

Theorem ? gives the approximation guarantee of the GSEMO w.r.t. the submodularity ratio in Definition ?. Note that it was proved that the standard greedy algorithm can find a subset with and [6]. Thus, Theorem ? shows that the GSEMO can achieve nearly this best known approximation guarantee. The proof of Theorem ? is similar to that of Theorem ?. The main difference is that a different inductive inequality on is used in the definition of the quantity , due to the change of the adopted notion of approximate submodularity. For concise illustration, we will mainly show the difference in the proof of Theorem ?.

The proof is similar to that of Theorem ?. We use a different , which is defined as

It is easy to verify that implies that the desired approximation guarantee is reached, since there must exist one solution in satisfying that and . Assume that currently and is a corresponding solution, i.e., and . We then only need to show that flipping one specific 0 bit of can generate a new solution with . By Lemma ?, we know that flipping one specific 0 bit of can generate a new solution , which satisfies that . Then, we have

where the second inequality is by , and the last inequality is by , which can be easily derived from and decreasing with . Thus, the theorem holds.

Theorem ? gives the approximation guarantee of the GSEMO w.r.t. the -approximately submodular ratio in Definition ?. Note that the standard greedy algorithm obtains the best known approximation guarantee [17]. Comparing with this, the approximation guarantee of the GSEMO is slightly better, since

The proof of Theorem ? is also similar to that of Theorem ?, except that a different inductive inequality on is used in the definition of the quantity , due to the change of the adopted notion of approximate submodularity.

Let be an optimal solution, i.e., . Let . Since is -approximately submodular as in Definition ?, we use to denote one corresponding submodular function satisfying that for all , . Then, we have

where the first inequality is by the submodularity of (i.e., Eq. (Equation 2)), the second inequality is by for any , and the last inequality is by the definition of and . By reordering the terms, we get

Since and (where the last inequality is by the monotonicity of ), we have

By reordering the terms, the lemma holds.

The proof is similar to that of Theorem ?. We use a different , which is defined as

It is easy to verify that implies that the desired approximation guarantee is reached. Assume that currently and is a corresponding solution, i.e., and . We then only need to show that flipping one specific 0 bit of can generate a new solution with . By Lemma ?, we know that flipping one specific 0 bit of can generate a new solution , which satisfies that . Then, we have

where the last inequality is by . Thus, the theorem holds.

Thus, we have shown that for each notion of approximate submodularity, the GSEMO can always obtain the best known approximation guarantee. Particularly, when the objective function is submodular, the quantity characterizing the approximately submodular degree in Definitions ?- ? becomes , and , respectively; thus the approximation guarantees obtained by the GSEMO in Theorems ?- ? all become , which is optimal in general [25], and also consistent with the previous result in [10].

## 5Conclusion

This paper theoretically studies the approximation performance of EAs for two general classes of combinatorial optimization problems, maximizing non-monotone submodular functions without constraints and maximizing monotone non-submodular functions with a size constraint. We prove that in polynomial expected running time, a simple multi-objective EA called GSEMO can achieve a constant approximation guarantee of nearly for the first problem class, and can achieve the best known approximation guarantee for the second problem class. These results together with the previous ones for the problem class of maximizing monotone submodular functions with matroid constraints [10] provide a theoretical explanation for the empirically good performance of EAs in various applications. A question that will be investigated in the future is to investigate whether simple single-objective EAs such as the (1+1)-EA can achieve good approximation guarantees on the studied two problem classes. Most of the existing studies considered the problems where the objective function is a set function. We will also try to investigate the performance of EAs for optimizing functions over the integer lattice or continuous domains.

### References

**An 0.828-approximation algorithm for the uncapacitated facility location problem.**

A. A. Ageev and M. I. Sviridenko. Discrete Applied MathematicsTheory of Randomized Search Heuristics: Foundations and Recent Developments

A. Auger and B. Doerr. .Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms

T. Bäck. .**A tight linear time (1/2)-approximation for unconstrained submodular maximization.**

N. Buchbinder, M. Feldman, J. Seffi, and R. Schwartz. SIAM Journal on Computing**Robust influence maximization.**

W. Chen, T. Lin, Z. Tan, M. Zhao, and X. Zhou. In*Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’16)*, pages 795–804, San Francisco, CA, 2016.**Submodular meets spectral: Greedy algorithms for subset selection, sparse approximation and dictionary selection.**

A. Das and D. Kempe. In*Proceedings of the 28th International Conference on Machine Learning (ICML’11)*, pages 1057–1064, Bellevue, WA, 2011.**A rigorous complexity analysis of the (1+1) evolutionary algorithm for separable functions with Boolean inputs.**

S. Droste, T. Jansen, and I. Wegener. Evolutionary Computation**On the analysis of the (1+1) evolutionary algorithm.**

S. Droste, T. Jansen, and I. Wegener. Theoretical Computer Science**Maximizing non-monotone submodular functions.**

U. Feige, V. S. Mirrokni, and J. Vondrak. SIAM Journal on Computing**Maximizing submodular functions under matroid constraints by evolutionary algorithms.**

T. Friedrich and F. Neumann. Evolutionary Computation**Approximating covering problems by randomized search heuristics using multi-objective models.**

T. Friedrich, J. He, N. Hebbinghaus, F. Neumann, and C. Witt. Evolutionary Computation**Expected runtimes of a simple multi-objective evolutionary algorithm.**

O. Giel. In*Proceedings of the 2003 IEEE Congress on Evolutionary Computation (CEC’03)*, pages 1918–1925, Canberra, Australia, 2003.**Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming.**

M. X. Goemans and D. P. Williamson. Journal of the ACM**Some optimal inapproximability results.**

J. Håstad. Journal of the ACM**Drift analysis and average time complexity of evolutionary algorithms.**

J. He and X. Yao. Artificial Intelligence**Robust influence maximization.**

X. He and D. Kempe. In*Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’16)*, pages 885–894, San Francisco, CA, 2016.**Maximization of approximately submodular functions.**

T. Horel and Y. Singer. In*Advances In Neural Information Processing Systems 29 (NIPS’16)*, pages 3045–3053, Barcelona, Spain, 2016.**Automated antenna design with evolutionary algorithms.**

G. S. Hornby, A. Globus, D. S. Linden, and J. D. Lohn. In*Proceedings of the 2006 American Institute of Aeronautics and Astronautics Conference on Space*, pages 19–21, San Jose, CA, 2006.**Evolutionary algorithms-how to cope with plateaus of constant fitness and when to reject strings of the same fitness.**

T. Jansen and I. Wegener. IEEE Transactions on Evolutionary Computation**What’s AI done for me lately? Genetic programming’s human-competitive results.**

J. R. Koza, M. A. Keane, and M. J. Streeter. IEEE Intelligent Systems**Submodular dictionary selection for sparse representation.**

A. Krause and V. Cevher. In*Proceedings of the 27th International Conference on Machine Learning (ICML’10)*, pages 567–574, Haifa, Israel, 2010.**Near-optimal sensor placements in Gaussian processes: Theory, efficient algorithms and empirical studies.**

A. Krause, A. Singh, and C. Guestrin. Journal of Machine Learning Research**An evolutionary clustering algorithm for gene expression microarray data analysis.**

P. C. Ma, K. C. Chan, X. Yao, and D. K. Chiu. IEEE Transactions on Evolutionary Computation**A survey of multiobjective evolutionary algorithms for data mining: Part I.**

A. Mukhopadhyay, U. Maulik, S. Bandyopadhyay, and C. A. C. Coello. IEEE Transactions on Evolutionary Computation**Best algorithms for approximating the maximum of a submodular set function.**

G. L. Nemhauser and L. A. Wolsey. Mathematics of Operations Research**An analysis of approximations for maximizing submodular set functions – I.**

G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. Mathematical Programming**Expected runtimes of a simple evolutionary algorithm for the multi-objective minimum spanning tree problem.**

F. Neumann. European Journal of Operational Research**Minimum spanning trees made easier via multi-objective optimization.**

F. Neumann and I. Wegener. Natural Computing**Randomized local search, evolutionary algorithms, and the minimum spanning tree problem.**

F. Neumann and I. Wegener. Theoretical Computer ScienceBioinspired Computation in Combinatorial Optimization: Algorithms and Their Computational Complexity

F. Neumann and C. Witt. .**Computing minimum cuts by randomized search heuristics.**

F. Neumann, J. Reichel, and M. Skutella. Algorithmica**An analysis on recombination in multi-objective evolutionary optimization.**

C. Qian, Y. Yu, and Z.-H. Zhou. Artificial Intelligence**On constrained Boolean Pareto optimization.**

C. Qian, Y. Yu, and Z.-H. Zhou. In*Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI’15)*, pages 389–395, Buenos Aires, Argentina, 2015.**Parallel Pareto optimization for subset selection.**

C. Qian, J.-C. Shi, Y. Yu, K. Tang, and Z.-H. Zhou. In*Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI’16)*, pages 1939–1945, New York, NY, 2016.**Evolutionary algorithms and matroid optimization problems.**

J. Reichel and M. Skutella. Algorithmica**Worst-case and average-case approximations by simple randomized search heuristics.**

C. Witt. In*Proceedings of the 22nd Annual Symposium on Theoretical Aspects of Computer Science (STACS’05)*, pages 44–56, Stuttgart, Germany, 2005.**On the approximation ability of evolutionary optimization with application to minimum set cover.**

Y. Yu, X. Yao, and Z.-H. Zhou. Artificial Intelligence