Estimating Approximation Errors of Elitist Evolutionary Algorithms

Estimating Approximation Errors of Elitist Evolutionary Algorithms

Cong Wang 1School of Science, Wuhan University of Technology, Wuhan 430070, China 1ychen@whut.edu.cn    Yu Chen 1School of Science, Wuhan University of Technology, Wuhan 430070, China 1ychen@whut.edu.cn    Jun He 2School of Science and Technology, Nottingham Trent University,
Nottingham NG11 8NS, UK 2
   Chengwang Xie 3School of Computer and Information Engineering, Nanning Normal University, Nanning 530299, China3
Abstract

When evolutionary algorithms (EAs) are unlikely to locate precise global optimal solutions with satisfactory performances, it is important to substitute alternative theoretical routine for the analysis of hitting time/running time. In order to narrow the gap between theories and applications, this paper is dedicated to perform an analysis on approximation error of EAs. First, we proposed a general result on upper bound and lower bound of approximation errors. Then, several case studies are performed to present the routine of error analysis, and theoretical results show the close connections between approximation errors and eigenvalues of transition matrices. The analysis validates applicability of error analysis, demonstrates significance of estimation results, and then, exhibits its potential to be applied for theoretical analysis of elitist EAs.

Keywords:
Evolutionary Algorithm Approximation Error Markov Chain Budget Analysis.

1 Introduction

For theoretical analysis, convergence performance of evolutionary algorithms (EAs) is widely evaluated by the expected first hitting time (FHT) and the expected running time (RT) [1], which quantify the respective numbers of iteration and function evaluations (FEs) to hit the global optimal solutions. General methods for estimation of FHT/RT have been proposed via theories of Markov chains [2, 3], drift analysis[4, 5], switch analysis [6] and application of them with partition of fitness levels [7], etc.

Although popularly employed in theoretical analysis, simple application of FHT/RT is not practical when the optimal solutions are difficult to hit. One of these “difficult” cases is optimization of continuous problems. Optimal sets of continuous optimization problems are usually zero-measure set, which could not be hit by generally designed EAs in finite time, and so, FHT/RT could be infinity for most cases. A remedy to this difficulty is to take a positive-measure set as the destination of population iteration. So, it is natural to take an approximation set for a given precision as the hitting set of FHT/RT estimation [8, 9, 10, 11]. Another “difficult” case is the optimization of NP-complete (NPC) problems that cannot be solved by EAs in polynomial FHT/RT. For this case, it is much more interesting to investigate the quality of approximate solutions obtained in polynomial FHT/RT. In this way, researchers have estimated approximation ratios of approximate solutions that EAs can obtain for various NPC combinatorial optimization problems in polynomial expected FHT/RT [12, 13, 14, 15, 16, 17].

However,the aforementioned methods could be impractical once we have little information about global optima of the investigated problems, and then, it is difficult to “guess” what threshold can result in polynomial FHT/RT. Since the approximation error after a given iteration number is usually employed to numerically compared performance of EAs, some researchers tried to analyze EAs by theoretically estimating the expected approximation error. Rudolph [18] proved that under the condition , the sequence converges in mean geometrically to , that is, . He and Lin [19] studied the geometric average convergence rate of the error sequence , defined by Starting from , it is straightforward to claim that .

A close work to analysis of approximation error is the fixed budget analysis proposed by Jansen and Zarges [20, 21], who aimed to bound the fitness value within a fixed time budget . However, Jansen and Zarges did not present general results for any time budget . In fixed budget analysis, a bound of approximation error holds for some small but might be invalid for a large one. He [22] made a first attempt to obtain an analytic expression of the approximation error for a class of elitist EAs. He proved if the transition matrix associated with an EA is an upper triangular matrix with unique diagonal entries, then for any , the approximation error is expressed by where are eigenvalues of the transition matrix. He et al. [23] also demonstrated the possibility of approximation estimation by estimating one-step convergence rate , however, it was not sufficient to validate its applicability to other problems because only two studied cases with trivial convergence rates were investigated.

This paper is dedicated to present an analysis on estimation of approximation error depending on any iteration number . We make the first attempt to perform a general error analysis of EAs, and demonstrate its feasibility by case studies. Rest of this paper is presented as follows. Section 2 presents some preliminaries. In Section 3, a general result on the upper and lower bounds of approximation error is proposed, and some case studies are performed in Section 4. Finally, Section 5 concludes this paper.

2 Preliminaries

In this paper, we consider a combinatorial optimization problem

(1)

where has only finite available values. Denote its optimal solution as , and the corresponding objective value as . Quality of a feasible solution is quantified by its approximation error . Since there are only finite solutions of problem (1), there exist finite feasible values of , denoted as . Obviously, the minimum value is the approximation error of the optimal solution , and so, takes the value 0. We call that is located at the status if . Then, there are totally statuses for all feasible solutions. Status consists of all optimal solutions, called the optimal status; other statuses are the non-optimal statuses.

Suppose that an feasible solution of problem (1) is coded as a bit-string, and an elitist EA described in Algorithm 1 is employed to solve it. When the one-bit mutation is employed, it is called a random local search (RLS); if the bitwise mutation is used, it is named as a (1+1) evolutionary algorithm ((1+1)EA). Then, the error sequence is a Markov Chain. Assisted by the initial probability distribution of individual status , the evolution process of (1+1) elitist EA can be depicted by the transition probability matrix

(2)

where is the probability to transfer from status to status .

1:counter ;
2:randomly initialize a solution ;
3:while the stopping criterion is not satisfied do
4:   generate a new candidate solution from by mutation;
5:   set individual if ; otherwise, let ;
6:   ;
7:end while
Algorithm 1 A Framework of the Elitist EA

Since the elitist selection is employed, the probability to transfer from status to status is zero when . Then, the transition probability matrix is upper triangular, and we can partition it as

(3)

where , ,

(4)

Thus, the expected approximation error at the iteration is

(5)

where , , is the sub-matrix representing transition probabilities between non-optimal statuses [24]. Because sum of each column in is equal to 1, the first row can be confirmed by , and in the following, we only consider the transition submatrix for estimation of approximation error. According to the shape of , we can further divide searching process of elitist EAs into two different categories.

  1. Step-by-step Search: If the transition probability satisfies

    (6)

    it is called a step-by-step search. Then, the transition submatrix is

    (7)

    which means the elitist EA cannot transfer between non-optimal statues that are not adjacent to each other;

  2. Multi-step Search: If there exists some such that , we called it a multi-step search. A multi-step search can transfer between inconsecutive statuses, which endows it with better global exploration ability, and probably, better convergence speed.

Note that this classification is problem-dependent because the statuses depend on the problem to be optimized. So, the RLS could be either a step-by-step search or a multi-step search. However, the (1+1)EA is necessarily a multi-step search, because the bitwise mutation can jump between any two statuses. When in (3) is non-zero, column sums of is less than 1, which means it could jump from at least one non-optimal status directly to the optimal status. So, a step-by-step search represented by (7) must satisfies

3 Estimation of General Approximation Bounds

3.1 General Bounds of the Step-by-step Search

Let be the submatrix of a step-by-step search. Its eigenvalues are

(8)

which represents the probability of remaining at the present status after one iteration. Then, it is very natural to declare that greater the eigenvalues are, slower the step-by-step search converges. Inspired by this idea, we can estimate general bounds of a step-by-step search by enlarging or reducing the eigenvalues. Achievement of the general bounds is based on the following lemma.

Lemma 1

Denote

(9)

Then, is monotonously increasing with , .

Proof

This lemma could be proved by mathematical induction.

  1. When , we have

    (10)

    Note that is not greater than 1 because it is an element of the probability transition matrix . Then, from the truth that , we conclude that is monotonously increasing with , . Meanwhile, (10) also implies that

    (11)
  2. Suppose that when , is monotonously increasing with for all , and it holds that

    (12)

    First, the monotonicity indicated by (12) implies that

    (13)

    Meanwhile, according to equation (9) we know , that is,

    So, ,

    (14)

    Combining (12), (13) and 2, we know that

    which means is monotonously increasing with for all .

In conclusion, is monotonously increasing with , .∎

Denote

(15)

If we enlarge or shrink all eigenvalues of to the maximum value and the minimum value, respectively, we can get two transition submatrices and , where , . Then, depicts a searching process converging slower than the one represents, and is the transition submatrix of a process converging faster than what represents.

Theorem 3.1

The expected approximation error of a step-by-step search represented by and is bounded by

(16)
Proof

Note that

where is a non-zero vector composed of non-negative components. Then, by lemma 1 we can conclude that is also monotonously increasing with , . So, we can get the result that

Theorem 3.1 provides a general result about the upper and the lower bounds of approximation error. From the above arguments we can figure out that the lower bounds and the upper bounds can be achieved once the transition submatrix degenerates to and , respectively. That is to say, they are indeed the “best” results about the general bounds. Recall that . Starting from the status, is the probability that the (1+1) elitist EA stays at the status after one iteration. Then, greater is, harder the step-by-step search transfers to the sub-level status . So, performance of a step-by-step search depicted by , for the worst case, would not be worse than that of ; meanwhile, it would not be better than that of , which contributes to a bottleneck for improving performance of the step-by-step search.

3.2 General Bounds of the Multi-step Search

Denoting the transition submatrix of a multi-step search as

(17)

we can bound its approximation error by defining two transition matrices

(18)

and

(19)
Lemma 2

Let , and be the transition matrix defined by (17), (18) and (19), respectively. Given any nonnegative vector satisfying and the corresponding initial distribution , it holds that

(20)
Proof

It is trivial to prove that . Because has part of non-zero elements of , is a partial sum of . Since all elements included in are nonnegative, it holds that .

Moreover, the second inequality can be proved by mathematical induction. Denote

(21)
(22)

where . Combining with the fact that , we know that

(23)
  1. When , (21), (22) and (23) imply that

  2. Assume that (20) holds when . Then, (23) implies that

    (24)

    Meanwhile, because , we know . Then, the assumption implies that

    Combining it with (24), we can conclude that

    So, the result also holds for .

In conclusion, it holds that .∎

Theorem 3.2

The approximation error of the multi-step search defined by (17) is bounded by

(25)

where .

Proof

From Lemma 2 we know that

(26)

Moreover, by Theorem 3.1 we know that

(27)

Combing (26) and (27) we get the theorem proved.∎

3.3 Analytic Expressions of General Bounds

Theorems 3.1 and 3.2 show that computation of general bounds for approximation errors is based on the computability of and , where and are defined by (15) and (19), respectively.

  1. Analytic Expression of : The submatrix can be split as , where

    Because multiplication of and is commutative, the binomial theorem [25] holds and we have

    (28)

    where

    (29)

    Note that is a nilpotent matrix of index  111In linear algebra, a nilpotent matrix is a square matrix such that for some positive integer . The smallest such is called the index of  [26]., and

    (30)

    Then, from (29), (30) and (28) we know

    1. if ,

      (31)
    2. if ,

      (32)
  2. Analytic Expression of : For the diagonal matrix , it holds that

    (33)

4 Case-by-case Estimation of Approximation Error

In section 3 general bounds of approximation error are obtained by ignoring most of elements in the sub-matrix . Thus, these bounds could be very general but not tight. In this section, we would like to perform several case-by-case studies to demonstrate a feasible routine of error analysis, where the RLS and the (1+1)EA are employed solving the popular OneMax problem and the Needle-in-Haystack problem.

Problem 1

(OneMax)

Problem 2

(Needle-in-Haystack)

4.1 Error Estimation for the OneMax Problem

Application of RLS on the unimodal OneMax problem generates a step-by-step search, the transition submatrix of which is

(34)

Eigenvalues and corresponding eigenvectors of are

(35)
Theorem 4.1

The expected approximation error of RLS for the OneMax problem is

(36)
Proof

Denote . Then we know that

(37)

has distinct eigenvalues, and so, can be diagonalized as  [27]. Then, we have

(38)

where , ,

(39)

Substituting (39) into (38) we get the result

Theorem 4.2

The expected approximation error of (1+1)EA for the OneMax problem is bounded from above by

(40)
Proof

According to the definition of population status, we know that the status index is the number of 0-bits in . Once one of 0-bits is flip to 1-bit and all 1-bits keep unchanged, the generated solution will be accepted, and the status transfers from to . Recalling that the probability this case happen is , we know that

Denote

and we know that

(41)

With distinct eigenvalues, can be diagonalized:

(42)

where , . and are the eigenvalues and the corresponding eigenvectors:

(43)

It is obvious that is invertible, and its inverse is

(44)

Similar to the result illustrated in (39), we know that

(45)

Combing (41), (42), (43), (44) and (45) we know that

4.2 Error Estimation for the Needle-in-Haystack Problem

Landscape of the Needle-in-Haystack problem has a platform where all solutions have the same function value , and only the global optimum has a non-zero function value . For this problem, the status is defined as total number of 1-bits in a solutions .

Theorem 4.3

The expected approximation error of RLS for the Needle-in-Haystack problem is bounded by

(46)
Proof

When the RLS is employed to solve the Needle-in-Haystack problem, the transition submatrix is

(47)

Then,

(48)

Since

we can conclude that

Theorem 4.3 indicates that both the upper bound and the lower bound converge to the positive when , which implies the fact that RLS cannot converge in mean to global optimal solution of the Needle-in-Haystack problem. Because the RLS searches adjacent statuses and only better solutions can be accepted, it cannot converge to the optimal status once the initial solution is not located at the status .

Theorem 4.4

The expected approximation error of (1+1)EA for the Needle-in-Haystack problem is bounded by

(49)
Proof

When the (1+1)EA is employed to solve the Needle-in-Haystack problem, the transition probability submatrix is

(50)

Then,

(51)

Since

we can conclude that

5 Conclusion

To make theoretical results more instructional to algorithm developments and applications, this paper proposes to investigate performance of EAs by estimating approximation error for any iteration budget . General bounds included in Theorems 3.1 and 3.2 demonstrate that bottlenecks of EAs’ performance are decided by the maximum and the minimum eigenvalues of transition submatrix . Meanwhile, theorems 4.1,4.2,4.3, and 4.4 present estimations of approximation error for RLS and (1+1)EA for two benchmark problems, which shows that our analysis scheme is applicable for elitist EAs, regardless the shapes of transition matrices. Moreover, the estimation results demonstrate that approximation errors are closely related to eigenvalues of the transition matrices, which provide useful information for performance improvements of EAs. Our future work is to further perform error analysis on real combinatorial problems to show its applicability in theoretical analysis of EAs.

Acknowledgements

This work was supported in part by the National Nature Science Foundation of China under Grants 61303028 and 61763010, in part by the Guangxi “BAGUI Scholar” Program, and in part by the Science and Technology Major Project of Guangxi under Grant AA18118047.

References

  • [1] Oliveto, P., He, J., Yao, X.: Time complexity of evolutionary algorithms for combinatorial optimization: A decade of results. International Journal of Automation and Computing 4(3), 281–293 (2007)
  • [2] He, J., Yao, X.: Towards an analytic framework for analysing the computation time of evolutionary algorithms. Artificial Intelligence 145(1-2), 59–97 (2003)
  • [3] Ding, L., Yu, J.: Some techniques for analyzing time complexity of evolutionary algorithms. Transactions of the Institute of Measurement and Control 34(6), 755–766 (2012)
  • [4] He, J., Yao, X.: Drift analysis and average time complexity of evolutionary algorithms. Artificial intelligence 127(1), 57–85 (2001)
  • [5] Doerr, B., Johannsen, D., Winzen, C.: Multiplicative drift analysis. Algorithmica 64(4), 673–697 (2012)
  • [6] Yu, Y., Qian, C., Zhou, Z.H.: Switch analysis for running time analysis of evolutionary algorithms. IEEE Transactions on Evolutionary Computation 19(6), 777–792 (2014)
  • [7] Droste, S., Jansen, T., Wegener, I.: On the analysis of the (1+ 1) evolutionary algorithm. Theoretical Computer Science 276(1-2), 51–81 (2002)
  • [8] Chen, Y., Zou, X., He, J.: Drift conditions for estimating the first hitting times of evolutionary algorithms. International Journal of Computer Mathematics 88(1), 37–50 (2011)
  • [9] Huang, H., Xu, W., Zhang, Y., Lin, Z., Hao, Z.: Runtime analysis for continuous (1+ 1) evolutionary algorithm based on average gain model. Scientia Sinica Informationis 44(6), 811–824 (2014)
  • [10] Zhang, Y., Huang, H., Hao, Z., Hu, G.: First hitting time analysis of continuous evolutionary algorithms based on average gain. Cluster Computing 19(3), 1323–1332 (2016)
  • [11] Akimoto, Y., Auger, A., Glasmachers, T.: Drift theory in continuous search spaces: expected hitting time of the (1+ 1)-es with 1/5 success rule. In: Proceedings of the Genetic and Evolutionary Computation Conference. pp. 801–808. ACM (2018)
  • [12] Yu, Y., Yao, X., Zhou, Z. H.: On the approximation ability of evolutionary optimization with application to minimum set cover. Artificial Intelligence (180-181), 20–33 (2012)
  • [13] Lai, X., Zhou, Y., He, J., Zhang, J.: Performance analysis of evolutionary algorithms for the minimum label spanning tree problem. IEEE Transactions on Evolutionary Computation 18(6), 860–872 (2014)
  • [14] Zhou, Y., Lai, X., Li, K.: Approximation and parameterized runtime analysis of evolutionary algorithms for the maximum cut problem. IEEE transactions on cybernetics 45(8), 1491–1498 (2015)
  • [15] Zhou, Y., Zhang, J., Wang, Y.: Performance analysis of the (1+ 1) evolutionary algorithm for the multiprocessor scheduling problem. Algorithmica 73(1), 21–41 (2015)
  • [16] Xia, X., Zhou, Y., Lai, X.: On the analysis of the (1+ 1) evolutionary algorithm for the maximum leaf spanning tree problem. International Journal of Computer Mathematics 92(10), 2023–2035 (2015)
  • [17] Peng, X., Zhou, Y., Xu, G.: Approximation performance of ant colony optimization for the tsp (1, 2) problem. International Journal of Computer Mathematics 93(10), 1683–1694 (2016)
  • [18] Rudolph, G.: Convergence rates of evolutionary algorithms for a class of convex objective functions. Control and Cybernetics 26, 375–390 (1997)
  • [19] He, J., Lin, G.: Average convergence rate of evolutionary algorithms. IEEE Transactions on Evolutionary Computation 20(2), 316–321 (2016)
  • [20] Jansen, T., Zarges, C.: Fixed budget computations: A different perspective on run time analysis. In: Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation. pp. 1325–1332. ACM (2012)
  • [21] Jansen, T., Zarges, C.: Performance analysis of randomised search heuristics operating with a fixed budget. Theoretical Computer Science 545, 39–58 (2014)
  • [22] He, J.: An analytic expression of relative approximation error for a class of evolutionary algorithms. In: Proceedings of 2016 IEEE Congress on Evolutionary Computation (CEC 2016). pp. 4366–4373 (July 2016)
  • [23] He, J., Jansen, T., Zarges, C.: Unlimited budget analysis. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion. pp. 427–428. ACM (2019)
  • [24] He, J., Chen, Y., Zhou, Y.: A theoretical framework of approximation error analysis of evolutionary algorithms. arXiv preprint arXiv:1810.11532 (2018)
  • [25] Aigner, M.: Combinatorial theory. Springer Science & Business Media (2012)
  • [26] Herstein, I.N.: Topics in algebra. John Wiley & Sons (2006)
  • [27] Lay, D.C.: Linear algebra and its applications. Pearson Education (2003)
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters