Whale swarm algorithm with iterative counter for multimodal function optimization

Whale swarm algorithm with iterative counter for multimodal function optimization

Bing Zeng zengbing2016@126.com Xinyu Li Liang Gao gaoliang@mail.hust.edu.cn Yuyan Zhang School of Mechanical Science and Engineering, Huazhong University of Science and Technology, 1037 Luoyu Road, Wuhan, China
Abstract

Most real-world optimization problems often come with multiple global optima or local optima. Therefore, increasing niching metaheuristic algorithms, which devote to finding multiple optima in a single run, are developed to solve these multimodal optimization problems. However, there are two difficulties urgently to be solved for most existing niching metaheuristic algorithms: how to set the optimal values of niching parameters for different optimization problems, and how to jump out of the local optima efficiently. These two difficulties limited their practicality largely. Based on Whale Swarm Algorithm (WSA) we proposed previously, this paper presents a new multimodal optimizer named WSA with Iterative Counter (WSA-IC) to address these two difficulties. In the one hand, WSA-IC improves the iteration rule of the original WSA for multimodal optimization, which removes the need of specifying different values of attenuation coefficient for different problems to form multiple subpopulations, without introducing any niching parameter. In the other hand, WSA-IC enables the identification of extreme point during iterations relying on two new parameters (i.e., stability threshold and fitness threshold ), to jump out of the located extreme point. The proposed WSA-IC is compared with several niching metaheuristic algorithms on CEC2015 niching benchmark test functions and five additional classical multimodal functions with high dimensions. The experimental results demonstrate that WSA-IC statistically outperforms other niching metaheuristic algorithms on most test functions.

keywords:
Whale swarm algorithm, multimodal optimization, metaheuristic algorithm, niching
journal:

1 Introduction

Most of the real-world optimization problems are multimodal gao2017jaya (); tasgetiren2017iterated (); lin2016effective (); ismkhan2017effective (); yi2016parallel (); dahi2016quantum (), i.e., their objective functions often contain multiple global optima or local optima. If a traditional numerical method is used to solve a multimodal optimization problem, it has to try many times for locating a different optimum in each single run to pick out the global optima, which will take a lot of time and work. In such a scenario, using metaheuristic algorithms, no matter evolutionary algorithms (EAs) or swarm based algorithms, to solve these problems has become a hot research topic, as they are easy to implement and can converge to as good as possible solutions. However, many metaheuristic algorithms, such as Genetic Algorithm (GA), Particle Swarm Optimization (PSO), Differential Evolution (DE), and so on, are primarily designed to search for a single global optimum. But, it is desirable to locate multiple global optima in order to choose the most appropriate solution for engineers. In addition, some metaheuristic algorithms are easy to fall into the local optima. Therefore, many techniques have been proposed for the metaheuristic algorithms to find as many global optima as possible. These techniques are commonly known as niching methods li2010niching (), which are committed to promoting and maintaining the formation of multiple stable subpopulations within a single population for locating multiple optima. Some representative niching methods include crowding de1975analysis (), fitness sharing goldberg1987genetic (), clustering yin1993fast (), restricted tournament selection harik1995finding (), parallelization bessaou2000island (), speciation deb1989investigation (), and population topologies kennedy2002population (), and so on. Several of them are presented below, more references and discussions about niching methods can be seen from reference li2016seeking ().

Crowding was firstly proposed by De Jong de1975analysis () to preserve genetic diversity, so as to improve the global search ability of the algorithm for locating multiple optima. In crowding method, the offspring with better fitness replaces the most similar individual from a subset (i.e., crowd) of the population. The similarity is generally measured by distance, including hamming distance for binary encoding and Euclidean distance for real-valued encoding thomsen2004multimodal (), which means that the smaller the distance of two individuals is, the higher similarity of two individuals is. The individuals of subset are randomly selected from the population, and the size of subset is a user specified parameter called crowding factor () that is often set to 2 or 3. However, low values will lead to replacement errors, i.e., the offspring replaces another individual with little similarity, which will lower the population diversity. To reduce replacement errors, deterministic crowding mahfoud1992crowding () and probabilistic crowding mengshoel1999probabilistic () were proposed, Thomsen suggested simply setting equal to the population size thomsen2004multimodal (), and so on.

Goldberg and Richardson goldberg1987genetic () proposed fitness sharing mechanism, which develops the method of sharing functions to permit the formation of multiple subpopulations. When using this method, the shared fitness of all the individuals need to be calculated according to Eq.1 before executing the operators of an algorithm, as the operations on the individuals are based on their shared fitness.

(1)

where, and are the original fitness and shared fitness of individual respectively; is the shared value of individual with other individuals, and is formulated as , wherein is the population size, is the sharing function over the individual and , which is calculated as follows.

(2)

where, is a constant, and always set as 1; is the distance between the individual and ; is the sharing distance, which is always set as the value of peak radius. However, this method assumes that all the peaks have equal height and width. Obviously, a prior knowledge of the fitness landscape is required for this method to set the value of .

Speciation deb1989investigation () is another popular niching technique, which is used to form parallel subpopulations, i.e., species, according to the similarity between individuals. The similarity is also measured by distance, such as Euclidean distance. This niching technique employs one user-specified parameter called species distance () to divide the population into a set of species. The detail procedure of forming species in every generation is shown below. Assuming a maximization optimization problem. The first step is to find out the species seeds that dominate their own species. Firstly, an empty set is defined to contain the species seeds. Sorting the individuals in decreasing order of fitness and adding the first individual of population after sorting to the set . Then, judging the remaining individuals one by one in order, and determining whether they are within a distance /2 of any species seed in . If no, they are added to . After all the individuals are traversed, the set has collected all the species seeds. Next comes the step of adding the individuals to their corresponding species. For each species seed in , adding the individuals that are within a distance /2 of it to its species, if an individual has been already added to a species, doing nothing. Although speciation method is able to divide the population into multiple subpopulations, it has a major shortcoming that its parameter, i.e., species distance, is hard to be set precisely for different optimization problems. In such case, inspired by the Multinational Evolutionary Algorithms ursem1999multinational (), Stoean et al. stoean2007disburdening () proposed “detect-multimodal” mechanism to establish species, which removes the need of distance parameter. The “detect-multimodal” mechanism utilities a set of interior points between two individuals to detect whether there is a valley between them in the fitness landscape, so as to determine whether the two individuals track different extreme points. If the fitness values of all the interior points are better than the worse fitness value among these two individuals, they are considered following the same extreme point, i.e., locating in the same peak of the fitness landscape, as shown in Fig. 1(a), wherein, > and >. On the contrary, if there exist at least one interior point whose fitness value is worse than the worse fitness value among these two individuals, at least one valley is considered existing between the two individuals, i.e., they are considered tracking different extreme points as shown in Fig. 1(b), wherein, <. Those individuals following the same extreme point are added to the same species. Although “detect-multimodal” mechanism does not utilize species distance to divide the population into multiple species, it employs another parameter called “number of gradations”, i.e., number of interior points, which also depends on the problem to be solved.

(a) two individuals follow the same extreme point
(b) two individuals follow different extreme points
Fig. 1: Sketch maps of the “detect-multimodal” mechanism.

Thus it can be seen that some niching methods need to set some parameters, which require some prior knowledge of the fitness landscape, to divide the population into multiple subpopulations. However, for some real-world optimization problems, the prior knowledge of the fitness landscape are very difficult or almost impossible to obtain li2010niching (). Therefore, these niching methods are difficult to be used to deal with the real-world optimization problems. In this paper, a new multimodal optimization algorithm called Whale Swarm Algorithm with Iterative Counter (WSA-IC), based on our preliminary work in Zeng2017 (), is proposed. By improving the iteration rule of the original WSA for multimodal optimization, WSA-IC removes the need of specifying optimal parameter values for different problems to form multiple subpopulations, without introducing any niching parameter. In addition, WSA-IC enables the identification of extreme point to jump out of the located extreme points during iterations.

The remainder of this paper is organized as follows. A brief overview of the multimodal optimization algorithms is presented in section 2. Section 3 describes the proposed WSA-IC in sufficient detail. The experimental setup containing the algorithms compared, test functions, parameters setting and performance metrics is presented in section 4. The next section presents the experimental results and analysis to evaluate WSA-IC. The last section is the conclusions and topics for further research.

2 Related works

Since increasing niching methods were put forward, a large number of multimodal optimization algorithms combining the metaheuristic algorithms with these niching methods have been proposed. In this section, a brief overview of multimodal optimization algorithms is presented. According to whether the prior knowledge of the fitness landscape is needed, these multimodal optimization algorithms are classified into prior knowledge based methods and non-prior knowledge based methods. More references and discussions about multimodal optimization algorithms can be viewed in references li2016seeking (); das2011real ().

2.1 Prior knowledge based methods

Species Conserving Genetic Algorithm (SCGA) was proposed by Li et al. li2002species () via introducing speciation and species conservation techniques into the classical GA. In each iteration, the current population is partitioned into multiple subpopulations (i.e., species) using the speciation technique deb1989investigation (), before executing the genetic operators. Moreover, after executing the genetic operators, all the species seeds are either conserved to the next generation or replaced by better members of the same species, which can contribute significantly to the preservation of global and local optima that have been found so far. Li showed that the additional overhead of SCGA caused by these two techniques is not higher than that introduced by GA with sharing goldberg1987genetic (), and SCGA performs far better than Genetic Algorithm with Sharing (SGA) in success rates of locating the global optima.

Li li2005efficient () proposed Species-based DE (SDE) algorithm to solve multimodal optimization problems via introducing speciation technique. In SDE algorithm, when the number of member individuals of a species is less than a predefined value, the algorithm will randomly generate new individuals within the radius of species seed until the species size reaches the predefined value. Then, each identified species runs the conventional DE algorithm by itself. In addition, if the fitness of an offspring is the same as that of its species seed, this offspring will be replaced by a randomly generated new individual. These two mechanisms have improved the efficiency of SDE algorithm significantly.

The speciation technique was also introduced into the conventional PSO by Li li2004adaptively () to solve multimodal optimization problems. In each iteration of Species-based PSO (SPSO), after the population is divided into multiple species and the species seeds are determined, each species seed is assigned to its member individuals as the . Then, each individual updates its position according to the iterative equations of velocity and position in the PSO. The experimental results showed that SPSO is comparable or better than SNGA beasley1993sequential (), SCGA and NichePSO brits2002niching () over a set of multimodal functions.

Stoean et al. stoean2007disburdening () proposed Topological Species Conservation (TSC) algorithm, which utilities the “detect-multimodal” mechanism to remove the need of distance parameter when selecting species seeds and forming species. In TSC algorithm, all the individuals that track the same extreme point are in the same species, which corresponds the real structure of the optimization function. And the species seeds also can be conserved to the next generation. However, TSC algorithm need excessive fitness evaluations in seeds selection procedure, especially when the number of interior points get larger. For improving the computational efficiency of TSC algorithm, i.e., saving the fitness evaluations, Stoean et al. stoean2010multimodal () proposed Topological Species Conservation Version 2 (TSC2) algorithm. In TSC2 algorithm, the current free individual chooses the seed one by one in ascending order of distance to it to perform the “detect-multimodal” procedure until the return value is true or this individual is considered a new seed, because the species dominated by the closer seed is more likely to track the same peak with the current free individual. With this method, TSC2 algorithm saves considerable fitness evaluations. In addition, when the optimization function has a large number of local optima, TSC algorithm might block the population into too many seeds that would only be conserved to the next generation, which would significantly reduce the search ability of TSC algorithm. And TSC2 algorithm introduced the maximum number of seeds to guarantee the algorithm’s search ability.

Deb and Saha deb2010finding () firstly converted a single-objective multimodal optimization problem into a bi-objective optimization problem. Multiple global and local optima of the original problem become the members of weak Pareto-optimal set of the transformed problem. One of the objectives of the transformed problem is the objective function of the original problem. With regards to another objective, the gradient-based approach is firstly employed, which is based on the property that the derivatives of objective function at the minimum points are equal to zero. However, the derivatives of objective function at the maximum and saddle points are also equal to zero, and the objective functions of some optimization problems may be non-differentiable at the minimum points. Then, more pragmatic neighborhood count based approaches are developed for establishing the second objective, which is assigned to the count of the number of neighboring solutions that are better than the current solution in terms of their objective function values. During iterations, the non-dominated ranks of different solutions rely on two parameters, i.e., and , which are used to distinguish two optima.

2.2 Non-prior knowledge based methods

Thomsen thomsen2004multimodal () proposed Crowding-based DE (CDE) algorithm by introducing crowding method into the conventional DE for multimodal function optimization. In CDE algorithm, the similarity of two individuals is measured by the Euclidean distance between two individuals. The fitness value of an offspring is only compared with that of the most similar individual in the current population, and the offspring replaces the most similar individual if it has better fitness. This replacement scheme can make the population remain diversity in the search space, which has a great contribution to the location of multiple optima. Thomsen showed that CDE algorithm performs better than a fitness sharing DE variant over a group of multimodal functions.

The History based topological speciation (HTS) was proposed by Li and Tang li2015history () to incorporate into the CDE with species conservation technique for multimodal optimization. HTS is a parameter-free speciation method, which captures the landscape topography relying exclusively on search history, which avoids the additional sampling and function evaluations associated with existing topology based methods. Therefore, HTS is a parameter-free speciation method. The experimental results showed that HTS performs better than existing topology-based methods when the function evaluation budget is limited.

Liang et al. liang2006comprehensive () proposed Comprehensive Learning Particle Swarm Optimizer (CLPSO) for multimodal function optimization. In CLPSO, all particles’ best previous positions can potentially be used to guide a particle’s flying, i.e., each dimension of a particle may learn from the corresponding dimension of different particle’s best previous position. The velocity updating equation of CLPSO is shown as follows.

(3)

where, is an inertia weight, is an acceleration constant, denotes the -th dimension of particle ’s position, represents the -th dimension of particle ’s velocity. is a random number between 0 and 1 associated with . For particle , a set =[, , , , , ], where denotes the dimension of fitness function, is built to store the serial numbers of those particles that particle should learn from their best previous positions at the corresponding dimensions. denotes the -th dimension of particle ’s best previous position. The values of elements in depend on the learning probability that can take different values for different particles. For example, generating a random number for assigning , if this random number is greater than , assigning to ; otherwise, assigning the serial number of a particle selected from population with tournament selection procedure to . If particle does not find a better position after a certain number of iterations called the refreshing gap , reassigning for particle .

Li li2007multimodal () proposed Fitness-Distance-Ratio based PSO (FERPSO) algorithm, which utilities FER to avoid specifying any niching parameter, for multimodal function optimization. The FER value between particle and particle is shown as follows.

(4)

where, and are the best previous positions of particle and particle respectively; is a scaling factor and formulated as follows.

(5)

where, and are the best-fit particle and worst-fit particle in current population respectively. |||| is the size of search space, which is estimated by its diagonal distance (where denotes the dimension of search space, i.e., the number of variables. and are the upper and lower bounds of the -th variable , respectively). In every iteration, each particle needs to calculate the FER value between it and every other particle to find the neighboring point denoted by , corresponding to the maximal FER value. Then, each particle updates its velocity according to Eq. 6. Over successive iterations, some subpopulations tracking different peaks will be formed, so as to locate multiple optima.

(6)

where, and are the velocity and position of particle respectively. and denote two vectors which comprise random values generated between 0 and . is a positive constant. And is a constriction coefficient.

The PSO niching algorithms using ring topology, such as , , - and -, were also proposed by Li li2010niching () for multimodal function optimization. These ring topology based PSO niching algorithms also remove the need of specifying any niching parameter. Taking for example, a particle’s neighboring best point , shown in Eq. 6, is set as the best one among the best previous positions of its two immediate neighbors (i.e., left and right neighbors identified by population indices). With the ring topology methods, these PSO algorithms are able to form multiple subpopulations over successive iterations. Li showed that the PSO algorithms using ring topology can provide comparable or better performance than SPSO and FERPSO on some test functions.

Qu et al. qu2012differential () proposed a neighborhood based mutation and integrated it with three niching DE algorithms, i.e., CDE, SDE and sharing DE thomsen2004multimodal (), for multimodal function optimization. In neighborhood mutation, the subpopulations are formed, relying on the parameter neighborhood size . During iterations, each individual should calculate the Euclidean distances between it and other individuals in the population. Then, selecting the former nearest individuals to current individual to form a subpopulation. And the offspring of each individual is generated by using the corresponding DE algorithm within the subpopulation that the individual belongs to. After a certain number of iterations, some subpopulations will track different extreme points of the multimodal function to be optimized. Generally, the parameter can be set to a value between 1/20 of the population size and 1/5 of the population size.

The locally informed PSO (LIPS) algorithm was proposed by Qu et al. qu2013distance () for multimodal function optimization. LIPS makes use of the local information (best previous positions of several neighbors) to guide the search of each particle. The velocity updating equation of LIPS is shown as follows.

(7)

where, is an inertia weight, denotes the -th dimension of particle ’s position, is the -th dimension of particle ’s velocity. , is the neighbor size, which is dynamically increased from 2 to 5 during iterations; is a random number generated in [0, 4.1/], and ; is the best previous position of the -th nearest neighbor to the -th individual’s best previous position. With this technique, LIPS algorithm eliminates the requirement for specifying any niching parameter and improves the local search ability. Qu et al. showed that LIPS algorithm outperforms several well-known niching algorithms, containing , , SPSO, FERPSO, SDE and CDE, and so on, over 30 standard benchmark functions not only on success rate but also with regard to accuracy.

Wang et al. wang2015mommop () proposed Multiobjective Optimization for Multimodal Optimization Problems (MOMMOP), which transforms a Multimodal Optimization Problem (MMOP) into a Multiobjective Optimization Problem (MOP) with two conflicting objectives. So that all the global optima of the original MMOP can become the Pareto optimal solutions of the transformed problem. With MOMMOP, an MMOP is transformed into a MOP as follows.

(8)

where, is a solution, () is the -th variable, and denotes the number of variables. and are the two conflicting objectives of the transformed problem. is the objective function value of with respect to the original problem. and denote the best and worst objective function values during the evolution, respectively. and are the upper and lower bounds of the first variable, respectively. is the scaling factor, which gradually increases during the evolution. Because some optima may have the same values in certain variables, for the sake of locating multiple global optima, each variable is used to design a bi-objective optimization problem similar to Eq. 8. If a solution Pareto dominates another solution on all the bi-objective optimization problems, is considered to dominate . What’s more, to make the population distribute more reasonably, another comparison criterion is proposed. That is a solution dominates another solution if

(9)

where, and are the objective function values of and , respectively, with respect to the original problem. denotes the Euclidean distance between the normalized and (i.e., =( )/( ), =( )/( ), where {1, ,}). If ( normalization(, ))<0.01, and is considered to be quite similar to each other.

2.3 Our motivations

Based on the above overview, we can find that lots of multimodal optimization algorithms need to set some niching parameters, which require some prior knowledge of the fitness landscape. However, it is very difficult or almost impossible to obtain the prior knowledge of the fitness landscape, for some real-world optimization problems. What’s more, few of existing multimodal optimization algorithms can effectively identify and get rid of the located extreme points during iterations. Because they have no idea whether a subpopulation has already located the extreme point of a peak or not, before the end of running. Therefore, lots of function evaluations will be wasted, when an extreme point has been located early. And it also restrict the global search ability of the algorithm that a subpopulation all the time tracks an extreme point located early.

So, based on the above analysis, our main motivations in this paper are summarized as follows.

  1. Improving the iteration rule of the original WSA to remove the need of specifying optimal values of attenuation coefficient for different problems to form multiple subpopulations, without adding any niching parameter.

  2. Enabling the identification of extreme point and jumping out of the located extreme points during iterations, relying on two new parameters named stability threshold and fitness threshold , so as to eliminate the unnecessary function evaluations and improve the global search ability.

3 Whale swarm algorithm with iterative counter

Firstly, this section introduces the traditional WSA briefly. Then, the improvements of WSA for multimodal function optimization are presented. Next, the implementation of WSA-IC is described in sufficient detail. Assuming the problems to be solved by the algorithms are minimization problems. Let the fitness functions be the same as the objective functions.

3.1 Whale swarm algorithm

We proposed WSA for function optimization Zeng2017 (), inspired by the whales’ behavior of communicating with each other via ultrasound for hunting. It can be seen from reference Zeng2017 (), WSA performs well on maintaining population diversity and has strong local search ability, which contribute significantly to locating the global optima with high accuracy. In WSA, a whale updates its position, under the guidance of its “better and nearest” whale , according to the following equation.

(10)

where, and denote the -th elements of ’s position at and +1 iterations respectively, and represents the -th element of ’s position at iteration. is the intensity of ultrasound source, which can be set to 2 for almost all the cases. denotes the natural constant. is the attenuation coefficient. And is the Euclidean distance between and . denotes a random value generated between 0 and uniformly. According to Eq. 10, a whale would move positively and randomly under the guidance of its “better and nearest” whale which is close to it, and move negatively and randomly under the guidance of that whale which is quite far away from it.

The general framework of WSA is shown in Fig. 2, where || in line 6 denotes the number of members in , namely the swarm size, and in line 7 is the -th whale in . From Fig. 2, it can be seen that WSA has a fairly simple structure. In every iteration, before moving, each whale needs to find its “better and nearest” whale as shown in Fig. 3, where in line 6 is the fitness value of whale .


Fig. 2: The general framework of WSA.

Fig. 3: The pseudo code of finding a whale’s “better and nearest” whale.

3.2 The improvements of WSA

3.2.1 The improvement on iteration rule of WSA

Although the original WSA performs well on forming multiple parallel subpopulations and maintaining the population diversity, it needs to specify optimal values of attenuation coefficient for different problems, which reduces the practicality of WSA. Thus, we improve the iteration rule of WSA to remove the need of specifying optimal values of attenuation coefficient for different problems, on the premise of ensuring the formation of multiple subpopulations and the ability of local exploitation, in this section. Firstly, we assume that the intensity of ultrasound is not attenuated in water, i.e., =0, which means that each whale can correctly understand the message send out by any other whale in the search area. Therefore, a whale will move positively and randomly under the guidance of its “better and nearest” whale, whether that whale is close to it or far away from it. So, when a whale and its “better and nearest” whale track different extreme points, the whale may leave far away from the extreme point tracked by it due to the guidance of its “better and nearest” whale that follows another extreme point, which will weaken WSA’s ability of local exploitation. Taking a one-dimensional function optimization problem for example, as it can be seen from Fig. 4, the whale is near to an extreme point, however, its “better and nearest” whale is near to another extreme point. In this case, may move to a worse point or even go to another peak under the guidance of , which will impede the location of the extreme point tracked by previously. Obviously, this situation is not conducive to locating multiple global optima for WSA.


Fig. 4: A sketch map of a whale and its “better and nearest” whale tracking different extreme points.

To solve the above problem effectively, we improved the rule of location update for each whale as follows. Firstly, generating a copy of a whale , then, moves under the guidance of ’s “better and nearest” whale according to Eq. 10. If the position of after moving is better than that of (i.e., the fitness value of after moving is less than that of ), moves to ; otherwise, remains unchanged. In a word, if a whale find a better position by Eq. 10 in an iteration, it moves to the better position; otherwise, it remains quiescent in its original position. So, when it comes to the case shown in Fig. 4, the probability of whale leaving away from the extreme point tracked by it will be reduced very much, because whale is difficult to find a better position by Eq. 10 under the guidance of its “better and nearest” whale . In other words, the whale may stay at its original position with great probability to guide the movement of other whales. When appearing at least one whale that follows the same extreme point with and is better than in the meantime, will converge to the extreme point under the guidance of the nearest one among those better whales, in next iteration. Therefore, this improvement will contribute significantly to forming multiple subpopulations and enhancing the ability of local exploitation for the improved WSA, which are very conducive to locating multiple global optima, despite =0. What’s more, this improvement does not added any niching parameter.

3.2.2 Identifying and escaping from the located extreme points during iterations

In the field of multimodal optimization, identifying the located extreme points effectively and jumping out of these extreme points for saving unnecessary function evaluations during iterations are very important for metaheuristic algorithms to locate the global optimum/optima. Although the improved WSA mentioned in section 3.2.1 can ensure the formation of multiple subpopulations and the ability of local exploitation, it cannot yet identify the located extreme points and escape from these extreme points during iterations. In such case, we propose two new parameters, i.e., stability threshold and fitness threshold , which aims to help each whale identify the located optima and jump out of these optima during iterations, so as to save unnecessary function evaluations and improve the global search ability. In this paper, is a predefined number of iterations utilized to judge whether a whale has reached steady state or not, and that a whale has reached steady state means that it has located the extreme point tracked by it. And is a predefined value utilized to judge whether a solution is a current global optimum or not. If a whale does not find a better position after successive iterations, it is considered to have already reached steady state and located an extreme point. If the difference between its fitness value and (the fitness value of the best one among the current global optima) is less than , the whale’s position is considered a current global optimum; otherwise, the whale’s position is considered a local optimum. If the whale’s position is a current global optimum, recording this optimum. Then, the whale that has reached steady state is randomly reinitialized in the search area to jump out of the located extreme point. For judging whether a whale does not find a better position after successive iterations or not, each whale keeps an iterative counter . In this paper, the improved WSA is called WSA with Iterative Counter (WSA-IC).

3.3 The detailed procedure of WSA-IC

Fig. 5 presents the pseudo code of WSA-IC. For WSA-IC, it is worth noting that the initialization of a whale contains two operations: initializing the whale’s position randomly and assigning 0 to its iterative counter. The improvement on iteration rule of WSA described in section 3.2.1 can be seen from Fig. 5. If a whale’s “better and nearest” whale exists (line 8 in Fig. 5), a copy of this whale is generated firstly (line 9 in Fig. 5), then, the copy moves under the guidance of the “better and nearest” whale according to Eq. 10 (line 10 in Fig. 5). If the position of this copy after moving is better than that of the original whale (line 12 in Fig. 5), the copy replaces the original whale (line 13 in Fig. 5).


Fig. 5: The pseudo code of WSA-IC.

The detail of identifying and escaping from the located extreme points during iterations for WSA-IC is shown below. If a whale finds a better position (lines in Fig. 5) in an iteration, assigning 0 to its iterative counter (line 14 in Fig. 5); otherwise, the whale should check its iterative counter (lines and in Fig. 5). The detail procedure of checking a whale’s iterative counter is demonstrated in Fig. 6. As we can see from Fig. 6, firstly it should determine whether the whale’s iterative counter is equal to stability threshold or not. If the whale’s iterative counter is not equal to (line 2 in Fig. 6), its increases by 1 (line 3 in Fig. 6); otherwise, the whale is considered to have already reached steady state and located an extreme point. If the whale has reached steady state, it should determine whether the located extreme point is a current global optimum or not (line 5 in Fig. 6). If it is a current global optimum, recording this extreme point. Then, the whale that has reached steady state is randomly reinitialized (line 6 in Fig. 6), for jumping out of the located extreme point to find the global optima. Thus it can be seen that, with the parameter stability threshold , the proposed WSA-IC can jump out of the located extreme points on the premise of keeping enough local search ability.


Fig. 6: The pseudo code of checking a whale’s iterative counter.

The detail procedure of judging whether a solution is a current global optimum or not is demonstrated in Fig. 7. Firstly, it should judge whether the fitness value of the solution is less than (the fitness value of the best one among the current global optima set ) or not. If the fitness value of this solution is less than (line 2 in Fig. 7), this solution must be the current global optimum. Before updating (line 6 in Fig. 7) and recording the new current global optimum (line 7 in Fig. 7), it should judge whether the optima in located before are still the current global optima or not. If the difference between and the whale’s fitness is greater than (line 3 in Fig. 7), all the elements of located before are not the current global optima, so needs to be cleared (line 4 in Fig. 7). If the fitness value of this solution is greater than (line 8 in Fig. 7), it should judge whether this solution is a current global optimum or not. If the difference between the fitness value of this solution and is not greater than (line 9 in Fig. 7), this solution is considered a current global optimum, so it is added to (line 10 in Fig. 7).


Fig. 7: The pseudo code of judging whether a solution is a current global optimum.

Until the end of iterations, though some whales’ iterative counters do not reach , they may have already located the current global optima. Therefore, it should conduct the step in Fig. 7 for each whale in the last generation (line 23 in Fig. 5).

3.4 Parameters setting of WSA-IC

As we can see from the detailed steps above, WSA-IC contains four algorithm-specific parameters, i.e., intensity of ultrasound source , attenuation coefficient , stability threshold and fitness threshold . and are two constants, and always set to 2 and 0 respectively. should be set to a comparatively small value that is between 0 and the difference between the global second best fitness and the global best fitness, if the problem to be solved has at least one local optimum as shown in the example of an one-dimensional function in Fig. 8. The and in Fig. 8 denote the global optimum and the global second best solution respectively, and the difference between their objective function values is quite small. For the function to be optimized in Fig. 8, should be set to a very small value that between 0 and . For almost all the problems, especially those problems without prior knowledge, can be set to . And for those benchmark test functions whose global optima are given, can be set to the value of the predefined fitness error (i.e., level of accuracy) that is utilized to judge whether a solution is a real global optimum. may need to be specified different values for different problems. According to a large number of experimental results, it is reasonable to set =100, where is the function dimension.


Fig. 8: A function with at least one local optimum.

4 Experimental setup

The proposed WSA-IC and other algorithms compared are all implemented with C++ programming language by Microsoft visual studio 2015 and executed on the PC with 3.2 GHz and 3.6 GHz Intel core i5-3470 processor, 4 GB RAM and 64-bit Microsoft windows 10 operating system. The 5 popular niching metaheuristic algorithms compared are listed as follows.

  1. LIPS qu2013distance (): The locally informed PSO.

  2. FERPSO li2007multimodal (): Fitness-Euclidean distance ratio PSO.

  3. SDE li2005efficient (): Speciation-based DE.

  4. CDE thomsen2004multimodal (): The original crowding DE.

  5. SPSO li2004adaptively (): Speciation-based PSO.

Apart from the above 5 niching metaheuristic algorithms, WSA-IC also compares with WSA Zeng2017 (). It is worth noting that the different evolutionary rules of different algorithms will result in different computational complexity. As all these algorithms compared are implemented in the same development environment, and run on the same computer, we utilize the CPU time as the stopping criterion for the sake of fairness. It is obvious that the more global optima the algorithm finds and the accuracy of these optima are higher when satisfying the stopping criterion, the better the algorithm performs.

4.1 Test functions

We use 20 multimodal benchmark functions to test these algorithms. Basic information of these test functions is summarized in Table 1, in which the symbol “” in last column corresponding to F16-F20 means that these functions have many local optima, and we have not counted the number of their local optima. In Table 1, the former 15 multimodal functions come from CEC2015 Suganthan2015cec (), and the latter 5 functions are the classical multimodal functions with high dimension. These CEC2015 functions can be divided into two parts. The first 8 functions are expanded scalable functions and the remaining 7 functions are composition functions. All these CEC2015 functions come with search space shift and rotation that makes them more difficult to be solved, while the latter 5 multimodal functions are only shifted. More details of these test functions are presented in the document “Definitions of CEC2015 niching benchmark 20141228” which can be download from the website shown in reference Suganthan2015cec (). For functions F2, F3, F5, F6, F7, F8, F9, F11, F12 and F13 the objective is to locate all the global optima, while for the rest the target is to escape from the local optima to hunt for the global optimum. And all these test functions are minimization problems.

Fn. Test function name Dimensions No. of global optima No. of local optima
F1 Expanded Two-Peak Trap 5 1 15
F2 Expanded Five-Uneven-Peak Trap 5 32 0
F3 Expanded Equal Minima 4 625 0
F4 Expanded Decreasing Minima 5 1 15
F5 Expanded Uneven Minima 3 125 0
F6 Expanded Himmelblau’s Function 4 16 0
F7 Expanded Six-Hump Camel Back 6 8 0
F8 Modified Vincent Function 3 216 0
F9 Composition Function 1 10 10 0
F10 Composition Function 2 10 1 9
F11 Composition Function 3 10 10 0
F12 Composition Function 4 10 10 0
F13 Composition Function 5 10 10 0
F14 Composition Function 6 10 1 19
F15 Composition Function 7 10 1 19
F16 Griewank 50 1
F17 Ackley 100 1
F18 Rosenbrock 100 1
F19 Rastrigin 100 1
F20 Expanded Scaffer’s F6 100 1
Search range:
Table 1: Test functions.

4.2 Parameters setting

To compare the performance of the multimodal optimization algorithms in this paper, all the test functions should be treated as black-box problems, though their global optima can be obtained by the method of derivation. Thus, the known global optima of these test functions cannot be used by the algorithms during iterations. After each algorithm finished the iteration, the fitness error , i.e., level of accuracy, is used to judge whether a solution is a real global optimum. If the difference between the fitness value of a solution and the fitness value of the known global optimum is lower than , this solution can be considered a real global optimum. In our experiments, the fitness error , population size and CPU time used by these algorithms for the test functions are listed in Table 2. In the third column of Table 2, the number before slash denote the population size of WSA-IC, while the number after slash are the population size of the 6 algorithms compared. Because the individuals in population of WSA-IC are able to identify and jump out of the located extreme points effectively during iterations, WSA-IC can keep a relative small population size with respect to other algorithms, which contributes to reduce the computation complexity of WSA-IC. It is worth noting that a function which has more optima or higher dimension may require a larger population size or more CPU time.

Fn. pop. size of WSA-IC / CPU time (s)
pop. size of other algorithms
F1 0.00000001 40 / 200 6
F2 0.00000001 60 / 200 200
F3 0.00000001 50 / 2000 1500
F4 0.00000001 30 / 100 180
F5 0.00000001 40 / 800 80
F6 0.00000001 40 / 200 20
F7 0.000001 30 / 100 30
F8 0.0001 100 / 1000 1500
F9 0.00000001 500 / 3000 1800
F10 0.00000001 500 / 1000 500
F11 0.00000001 100 / 800 800
F12 0.00000001 400 / 600 800
F13 0.00000001 50 / 300 260
F14 0.00000001 400 / 2000 500
F15 0.00000001 100 / 600 1000
F16 0.00000001 100 / 200 300
F17 0.00000001 40 / 300 150
F18 0.00000001 60 / 300 1200
F19 0.00000001 80 / 300 1000
F20 0.00000001 60 / 200 600
Table 2: Setting of parameters associated with test functions.

The parameters’ values of these algorithms compared are set as same as those in their reference source respectively. Table 3 lists the values of main parameters of these algorithms.

Algorithms Parameters
LIPS =0.729844, =25
FERPSO =0.729844, =4.1
SDE =0.9, =0.5, =10
CDE =0.9, =0.5, =population size
SPSO =0.729844, =2.05, =2.05
WSA =2
WSA-IC =2, =0, =100, =
  • : inertia weight; : neighborhood size;

  • : constriction factor; , , : coefficient;

  • : crossover rate; : scaling factor; : species size;

  • : crowding factor;

Table 3: Setting of main parameters of algorithms.

The attenuation coefficient of WSA for these test functions are listed in Table 4. Table 5 shows the values of species radius of SDE and SPSO for these test functions.

Fn. F1 F2 F3 F4 F5 F6 F7 F8 F9 F10
0.0001 0.1 0.14 0.00005 0.16 0.16 0.001 0.3 0.09 0.001
Fn. F11 F12 F13 F14 F15 F16 F17 F18 F19 F20
0.01 0.001 0.001 0.001 0.001 0.005 0.01 0.014 0.005 0.01
Table 4: Setting of attenuation coefficient of WSA for test functions.
Fn. F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 F16 F17 F18 F19 F20
of SDE 25 8 15 10 8 40 10 15 10 25 25 30 15 10 10 0.5 0.5 0.2 0.5 0.5
of SPSO 60 200 30 50 30 60 70 60 100 100 100 140 150 120 150 1000 900 950 2000 2000
Table 5: Setting of species radius for test functions.

4.3 Performance metrics

To fairly compare the performance of WSA-IC with other 6 algorithms, we have conducted 51 independent runs for each algorithm over each test function. And the following four metrics are used to measure the performance of all the algorithms.

  1. Success Rate (SR) li2004adaptively (): the percentage of runs in which all the global optima are successfully located using the given level of accuracy.

  2. Average Number of Optima Found (ANOF) Suganthan2015cec (): the average number of global optima found over 51 runs.

  3. Quality of optima found: the mean of fitness values of optima found over 51 runs, reflecting the accuracy of optima found.

  4. Convergence rate: the rate of an algorithm converging to the global optimum over function evaluations.

5 Experimental results and analysis

In this section, the results of the comparative experiments are presented and analyzed in detail.

5.1 Quantity of optima found

This section presents and analyses the results of quantity of optima found by these algorithms. Firstly, all the algorithms are compared on “Success Rate”, which is the most popular metric used to test the performance of the multimodal optimization algorithms in terms of locating multiple global optima. Then, the metric “Average Number of Optima Found” is employed to further compare the performance of the algorithms on locating multiple global optima, as some algorithms can not achieve nonzero SR over some functions with multiple global optima.

5.1.1 Success Rate

The SR of each algorithm on each test function is presented in Table 6, in which each number within the parenthese denotes the rank of each algorithm on the corresponding function in terms of SR, and the bold number means the corresponding algorithm performs best on the function. The same SR value on a function means that the corresponding algorithms have the same rank for the function. The last row of Table 6 shows the total rank of each algorithm for all the test functions, which are the summation of each individual rank of the algorithm for each function. It can be seen from Table 6 that WSA-IC performs best on most of the test functions in terms of SR. Especially on F3, F5 and F8 which have massive global optima, WSA-IC performs far better than other algorithms, while the algorithms compared can not achieve nonzero SR values on the three functions. It is worth noting that F9F15 are composition functions with search space shift and rotation, whose global optima are more difficult to be located, so that all the algorithms can not achieve nonzero SR values on F9F14. For the composition function F15, WSA-IC, LIPS and FERPSO all get the maximal SR value, i.e., 1. What’s more, for the high dimensional multimodal functions F16, F18 and F19, WSA-IC can also achieve much higher SR values than most of other multimodal optimization algorithms. It also can be seen that the better performance of WSA-IC in terms of SR can be supported by the total rank of WSA-IC that is 21 which is much better than those achieved by other algorithms.

Fn. LIPS FERPSO SPSO SDE CDE WSA WSA-IC
F1 0.69 0.67 0 1 0 0.12 1
(3) (4) (6) (1) (6) (5) (1)
F2 0 0 0 0 0 0 1
(2) (2) (2) (2) (2) (2) (1)
F3 0 0 0 0 0 0 1
(2) (2) (2) (2) (2) (2) (1)
F4 0.31 0.43 0.33 0.80 1 0 1
(5) (6) (4) (3) (1) (7) (1)
F5 0 0 0 0 0 0 1
(2) (2) (2) (2) (2) (2) (1)
F6 0.12 0 0 0 1 0 1
(3) (4) (4) (4) (1) (4) (1)
F7 0.06 0 0 0.59 1 0 1
(4) (5) (5) (3) (1) (5) (1)
F8 0 0 0 0 0 0 1
(2) (2) (2) (2) (2) (2) (1)
F9 0 0 0 0 0 0 0
(1) (1) (1) (1) (1) (1) (1)
F10 0 0 0 0 0 0 0
(1) (1) (1) (1) (1) (1) (1)
F11 0 0 0 0 0 0 0
(1) (1) (1) (1) (1) (1) (1)
F12 0 0 0 0 0 0 0
(1) (1) (1) (1) (1) (1) (1)
F13 0 0 0 0 0 0 0
(1) (1) (1) (1) (1) (1) (1)
F14 0 0 0 0 0 0 0
(1) (1) (1) (1) (1) (1) (1)
F15 1 1 0 0.98 0.12 0.06 1
(1) (1) (7) (4) (5) (6) (1)
F16 1 0.49 0 0 0 0.42 0.96
(1) (3) (5) (5) (5) (4) (2)
F17 0 0 0 0 0 0 0
(1) (1) (1) (1) (1) (1) (1)
F18 0 0 0 0 0 0 0.88
(2) (2) (2) (2) (2) (2) (1)
F19 0 0 0 0 0 0 0.98
(2) (2) (2) (2) (2) (2) (1)
F20 0 0 0 0 0 0 0
(1) (1) (1) (1) (1) (1) (1)
Total rank 37 43 51 40 39 51 21
Table 6: SR and ranks (in parentheses) of algorithms on F1F20.

5.1.2 Average Number of Optima Found

As the sample size in this paper is 51 that is greater than 30, we have conducted the Two Independent-samples Z-test for WSA-IC to judge whether the difference between its population and the populations of other algorithms, respectively represented by their independent samples, are significant or not on each test function under the significance level 0.05, which is based on the variance between the ANOF of two independent samples. Table 7 presents the ANOF of each algorithm on each test function, and the standard deviation of the number of optima found are also listed. The symbol “+” means that the difference between the population of WSA-IC and the population of the algorithm compared is significant, and WSA-IC performs better than the algorithm compared, while the symbol “=” means that the difference is not significant. And the symbol “” means that the difference is significant, and WSA-IC performs worse than the algorithm compared. The bold number in Table 7 means that the corresponding algorithm performs best on the function in terms of ANOF. It can be seen form Table 7 that WSA-IC has the best performance in terms of ANOF over F1F8, F15, F18 and F19, which echoes the best SR values of WSA-IC on these test functions as shown in Table 6. For the two composition functions F10 and F14 and the two high dimensional functions F17 and F20, all the algorithms can not get nonzero ANOF, which means that all the algorithms can not find the global optima of these functions. For the composition function F12, WSA-IC only performs a little bit worse than SDE in terms of ANOF. While for the composition functions F9 and F11, WSA-IC performs far better than other algorithms compared. It also can be seen that the better performance of WSA-IC in terms of the number of optima found can be supported by the total number of symbols “+”, “=” and “” in the last three rows of Table 7. As we can see from Table 7, the nonzero values of the number of symbol “” only occur when WSA-IC is compared with LIPS and SDE, that are 1 respectively. And the number of symbol “+” is lager than that of symbol “=” when compared with all the other algorithms. The better performance of WSA-IC is firstly due to the improvement on the location update rule of WSA when =0, i.e., a whale moves to a new position under the guidance of its “better and nearest” whale if this new position is better than its original position, which can ensure the formation of multiple subpopulations and keep the ability of local exploitation. More importantly, the method of identifying and jumping out of the located extreme points during iterations can improve the global search ability as far as possible, which can contribute significantly to the location of multiple global optima.

Fn. LIPS FERPSO SPSO SDE CDE WSA WSA-IC
F1 0.690.46 + 0.670.47 + 00 + 10 = 00 + 0.120.32 + 10
F2 20.921.90 + 7.431.42 + 00 + 13.771.55 + 00 + 1.960.74 + 320
F3 288.478.92 + 148.147.32 + 00 + 163.73212.52 + 1.941.31 + 10.061.42 + 6250
F4 0.310.46 + 0.430.50 + 0.330.47 + 0.800.40 + 10 = 00 + 10
F5 107.393.34 + 84.294.36 + 00 + 15.651.54 + 00 + 6.220.89 + 1250
F6 14.390.97 + 11.391.36 + 0.200.40 + 4.370.48 + 160 = 2.730.79 + 160
F7 5.531.33 + 2.731.07 + 0.240.42 + 7.470.70 + 80 = 0.510.50 + 80
F8 112.274.72 + 70.294.23 + 11.313.54 + 23.531.67 + 204.593.86 + 10.841.41 + 2160
F9 7.080.27 3.551.05 + 00 + 00 + 00 + 2.590.69 + 4.221.13
F10 00 = 00 = 00 = 00 = 00 = 00 = 00
F11 0.980.14 + 0.100.30 + 0.350.48 + 0.670.46 = 00 + 0.390.49 + 0.800.40
F12 0.450.50 0.020.14 = 00 = 0.820.38 00 = 00 = 0.140.34
F13 10 = 0.960.19 = 00 + 10 = 10 = 0.040.19 + 0.940.24
F14 00 = 00 = 00 = 00 = 00 = 00 = 00
F15 10 = 10 = 00 + 0.980.14 = 0.120.32 + 0.060.24 + 10
F16 10 + 0.490.50 + 00 + 00 + 00 + 0.410.49 = 0.960.19
F17 00 = 00 = 00 = 00 = 00 = 00 = 00
F18 00 + 00 + 00 + 00 + 00 + 00 + 0.880.32
F19 00 + 00 + 00 + 00 + 00 + 00 + 0.980.14
F20 00 = 00 = 00 = 00 = 00 = 00 = 00
+ 10 13 15 11 11 14
= 8 7 5 8 9 6
2 0 0 1 0 0
Table 7: ANOF of algorithms on F1F20.

5.2 Quality of optima found

This section compares the performance of these algorithms in terms of the quality of optima found. Table 8 presents the mean of fitness values of optima found over 51 runs on all these test functions, and the standard deviation of fitness values of optima found are also listed in the parentheses. It is worth noting that the fitness values of optima found by all the algorithms on the CEC2015 niching test functions (i.e., F1F15 in Table 1) have substracted 100*Fn. (Fn. denotes the serial number of a function), for comparing the performance of all the algorithms on the quality of optima found. And we have also conducted the Two Independent-samples Z-test between WSA-IC and other algorithms compared. The bold number in Table 8 means that the corresponding algorithm performs best on the function in terms of the quality of optima found. It can be seen form Table 8 that WSA-IC has the best performance over F1, F2, F4, F6F9, F14 and F17F19. The better performance of WSA-IC in terms of the quality of optima found can also be supported by the total number of symbols “+”, “=” and “” in the last three rows of Table 8, in which the number of symbol “” pertaining to different algorithms compared is much less than that of symbols “+” and “=”.

Fn. LIPS FERPSO SPSO SDE CDE WSA WSA-IC
F1 8.68E+00 + 1.33E+01 + 3.66E02 + 3.62E10 = 4.32E01 + 5.73E+01 + 0.00E+00
(1.60E+01) (1.89E+01) (7.61E02) (1.80E09) (2.95E01) (3.19E+01) (0.00E+00)
F2 3.58E14 = 6.11E14 = 2.60E01 + 5.51E10 = 2.40E04 = 6.75E01 + 1.39E16
(2.52E13) (6.71E14) (2.71E01) (4.50E10) (3.04E04) (4.77E+00) (9.85E16)
F3 4.14E12 = 2.41E12 = 3.57E03 = 2.52E10 = 1.10E08 = 0.00E+00 = 3.25E14
(9.06E12) (9.77E12) (1.88E03) (5.00E10) (3.29E09) (0.00E+00) (1.83E13)
F4 5.79E02 = 7.23E02 + 2.39E01 = 2.11E02 = 1.45E14 = 1.29E+00 + 3.34E15
(4.35E02) (8.17E02) (9.09E01) (4.59E02) (2.48E14) (5.35E01) (1.75E14)
F5 2.38E11 = 2.58E12 = 5.49E04 = 3.26E10 = 8.58E05 = 0.00E+00 = 2.12E15
(3.48E11) (1.51E11) (3.72E04) (3.66E10) (5.74E05) (0.00E+00) (9.89E15)
F6 1.67E13 = 4.69E14 = 2.37E02 = 3.56E10 = 3.32E11 = 2.69E11 = 2.55E14
(1.17E12) (6.00E14) (2.84E02) (7.27E10) (1.15E10) (1.13E10) (5.29E14)
F7 5.58E07 = 5.58E07 = 1.22E02 = 6.15E07 = 5.58E07 = 1.79E+00 + 5.58E07
(1.11E14) (6.06E14) (1.61E02) (6.42E08) (1.29E14) (1.97E+00) (3.34E15)
F8 5.82E08 = 2.12E07 = 6.38E05 = 5.92E06 = 1.31E06 = 3.02E06 = 2.36E08
(2.08E07) (3.95E07) (9.64E06) (4.10E06) (5.85E07) (3.47E06) (6.89E08)
F9 1.54E11 = 1.47E10 = 1.53E+00 + 1.53E+00 + 1.53E+00 + 5.21E10 = 6.86E14
(1.03E10) (4.23E10) (3.57E09) (7.90E14) (2.31E09) (1.00E09) (5.99E14)
F10 2.94E+01 4.95E+03 + 1.26E+04 + 3.41E+01 5.10E+01 + 9.30E+03 + 3.71E+01
(4.16E+00) (6.63E+03) (4.06E+03) (1.03E+01) (1.13E+01) (6.54E+03) (1.27E+01)
F11 3.80E06 = 2.65E02 = 4.52E02 = 1.64E01 = 6.55E02 = 3.11E03 = 4.17E02
(2.68E05) (6.26E02) (5.74E02) (2.89E01) (2.64E02) (8.68E03) (8.79E02)
F12 1.48E01 1.66E01 5.04E+01 + 1.28E04 1.36E01 1.49E+01 + 4.52E01
(1.56E01) (5.91E02) (7.69E+01) (3.43E04) (2.08E02) (5.66E+01) (2.59E01)
F13 2.27E13 8.08E+00 4.98E+01 + 1.78E13 4.50E13 3.25E+02 + 1.19E+01
(0.00E+00) (4.17E+01) (1.31E+01) (1.93E13) (9.54E14) (7.85E+01) (4.76E+01)
F14 1.61E+02 + 1.05E+02 + 3.07E+02 + 1.39E+02 + 6.51E+02 + 4.59E+02 + 6.27E+01
(1.93E+02) (9.29E+01) (2.38E+01) (4.01E+01) (3.16E+01) (1.79E+02) (2.88E+01)
F15 2.32E13 = 4.46E14 = 4.92E+01 + 3.73E+00 + 1.06E06 = 2.37E+02 + 0.00E+00
(3.15E14) (9.03E14) (1.34E+01) (2.64E+01) (2.25E06) (7.19E+01) (0.00E+00)
F16 1.83E13 = 8.26E02 = 1.20E+00 + 4.78E02 + 6.37E04 = 9.45E03 = 1.58E03
(9.03E14) (2.63E01) (2.13E02) (1.08E02) (9.34E04) (1.05E02) (1.02E02)
F17 2.13E+01 + 2.00E+01 = 2.13E+01 + 2.11E+01 + 2.13E+01 + 2.00E+01 = 2.00E+01
(3.12E02) (8.07E03) (2.22E02) (1.99E02) (2.72E02) (3.21E04) (6.66E05)
F18 1.11E+02 + 5.59E+07 + 2.89E+07 + 1.91E+02 + 1.74E+09 + 9.30E+01 + 4.92E+01
(2.57E+01) (1.20E+08) (7.33E+06) (3.35E+00) (3.66E+08) (3.87E+01) (1.98E+02)
F19 1.98E+02 + 1.10E+03 + 4.08E+03 + 7.33E+02 + 1.81E+04 + 3.20E+03 + 1.37E01
(2.95E+01) (7.07E+02) (2.26E+02) (2.62E+01) (8.62E+02) (6.18E+02) (9.66E01)
F20 4.07E+01 3.68E+01 4.39E+01 = 4.08E+01 4.65E+01 + 4.41E+01 + 4.37E+01
(1.22E+00) (2.40E+00) (5.23E01) (7.15E01) (2.04E01) (6.58E01) (7.35E01)
+ 6 6 12 7 8 12
= 10 11 8 9 10 8
4 3 0 4 2 0
Table 8: Quality of optima found by algorithms on F1F20.

What’s more, the box plot of mean fitness values of optima found per run over 51 runs is shown in Fig. 9. It can be seen from Fig. 9, the dispersion degree of mean fitness values of optima found by WSA-IC are quite small on most of the test functions with respect to other algorithms compared. And WSA-IC only has outliers on F11, F13, F14, F16 and F20, while most of other algorithms have more outliers over these test functions. Therefore, it can be concluded that WSA-IC has good stability on the accuracy of optima found over these test functions, with respect to other algorithms compared. The better performance of WSA-IC in terms of the quality of optima found is also due to the improvement on the location update rule of WSA, i.e., a whale moves to a new position under the guidance of its “better and nearest” whale if this new position is better than its original position, which can ensure the ability of local exploitation. For example, when some whales follow a same extreme point, the best whale among these whales will stay where it is with great probability to guide other whales to converge to the extreme point followed by them. Besides, the method of identifying and jumping out of the located extreme points during iterations can improve the global search ability as far as possible to find the global optima. For example, if some whales converge to a solution that near to a global optimum, with this method some other whales that have already reached steady state will be reinitialized, and they may move to the positions that near to those convergent whales, which will accelerate these whales to converge to the global optimum.


Fig. 9: Box plot of algorithms on F1F20.

5.3 Convergence rate

From the previous two sections, it can be seen that the proposed WSA-IC has better and more consistent performance than other algorithms in terms of both the quantity of optima found and the quality of optima found on most test functions. To demonstrate the efficiency of WSA-IC on locating the global optima, WSA-IC is compared with other algorithms excepting FERPSO and WSA (because the population of FERPSO and WSA may prematurely converge to a solution or several solutions with same fitness value and quit iteration) in terms of convergence rate in this section. Six functions (i.e., F1, F4, F9, F14, F18 and F19, wherein F9 has no local optima while others all come with local optima) are used to test these algorithms. The convergence curves of all the algorithms on these test functions are depicted in Fig. 10, in which the abscissa values represent the number of function evaluations and the ordinate values denote the mean of fitness values of the current global optima over 51 runs. It can be seen from Fig. 10(c), for function F9 without local optima, SPSO, SDE and CDE all cannot converge to the global optima, and WSA-IC converge to the global optima with much faster rate than that of LIPS. Although LIPS can converge to the global optima of F9, it gets a much lower ANOF on F9 than that gained by WSA-IC as shown in Table 7. What’s more, for functions F1, F4, F14, F18 and F19 that have multiple local optima, WSA-IC can achieve the global optima with fast convergence rate on F1 and F4 as shown in Fig. 10(a) and Fig. 10(b), with respect to other algorithms. And WSA-IC can converge to better solutions than other algorithms on F14, F18 and F19, as shown in Fig. 10(d), Fig. 10(e) and Fig. 10(f). Especially, the convergence rates of WSA-IC on F14 and F19 are much faster than that of other algorithms. Therefore, it can be concluded that the proposed WSA-IC has excellent performance on convergence rate with respect to other algorithms.

(a) F1
(b) F4
(c) F9
(d) F14
(e) F18
(f) F19
Fig. 10: Convergence rate of algorithms on F1, F4, F9, F14, F18 and F19.

5.4 The effect of different parameter values

As mentioned in section 3.4, the parameters and two constants, and always set to 2 and 0 respectively. For almost all the problems, especially those problems without prior knowledge, can be set to . Thus, only the parameter stability threshold may need to be specified different values for different problems. This section presents the results of ANOF obtained by WSA-IC on all these test functions with different values, as shown in Table 9. And a clear visual comparison of ANOF obtained by WSA-IC with different values is shown in Fig. 11, where the values of ANOF with different values on each test function are normalized, and 1 refers to the best ANOF value while 0 refers to the worst ANOF value. It can be seen from Table 9 and Fig. 11, WSA-IC can achieve the best ANOF values on most test functions with =100. Therefore, the parameter can be set to 100 for almost all the continuous optimization problems.

Fn. =20 =40 =60 =80 =100 =120 =140 =160 =180 =200
F1 1 1 1 1 1 1 1 1 1 1
F2 32 32 32 32 32 32 32 32 31.98 32
F3 472.94 586.94 622.20 624.92 625 625 625 624.96 624.90 624.96
F4 0.69 0.92 0.96 0.98 1 1 1 1 1 1
F5 125 125 125 125 125 125 125 124.94 124.94 124.90
F6 16 16 16 16 16 15.98 15.94 15.94 15.96 15.96
F7 8 8 8 8 8 8 8 8 7.98 8
F8 216 215.98 216 216 216 215.92 215.88 215.88 215.88 215.82
F9 5.69 4.69 4.24 3.71 4.22 3.84 3.86 4.04 4.06 3.71
F10 0 0 0 0 0 0 0 0 0 0
F11 0.82 0.76 0.63 0.65 0.80 0.73 0.67 0.73 0.73 0.67
F12 0.08 0.02 0.06 0.06 0.14 0.08 0.08 0.08 0.04 0.06
F13 0.69 0.82 0.63 0.67 0.94 0.61 0.69 0.49 0.51 0.49
F14 0 0 0 0 0 0 0 0 0 0
F15 0.94 0.80 0.92 0.88 1 0.92 0.65 0.67 0.71 0.55
F16 0.84 0.88 0.92 0.73 0.96 0.76 0.92 0.90 0.92 0.76
F17 0 0 0 0 0 0 0 0 0 0
F18 0.71 0.78 0.57 0.78 0.88 0.64 0.69 0.67 0.75 0.67
F19 1 0.98 0.96 0.98 0.98 0.96 0.94 0.88 0.92 0.82
F20 0 0 0 0 0 0 0 0 0 0
Table 9: ANOF of WSA-IC with different values on F1F20.

Fig. 11: Overview of ANOF obtained by WSA-IC with different values on each function.

6 Conclusions and future research

A new multimodal optimizer named Whale Swarm Algorithm with Iterative Counter (WSA-IC), based on our preliminary work in Zeng2017 (), is proposed in this paper. Firstly, WSA-IC improves the iteration rule of the original WSA when attenuation coefficient is set to 0, i.e., a whale moves to a new position under the guidance of its “better and nearest” whale if this new position is better than its original position. As a results, WSA-IC removes the need of specifying different values of for different problems to form multiple subpopulations, without introducing any niching parameter. And the ability of local exploitation is also ensured. What’s more, WSA-IC enables the identification of extreme point and enables jumping out of the located extreme points during iterations, relying on two new parameters, i.e., stability threshold and fitness threshold . If a whale did not find a better position after successive iterations, it is considered to have already located an extreme point and is to be reinitialized, so as to eliminate the unnecessary function evaluations and improve the global search ability. If the difference between the fitness value of the located extreme point and (the fitness value of the best one among the current global optima) is less than , the located extreme point is considered a current global optimum. And the values of and are very easy to set for different problems. The experimental results clearly show that WSA-IC performs statistically better than other niching metaheuristic algorithms over most test functions on comprehensive metrics.

The main contributions of this paper are summarized into four aspects as follows.

  1. WSA-IC removes the need of specifying optimal niching parameter for different problems, which increases the practicality.

  2. WSA-IC can efficiently identify and jump out of the located extreme points during iterations, so as to locate as more global optima as possible in a single run, which further increases the practicality.

  3. The algorithm-specific parameters of WSA-IC are easy to be assigned for different problems, which also increases the practicality.

  4. The population size of WSA-IC does not need to exactly suit the number of optima of the optimization problem. Generally, WSA-IC can keep a relative small population size, which contributes significantly to reducing the computation complexity.

In the future, we will focus on the following aspects.

  1. Introducing other metaheuristic algorithms or heuristic algorithms for the best whale in each iteration to execute the neighborhood search process, so as to further improve the local search ability and the quality of optima.

  2. Designing some new methods to escape from the located extreme points instead of random reinitialization, to make the population spread over the entire solution space as far as possible.

Acknowledgements

This research was supported by the National Natural Science Foundation of China (NSFC) (51435009, 51421062 and 61232008) and the National Key Technology Support Program (2015BAF01B04).

References

  • (1) K. Gao, Y. Zhang, A. Sadollah, A. Lentzakis, R. Su, Jaya, harmony search and water cycle algorithms for solving large-scale real-life urban traffic light scheduling problem, Swarm and Evolutionary Computation.
  • (2) M. F. Tasgetiren, D. Kizilay, Q.-K. Pan, P. N. Suganthan, Iterated greedy algorithms for the blocking flowshop scheduling problem with makespan criterion, Computers & Operations Research 77 (2017) 111–126.
  • (3) G. Lin, W. Zhu, M. M. Ali, An effective hybrid memetic algorithm for the minimum weight dominating set problem, IEEE Transactions on Evolutionary Computation 20 (6) (2016) 892–907.
  • (4) H. Ismkhan, Effective heuristics for ant colony optimization to handle large-scale problems, Swarm and Evolutionary Computation 32 (2017) 140–149.
  • (5) J. Yi, X. Li, C.-H. Chu, L. Gao, Parallel chaotic local search enhanced harmony search algorithm for engineering design optimization, Journal of Intelligent Manufacturing (2016) 1–24.
  • (6) Z. A. E. M. Dahi, C. Mezioud, A. Draa, A quantum-inspired genetic algorithm for solving the antenna positioning problem, Swarm and Evolutionary Computation 31 (2016) 24–63.
  • (7) X. Li, Niching without niching parameters: particle swarm optimization using a ring topology, IEEE Transactions on Evolutionary Computation 14 (1) (2010) 150–169.
  • (8) K. A. De Jong, Analysis of the behavior of a class of genetic adaptive systems.
  • (9) D. E. Goldberg, J. Richardson, et al., Genetic algorithms with sharing for multimodal function optimization, in: Genetic algorithms and their applications: Proceedings of the Second International Conference on Genetic Algorithms, Hillsdale, NJ: Lawrence Erlbaum, 1987, pp. 41–49.
  • (10) X. Yin, N. Germay, A fast genetic algorithm with sharing scheme using cluster analysis methods in multimodal function optimization, in: Artificial neural nets and genetic algorithms, 1993, pp. 450–457.
  • (11) G. R. Harik, Finding multimodal solutions using restricted tournament selection., in: ICGA, 1995, pp. 24–31.
  • (12) M. Bessaou, A. Pétrowski, P. Siarry, Island model cooperating with speciation for multimodal optimization, in: Parallel Problem Solving from Nature PPSN VI, Springer, 2000, pp. 437–446.
  • (13) K. Deb, D. E. Goldberg, An investigation of niche and species formation in genetic function optimization, in: Proceedings of the 3rd international conference on genetic algorithms, Morgan Kaufmann Publishers Inc., 1989, pp. 42–50.
  • (14) J. Kennedy, R. Mendes, Population structure and particle swarm performance, in: Evolutionary Computation, 2002. CEC’02. Proceedings of the 2002 Congress on, Vol. 2, IEEE, 2002, pp. 1671–1676.
  • (15) X. Li, M. Epitropakis, K. Deb, A. Engelbrecht, Seeking multiple solutions: an updated survey on niching methods and their applications, IEEE Transactions on Evolutionary Computation.
  • (16) R. Thomsen, Multimodal optimization using crowding-based differential evolution, in: Evolutionary Computation, 2004. CEC2004. Congress on, Vol. 2, IEEE, 2004, pp. 1382–1389.
  • (17) S. W. Mahfoud, Crowding and preselection revisited, Urbana 51 (1992) 61801.
  • (18) O. J. Mengshoel, D. E. Goldberg, Probabilistic crowding: Deterministic crowding with probabilistic replacement, in: Proc. of the Genetic and Evolutionary Computation Conference (GECCO-99), 1999, p. 409.
  • (19) R. K. Ursem, Multinational evolutionary algorithms, in: Evolutionary Computation, 1999. CEC 99. Proceedings of the 1999 Congress on, Vol. 3, IEEE, 1999, pp. 1633–1640.
  • (20) C. L. Stoean, M. Preuss, R. Stoean, D. Dumitrescu, Disburdening the species conservation evolutionary algorithm of arguing with radii, in: Proceedings of the 9th annual conference on Genetic and evolutionary computation, ACM, 2007, pp. 1420–1427.
  • (21) B. Zeng, L. Gao, X. Li, Whale Swarm Algorithm for Function Optimization, Springer International Publishing, Cham, 2017, pp. 624–639. doi:10.1007/978-3-319-63309-1_55.
  • (22) S. Das, S. Maity, B.-Y. Qu, P. N. Suganthan, Real-parameter evolutionary multimodal optimization—a survey of the state-of-the-art, Swarm and Evolutionary Computation 1 (2) (2011) 71–88.
  • (23) J.-P. Li, M. E. Balazs, G. T. Parks, P. J. Clarkson, A species conserving genetic algorithm for multimodal function optimization, Evolutionary computation 10 (3) (2002) 207–234.
  • (24) X. Li, Efficient differential evolution using speciation for multimodal function optimization, in: Proceedings of the 7th annual conference on Genetic and evolutionary computation, ACM, 2005, pp. 873–880.
  • (25) X. Li, Adaptively choosing neighbourhood bests using species in a particle swarm optimizer for multimodal function optimization, in: Genetic and Evolutionary Computation Conference, Springer, 2004, pp. 105–116.
  • (26) D. Beasley, D. R. Bull, R. R. Martin, A sequential niche technique for multimodal function optimization, Evolutionary computation 1 (2) (1993) 101–125.
  • (27) R. Brits, A. P. Engelbrecht, F. Van den Bergh, A niching particle swarm optimizer, in: Proceedings of the 4th Asia-Pacific conference on simulated evolution and learning, Vol. 2, Singapore: Orchid Country Club, 2002, pp. 692–696.
  • (28) C. Stoean, M. Preuss, R. Stoean, D. Dumitrescu, Multimodal optimization by means of a topological species conservation algorithm, IEEE Transactions on Evolutionary Computation 14 (6) (2010) 842–864.
  • (29) K. Deb, A. Saha, Finding multiple solutions for multimodal optimization problems using a multi-objective evolutionary approach, in: Proceedings of the 12th annual conference on genetic and evolutionary computation, ACM, 2010, pp. 447–454.
  • (30) L. Li, K. Tang, History-based topological speciation for multimodal optimization, IEEE Transactions on Evolutionary Computation 19 (1) (2015) 136–150.
  • (31) J. J. Liang, A. K. Qin, P. N. Suganthan, S. Baskar, Comprehensive learning particle swarm optimizer for global optimization of multimodal functions, IEEE transactions on evolutionary computation 10 (3) (2006) 281–295.
  • (32) X. Li, A multimodal particle swarm optimizer based on fitness euclidean-distance ratio, in: Proceedings of the 9th annual conference on Genetic and evolutionary computation, ACM, 2007, pp. 78–85.
  • (33) B.-Y. Qu, P. N. Suganthan, J.-J. Liang, Differential evolution with neighborhood mutation for multimodal optimization, IEEE transactions on evolutionary computation 16 (5) (2012) 601–614.
  • (34) B.-Y. Qu, P. N. Suganthan, S. Das, A distance-based locally informed particle swarm model for multimodal optimization, IEEE Transactions on Evolutionary Computation 17 (3) (2013) 387–402.
  • (35) Y. Wang, H.-X. Li, G. G. Yen, W. Song, Mommop: Multiobjective optimization for locating multiple optimal solutions of multimodal optimization problems, IEEE transactions on cybernetics 45 (4) (2015) 830–843.
  • (36) http://www.ntu.edu.sg/home/EPNSugan/index_files/CEC2015/CEC2015.htm.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
139430
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description