Feasibility Preserving Constraint-Handling Strategies for Real Parameter Evolutionary Optimization
Evolutionary Algorithms (EAs) are being routinely applied for a variety of optimization tasks, and real-parameter optimization in the presence of constraints is one such important area. During constrained optimization EAs often create solutions that fall outside the feasible region; hence a viable constraint-handling strategy is needed. This paper focuses on the class of constraint-handling strategies that repair infeasible solutions by bringing them back into the search space and explicitly preserve feasibility of the solutions. Several existing constraint-handling strategies are studied, and two new single parameter constraint-handling methodologies based on parent-centric and inverse parabolic probability (IP) distribution are proposed. The existing and newly proposed constraint-handling methods are first studied with PSO, DE, GAs, and simulation results on four scalable test-problems under different location settings of the optimum are presented. The newly proposed constraint-handling methods exhibit robustness in terms of performance and also succeed on search spaces comprising up-to variables while locating the optimum within an error of . The working principle of the IP based methods is also demonstrated on (i) some generic constrained optimization problems, and (ii) a classic ‘Weld’ problem from structural design and mechanics. The successful performance of the proposed methods clearly exhibits their efficacy as a generic constrained-handling strategy for a wide range of applications.
Keywords constraint-handling, nonlinear and constrained optimization, particle swarm optimization, real-parameter genetic algorithms, differential evolution.
Optimization problems are wide-spread in several domains of science and engineering. The usual goal is to minimize or maximize some pre-defined objective(s). Most of the real-world scenarios place certain restrictions on the variables of the problem i.e. the variables need to satisfy certain pre-defined constraints to realize an acceptable solution.
The most general form of a constrained optimization problem (with inequality constraints, equality constraints and variable bounds) can be written as a nonlinear programming (NLP) problem:
The NLP problem defined above contains decision variables (i.e. is a vector of size ), greater-than-equal-to type inequality constraints (less-than-equal-to can be expressed in this form by multiplying both sides by ), and equality-type constraints. The problem variables are bounded by the lower () and upper () limits. When only the variable bounds are specified then the constraint-handling strategies are often termed as the boundary-handling methods. 111For the rest of the paper, by constraint-handling we imply tackling all of the following: variable bounds, inequality constraints and equality constraints. And, by a feasible solution it is implied that the solution satisfies all the variable bounds, inequality constraints, and equality constraints. The main contribution of the paper is to propose an efficient constraint-handling method that operates and generates only feasible solutions during optimization.
In classical optimization, the task of constraint-handling has been addressed in a variety of ways: (i) using penalty approach developed by Fiacoo and McCormick , which degrades the function value in the regions outside the feasible domain, (ii) using barrier methods which operate in a similar fashion but strongly degrade the function values as the solution approaches a constraint boundary from inside the feasible space, (iii) performing search in the feasible directions using methods such gradient projection, reduced gradient and Zoutendijk’s approach  (iv) using the augmented Lagrangian formulation of the problem, as commonly done in linear programming and sequential quadratic programming (SQP). For a detailed account on these methods along with their implementation and convergence characteristics the reader is referred to [22, 5, 14]. The classical optimization methods reliably and effectively solve convex constrained optimization problems while ensuring convergence and therefore widely used in such scenarios. However, same is not true in the presence of non-convexity. The goal of this paper is to address the issue of constraint-handling for evolutionary algorithms in real-parameter optimization, without any limitations to convexity or a special form of constraints or objective functions.
In context to the evolutionary algorithms the constraint-handling has been addressed by a variety of methods; including borrowing of the ideas from the classical techniques. These include (i) use of penalty functions to degrade the fitness values of infeasible solutions such that the degraded solutions are given less emphasis during the evolutionary search. A common challenge in employing such penalty methods arises from choosing an appropriate penalty parameter () that strikes the right balance between the objective function value, the amount of constraint violation and the associated penalty. Usually, in EA studies, a trial-and-error method is employed to estimate . A study  in 2000 suggested a parameter-less approach of implementing the penalty function concepts for population-based optimization method. A recent bi-objective method  was reported to find the appropriate values adaptively during the optimization process. Other studies [27, 26] have employed the concepts of multi-objective optimization by simultaneously considering the minimization of the constraint violation and optimization of the objective function, (ii) use of feasibility preserving operators, for example, in  specialized operators in the presence of linear constraints were proposed to create new and feasible-only individuals from the feasible parents. In another example, generation of feasible child solutions within the variable bounds was achieved through Simulated Binary Crossover (SBX)  and polynomial mutation operators . The explicit feasibility of child solutions was ensured by redistributing the probability distribution function in such a manner that the infeasible regions were assigned a zero probability for child-creation . Although explicit creation of feasible-only solutions during an EA search is an attractive proposition, but it may not be possible always since generic crossover or mutation operators or other standard EAs do not gaurantee creation of feasible-only solutions, (iii) deployment of repair strategies that bring an infeasible solution back into the feasible domain. Recent studies [19, 11, 10, 3] investigated the issue of constraint-handling through repair techniques in context to PSO and DE, and showed that the repair mechanisms can introduce a bias in the search and hinder exploration. Several repair methods proposed in context PSO [16, 12, 1] exploit the information about location of the optimum and fail to perform when the location of optimum changes . These issues are universal and often encountered with all EAs (as shown in later sections). Furthermore, the choice of the evolutionary optimizer, the constraint-handling strategy, and the location of the optima with respect to the search space, all play an important role in the optimization task. To this end, authors have realized a need for a reliable and effective repair-strategy that explicitly preserves feasibility. An ideal evolutionary optimizer (evolutionary algorithm and its constrained-handling strategy) should be robust in terms of finding the optimum, irrespective of the location of the optimal location in the search space. In rest of the paper, the term constraint-handling strategy refers to explicit feasibility preserving repair techniques.
First we review the existing constraint-handling strategies and then propose two new constraint-handling schemes, namely, Inverse Parabolic Methods (IPMs). Several existing and newly proposed constrained-handling strategies are first tested on a class of benchmark unimodal problems with variable bound constraints. Studying the performance of constraint-handling strategies on problems with variable bounds allows us to gain better understanding into the operating principles in a simplistic manner. Particle Swarm Optimization, Differential Evolution and real-coded Genetic Algorithms are chosen as evolutionary optimizers to study the performance of different constraint-handling strategies. By choosing different evolutionary optimizers, better understanding on the functioning of constraint-handlers embedded in the evolutionary frame-work can be gained. Both, the search algorithm and constraint-handling strategy must operate efficiently and synergistically in order to successfully carry out the optimization task. It is shown that the constraint-handling methods possessing inherent pre-disposition; in terms of bringing infeasible solutions back into the specific regions of the feasible domain, perform poorly. Deterministic constraint-handling strategies such as those setting the solutions on the constraint boundaries result in the loss of population diversity. On the other hand, random methods of bringing the solutions back into the search space arbitrarily; lead to complete loss of all useful information carried by the solutions. A balanced approach that utilizes the useful information from the solutions and brings them back into the search space in a meaningful way is desired. The newly proposed IPMs are motivated by these considerations. The stochastic and adaptive components of IPMs (utilizing the information of the solution’s feasible and infeasible locations), and a user-defined parameter () render them quite effective.
The rest of the paper is organized as follows: Section 2 reviews existing constraint-handling techniques commonly employed for problems with variable bounds. Section 3 provides a detailed description on two newly proposed IPMs. Section 4 provides a description on the benchmark test problems and several simulations performed on PSO, GAs and DE with different constraint-handling techniques. Section 5 considers optimization problems with larger number of variables. Section 6 shows the extension and applicability of proposed IPMs for generic constrained problems. Finally, conclusions and scope for future work are discussed in Section 7.
2 Feasibility Preserving Constraint-Handling Approaches for Optimization Problems with Variable Bounds
Several constraint-handling strategies have been proposed to bring solutions back into the feasible region when constraints manifest as variable bounds. Some of these strategies can also be extended in presence of general constraints. An exhaustive recollection and comparison of all the constraint-handling techniques is beyond the scope of this study. Rather, we focus our discussions on the popular and representative constraint-handling techniques.
The existing constraint-handling methods for problems with variable bounds can be broadly categorized into two groups: Group techniques that perform feasibility check variable wise, and Group techniques that perform feasibility check vector-wise. According to Group techniques, for every solution, each variable is tested for its feasibility with respect to its supplied bounds and made feasible if the corresponding bound is violated. Here, only the variables violating their corresponding bounds are altered, independently, and other variables are kept unchanged. According to Group techniques, if a solution (represented as a vector) is found to violate any of the variable bounds, it is brought back into the search space along a vector direction into the feasible space. In such cases, the variables that explicitly do not violate their own bounds may also get modified.
It is speculated that for variable-wise separable problems, that is, problems where variables are not linked to one another, techniques belonging to Group are likely to perform well. However, for the problems with high correlation amongst the variables (usually referred to as linked-problems), Group techniques are likely to be more useful. Next, we provide description of these constraint-handling methods in detail 222The implementation of several strategies as C codes can be obtained by emailing email@example.com or firstname.lastname@example.org.
2.1 Random Approach
This is one of the simplest and commonly used approaches for handling boundary violations in EAs . This approach belongs to Group . Each variable is checked for a boundary violation and if the variable bound is violated by the current position, say , then is replaced with a randomly chosen value in the range , as follows:
Figure 1 illustrates this approach. Due to the random choice of the feasible location, this approach explicitly maintains diversity in the EA population.
2.2 Periodic Approach
This strategy assumes a periodic repetition of the objective function and constraints with a period . This is carried out by mapping a violated variable in the range to , as follows:
In the above equation, % refers to the modulo operator. Figure 2 describes the periodic approach. The above operation brings back an infeasible solution in a structured manner to the feasible region. In contrast to the random method, the periodic approach is too methodical and it is unclear whether such a repair mechanism is supportive of preserving any meaningful information of the solutions that have created the infeasible solution. This approach belongs to Group .
2.3 SetOnBoundary Approach
As the name suggests, according to this strategy a violated variable is reset on the bound of the variable which it violates.
Clearly this approach forces all violated solutions to lie on the lower or on the upper boundaries, as the case may be. Intuitively, this approach will work well on the problems when the optimum of the problem lies exactly on one of the variable boundaries. This approach belongs to Group .
2.4 Exponentially Confined (Exp-C) Approach
This method was proposed in . According to this approach, a particle is brought back inside the feasible search space variable-wise in the region between its old position and the violated bound. The new location is created in such a manner that higher sampling probabilities are assigned to the regions near the violated boundary. The developers suggested the use of an exponential probability distribution, shown in Figure 4. The motivation of this approach is based on the hypothesis that a newly created infeasible point violates a particular variable boundary because the optimum solution lies closer to that variable boundary. Thus, this method will probabilistically create more solutions closer to the boundaries, unless the optimum lies well inside the restricted search space. This approach belongs to Group .
Assuming that the exponential distribution is , the value of can be obtained by integrating the probability from to (where or , as the case may be). Thus, the probability distribution is given as . For any random number within , the feasible solution is calculated as follows:
2.5 Exponential Spread (Exp-S) Approach
This is a variation of the above approach, in which, instead of confining the probability to lie between and the violated boundary, the exponential probability is spread over the entire feasible region, that is, the probability is distributed from lower boundary to the upper boundary with an increasing probability towards the violated boundary. This requires replacing with (when the lower boundary is violated) or (when the upper boundary is violated) in the Equation 5 as follows:
The probability distribution is shown in Figure 4. This approach also belongs to Group .
2.6 Shrink Approach
This is a vector-wise approach and belongs to Group in which the violated solution is set on the intersection point of the line joining the parent point (), child point (, and the violated boundary. Mathematically, the mapped vector is created as follows:
where is computed as the minimum of all positive values of intercept
for a violated boundary
and for a violated boundary
This operation is shown in Figure 5. In the case shown,
needs to be
computed for variable bound only.
3 Proposed Inverse Parabolic (IP) Constraint-Handling Methods
The exponential probability distribution function described in the previous section brings violated solutions back into the allowed range variable-wise, but ignores the distance of the violated solution with respect to the violated boundary. The distance from the violated boundary carries useful information for remapping the violated solution into the feasible region. One way to utilize this distance information is to bring solutions back into the allowed range with a higher probability closer to the boundary, when the fallen-out distance (, as shown in Figure 6) is small. In situations, when points are too far outside the allowable range, that is, the fallen-out distance is large, particles are brought back more uniformly inside the feasible range. Importantly, when the fallen-out distance is small (meaning that the violated child solution is close to the variable boundary), the repaired point is also close to the violated boundary but in the feasible side. Therefore, the nature of the exponential distribution should become more and more like a uniform distribution as the fallen-out distance becomes large.
Let us consider Figure 6 which shows a violated solution and its parent solution . Let denote the distance between the violated solution and the parent solution. Let and be the intersection points of the line joining and with the violated boundary and the non-violated boundary, respectively. The corresponding distances of these two points from are and , respectively. Clearly, the violated distance is . We now define an inverse parabolic probability distribution function from along the direction as:
where is the upper bound of allowed by the constraint-handling scheme (we define later) and is a pre-defined parameter. By calculating and equating the cumulative probability equal to one, we find:
The probability is maximum at (at the violated boundary) and reduces as the solution enters the allowable range. Although this characteristic was also present in the exponential distribution, the above probability distribution is also a function of violated distance , which acts like a variance to the probability distribution. If is small, then the variance of the distribution is small, thereby resulting in a localized effect of creating a mapped solution. For a random number , the distance of the mapped solution from in the allowable range is given as follows:
The corresponding mapped solution is as follows:
Note that the IP method makes a vector-wise operation and is sensitive to the relative locations of the infeasible solution, the parent solution, and the violated boundary.
The parameter has a direct external effect of inducing small or large variance to the above probability distribution. If is large, the variance is large, thereby having uniform-like distribution. Later we shall study the effect of the parameter . A value 1.2 is found to work well in most of the problems and is recommended. Next, we describe two particular constraint-handling schemes employing this probability distribution.
3.1 Inverse Parabolic Confined (IP-C) Method
In this approach, the probability distribution is confined between , thereby making . Here, a mapped solution lies strictly between violated boundary location () and the parent ().
3.2 Inverse Parabolic Spread (IP-S) Method
4 Results and Discussions
In this study, first we choose four standard scalable unimodal test functions (in presence of variable bounds): Ellipsoidal (), Schwefel (), Ackley (), and Rosenbrock (), described as follows:
In the unconstrained space, , and have a minimum at , whereas has a minimum at . All functions have minimum value . is the only variable separable problem. is a challenging test problem that has a ridge which poses difficulty for several optimizers. In all the cases the number of variables is chosen to be .
For each test problem three different scenarios corresponding to the relative location of the optimum with respect to the allowable search range are considered. This is done by selecting different variable bounds, as follows:
- On the Boundary:
Optimum is exactly on one of the variable boundaries (for , and , and for , ),
- At the Center:
Optimum is at the center of the allowable range (for , and , and for , ), and
- Close to Boundary:
Optimum is near the variable boundary, but not exactly on the boundary (for , and , and for , ).
These three scenarios are shown in the Figure 7 for a two-variable problem having variable bounds: and . Although in practice, the optimum can lie anywhere in the allowable range, the above three scenarios pose adequate representation of different possibilities that may exist in practice.
For each test problem, the population is initialized uniformly in the allowable range. We count the number of function evaluations needed for the algorithm to find a solution close to the known optimum solution and we call this our evaluation criterion . Choosing a high accuracy (i.e. small value of ) as the termination criteria minimizes the chances of locating the optimum due to random effects, and provides a better insight into the behavior of a constraint-handling mechanism.
To eliminate the random effects and gather results of statistical importance, each algorithm is tested on a problem times (each run starting with a different initial population). A particular run is terminated if the evaluation criterion is met (noted as a successful run), or the number of function evaluations exceeds one million (noted as an unsuccessful run). If only a few out of the runs are successful, then we report the number of successful runs in the bracket. In this case, the best, median and worst number of function evaluations are computed from the successful runs only. If none of the runs are successful, we denote this by marking (DNC) (Did Not Converge). In such cases, we report the best, median and worst attained function values of the best solution at the end of each run. To distinguish the unsuccessful results from successful ones, we present the fitness value information of the unsuccessful runs in italics.
An in-depth study on the constraint-handling techniques is carried out in this paper. Different locations of the optimum are selected and systematic comparisons are carried out for PSO, DE and GAs in Sections 4.1, 4.3 and 4.4 , respectively.
4.1 Results with Particle Swarm Optimization (PSO)
In PSO, decision variable and the velocity terms are updated independently. Let us say, that the initial position is , the newly created position is infeasible and represented by , and the repaired solution is denoted by .
If the velocity update is based on the infeasible solution as:
then, we refer to this as “Velocity Unchanged”. However, if the velocity update is based on the repaired location as:
then, we refer to this as “Velocity Recomputed”. This terminology is used for rest of the paper. For inverse parabolic (IP) and exponential (Exp) approaches, we use “Velocity Recomputed” strategy only. We have performed “Velocity Unchanged” strategy with IP and exponential approaches, but the results were not as good as compared to “Velocity Recomputed” strategy. For the SetOnBoundary approach, we use the “Velocity Recomputed” strategy and two other strategies discussed as follows.
Another strategy named “Velocity Reflection” is used, which simply implies that if a particle is set on the -th boundary, then is changed to . The goal of the velocity reflection is to explicitly allow particles to move back into the search space. In the “Velocity Set to Zero” strategy, if a particle is set on the -th boundary, then the corresponding velocity component is set to zero i.e. . For the shrink approach, both “Velocity Recomputed” and “Velocity Set to Zero” strategies are used.
For PSO, a recently proposed hyperbolic  constraint-handling approach is also included in this study. This strategy operates by first calculating velocity according to the standard mechanism 15, and in the case of violation a linear normalization is performed on the velocity to restrict the solution from jumping out of the constrained boundary as follows:
Essentially, the closer the particle gets to the boundary (e.g., only slightly smaller than ), the more difficult it becomes to reach the boundary. In fact, the particle is never completely allowed to reach the boundary as the velocity tends to zero. We emphasize again that this strategy is only applicable to PSO. A standard PSO is employed in this study with a population size of 100. The results for all the above scenarios with PSO are presented in Tables 1 to 4.
|in [0,10]: On the Boundary|
|IP Confined||Recomputed||112,400 (6)||126,500||145,900|
|Exp. Spread||Recomputed||3.79e+00 (DNC)||8.37e+00||1.49e+01|
|SetOnBoundary||Set to Zero||1,000 (40)||1,600||5,300|
|Shrink||Set to Zero||17,900||31,900||49,800|
|Hyperbolic||Modified (Eq. (17))||36,400||41,700||48,700|
|in [-10,10]: At the Center|
|SetOnBoundary||Reflected||3.30e-06 (DNC)||8.32e+01(DNC)||2.95e+02 (DNC)|
|SetOnBoundary||Set to Zero||111,900||132,200||149,700|
|Shrink.||Set to Zero||108,400||125,100||143,600|
|Hyperbolic||Modified (Eq. (17))||101,300||117,700||129,700|
|in [-1,10]: Close to Boundary|
|IP Confined||Recomputed||120,100 (44)||171,200||301,200|
|Periodic||Recomputed||8.09e+02 (DNC)||2.01e+03 (DNC)||5.53e+03(DNC)|
|Periodic||Unchanged||2.16e+03 (DNC)||4.36e+03 (DNC)||6.87e+03 (DNC)|
|Random||Unchanged||8.17e+02 (DNC)||1.96e+03 (DNC)||2.68e+03 (DNC)|
|SetOnBoundary||Recomputed||2.50e+00 (DNC)||1.25e+01 (DNC)||5.75e+02 (DNC)|
|SetOnBoundary||Reflected||1.86e+00 (DNC)||7.76e+00 (DNC)||5.18e+01 (DNC)|
|SetOnBoundary||Set to Zero||1.00e+00 (DNC)||5.00e+00 (DNC)||4.21e+02 (DNC)|
|Shrink||Recomputed||5.00e-01 (DNC)||3.00e+00 (DNC)||1.60e+01 (DNC)|
|Shrink||Set to Zero||108,300 (8)||130,300||143,000|
|Hyperbolic||Modified (Eq. (17))||93,100||108,300||119,000|
|in [0,10]: On the Boundary|
|IP Spread||Recomputed||150,600 (49)||220,900||328,000|
|IP Confined||Recomputed||4.17e+00 (DNC)||6.53e+00||8.79e+00|
|Exp. Spread||Recomputed||2.76e-01 (DNC)||9.62e-01||2.50e+00|
|SetOnBoundary||Set to Zero||1,100||1,800||3,100|
|Shrink.||Set Zero||3.65e+00 (DNC)||6.28e+00||8.35e+00|
|Hyperbolic||Modified (Eq. (17))||24,600 (25)||26,100||28,000|
|in [-10,10]: At the Center|
|IP Spread||Recomputed||53,900 (46)||58,600||66,500|
|IP Confined||Recomputed||54,800 (49)||59,200||64,700|
|SetOnBoundary||Set to Zero||53,900 (49)||59,600||67,700|
|Shrink||Set to Zero||55,700 (49)||58,900||62,000|
|Hyperbolic||Modified (Eq. (17))||52,900 (49)||56,200||64,400|
|in [-1,10]: Close to Boundary|
|IP Spread||Recomputed||54,600 (5)||55,100||56,600|
|IP Confined||Recomputed||63,200 (1)||63,200||63,200|
|Exp. Confined||Recomputed||1.42e+00 (DNC)||2.17e+00||2.92e+00|
|SetOnBoundary||Set to Zero||1.70e+00 (DNC)||2.63e+00||3.26e+00|
|Shrink||Set to Zero||2.01e+00 (DNC)||3.96e+00||6.76e+00|
|Hyperbolic||Modified (Eq. (17))||50,000 (39)||53,500||58,100|
The extensive simulation results are summarized using the following method. For each (say ) of the 14 approaches, the corresponding number of the successful applications () are recorded. Here, an application is considered to be successful if more than 45 runs out of 50 runs are able to find the optimum within the specified accuracy. It is observed that IP-S is successful in 10 out of 12 problem instances. Exponential confined approach (Exp-C) is successful in 9 cases. To investigate the required number of function evaluations (FE) needed to find the optimum, by an approach (say ), we compute the average number of needed to solve a particular problem () and construct the following objective for -th approach:
where FE is the FEs needed by the -th approach to solve the -th problem. Figure 8 shows the performance of each (-th) of the 14 approaches on the two-axes plot ( and ).
The best approaches should have large values and small values. This results in a trade-off between three best approaches which are marked in filled circles. All other 11 approaches are dominated by these three approaches. The SetBound (SetOnBoundary) with velocity set to zero performs in only six out of 12 problem instances. Thus, we ignore this approach. There is a clear trade-off between IP-S and Exp-C approaches. IP-S solves one problem more, but requires more FEs in comparison to Exp-C. Hence, we recommend the use of both of these methods vis-a-vis all other methods used in this study.
Other conclusions of this extensive study of PSO with different constraint-handling methods are summarized as follows:
The constraint-handling methods show a large variation in the performance depending on the choice of test problem and location of the optimum in the allowable variable range.
When the optimum is on the variable boundary, periodic and random allocation methods perform poorly. This is expected intuitively.
When the optimum is on the variable boundary, methods that set infeasible solutions on the violated boundary (SetOnBoundary methods) work very well for obvious reasons, but these methods do not perform well for other cases.
When the optimum lies near the center of the allowable range, most constraint-handling approaches work almost equally well. This can be understood intuitively from the fact that tendency of particles to fly out of the search space is small when the optimum is in the center of the allowable range. For example, the periodic approaches fail in all the cases but are able to demonstrate some convergence characteristics for all test problems, when the optimum is at the center. When the optimum is on the boundary or close to the boundary, then the effect of the chosen constraint-handling method becomes critical.
The shrink method (with “Velocity Recomputed” and “Velocity Set Zero” strategies) succeeded in 10 of the 12 cases.
4.2 Parametric Study of
The proposed IP approaches involve a parameter affecting the variance of the probability distribution for the mapped variable. In this section, we perform a parametric study of to determine its effect on the performance of the IP-S approach.
Following values are chosen: , , , and . To have an idea of the effect of , we plot the probability distribution of mapped values in the allowable range () for in Figure 9. It can be seen that for and 1,000, the distribution is almost uniform.
Figure 10 shows the effect of on problem. For the same termination criterion we find that and perform better compared to other values. With larger values of the IP-S method does not even find the desired solution in all 50 runs.
4.3 Results with Differential Evolution (DE)
Differential evolution, originally proposed in , has gained popularity as an efficient evolutionary optimization algorithm. The developers of DE proposed a total of ten different strategies . In  it was shown that performance of DE largely depended upon the choice of constraint-handling mechanism. We use Strategy 1 (where the offspring is created around the population-best solution), which is most suited for solving unimodal problems . A population size of was chosen with parameter values of and . Other parameters are set the same as before. We use as our termination criterion. Results are tabulated in Tables 5 to 8. Following two observations can be drawn:
For problems having optimum at one of the boundaries, SetOnBoundary approach performs the best. This is not a surprising result.
However, for problems having the optimum near the center of the allowable range, almost all eight algorithms perform in a similar manner.
For problems having their optimum close to one of the boundaries, the proposed IP and existing exponential approaches perform better than the rest of the approaches with DE.
Despite the differences, somewhat similar performances of different constraint-handling approaches with DE indicates that the DE is an efficient optimization algorithm and its performance is somewhat less dependent on the choice of constraint-handling scheme compared to the PSO algorithm.
|in [0,10]: On the Boundary|
|in [-10,10]: At the Center|
|in [-1,10]: Close to Boundary|
|in [0,10]: On the Boundary|
|in [-10,10]: At the Center|
|in [-1,10]: Close to Boundary|
|in [0,10]: On the Boundary|
|Exp. Spread Dist.||66,300||69,250||71,300|
|Exp. Confined Dist||32,750||34,600||36,200|
|in [-10,10]: At the Center|
|in [-1,10]: Close to Boundary|
|in [1,10]: On the Boundary|
|in [-8,10]: At the Center|
|IP Spread||133,350 (41)||887,250||995,700|
|IP Confined||712,500 (44)||854,800||991,400|
|Exp. Spread||390,700 (48)||866,150||998,950|
|Exp. Confined||138,550 (40)||883,500||994,350|
|in [0,10]: Close to Boundary|
|IP Confined||46,400 (46)||95,900||891,450|
|Exp. Spread||49,550 (49)||85,900||968,200|
|Exp. Confined||87,300 (43)||829,200||973,350|
4.4 Results with Real-Parameter Genetic Algorithms (RGAs)
We have used two real-parameter GAs in our study here:
Standard-RGA using the simulated binary crossover (SBX)  operator and the polynomial mutation operator . In this approach, variables are expressed as real numbers initialized within the allowable range of each variable. The SBX and polynomial mutation operators can create infeasible solutions. Violated boundary, if any, is handled using one of the approaches studied in this paper. Later we shall investigate a rigid boundary implementation of these operators which ensures creation of feasible solutions in every recombination and mutation operations.
Elitist-RGA in which two newly created offsprings are compared against the two parents, and the best two out of these four solutions are retained as parents (thereby introducing elitism). Here, the offspring solutions are created using non-rigid versions of SBX and polynomial mutation operators. As before, we test eight different constraint-handling approaches and, later explore a rigid boundary implementation of the operators in presence of elite preservation.
Parameters for RGAs are chosen as follows: population size of 100, crossover probability =0.9, mutation probability =0.05, distribution index for crossover =2, distribution index for mutation =100. The results for the Standard-RGA are shown in Tables 9 to 12 for four different test problems. Tables 13 to 16 show results using the Elitist-RGA. Following key observations can be made:
For all the four test problems, Standard-RGA shows convergence only in the situation when optima is on the boundary.
Elitist-RGA shows good convergence on when the optimum is on the boundary and, only some convergence is noted when the optima is at the other locations. For other three problems, convergence is only obtained when optimum is present on the boundary.
Overall, the performance of Elite-RGA is comparable or slightly better compared to Standard-RGA.
The Did Not Converge cases can be explained on the fact that the SBX operator has the property of creating solutions around the parents; if parents are close to each other. This property is a likely cause of premature convergence as the population gets closer to the optima. Furthermore the results suggest that the elitism implemented in this study (parent-child comparison) is not quite effective.
Although RGAs are able to locate the optima, however, they are unable to fine-tune the optima due to undesired properties of the generation scheme. This emphasizes the fact that generation scheme is primarily responsible for creating good solutions, and the constraint-handling methods cannot act as surrogate for generating efficient solutions. Each step of the evolutionary search should be designed effectively in order to achieve overall success. On the other hand one could argue that strategies such as increasing the mutation rate (in order to promote diversity so as to avoid pre-mature convergence) should be tried, however, creation of good and meaningful solutions in the generation scheme is rather an important and a desired fundamental-feature.
As expected, when the optima is on the boundary SetOnBoundary finds the optima most efficiently within a minimum number of function evaluations. Like in PSO the performance of Exp. Confined is better than Exp. Spread. Periodic and Random show comparable or slightly worse performances (these mechanisms don’t have any preference of creating solutions close to the boundary and actually promote spread of the population).
|Exp. Spread||103,100 (6)||718,900||931,200|
|Exp. Spread||21,200 (47)||180,000||772,200|
|Exp. Spread||58,100 (29)||355,900||994,000|
|in [1,10]: On the Boundary|
|IP Spread||12,400 (39)||15,800||20,000|
|IP Confined||9,400 (39)||11,800||13,600|
|Exp. Spread||9.73e+00 (DNC)||1.83e+00||2.43e+01|
|in [1,10]: On the Boundary|
|IP Spread||980,200 (1)||980,200||980,200|
|IP Confined||479,000 (1)||479,000||479,000|
|Exp. Spread||2.06e-01 (DNC)||4.53e-01||4.86e-01|
|Exp. Confined||954,400 (1)||954,400||954,400|
|IP Spread||803,400 (5)||886,100||947,600|
|IP Confined||643,300 (2)||643,300||963,000|
|Exp. Spread||593,300 (3)||628,900||863,500|
|Exp. Confined||655,400 (3)||940,500||946,700|
|IP Spread||9,900 (13)||12,500||14,000|
|IP Confined||10,100 (12)||12,100||14,400|
|Exp. Spread||8,500 (10)||11,000||15,400|
|Exp. Confined||6,600 (30)||7,800||8,900|
in [1,10]: On the Boundary
4.4.1 RGAs with Rigid Boundary
|Optimum on the boundary|
|Optimum in the center|
|Optimum close to the edge of the boundary|
We also tested RGA (standard and its elite version) with a rigid bound consideration in its operators. In this case, the probability distributions of both SBX and polynomial mutation operator are changed in a way so as to always create a feasible solution. It is found that the Standard-RGA with rigid bounds shows convergence only when optimum is on the boundary (Table 17). The performance of Elite-RGA with rigid bounds is slightly better (Table 18). Overall, SBX operating within the rigid bounds is found to perform slightly better compared to the RGAs employing boundary-handling mechanisms. However, as mentioned earlier, in the scenarios where the generation scheme cannot guarantee creation of feasible only solutions there is a necessary need for constraint-handling strategies.
|Optimum on the boundary|
|Optimum in the center|
|Optimum close to the boundary edge|
5 Higher-Dimensional Problems
As the dimensionality of the search space increases it becomes difficult for a search algorithm to locate the optimum. Constraint-handling methods play even a more critical role in such cases. So far in this study, DE has been found to be the best algorithm. Next, we consider all four unimodal test problems with an increasing problem size: , , , , , and . For all problems the variable bounds were chosen such that optima occured near the center of the search space. No population scaling is used for DE. The DE parameters are chosen as and . For we were able to achieve a high degree of convergence and results are shown in Figure 11. As seen from the figure, although it is expected that the required number of function evaluations would increase with the number of variables, the increase is sub-quadratic. Each case is run times and the termination criteria is set as . All 20 runs are found to be successful in each case, demonstrating the robustness of the method in terms of finding the optimum with a high precision. Particularly problems with large variables, complex search spaces and highly nonlinear constraints, such a methodology should be useful in terms of applying the method to different real-world problems.
It is worth mentioning that authors independently tested other scenarios for large scale problems with corresponding optimum on the boundary, and in order to achieve convergence with IP methods we had to significantly reduce the values of . Without lowering , particularly, PSO did not show any convergence. As expected in larger dimensions the probability to sample a point closer to the boundary decreases and hence a steeper distribution is needed. However, this highlights the usefulness of the parameter to modify the behavior of the IPMs so as to yield the desired performance.
6 General Purpose Constraint-Handling
So far we have carried out simulations on problems where constraints have manifested as the variable bounds. The IP methods proposed in this paper can be easily extended and employed for solving nonlinear constrained optimization problems (inclusive of variable bounds).333By General Purpose Constraint-Handling we imply tackling of all variable bounds, inequality constraints and equality constraints.
As an example, let us consider a generic inequality constraint function: – the -th constraint in a set of inequality constraints. In an optimization algorithm, every created (offspring) solution at an iteration must be checked for its feasibility. If satisfies all inequality constraints, the solution is feasible and the algorithm can proceed with the created solution. But if does not satisfy one or more of constraints, the solution is infeasible and should be repaired before proceeding further.
Let us illustrate the procedure using the inverse parabolic (IP) approach described in Section 3; though other constraint-handling methods discussed before may also be used. The IP approaches require us to locate intersection points and : two bounds in the direction of (), where is one of the parent solutions that created the offspring solution (see Figure 12). The critical intersection point can be found by finding multiple roots of the direction vector with each constraint and then choosing the smallest root.
We define a parameter as the extent of a point from , as follows:
Substituting above expression for 444Note that here for calculating points should not be confused with parameter introduced in the proposed constraint-handling methods.in the -th constraint function, we have the following root-finding problem for the -th constraint:
Let us say the roots of the above equation are for . The above procedure can now be repeated for all inequality constraints and corresponding roots can be found one at a time. Since the extent of to reach from is given as
we are now ready to compute the two closest bounds (lower and upper bounds) on for our consideration, as follows:
IP-S and IP-C approaches presented in Section 3 now can be used to map the violated variable value into the feasible region using (the lower bound), (the upper bound) and (location of parent). It is clear that the only difficult aspect of this method is to find multiple intersection points in presence of nonlinear constraints.
For the sake of completeness, we show here that the two bounds for each variable: and used in previous sections can be also be treated uniformly using the above described approach. The two bounds can be written together as follows:
Note that a simultaneous non-positive value of each of the bracketed terms is not possible, thus only way to satisfy the above left side is to make each bracketed term non-negative. The above inequality can be considered as a quadratic constraint function, instead of two independent variable bounds and treated as a single combined nonlinear constraint and by finding both roots of the resulting quadratic root-finding equation.
Finally, the above procedure can also be extended to handle equality constraints () with a relaxation as follows: . Again, they can be combined together as follows:
Alternatively, the above can also be written as and may be useful for non-gradient based optimization methods, such as evolutionary algorithms. We now show the working of the above constraint handling procedure on a number of constrained optimization problems.
6.1 Illustrations on Nonlinear Constrained Optimization
First, we consider the three unconstrained problems used in previous sections as , but now add an inequality constraint by imposing a quadratic constraint that makes the solutions fall within a radius of one-unit from a chosen point :
There are no explicit variable bounds in the above problem. By choosing different locations of the center of the hyper-sphere (), we can have different scenarios of the resulting constrained optimum. If the minimum of the objective function (without constraints) lies at the origin, then setting the unconstrained minimum is also the solution to the constrained problem, and this case is similar to the “Optimum at the Center” (but in the context of constrained optimization now). The optimal solution is at with . DE with IP-S and previous parameter settings is applied to this new constrained problem, and the results from 50 different runs for this case are shown in Table 19.
As a next case, we consider . The constrained minimum is now different from that of the unconstrained problem, as the original unconstrained minimum is no more feasible. This case is equivalent to “Optima on the Constraint Boundary”. Since the optimum value is not zero as before, instead of choosing a termination criterion based on value, we allocate a maximum of one million function evaluations for a run and record the obtained optimized solution. The best fitness values for as , and are shown in Table 20. For each function, we verified that the obtained optimized solution satisfies the KKT optimality conditions [22, 5] suggesting that a truly optimum solution has been found by this procedure.
|640.93 0.00||8871.06 0.39||6.56 0.00|
Next, we consider two additional nonlinear constrained optimization
problems (TP5 and TP8) from  and a well-studied structural design and mechanics problem (‘Weld’) taken from .
The details on the mechanics of the welded structure and the beam deformation can be found in [23, 25].
These problems have
multiple nonlinear inequality constraints and our goal is to demonstrate the performance of our proposed constraint-handling
methods. We used DE with
IP-S method with the following
parameter settings: , , , and for all three problems.
A maximum of 200,000 function evaluations were allowed and a termination criteria of
from the known optima is chosen.
The problem definitions of TP5, TP8 and ‘Weld’ are as follows: