# An approximate Itô-SDE based simulated annealing algorithm for multivariate design optimization problems

###### Abstract

This research concerns design optimization problems involving numerous design parameters and large computational models. These problems generally consist in non-convex constrained optimization problems in large and sometimes complex search spaces. The classical simulated annealing algorithm rapidly loses its efficiency in high search space dimension. In this paper a variant of the classical simulated annealing algorithm is constructed by incorporating (1) an Itô stochastic differential equation generator (ISDE) for the transition probability and (2) a polyharmonic splines interpolation of the cost function. The control points are selected iteratively during the research of the optimum. The proposed algorithm explores efficiently the design search space to find the global optimum of the cost function as the best control point. The algorithm is illustrated on two applications. The first application consists in a simple function in relatively high dimension. The second is related to a Finite Element model.

Keywords: Engineering Design; Simulated annealing; Itô stochastic differential equation; polyharmonic spline interpolation)

## 1 Introduction

Recent advances in computational mechanics offer engineers the possibility of constructing very predictive computational models of complex mechanical structures. Such computational models can then be used to assess the performance of the structure over specific design scenarios. This increase of predictability is inevitably accompanied by an increase of the size of the computational models. In most industrial sectors, it is common for engineers to deal with computational models having tens of millions of degrees of freedom. The forward problem consisting in calculating the response of an already designed system can be performed using usual computational resources. Nevertheless the design of a system represented by a large computational model is a hard task since it generally consists in a large non-convex constrained optimization problem in a large and sometimes complex search space and requires numerous runs of the cost function.

There exists several methods in the literature in order to find or approximate the global maximum of a non-convex constrained optimization problem [Pardalos and Romeijn(2002)]. Among these methods, simulated annealing methods [Kirkpatrick, Gelatt, and Vecchi(1983), Černý(1985)] are attractive metaheuristic random search methods which consist in assigning the design parameters a probability distribution whose logarithm is proportional to the cost function to be minimized. Then a Markov Chain is generated and by varying the coefficient of proportionality (inverse of the temperature) in an appropriate way, the Markov Chain converges to the global optimum. This methodology, for which the basic version [Kirkpatrick, Gelatt, and Vecchi(1983), Černý(1985)] is easy to implement, has been improved by many researchers (see for instance [Pardalos and Romeijn(2002), Ingber(1993), Suman and Kumar(2006)]). In [Zolfaghari and Liang(1999)] the authors have combined the simulated annealing with a tabu search approach in order to avoid solution re-visits. In [Mingjun and Huanwen(2004)] the authors introduce a chaotic initialization and a chaotic sequence generation to improve the convergence rate of the simulated annealing method. Many researches have been devoted to the extension of simulated annealing method to multiobjective optimization problems [Suman and Kumar(2006)]. The improvement of the annealing schedule has also interested many researchers (see for instance [Kouvelis and Chiang(1992)] for the choice of the initial temperature and [Ingber(1989), Noutani and Andresen(1998), Azizi and Zolfaghari(2004), Triki, Collette, and Siarry(2004)] for the cooling schedule). Convergence results related to simulated annealing algorithm have been studied in [Aarts and Laarhoven(1985), Lundy and Mees(1986), Faigle and Schrader(1988), Faigle and Kern(1991), Granville, Krivanek, and Rasson(1994)] for the case where the temperature is held constant until a stationary distribution is reached and in [Mitra, Romeo, and Sangiovanni-Vincentelli(1986), Hajek(1988), Connors and Kumar(1989), Borkar(1992), Anily and Federgruen(1987), Yao and Li(1991)] for the case where the temperature always changes (and then no stationary distribution can be reached). For cost functions involving large computational models, the simulated annealing algorithm can become prohibitively time consuming. To circumvent this difficulty, an adaptative surface response can be introduced for the approximation of the cost function [Wang, Dong, and Aitchison(2001)].

The basic version of the simulated annealing [Kirkpatrick, Gelatt, and Vecchi(1983), Černý(1985)] is based on a Metropolis-Hastings algorithm [Hastings (1970)] for which a proposal distribution has to be provided. If the dimension of the search spaces increases, it is not easy to defined this proposal distribution and the Markov chain can become very slow due to numerous rejections. In this case, for large computational models, the use of the classical simulated annealing algorithm can become prohibitive due to the large number of runs of the cost function that are needed.

The objective of this paper is to improve the simulated annealing algorithm in order to design, in high-dimensional search space, complex mechanical systems represented by large computational models. The methodology proposed in the present paper is based on two ingredients. The first one consists in replacing the Metropolis-Hastings algorithm by an Itô stochastic differential equation (ISDE) generator [Soize(1994), Soize(2008)] which belongs to class of Monte Carlo Markov Chain Methods (MCMC). This generator is very well adapted in high dimension and does not require the specification of a proposal distribution. Nevertheless it requires the gradient of the cost function to be evaluated thousands times making it unusable. To circumvent this difficulty, the second ingredient of the methodology proposed here consists, as in [Wang, Dong, and Aitchison(2001)], in introducing response surface function in order to approximate the cost function. Since we are interested in multivariate optimization problems, a radial interpolation surface (polyharmonic) is preferred to a polynomial approximation (as used in[Wang, Dong, and Aitchison(2001)]). This interpolation is well adapted in relatively high input dimension and allows the gradient of the cost function to be calculated explicitly at each step of the Markov Chain. An adaptative strategy is proposed in order to enrich sequentially, during the numerical integration of the ISDE, the set of the control points that are used to construct the interpolation surface. At the end of the simulation the best control point is selected as the global optimum.

In Section 2, the design optimization problem and the classical simulated annealing algorithm are presented. Then Section 3 is devoted to the presentation the ISDE-based simulated annealing algorithm. The constraints on the design parameters are taken into account by introducing a regularisation of the indicator function. In Section 4, the polyharmonic interpolation of the cost function is introduced and the complete approximate ISDE-based simulated annealing algorithmdesign optimization algorithm is presented. The methodology is illustrated through two applications in Section 5. The first application consists in the classical Ackley’s test function in high dimension. The second one consists in a Finite Element structure submitted to an acceleration of the soil and for which the maximal displacement has to be minimized by designing the stiffnesses of supporting springs.

## 2 Design optimization problem and classical simulated annealing algorithm

In this section the design optimization problem is formulated and the simulated annealing probability distribution is constructed classically using the Maximum Entropy (MaxEnt) principle.

Let be the vector of the design parameters with values in . Let be the admissible set for vector . It is assumed that the design optimization problem to be solved consists in searching the global minimum of a non-convex positive cost function . Then the optimal value of vector is calculated such that

(1) |

For instance, in case of a least square problem, the function can be written as

(2) |

in which is a performance vector-valued function and is a target vector. In random search methods, a random vector with values in is associated with the vector . In the simulated annealing the probability density function (pdf) of random vector is constructed by using the MaxEnt principle [Shannon(1948), Jaynes(1957), Kapur and Kevasan(1992)] under the constraints defined by the available information related to random vector . This information is written as

(3) |

in which is the mathematical expectation. Equation(3-b) is related to the normalization of the pdf and Eq. (3-c) is related to the finiteness of the mean value of the cost function which is interpreted as an energy in thermodynamics. The entropy related to pdf is defined by

(4) |

where is the natural logarithm. This functional measures the relative uncertainty of . Let be the set of all the possible pdfs of random vector , satisfying the constraints defined by Eq. (3). Then the MaxEnt principle consists in constructing the pdf as the unique pdf in which maximizes entropy . Then by introducing a positive Lagrange multiplier associated with Eq. (3-b) and a Lagrange multiplier associated with Eq. (3-c), it can be shown (see [Jaynes(1957), Kapur and Kevasan(1992)]) that the MaxEnt solution, is defined by

(5) |

in which is a normalization constant. In the context of the simulated annealing algorithm, the Lagrange multiplier is written as (or with ) in which is the temperature which controls the sharpness of pdf . Then random variable is rewritten and its pdf is rewritten as

(6) |

A large value of the temperature makes tend to the uniform distribution and a small value of the temperature makes tend to a Dirac distribution centred at the optimal value. Then the classical simulated annealing algorithm consists in constructing a sequence using the Metropolis-Hastings (MH) algorithms [Hastings (1970)] with the pdf while decreasing slowly the temperature (see for instance [Noutani and Andresen(1998)] for the cooling strategies). The algorithm is summarized in Algorithm 1.

In this algorithm 1, it is often set for , skipping the burning phase. In this case, there is only one loop and the temperature is changed at each iteration. Under some conditions related to the rate of decreasing of the temperature, the algorithm converges to the global optimum. Two difficulties are usually referred for the classical simulated annealing algorithm:(1) it requires numerous calculation of the cost function making it unusable for large computational model and (2) if the dimension of random design vector is large, the Metropolis-Hastings algorithms may have difficulties to explore efficiently the design search space. In the next sections two modifications are introduced to circumvent these two difficulties. The first one consists in replacing the Metropolis-Hastings generator by an ISDE generator in order to explore efficiently the design search space in high dimension. The second one consists in introducing a polyharmonic spline interpolation of the function for which the list of control points is enriched during the algorithm.

## 3 Construction of a new ISDE-based simulated annealing algorithm

The ISDE generator consists in constructing the pdf of random vector as the density of the invariant measure associated with the stationary solution of a second-order nonlinear ISDE. This methodology, which has similarities with the Hamiltonian (or Hybrid) Monte Carlo method [Duane, et al(1987), Salazar and Toral(1997)], has recently been revisited [Soize(1994), Soize(2008)] in the context of the generation of random vectors in high dimension for which the pdf is constructed using the MaxEnt principle with complex available information. The advantages of this generator compared to the other MCMC generators such as the Metropolis-Hastings algorithm (see [Hastings (1970)]) are: (1) there is no need to introduce a proposal distribution (uniform in case of random walk) for the exploration of the admissible space, (2) a damping can be introduced in order to rapidly reach the invariant measure and (3) mathematical results concerning ISDEs can be used for convergence analyses. Below, we first present the ISDE generator followed by its application for generating independent realizations of . Then a new simulated algorithm based on the use of this ISDE generator is presented

### 3.1 ISDE generator

Let be a positive function. Let be a stochastic process with values in satisfying, for all , the following ISDE [Itô(1951), Soize(1994)]

(7) |

with the initial conditions

(8) |

In Eq. (7), is the gradient of function with respect to . The matrix is a symmetric positive-definite damping matrix, the lower triangular matrix is such that and is the normalized Wiener stochastic process indexed by . The random initial condition is a second-order random variable independent of the Wiener stochastic process . Then it can be proven (see [Soize(1994), Soize(2008)]) that, if is continuous on , if is locally bounded on (i.e. is bounded on all compact set in ), and if

(9) |

(10) |

(11) |

then the ISDE defined by Eqs. (7) and (8) admits an invariant measure defined by the pdf which is written as

(12) |

It can then be deduced that, for , the stochastic process tends to a stationary stochastic process in probability distribution, for which the one-order marginal probability distribution is such that

(13) |

Therefore, using an independent realization of the Wiener stochastic process and an independent realization of the initial condition , an independent realization random variable related to the probability distribution (13) can be constructed as the solution of the ISDE defined by Eqs. (7) and (8), for sufficiently large. In other words, independent realizations of any random variable having a probability distribution of the form (13) (verifying the required continuity and regularity properties for function ) can be obtained by solving the ISDE defined by Eqs. (7) and (8). The value of for which the invariant measure is approximatively reached depends on the choice of the damping matrix and on the probability distribution of the random initial conditions. The damping induced by the matrix has to be sufficiently large in order to rapidly kill the transient response but a too large damping introduces increasing errors in the numerical integration of the ISDE.

The solution of the ISDE can be constructed numerically using a for instance a modified Euler integration schemes which has a better accuracy than a classical Euler integration schemes (see [Soize(2008), Burrage(2007)]). Let be the integration step size and let be the sampling points, M being a positive integer. Let and be the time series related to the discretization of random processes and . Then, after having generated the random initial values and , the following updating rules applies

(14) |

The vector is a second-order Gaussian centered random vector with covariance matrix equal to and the random vectors are mutually independent.

### 3.2 ISDE generator for random variable

Equation (6) can be rewritten as

(15) |

Comparing equations (13) and (15), we naturally introduce the function

(16) |

Unfortunately, due to the presence of the indicator function , the function is not differentiable and then cannot be used within an ISDE generator. To solve this issue, the indicator function is replaced by the regularized indicator function . For simple manifolds, explicit regularized indicator function can be constructed. Two simple examples are given below (see [Batou and Soize(2014)]):

Example 1: Positive design variables

Assume that the constraints on are

(17) |

In this case, the regularized indicator function can be constructed as follow

(18) |

where are shape parameters

Example 2: Design variables belonging to independent intervals

Assume that the constraints on are

(19) |

where and , are lower and upper bounds. In this case, the regularized indicator function can be constructed as follow

(20) |

For more complex manifolds, a general kernel-smoothing regularization approach has been proposed in [Guilleminot and Soize(2014)].

Then a regularized ISDE can be introduced by replacing, in Eq.(7), by such that

(21) |

and then independent realizations of random variable can be constructed.

### 3.3 ISDE based simulated annealing algorithm

Now that an ISDE generator for random variable is constructed, we can use this generator for replacing the MH algorithm in the classical simulated annealing algorithm 1. This ISDE based simulated annealing algorithm is summarized in Algorithm 2.

At each changing of the temperature, three parameters have to be set: the steps in the ISDE , the damping matrix and the number of steps in the ISDE. By analogy to second-order dynamical systems, the time step that guaranties the stability of the modified Euler scheme is calculated using the eigenvalues related the conservative part of the dynamical equation. In the ISDE (7), the corresponding ”mass” matrix is identity and the ”stiffness” matrix for given ”displacement” is the hessian matrix of function . Then at each iteration , the step is calculated as where is an integer such that and is the largest eigenvalue of the hessian matrix . When approaching the border of the admissible domain , the ISDE can become very ”stiff”, which may yield stability issues. To avoid this, the integer has to be chosen large enough but not too large in order to avoid a too slow integration of the ISDE. Alternatively, an adaptative step-size algorithm can be used to automatically choose the best step-size at each iteration of the Euler Scheme [Guilleminot and Soize(2014)]. The damping matrix has to be large enough in order to kill rapidly the transient response. But a too large damping will introduce errors during the integration. A good compromise consists in imposing the largest damping rate equal to . This can achieved approximatively by choosing . The number of steps of can be chosen as invariant and then determined through an off-line convergence analysis. The convergence can also be controlled during the integration of each ISDE yielding different values for .

## 4 Approximation of the cost function

In this section, the function in the ISDE is approximated using multivariate polyharmonic splines. This approximation allows to reduce the computational cost for large computational models. Furthermore it allows to derive explicitly the gradient , avoiding the calculation of the gradient numerically. The multivariate polyharmonic splines approximation of function is written as

(22) |

in which is the vector of the -order polyharmonic splines defined at control points and which are such that, for

(23) |

For the odd and even cases, if then . In Eq. (22), is the weight vectors in . Classically, this vector should be calculated by solving the set of equations

(24) |

yielding independent equations allowing vectors to be calculated. The function in Eq. (21) is then approximated by the function defined by

(25) |

whose gradient is

(26) |

with

(27) |

where

(28) |

It should be noted that for , the approximated gradient is not defined at the control points. Then only the case will be considered. It can be shown that as , function in Eq.(22) is such that

(29) |

Then since the limit of as is if the domain is bounded and zero otherwise, a sufficient condition for Eq.(9) holds is . This condition condition can be enforced by imposing the following condition

(30) |

when solving Eq.(24) (introduction of a Lagrange multiplier). The verification of Eq.(11) is straightforward using the preceding remark and the continuity of function . Concerning the condition Eq.(10), we have . Since (1) the function generally decreases very rapidly out of and (2) for large values of the components of , (a) the exponential decreases rapidly under the condition Eq.(30) and (b) the gradient is polynomial, then it can be deduced that the two integrals are finite and then Eq.(11) holds.

In this section, we have used polyharmonic splines for approximating the function . Other radial basis functions such as Gaussian or Hardy [Buhmann(2003)] types could also be used. The polyharmonic functions have been preferred since they do not require shape parameters to be estimated. It should be noted that, even if radial interpolation methods are well adapted for the interpolation of multivariate functions, they suffer, like other surface response, of the curse of dimensionality. Therefore, they are not adapted if the input research space dimension is very high.

## 5 Approximate ISDE-based simulated annealing algorithm

The approximate ISDE-based simulated annealing algorithm is similar to Algorithm 2, with the difference that function is replaced by its polyharmonic splines approximation introduced in the previous section. Furthermore after each iteration , the polyharmonic splines approximation is enriched by adding a new control point as the end value reached during the integration of the ISDE. The algorithm is initialized by generating control points randomly in . The complete algorithm is summarized in Algorithm 3.

This procedure allows the added control points to be distributed according to the temperature-dependent distribution. This algorithm can be parallelized by integrating simultaneously several ISDEs with different randomly generated initial conditions at each iteration .

## 6 Application

### 6.1 Ackley’s test function

The objective here is to illustrate and show the efficiency of both Algorithms 2 and 3. For this purpose, we use the Ackley’s function which is a classical function for testing multivariate optimization algorithms. This function is defined by

(31) |

with the following constraints

(32) |

For this case the global maximum is reached at

(33) |

The number of local minima increases rapidly with dimension.

#### 6.1.1 Exact ISDE-based simulated annealing algorithm

For case , the ISDE-based simulated annealing sequence, described in Algorithm 2, is generated. The total number of ISDE integrations is . For each ISDE, there are steps. The temperature decreasing law is chosen as

(34) |

in which , , . This function is represented in Fig. 3.

To take into account the constraints, the regularized indicator function in Eq.(20) with . Figure 4 shows the values generated using Algorithm 2.

It can be seen the capacity of the algorithm to escape from local minima and to converge to the global minimum. There are still fluctuations at the end of the simulation since the end-temperature is not zero.

Let’s now increase the dimension to . The parameter of the temperature decreasing law are , , . Figure 5 shows the values generated using Algorithm 2).

It can be seen in this figure that all the components converge to the global minimum. To show the efficiency of this algorithm, these results are compared with the ones obtained using the classical simulated annealing algorithm (1). The same temperature law, and the same increment size calculation method (based on the Hessian of cost function) has been used. Nevertheless, since the calculations of the gradient in Algorithm (2) is more expensive than the simple function evaluations in Algorithm (1), steps for each MH sequence has been set in order to have the same calculation time. The results of the optimisation using the classical simulated annealing algorithm are shown in Fig. 6.

It can be seen in this figure that the result obtained using the ISDE based algorithm are better than those obtained using the classical SA algorithm.

#### 6.1.2 Approximate ISDE-based simulated annealing algorithm

This section illustrates Algorithm 3 for which the cost function is approximated using a polyharmonic spline approximation at order . The initial number of control points is . The number of enrichments is . For each ISDE, there are steps. It should be noted that for this simple application the computation time is larger using the adaptative Algorithm 3 than using the Algorithm 2 (without approximation of the cost function). The objective here is just to validate the methodology. The computational gain of Algorithm 3 will be illustrated in the next application which consists in a Finite Element simulation. For case , Fig. 7 shows the values . It can be seen again that the algorithm converges rapidly to the global minimum.

The approximate cost function obtained after the enrichments is plotted on Fig. 8. Compared with Fig. 2, it can be seen a good approximation of the cost function is the visited regions.

For case , Fig. 10 shows the values . It can be seen the ability of the algorithm to converge to the global minimum when increasing the dimension.

### 6.2 Finite Element model

We are interested in the maximal response of the 2-storey structure represented in Fig. 10 submitted to the soil acceleration plotted in Fig. 11.

This structure is made up with two m plates (blue color), eight vertical beams (red color), eight horizontal beams (red color). The structure is linked to the ground by four linear springs. The bottom plate has thickness m, Young’s modulus Pa, mass density kg/m, Poisson ratio . The top plate has thickness m, Young’s modulus Pa, mass density kg/m and Poisson ratio . All the vertical beams have length m, diameter m (circular section) Young’s modulus Pa, mass density kg/m and Poisson ratio . All the horizontal beams have length m, diameter m (circular section) Young’s modulus Pa and Poisson ratio . The north, east, south and west horizontal beams have mass density kg/m, kg/m, kg/m and kg/m respectively. The initial value for the springs’ stiffnesses (same value for the three directions) are N/m. The structure is subjected to the seismic ground acceleration plotted in Fig. 11 along -direction.

We are interested in the displacement at the observation point located at m on the top plate. The norm of this displacement is plotted in Fig. 12.

The objective is to design the stiffnesses of the springs in order to minimize the maximum value of the norm of the displacement. Then .To perform this task, the approximated ISDE-based simulated algorithm 3 is used with a polyharmonic approximation at order . The admissible domains for the stiffnesses are N/m. The initial number of control points is . The number of enrichments is . The number of steps for each ISDE between two enrichments is . The temperature decreasing law is plotted in Fig. 13.

The optimal control point is N/m. The evolution of the stiffnesses and the exact cost function during the simulation are shown in Figs. 14 and 15.

It can be seen it these figures that the optimum is rapidly reach using a few number of function calls. The exact and approximate univariate variations of the cost function around the optimal value are illustrated in Fig. 16.The exact and approximate bivariate variations of the cost function around the optimal value are illustrated in Figs. 17 and 18. It can be seen in these figures that the global optimum has been found despite the presence of local minima. Furthermore, it can be seen that the polyharmonic splines provide a good approximation of the cost function around the optimal value.

It should be noted that these figures seem to show that the algorithm has reach the global minimum but there is no guaranty unless all the research space is rigorously explored.

The new norm of the displacement of the observation point calculated using the designed structure is plotted on Fig.19. The previous maximum displacement was m. The new one, after optimization, is m.

## 7 Conclusions

We have presented an original variant of the simulated annealing algorithm (1) by incorporating an ISDE generator for the temperature-dependent probability distribution in order to explore efficiently the search space and (2) by interpolating the cost function using polyharmonic splines in order to calculate explicitly and efficiently the cost function and its gradient. Several Markov Chain can be generated in parallel in order to accelerate the enrichment of the interpolation surface. The algorithm has been illustrated on two applications. The first application has shown the potentiality of the algorithm in high dimension and its ability to escape from local minima. The second application shows the potentiality of the algorithm for large computational models. In addition, all the improvements that have been proposed in [Ingber(1993), Pardalos and Romeijn(2002)] for the classical simulated annealing algorithm could be used to improve the algorithm proposed in the present paper. Two remaining issues will be treated in future works. The first one concerns the choice the temperature law which has not been discussed in this paper since the convergence properties for classical simulated annealing are not preserved exactly. The second issue concerns the choice of the size step for the integration of the ISDE. Since the cost function is know explicitly by its polyharmonic spline representation, an adaptative step size could be studied and implemented in the algorithm.

## References

- [Aarts and Laarhoven(1985)] Aarts E. H. L, van Laarhoven P. J. M. 1985. “Statistical cooling: A general approach to combinatorial optimization problems.” Philips Journal of Research 40(4), 193-226.
- [Anily and Federgruen(1987)] Anily S., Federgruen A. 1987. “Simulated annealing methods with general acceptance probabilities.” Journal of Applied Probability 24(3) 657-667.
- [Azizi and Zolfaghari(2004)] Azizi N., Zolfaghari S. 2004. “Adaptive temperature control for simulated annealing: a comparative study.” Computers & Operations Research 31(14), 2439-2451.
- [Batou and Soize(2014)] Batou A., Soize C. 2014. “Generation of accelerograms compatible with design specifications using information theory.” Bulletin of Earthquake Engineering 12(2), 769-794.
- [Borkar(1992)] Borkar V. S. 1992. “Pathwise recurrence orders and simulated annealing.” Journal of Applied Probability 1992, 29(2), 472-476.
- [Buhmann(2003)] Buhmann MD. 2003. Radial Basis Functions: Theory and Implementations, Cambridge University Press, New York.
- [Burrage(2007)] Burrage K. 2007. “Numerical methods for second-order stochastic differential equation.” SIAM Journal of Scientific Computing 29(1), 245-264.
- [Černý(1985)] Černý V. 1985. “Thermodynamical approach to the traveling salesman problem: An efficient simulation algorithm, Journal of Optimization Theory and Applications 45, 41-51.
- [Connors and Kumar(1989)] Connors D. P., Kumar P. R. 1989. “Simulated annealing type markov-chains and their order balance-equations.” SIAM Journal on Control and Optimization 1989 27(6), 1440-1461.
- [Duane, et al(1987)] Duane S., Kennedy A. D., Pendleton B. J., Roweth D. 1987. “Hybrid Monte Carlo.” Physics Letters B 195(2), 216-222
- [Faigle and Schrader(1988)] Faigle U., Schrader R. 1988. “On the convergence of stationary distributions in simulated annealing algorithms.” Information Processing Letters 27(4), 189-194.
- [Faigle and Kern(1991)] Faigle U., Kern W. 1991. “Note on the convergence of simulated annealing algorithms.” SIAM Journal on Control and Optimization 29(1), 153-159.
- [Granville, Krivanek, and Rasson(1994)] Granville V., Krivanek M., Rasson J. P. 1994. “Simulated annealing - a proof of convergence.” IEEE Transactions on Pattern Analysis and Machine Intelligence 16(6), 652-656.
- [Guilleminot and Soize(2014)] Guilleminot J., Soize C. 2014. “Itô SDE-based generator for a class of non-Gaussian vector-valued random fields in uncertainty quantification.” SIAM Journal on Scientific Computing, Society for Industrial and Applied Mathematics Methods and Algorithms for Scientific Computing, 2014, 36(6), A2763-A2786.
- [Hajek(1988)] Hajek B. 1988. “Cooling schedules for optimal annealing.” Mathematics of Operations Research 13(2) 311-329.
- [Hastings (1970)] Hastings W. K. 1970. “Monte Carlo sampling methods using Markov Chains and their applications.” Biometrica 109 , 57-97.
- [Ingber(1989)] Ingber L. 1989. “Very fast simulated re-annealing.” Mathematical and Computer Modelling 12, 8, 967-973.
- [Ingber(1993)] Ingber L. 1993. “Simulated annealing: Practice versus theory.” Mathematical and Computer Modelling 18,(11), 29-57.
- [Itô(1951)] Itô K.. 1951. “On stochastic differential equations, Memoirs American Mathematical Society 4.
- [Jaynes(1957)] Jaynes E. T. 1957. “Information theory and statistical mechanics.” Physical Review 106(4), 620–630 and 108(2), 171-190.
- [Kapur and Kevasan(1992)] Kapur J. N., Kevasan H. K. 1992. Entropy Optimization Principles with Applications, Academic Press, San Diego.
- [Kirkpatrick, Gelatt, and Vecchi(1983)] Kirkpatrick S., Gelatt Jr C. D., Vecchi M. P. 1983. “Optimization by Simulated Annealing.” Science 220 (4598), 671-680.
- [Kouvelis and Chiang(1992)] Kouvelis P., Chiang W. C. 1992. “A simulated annealing procedure for single row layout problems in flexible manufacturing systems.” International Journal of Production Research 30, 717-732.
- [Lundy and Mees(1986)] Lundy M., Mees A. 1986. “Convergence of an annealing algorithm.” Mathematical Programming 34(1), 111-124.
- [Mingjun and Huanwen(2004)] Mingjun J, Huanwen T. 2004. “Application of chaos in simulated annealing.” Chaos, Solitons & Fractals 21, 933-944.
- [Mitra, Romeo, and Sangiovanni-Vincentelli(1986)] Mitra D., Romeo F., Sangiovanni-Vincentelli A. L. 1986. “Convergence and finite time behavior of simulated annealing.” Advances in Applied Probability 18(3), 747-77.
- [Noutani and Andresen(1998)] Noutani Y., Andresen B. 1998. “A comparison of simulated annealing cooling strategies.” J Phys A: Math Gen 41(31), 8373-8385.
- [Pardalos and Romeijn(2002)] Pardalos P., Romeijn E. 2002. Handbook of global optimization, volume 2, Kluwer: Dodrecht.
- [Salazar and Toral(1997)] Salazar R., Toral R. 1997. “Simulated Annealing Using Hybrid Monte Carlo.” Journal of Statistical Physics 89 (5-6), 1047-1060.
- [Shannon(1948)] Shannon C. E. 1948. “A mathematical theory of communication.” Bell System Technology Journal 27, 379-423 and 623-659.
- [Suman and Kumar(2006)] Suman B., Kumar P. A. 2006. “Survey of simulated annealing as a tool for single and multiobjective optimization.” Journal of the Operational Research Society 57, 1143-1160.
- [Soize(1994)] Soize C.. 1994. The Fokker-Planck Equation for Stochastic Dynamical Systems and its Explicit Steady State Solution, World Scientific: Singapore.
- [Soize(2008)] Soize C.. 2008. “Construction of probability distributions in high dimension using the maximum entropy principle. Applications to stochastic processes, random fields and random matrices.” International Journal for Numerical Methods in Engineering 76(10), 1583-1611.
- [Triki, Collette, and Siarry(2004)] Triki E., Collette Y., Siarry P. 2004. “A theoretical study on the behavior of simulated annealing leading to a new cooling schedule.” European Journal of Operational Research 166(1), 77-92.
- [Wang, Dong, and Aitchison(2001)] Wang G., Dong Z., Aitchison P. 2001. “Adaptive Response Surface Method - A Global Optimization Scheme for Computation-intensive Design Problems.” Journal of Engineering Optimization 33, 707-734.
- [Yao and Li(1991)] Yao X., Li G. 1991. “General simulated annealing.” Journal of Computer Science and Technology 6(4), 329-338.
- [Zolfaghari and Liang(1999)] Zolfaghari S., Liang M. 1999. “Jointly solving the group scheduling and machining speed selection problems: A hybrid tabu search and simulated annealing approach.” International journal of production research 37, 2377-2397