A FrankWolfe Based BranchandBound Algorithm
[1ex] for MeanRisk Optimization
C. Buchheim, M. De Santis, F. Rinaldi, L. Trieu
Fakultät für Mathematik
TU Dortmund
Vogelpothsweg 87  44227 Dortmund  Germany
Institut für Mathematik
AlpenAdriaUniversität Klagenfurt
Universitätsstrasse 6567, 9020 Klagenfurt  Austria
Dipartimento di Matematica
Università di Padova
Via Trieste, 63  35121 Padova  Italy
email (Buchheim): christoph.buchheim@tudortmund.de
email (De Santis): marianna.desantis@aau.at
email (Rinaldi): rinaldi@math.unipd.it
email (Trieu): long.trieu@math.tudortmund.de
Abstract
We present an exact algorithm for meanrisk optimization subject to a budget constraint, where decision variables may be continuous or integer. The risk is measured by the covariance matrix and weighted by an arbitrary monotone function, which allows to model riskaversion in a very individual way. We address this class of convex mixedinteger minimization problems by designing a branchandbound algorithm, where at each node, the continuous relaxation is solved by a nonmonotone FrankWolfe type algorithm with awaysteps. Experimental results on portfolio optimization problems show that our approach can outperform the MISOCP solver of CPLEX 12.6 for instances where a linear riskweighting function is considered.
Keywords. mixedinteger programming, meanrisk optimization, global optimization
AMS subject classifications. 90C10, 90C57, 90C90
1 Introduction
We consider mixedinteger knapsack problems of the form
where is the vector of nonnegative decision variables, the index set specifies which variables have to take integer values. In many practical applications, the objective function coefficients are uncertain, while and are known precisely. E.g., in portfolio optimization problems, the current prices and the budget are given, but the returns are unknown at the time of investment. The robust optimization approach tries to address such uncertainty by considering worstcase optimal solutions, where the worstcase is taken over a specified set of probable scenarios called the uncertainty set of the problem. Formally, we thus obtain the problem
(1) 
The coefficients may also be interpreted as random variables. Assuming a multivariate normal distribution, a natural choice for the set is an ellipsoid defined by the means and a positive definite covariance matrix of . In this case, Problem (1) turns out to be equivalent (see, e.g., [3]) to the nonlinear knapsack problem
(2) 
where the factor corresponds to the chosen confidence level. It can be used to balance the mean and the risk in the objective function and hence to model the riskaversion of the user. Ellipsoidal uncertainty sets have been widely considered in robust optimization [1, 2, 3].
In fact, meanrisk models such as (2) have been studied intensively in portfolio optimization, since Markowitz addressed them in his seminal paper dating back to 1952 [22]. Originally, the risk term was often given as instead of , which generally leads to a different optimal balance between mean and risk. In our approach, we allow to describe the weight of the risk by any convex, differentiable and nondecreasing function . Typical choices for the function could be , yielding (2), or , which gives a convex MIQP problem. However, it may also be a reasonable choice to neglect small risks while trying to avoid a large risk as far as possible, this could be modeled by an exponential function
In summary, our aim is to compute exact solutions for problems of the form
(3) 
1.1 Our contribution
The main contribution of this paper is an exact algorithm to solve Problem (3), i.e. a class of convex nonlinear mixedinteger programming problems. We propose a branchandbound method that suitably combines a FrankWolfe like algorithm [11] with a branching strategy already succesfully used in the context of mixedinteger programming problems (see [4] and references therein).
Our approach for solving the continuous relaxation in each subproblem (i.e. the problem obtained by removing the integrality constraints) exploits the simple structure of the feasible set of (3) as well as the specific structure of the objective function. It uses awaysteps as proposed by Guélat and Marcotte [17] as well as a nonmonotone line search.
Our motivation to choose a FrankWolfe like method is twofold. On the one hand, the algorithm, at each iteration, gives a valid dual bound for the original mixedinteger nonlinear programming problem, thus enabling fast pruning of the nodes in the branchandbound tree. On the other hand, the running time per iteration is very low, because the computation of the descent direction and the update of the objective function can be performed in an efficient way, as it will be further explained in the next sections. These two properties, along with the possibility of using warmstarts, are the key to a fast enumeration of the nodes in the branchandbound algorithm we have designed.
1.2 Organization of the paper
The remaining sections of the paper are organized as follows. In Section 2 we describe a modified FrankWolfe method to efficiently compute the dual bounds for the node relaxations. The section also includes an indepth convergence analysis of the algorithm. In Section 3 we shortly explain the main ideas of our branchandbound algorithm, including the branching strategy, upper and lower bound computations and several effective warmstart strategies to accelerate the dual bound computation. In Section 4 we test our algorithm on realworld instances. We show computational results and compare the performances of our algorithm and of CPLEX 12.6 for different riskweighting functions . Finally, in Section 5 we summarize the results and give some conclusions.
2 A modified version of the FrankWolfe method for the fast computation of valid dual bounds
A continuous convex relaxation of (the minimization version of) Problem (3), simply obtained by removing the integrality constraints in the original formulation, is the following:
(4) 
By the transformation , Problem (4) becomes
(5) 
where , and is the dimensional vector with all entries equal to one. For the following, let denote the feasible set of (5).
In this section, we consider the FrankWolfe algorithm with awaysteps proposed by Guélat and Marcotte [17], and define a nonmonotone version for solving Problem (5). We also analyze its convergence properties. This algorithm is then embedded into our branchandbound framework.
The original method described in [17] uses an exact line search to determine, at a given iteration, the stepsize along the descent direction that yields the new iterate. When the exact line search is too expensive (i.e. too many objective function and gradient evaluations are required), different rules can be used for the stepsize calculation; see e.g. [12]. In particular, inexact line search methods can be applied to calculate the stepsize [10], such as the Armijo or Goldstein line search rules. Typical line search algorithms try a sequence of candidate values for the stepsize, stopping as soon as some welldefined conditions on the resulting reduction of the objective function value are met. Since the evaluation of the objective function at the trial points can be performed in constant time (see Section 2.3), line search methods are inexpensive in our context. Furthermore, from our numerical experience, using a nonmonotone Armijo line search turned out to be the best choice in practice. With this choice, a stepsize that yields a (safeguarded) growth of the objective function can be accepted (see e.g. [13, 14, 15, 16]).
The outline of our approach is given in Algorithm 1. At each iteration , the algorithm first computes a descent direction, choosing among a standard towardstep and an awaystep direction, as clarified in Section 2.2. Then, in case optimality conditions are not satisfied, it calculates a stepsize along the given direction by means of a nonmonotone line search, see Section 2.3, updates the point, and starts a new iteration.
In Section 2.1, we will discuss how to decide whether the origin is an optimal solution of Problem (5). If this is not the case, we always choose a starting point better than the origin. The points produced at each iteration thus satisfy , so that and , where
This is done in order to avoid obtaining the origin in any of the following iterations, as the objective function may not be differentiable in .
For the following, we summarize some important properties of Problem (5).
Lemma 1.
Assume that is not an optimal solution of Problem (5) and a point exists such that . Then,

the set is compact;

the function is continuously differentiable in ;

the function is Lipschitz continuous in ;

the function is Lipschitz continuous in with Lipschitz constant , where is the Lipschitz constant of the function .
Proof.
For (a), it suffices to note that is a closed subset of the compact set , while (b) holds since . As is differentiable on the compact set , we obtain (c). Finally, to prove (d), let denote the unique symmetric matrix satisfying . Then
∎
In particular, it follows from (d) that is uniformly continuous in .
2.1 Checking optimality in the origin
A first difficulty in dealing with Problem (5) arises from the fact that the objective function may not be differentiable in the origin . We thus aim at checking, in a first phase of our algorithm, whether the origin is an optimizer of Problem (5). If so, we are done. Otherwise, our strategy is to avoid the origin as an iterate of our algorithm, as discussed in more detail in the following sections.
Since Problem (5) is convex, the origin is a global optimal solution if and only if there exists a subgradient such that for all . From standard results of convex analysis (see e.g. Theorem 2.3.9 in Clarke [6]), we have that and we derive that
where is the unit ball in . Thus is an optimal solution for Problem (5) if and only if
(6) 
Since implies and for all , Condition (6) is equivalent to
(7) 
Note that Condition (7) is never satisfied if , since and . Consequently, the origin is not an optimal solution of Problem (5) in this case. In general, Condition (7) allows to decide whether the origin is optimal by solving a convex quadratic optimization problem with nonnegativity constraints, namely
2.2 Computation of a feasible descent direction
For the computation of a feasible descent direction we follow the awaystep approach described in [17]. At every iteration , we either choose a towardstep or an awaystep. We first solve the following linearized problem (corresponding to the towardstep),
(8) 
and define as . The maximum stepsize that guarantees feasibility of the point chosen along is . Once the towardstep direction is computed, we consider the problem corresponding to the awaystep,
(9) 
and define as . In this case, the maximum stepsize guaranteeing feasibility is
If , the point may become infeasible in case the nonnegativity constraint on is violated. On the other hand, if , the point can only violate the constraint . Therefore, needs to be chosen as:
Note that, according to this rule, may be an infeasible steplength. Note also that, in case the equality constraints are not enforced in Problem (9), could be trivially zero.
In order to choose between the two directions, we use a criterion similar to the one presented in [17]: if
(10) 
with a suitably chosen constant value, we choose the awaystep direction, setting and . Otherwise we select the towardstep direction, setting and . The condition is needed to ensure convergence, as will become clear in Section 2.4 below.
In both Problems (8) and (9), we need to optimize a linear function over a simplex. This reduces to computing the objective function value at each vertex of the simplex, i.e., for and in (8) and for and all with in (9). Consequently, after computing the gradient , both solutions can be obtained at a computational cost of .
2.3 Computation of a suitable stepsize
When using exact line searches, the FrankWolfe method with awaysteps converges linearly if the objective function satisfies specific assumptions; see e.g. [17, 21]. When an exact line search approach is too expensive, we combine the awaystep approach with nonmonotone inexact line searches. Even if the FrankWolfe method is not guaranteed to converge linearly in the latter case, it yields very good results in practice, as will be shown in the numerical experience section.
In the nonmonotone line search used in our algorithm, a stepsize is accepted as soon as it yields a point which allows a sufficient decrease with respect to a given reference value. A classical choice for the reference value is the maximum among the last objective function values computed, where is a positive integer constant. See Algorithm 2 for the details of our line search method.
The maximum stepsize used in Line of Algorithm 2 is set to if the towardstep direction is chosen; it is set to , otherwise.
The following result states that Algorithm 2 terminates in a finite number of steps. It can be proved using similar arguments as in the proof of Proposition 3 in [16].
Proposition 1.
For each , assume that . Then Algorithm 2 determines, in a finite number of iterations of the while loop in Lines 3–5, a stepsize such that
From a practical point of view, it is important that the computation of the objective function values of the trial points can be accelerated by using incremental updates. Therefore, during the entire algorithm for solving Problem (5), we keep the values , , and uptodate. In the line search, if a towardstep is applied and , we exploit the fact that all expressions
can be computed in constant time. Similarly, for , we obtain
In particular, if can be evaluated in constant time, the same is true for the computation of the objective value . Moreover, when the line search is successful and the next iterate is chosen, the same formula as above can be used to compute and in constant time, while can be updated in linear time using
The case of an awaystep can be handled analogously.
2.4 Convergence analysis of the nonmonotone FrankWolfe algorithm
We now analyze the convergence properties of the nonmonotone FrankWolfe algorithm NMMFW with awaysteps (Algorithm 1). All the proofs of the following theoretical results can be found in the Appendix.
Lemma 2.
Suppose that NMMFW produces an infinite sequence . Then

for all ;

the sequence is nonincreasing and converges to a value .
Proof.
For the proof, see Appendix.
∎
Lemma 3.
Suppose that NMMFW produces an infinite sequence . Then
Proof.
For the proof, see Appendix.
∎
Lemma 4.
Suppose that NMMFW produces an infinite sequence . Then
Proof.
For the proof, see Appendix.
∎
Theorem 1.
Proof.
For the proof, see Appendix.
∎
We notice that, due to the use of the line search, there is no need to make any particular assumption on the gradient of the objective function (such as Lipschitz continuity) for proving the convergence of Algorithm NMMFW.
2.5 Lower bound computation
When using Algorithm NMMFW within a branchandbound framework as we will present in Section 3, the availability of valid dual bounds during the execution of NMMFW can help to prune the current node before termination of the algorithm, and thus to decrease the total running time of the branchandbound scheme.
Considering Problem (5), we can define the following dual function [7, 19] for all :
From the definition of and taking into account the convexity of , we have the following weak duality result:
(11) 
where again denotes an optimal solution of Problem (5). We thus obtain a dual bound in each iteration for free, given by
Note that this equation follows from how our direction is chosen, according to (10) (see Section 2.2 for further details). We can stop Algorithm NMMFW as soon as exceeds the current best upper bound in the branchandbound scheme. Furthermore, strong duality holds in (11) (in the sense that ); see e.g. [7] and the references therein.
3 BranchandBound algorithm
In order to deal with integer variables in Problem (3), we embedded Algorithm 1 into a branchandbound framework. Aiming at a fast enumeration of the branchandbound tree, we follow the ideas that have been successfully applied in, e.g., [4]. In this section, we give a short overview over the main features of the branchandbound scheme.
3.1 Branching and enumeration strategy
At every node in our branchandbound scheme, we branch by fixing a single integer variable to one of its feasible values. The enumeration order of the children nodes is by increasing distance to the value of this variables in the solution of the continuous relaxation , computed by Algorithm 1. If the closest integer value to is , we thus consecutively fix to integer values , and so on. If the closest integer is , we analogously start with fixing to the integer value . By optimality of and by the fact that the problem is convex, the resulting lower bounds are nondecreasing when fixing to either increasing values greater than or decreasing values less than . In particular, when being able to prune a node, all siblings beyond this node can be pruned as well.
3.2 Lower bounds after fixing
An advantage of branching by fixing variables as opposed to branching by splitting up variable domains is that the subproblems in the enumeration process of the search tree essentially maintain the same structure. Fixing a variable in Problem (4) just corresponds to moving certain coefficients from the matrix to a linear or constant part under the square root, and from the vector to a constant part outside the square root. More precisely, assume that the variables with indices in have been fixed to values . The problem then reduces to the minimization of
(12) 
over the feasible region , where the matrix is obtained by deleting the rows and columns corresponding to , the vector is obtained by deleting the columns corresponding to , and the remaining terms are updated appropriately.
Note that the relaxation of Problem (12) has a slightly more general form than the original Problem (4), since the data and may be nonzero as a result of fixing variables. However, the algorithm for solving Problem (4) discussed in Section 2 can easily be applied to the relaxation of Problem (12) as well, the only difference being in the computation of the gradient. In fact, in case at least one variable has been fixed to a nonzero value, we obtain since . In particular, the objective function becomes globally differentiable in this case.
3.3 Upper bounds
As an initial upper bound in the branching tree, we use a simple heuristic, adapted from a greedy heuristic by Julstrom [20] for the quadratic knapsack problem. Analogously to the notation used in the theory of knapsack problems the profit ratio of an item is defined as the sum of all profits that one gains by putting item into the knapsack, divided by its weight. Transferred to our application, we have
for all . Julstrom proposed to sort all items in a nondecreasing order with respect to and, starting from the first item, successively set until the capacity of the knapsack is reached. The remaining variables are set to zero.
We adapt this algorithm by allowing multiple copies of each item, i.e. , where is the current capacity of the knapsack.
During the branchandbound enumeration, we do not use any heuristics for improving the primal bound, since the fast enumeration using a depthfirst search usually leads to the early identification of good feasible solutions and hence to fast updates of the bound. Once all integer variables have been fixed, we compute the optimal solution of the subproblem in the reduced continuous subspace.
3.4 Warmstarts
With the aim of speedingup our branchandbound scheme, we use a warmstart procedure by taking over information from the parent node. For this, let be the optimal solution in the parent node and define by removing the entry of corresponding to the variable that has been fixed last. If is feasible for the current node relaxation, we always use it as a starting point for NMMFW, otherwise we choose one of the following feasible points according to our chosen warmstarting rule:

the first unit vector ;

the projection of onto the feasible region;

or the unit vector with .
The resulting warmstarting rules are denoted by (), (), and (), respectively. This notation is meant to emphasize that we either use or, if not possible, one of the other choices depending on the selected rule.
Note that the point can be computed by the algorithm originally proposed by Held et al. [18] that was recently rediscovered by Duchi et al. [9]. For the latter version the overall complexity has been proved to be . The unit vector is chosen by again adapting ideas of the greedy heuristic by Julstrom [20]. It represents the vertex of where the potential increase of the objective function due to the remaining items is minimized, if setting .
4 Numerical experience
In order to investigate the potential of our algorithm FWBB when applied to Problem (3), we implemented it in C++ and Fortran 90 and performed an extensive computational evaluation. As benchmark data set, we used historical realdata capital market indices from the Standard & Poor’s 500 index (S&P 500) that were used and made public by Cesarone et al. [5]. This data set was used for solving a Limited Asset Markowitz (LAM) model. For each of the 500 stocks the authors obtained 265 weakly price data, adjusted for dividends, from Yahoo Finance for the period from March 2003 to March 2008. Stocks with more than two consecutive missing values were disregarded. The missing values of the remaining stocks were interpolated, resulting in an overall of 476 stocks. Logarithmic weekly returns, expected returns and covariance matrices were computed based on the period March 2003 to March 2007.
By choosing stocks at random from the 476 available ones, we built mixedinteger portfolio optimization instances of different sizes. Namely, we built 10 problems with 100, 10 with 150 and 10 with 200 stocks, considering (so half of the variables are constrained to be integer). We considered three different values for , representing the budget of the investor, namely , , and , yielding a total of 90 instances.
All experiments were carried out on Intel Xeon processors running at 2.60 GHz. All running times were measured in cpu seconds and the timelimit was set to one cpu hour. In the following, we first present a numerical evaluation related to our algorithm FWBB: we explore the benefits obtained from using the nonmonotone line search and using warmstart alternatives. Then, we present a comparison of FWBB with the MISOCP and the MIQP solver of CPLEX 12.6, for the two cases and , respectively. Finally, to show the generality of our approach, we report the results of numerical tests for a nonstandard riskweighting function .
4.1 Benefits of the nonmonotone line search and warmstarts
The NMMFWalgorithm devised in Section 2 uses a nonmonotone line search; in our implementation of FWBB we set . In order to show the benefits of the nonmonotone version of FWBB we report in Table 1 a comparison between the nonmonotone version (NMFWBB) and the monotone one (MFWBB), on instances with and budget constraint . We considered as warmstart choice. In Table 1 we report, for each dimension, the number of instances solved within the time limit (), the average running times (time), and the average numbers of iterations of NMMFW in each node of the enumeration tree (it). All averages are taken over the set of instances solved within the time limit. Using the nonmonotone line search, FWBB is able to solve a greater number of instances within the time limit. Furthermore, NMFWBB gives in general better performance in terms of running times, while the number of iterations is very similar, showing the advantage of allowing stepsizes with a safeguarded growth of the objective function.
NMFWBB  MFWBB  

#  time  it  #  time  it  
100  10  1.6  314.3  10  0.8  294.9 
150  10  7.1  307.9  9  69.0  300.7 
200  8  32.4  277.8  8  340.7  256.0 
In order to investigate the benefits of the warmstart choices , , , we again ran the different versions of FWBB on instances with and budget constraint . We compare the three warmstart possibilities presented above with the following alternatives: () always choose ; () always choose . In Table 2 we show the results related to the five different starting point choices. We can observe that the best choice among those considered, according to the number of instances solved within the time limit, is . We also observe that, when , choosing is better than considering or as starting points, highlighting the benefits of using warmstarts.
#  time  #  time  #  time  #  time  #  time  

100  10  0.2  10  0.8  10  0.8  10  1.6  10  0.5 
150  10  3.8  10  3.9  10  7.1  10  7.1  10  6.1 
200  7  220.2  7  223.5  8  31.7  8  32.4  9  46.0 
4.2 Comparison with Cplex 12.6
In this section, we present a numerical comparison on instances with and . We compare FWBB with the MISOCP and the MIQP solver of CPLEX 12.6, respectively. Concerning FWBB, we consider the two nonmonotone versions, FWBBP and FWBBG, using and , respectively. We use an absolute optimality tolerance of for all algorithms.
Comparison on instances with .
In order to compare FWBB with CPLEX 12.6, we modeled (3) as an equivalent mixedinteger secondorder cone program (MISOCP):
We chose , where . The value of controls the amount of risk the investor is willing to take. In theory, can take any value in (0,1], where a small value implies a big weight on the riskterm and means that the risk is not taken into account. Numerical tests on single instances showed that any value of in (0,0.9] leads to the trivial optimal solution zero, i.e. not investing anything is the optimal decision for the investor. Therefore, we restricted our experiments to the three values of mentioned above.
In Table 3, we report for each algorithm the following data: numbers of instances solved within the time limit (), average running times (time), average numbers of branchandbound nodes (nodes). All averages are taken over the set of instances solved within the time limit. We show the computational results for the three different values of and . We can see that FWBB suffers from an increasing right hand side , which however holds for CPLEX 12.6 as well, even to a larger extent. The choice of does not significantly effect the performance of FWBB, while CPLEX 12.6 performs better on instances with large . Altogether, we can observe that FWBBP is able to solve the largest number of instances within the time limit. When the number of solved instances is the same, both version of FWBB outperform the MISOCP solver of CPLEX 12.6 in terms of cpu time. Note that the average number of branchandbound nodes in FWBB is much larger than that needed by CPLEX 12.6. This highlights how solving the continuous relaxations by NMFWBB leads to a fast enumeration of the branchandbound nodes. Besides Table 3, we visualize our running time results by performance profiles in Figure 1, as proposed in [8]. They confirm that, in terms of cpu time, FWBBP outperforms the MISOCP solver of CPLEX 12.6 significantly.
In our experiments, we noticed that in some cases FWBB and CPLEX provide slightly different minimizers, yielding slightly different optimal objective function values. While on certain instances the optimal solution of FWBB is slightly superior to CPLEX, on other instances it is the other way round. We observed a relative difference from the best solution of the order of .
inst  FWBBP  FWBBG  CPLEX 12.6  

#  time  nodes  #  time  nodes  #  time  nodes  
100  0.91  10  0.17  1.61e+03  10  0.33  1.63e+03  10  17.00  3.81e+03  
100  0.91  10  0.09  8.29e+02  10  0.22  8.33e+02  10  279.15  7.90e+03  
100  0.91  10  0.30  3.74e+02  10  0.42  4.28e+02  3  51.01  2.77e+03  
100  0.95  10  0.02  2.59e+02  10  0.04  2.65e+02  10  1.89  4.66e+02  
100  0.95  10  0.04  3.19e+02  10  0.09  3.14e+02  10  59.10  2.98e+03  
100  0.95  10  0.47  5.87e+02  10  1.57  2.08e+03  5  364.90  4.39e+03  
100  0.99  10  0.01  1.70e+02  10  0.01  1.78e+02  10  0.15  3.70e+01  
100  0.99  10  0.04  5.81e+02  10  0.04  6.64e+02  10  0.70  1.85e+02  
100  0.99  10  16.51  1.57e+04  10  260.60  3.71e+05  9  503.62  1.03e+04  
150  0.91  10  0.14  6.56e+02  10  4.45  1.06e+04  10  52.53  3.18e+03  
150  0.91  10  0.40  1.73e+03  10  46.4  4.83e+04  6  707.94  6.70e+03  
150  0.91  10  2.15  2.01e+03  9  1.77  1.41e+03  5  47.32  1.81e+03  
150  0.95  10  0.15  9.76e+02  10  0.24  1.05e+03  10  11.49  1.04e+03  
150  0.95  10  0.17  6.75e+02  10  5.70  8.89e+03  8  225.78  3.04e+03  
150  0.95  10  6.14  6.15e+03  10  7.11  6.16e+03  5  834.79  6.23e+03  
150  0.99  10  0.04  2.56e+02  10  0.06  2.66e+02  10  35.08  5.81e+02  
150  0.99  10  0.10  2.20e+02  10  0.23  5.08e+02  10  6.69  6.14e+02  
150  0.99  10  0.78  8.67e+02  10  0.82  8.80e+02  9  422.15  3.53e+03  
200  0.91  10  4.81  1.71e+04  10  5.80  1.35e+04  10  465.62  9.48e+03  
200  0.91  9  19.83  7.89e+04  9  116.78  1.89e+05  3  879.46  9.14e+03  
200  0.91  10  22.99  1.86e+04  10  32.44  2.20e+04  3  204.92  3.80e+03  
200  0.95  10  0.37  1.33e+03  10  0.54  1.29e+03  10  75.50  3.46e+03  
200  0.95  10  0.82  1.38e+03  10  1.40  1.48e+03  5  44.64  1.55e+03  
200  0.95  9  45.98  3.74e+04  8  32.39  2.05e+04  7  38.77  2.42e+03  
200  0.99  10  2.17  1.57e+04  10  2.00  1.57e+04  10  2.44  2.76e+03  
200  0.99  10  0.49  6.04e+02  10  0.95  9.12e+02  9  277.08  2.61e+03  
200  0.99  10  11.14  1.06e+04  9  67.57  5.90e+04  9  183.80  2.13e+03 
Comparison on instances with .
If we consider as riskweighting function , Problem (3) reduces to a convex quadratic mixedinteger problem, and the objective function is differentiable everywhere in the feasible set. In Table 4 we report the comparison among FWBBP, FWBBG and the MIQP solver of CPLEX 12.6. We considered . All algorithms were able to solve all the instances very quickly. The MIQP solver of CPLEX 12.6 shows the best cpu times, although both versions of FWBB are also very fast, even if they enumerate a higher number of nodes.
We would like to remark that our branchandbound algorithm does not exploit the (quadratic) structure of the objective function, since it is designed to solve a more general class of problems than MIQPs. Nevertheless, the algorithm gives competitive results also when dealing with those problems.
inst  FWBBP  FWBBG  CPLEX 12.6  

#  time  nodes  #  time  nodes  #  time  nodes  
100  10  0.06  4.10e+02  10  0.06  4.10e+02  10  0.04  1.12e+01  
100  10  0.07  6.38e+02  10  0.07  6.38e+02  10  0.03  1.58e+01  
100  10  0.23  9.86e+02  10  0.23  9.86e+02  10  0.03  2.59e+01  
150  10  0.12  7.66e+02  10  0.12  7.65e+02  10  0.06  1.92e+01  
150  10  0.19  9.76e+02  10  0.18  9.76e+02  10  0.06  2.06e+01  
150  10  0.19  8.80e+02  10  0.19  8.81e+02  10  0.06  9.90e+00  
200  10  0.61  3.28e+03  10  0.61  3.28e+03  10  0.11  2.07e+01  
200  10  0.94  5.24e+03  10  0.91  5.24e+03  10  0.11  1.91e+01  
200  10  0.41  1.46e+03  10  0.42  1.46e+03  10  0.12  2.40e+01 
4.3 Results with a nonstandard riskweighting function
As a further experiment, we tested our instances considering a different riskweighting function , namely
such that the investor’s riskaversion increases exponentially in the risk after exceeding a certain threshold value . In Table 5, we report the results of FWBBP, considering three choices . We observe that for both and our algorithm FWBBP is able to solve all instances within the time limit, and that instances get more difficult for FWBBP with increasing .
inst  
#  time  nodes  #  time  nodes  #  time  nodes  
100  10  0.09  4.8e+02  10  0.17  7.2e+02  10  0.09  5.4e+02  
100  10  0.07  4.1e+02  10  0.27  7.5e+02  10  243.87  3.3e+05  
100  10  0.30  8.6e+02  10  31.14  5.1e+04  5  401.57  6.2e+05  
150  10  0.17  9.0e+02  10  1.31  4.5e+03  10  0.19  3.2e+02  
150  10  0.33  1.5e+03  10  2.52  7.9e+03  10  193.40  2.0e+04  
150  10  0.56  2.2e+03  10  6.50  1.2e+04  4  565.76  7.1e+05  
200  10  1.41  6.6e+03  10  14.96  4.7e+04  10  7.40  7.7e+03  
200  10  1.09  3.2e+03  10  50.47  1.1e+05  7  929.46  7.9e+05  
200  10  0.82  2.5e+03  10  30.25  3.0e+04  5  138.07  1.5e+05  
In order to investigate the influence of the riskweighting function on the optimal solution, we compared different functions for an instance of dimension under the constraint . The results are given in Table 6. We report, for each riskweighting function depending on a specific risk parameter (riskpar), the objective function value obtained (obj), the value of the return term in the objective function evaluated at the optimal solution (), the number of nonzero entries in the optimal solution (), and the maximal entry in the optimal solution ().
riskpar  obj  

0.3684  2.2452  16  58  
1.4454  6.0911  4  280  
4.2161  6.4523  3  320  
0.0513  0.1021  16  2  
0.0905  0.1715  15  3.14  
0.5258  0.5900  16  11.87  
3.4991  3.5348  7  113 
Not surprisingly, the results show that a larger weight on the riskterm leads to a smaller expected return in the optimal solution. At the same time, a large weight on the risk favors a diversified portfolio, so that the number of nonzeros increases with the weight on the risk, at the same and the maximal amount invested into a single investment decreases. However, the precise dependencies are defined by the function . In Figure 2, we show contour plots for the different types of functions considered here.
5 Conclusions
We presented a branchandbound algorithm for a large class of convex mixedinteger minimization problems arising in portfolio optimization. Dual bounds are obtained by a modified version of the FrankWolfe method. This is motivated mainly by two reasons. On the one hand, the FrankWolfe algorithm, at each iteration, gives a valid dual bound for the original mixedinteger problem, therefore it may allow an early pruning of the node. On the other hand, the cost per iteration is very low, since the computation of the descent direction and the update of the objective function can be performed in a very efficient way. Furthermore, the devised FrankWolfe method benefits from the use of a nonmonotone Armijo line search. Within the branchandbound scheme, we propose different warmstarting strategies. The branchandbound algorithm has been tested on a set of realworld instances for the capital budgeting problem, considering different classes of riskweighting functions. Experimental results show that the proposed approach significantly outperforms the MISOCP solver of CPLEX 12.6 for instances where a linear riskweighting function is considered.
6 Appendix
Proof of Lemma 2:
Proof.
First note that the definition of ensures and hence for all , which proves (i). For (ii), we have that
Since by the definition of the line search, we derive , which proves that the sequence is nonincreasing. By (i), this sequence is bounded from below by the minimum of on , which exists by Lemma 1, and hence converges.
∎
Proof of Lemma 3:
Proof.
For each , choose with . We prove by induction that for any fixed integer we have
(13) 
We now assume that (13) holds for and we prove that it holds for index . We have
so that the same reasoning as before yields
(14) 
The left hand side of (14) converges to zero since (13) holds for and the term converges to (by the inductive hypothesis), as well as because of Lemma 2 (and the fact that is bounded by ). Then,
so that . Again, uniform continuity of over yields (13) for index .
To conclude the proof, let and note that for any we can write
Therefore, since the summation vanishes and converges to from Lemma 2, taking the limit for and observing we obtain the result.
∎
Proof of Lemma 4:
Proof.
First note that for all . Let be the stepsize used by NMMFW at iteration . Then,
By Lemma 3, the left hand side converges to zero, hence
(15) 
Since is continuously differentiable on the compact set by Lemma 1 and is bounded on , the sequence is bounded. It thus suffices to show that any convergent subsequence of converges to zero.
We assume by contradiction that a subsequence exists with
Since the sequences and are bounded, we can switch to an appropriate subsequence and assume that and exist. From (15) we obtain
(16) 
and the continuity of the gradient in implies