Oracle-Based Robust Optimization via Online Learning

# Oracle-Based Robust Optimization via Online Learning

## Abstract

Robust optimization is a common framework in optimization under uncertainty when the problem parameters are not known, but it is rather known that the parameters belong to some given uncertainty set. In the robust optimization framework the problem solved is a min-max problem where a solution is judged according to its performance on the worst possible realization of the parameters. In many cases, a straightforward solution of the robust optimization problem of a certain type requires solving an optimization problem of a more complicated type, and in some cases even NP-hard. For example, solving a robust conic quadratic program, such as those arising in robust SVM, ellipsoidal uncertainty leads in general to a semidefinite program. In this paper we develop a method for approximately solving a robust optimization problem using tools from online convex optimization, where in every stage a standard (non-robust) optimization program is solved. Our algorithms find an approximate robust solution using a number of calls to an oracle that solves the original (non-robust) problem that is inversely proportional to the square of the target accuracy.

## 1Introduction

The Robust Optimization (RO; see [2]) framework addresses a fundamental problem of many convex optimization problems: slight inaccuracies in data give rise to significant fluctuations in the solution. While there are different approaches to handle uncertainty in the parameters of an optimization problem, the RO approach choose a solution that performs best against the worst possible parameter. When the objective function is convex in the parameters, and concave in the uncertainty, and when the uncertainty set is convex the overall optimization problem is convex.

Despite its theoretical and empirical success, a significant hinderance of adopting RO to large scale problems is the increased computational complexity. In particular, robust counterpart of an optimization problem is often more difficult, albeit usually convex, mathematical problems. For example, the robust counterpart of conic quadratic programming with ellipsoidal uncertainty constraints becomes a semi-definite program, for which we currently have significantly slower solvers.

RO has recently gained traction as a tool for analyzing machine learning algorithms and for devising new ones. In a sequence of papers, Xu, Caramanis and Mannor show that several standard machine learning algorithms such as Lasso and norm regularized support vector machines have a RO interpretation [26]. Beyond these works, robustness is a desired property for many learning algorithms. Indeed, making standard algorithms robust to outliers or to perturbation in the data has been proposed in several works; see [16]. However in these cases, the problem eventually solved is more complicated than the original problem. For example, in [25] the original problem is a standard support vector machine, but when robustifying it to input uncertainty, one has to solve a second-order conic program (in the non-separable case). Another example is [24] where the uncertainty is a probability distribution over inputs. In that case, the original SVM becomes a second-order conic program as well.

The following question arrises: can we (approximately) solve a robust counterpart of a given optimization problem using only an algorithm for the original optimization formulation? In this paper we answer this question on the affirmative: we give two meta-algorithms that receive as input an oracle to the original mathematical problem and approximates the robust counterpart by invoking the oracle a polynomial number of times. In both approaches, the number of iterations to obtain an approximate robust solution is a function of the approximation guarantee and the complexity of the uncertainty set, and does not directly depend on the dimension of the problem. Our methods differ on the assumptions regarding the uncertainty set and the dependence of the constraints on the uncertainty. The first method allows any concave function of the noise terms but is limited to convex uncertainty sets. The second method allows arbitrary uncertainty sets as long as a “pessimization oracle” (as termed by [18]) exists— an oracle that finds the worst case-noise for a given feasible solution. Our methods are formally described as template, or meta-algorithms, and are general enough to be applied even if the robust counterpart is NP-hard 1.

Our approach for achieving efficient oracle-based RO is to reduce the robust formulation to a zero-sum game, which we solve by a primal-dual technique based on tools from online learning. Such primal-dual methods originated from the study of approximation algorithms for linear programs [20] and were recently proved invaluable in understanding Lagrangian relaxation methods (see e.g. [1]) and in sublinear-time optimization techniques [10]. We show how to apply this methodology to oracle-based RO. Along the way, we contribute some extensions to the existing online learning literature itself, notably giving a new Follow-the-Perturbed-Leader algorithm for regret minimization that works with (additive) approximate linear oracles.

Finally, we demonstrate examples and applications of our methods to various RO formulation including linear, semidefinite and quadratic programs. The latter application builds on recently developed efficient linear-time algorithms for the trust region problem [13].

Related work. Robust optimization is by now a field of study by itself, and the reader is referred to [3] for further information and references. The computational bottleneck associated with robust optimization was addressed in several papers. [8] propose to sample constraints from the uncertainty set, and obtain an “almost-robust” solution with high probability with enough samples. The main problem with their approach is that the number of samples can become large for a high-dimensional problem.

For certain types of discrete robust optimization problems, [4] propose solving the robust version of the problem via (the dimension) solutions of the original problem. [18] give an iterative cutting plane procedure for attaining essentially the same goal as us, and demonstrate impressive practical performance. However, the overall running time for their method can be exponential in the dimension.

Oraganization. The rest of the paper is organized as follows. In Section 2 we present the model and set the notations for the rest of the paper. In Section 3.1 we describe the simpler of our two meta-algorithms: a meta algorithm for approximately solving RO problems that employs dual subgradient steps, under the assumption that the robust problem is convex with respect to the noise variables. In Section 3.2 we remove the latter convexity assumption and only assume that the problem of finding the worst-case noise assignment can be solved by invoking a “pessimization oracle”. This approach is more general than the subgradient-based method and we exhibit perhaps our strongest example of solving robust quadratic programs using this technique in Section 4. Section 4 also contains examples of application of our technique for robust linear and semi-definite programming. We conclude in Section 5.

## 2Preliminaries

We start this section with the standard formulation of RO. We then recall some basic results from online learning.

### 2.1Robust Optimization

Consider a convex mathematical program in the following general formulation:

Here are convex functions, is a convex set in Euclidean space, and is a fixed parameter vector. The robust counterpart of this formulation is given by

where the parameter vector is constrained to be in a set called the uncertainty set. It is without loss of generality to assume that the uncertainty set has this specific form of a cartesian product , see e.g. [2]. Here we also assume that the uncertainty set is symmetric (that is, its projection onto each dimension is the same set). This assumption is only made for simplifying notations and can be relaxed easily.

The following observation is standard: we can reduce the above formulation to a feasibility problem via a binary search over the optimal value of , replacing the objective with the constraint with being our current guess of the optimal value (of course, assuming the range of feasible values is known a-priory). For ease of notation, we rename by shifting it by , and can write the first constraint as simply . With these observations, we can reduce the robust counterpart to the feasibility problem

We say that is an -approximate solution to this problem if meets each constraint up to , that is, if it satisfies for all ().

### 2.2Online Convex Optimization and Regret minimization

Our derivations below use tools from online learning, namely algorithms for minimizing regret in the general prediction framework of Online Convex Optimization (OCO). In OCO 2, the online predictor iteratively produces a decision from a convex decision set . After a decision is generated, a concave reward function is revealed, and the decision maker suffers a loss of . The standard performance metric in online learning is called regret, given by

The reward function is not known to the decision maker before selecting and it is, in principal, arbitrary and even possibly chosen by an adversary. We henceforth make crucial use of this robustness against adversarial choice of reward functions: the reward functions we shall use will be chosen by a dual optimization problem, thereby directing the entire algorithm towards a correct solution. We refer the reader to [9] for more details on online learning and online convex optimization.

Two regret-minimization algorithms that we shall use henceforth (at least in spirit) are Online Gradient Descent [28] and Follow the Perturbed Leader [15].

Online Gradient Descent (OGD). In OGD, the decision maker predicts according to the rule

where is the Euclidean projection operator onto the set . Hence, the OGD algorithm takes a projected step in the direction of the gradient of the current reward function. Even thought the next reward function can be arbitrary, it can be shown that this algorithm achieves a sublinear regret.

Follow the Perturbed Leader (FPL). The FPL algorithm works in a similar setting as OGD, but with two crucial differences:

1. The set does not need to be convex. This is a significant advantage of the FPL approach, which we make use of in our application to robust quadratic programming (see Section 4.2).

2. FPL assumes that the reward functions are linear, i.e. with .

[15] suggest the following method for online decision making that relies on a linear optimization procedure over the set that computes for all . FPL chooses by first drawing a perturbation uniformly at random, and computing:

The regret of this algorithm is bounded as follows.

For our purposes, and in order to be able to work with an approximate optimization oracle to the original mathematical program, we need to adapt the original FPL algorithm to work with noisy oracles. This adaptation is made precise in Section 3.2.

## 3Oracle-Based Robust Optimization

In this section we formally state and prove our first (and simpler) result: an oracle-based approximate robust optimization algorithm that is based on subgradient descent.

Throughout the section we assume the availability of an optimization oracle for the original optimization problem of the form given in Figure ?, which we denote by . Such an optimization oracle approximately solves formulation for any fixed noise , in the sense that it either returns an -feasible solution (that meets each constraint up to ) or correctly declares that the problem is infeasible.

In this section we assume that for all :

1. For all , the function is concave in ;

2. The set is convex.

Under these assumptions, the robust formulation is in fact a convex-concave saddle-point problem that can be solved in polynomial time using interior-point methods. However, recall that our goal is to solve the robust problem by invoking a solver of the original (non-robust) optimization problem.

In the setting of this section, we shall make use of the following definitions. Let be an upper bound over the diameter of , that is . Let be a constant such that for all and .

With the above assumptions and definitions, we can now present an oracle-based robust optimization algorithm, given in Algorithm ?. The algorithm is comprised of primal-dual iterations, where the dual part of the algorithm updates the noise terms according to the current primal solution, via a low-regret update. For this algorithm, we prove:

First, suppose that the algorithm returns “infeasible”. By the definition of the oracle , this happens if for some , there does not exists such that

This implies that the robust counterpart cannot be feasible, as there exists an admissible perturbation that makes the original problem infeasible.

Next, suppose that a solution is returned. The premise of the oracle implies that for all and (otherwise, the algorithm would have returned “infeasible”), whence

On the other hand, from the regret guarantee of the Online Gradient Descent algorithm we have

Combining and , we conclude that for all ,

where the final inequality follows from the convexity of the functions with respect to . Hence, for every we have

implying that is an -approximate robust solution.

### 3.2Dual-Perturbation Meta-Algorithm

We now give our more general and intricate oracle-based approximation algorithm for RO. In contrast to the previous simple subgradient-based method, in this section we do not need the uncertainty structure to be convex. Instead, in addition to an oracle to solve the original mathematical program, we also assume the existence of an efficient “pessimization oracle” (as termed by [18]), namely an oracle that approximates the worst-case noise for any given feasible solution . Formally, assume that for all the following hold:

1. For all , the function is linear in , i.e. can be written as for some functions and ;

2. There exists a linear optimization procedure that given a vector , computes a vector such that

On the surface, the linearity assumption seems very strong. However, note that we do not assume the convexity of the set . This means that the dual subproblem (that amounts to finding the worst-case noise for a given ) is not necessarily a convex program. Nevertheless, our approach can still approximate the robust formulation as long as a procedure is available.

In the rest of the section we use the following notations. Let be an upper bound over the diameter of , that is . Let and be constants such that and for all and .

We can now present our second oracle-based meta-algorithm, described in Algorithm ?. Similarly to our dual-subgradient method, the algorithm is based on primal-dual iterations. However, in the dual part we now rely on the approximate pessimization oracle for updating the noise terms. This algorithm provides the following convergence guarantee.

We begin by analyzing the dual part of the algorithm, namely, the rule by which the variables are updated. While this rule is essentially an FPL-like update, we cannot apply Lemma ? directly for two crucial reasons. First, the update uses an approximate linear optimization procedure instead of an exact one as required by FPL. Second, the reward vectors being observed by the dual algorithm are random variables that depend on its internal randomization (i.e., on the random variables ). Nevertheless, by analyzing a noisy version of the FPL algorithm (in Section 3.3 below) we can prove the following bound.

Fix some . Note that the distribution from which the dual algorithm draws is a deterministic function of the primal variables . Hence, we can apply Lemma 4.1 of [9], together with the regret bound of Theorem ? (see Section 3.3 below), and obtain that

where denotes the expectation conditioned on . Next, note that the random variables for form a martingale differences sequence with respect to , and

Hence, by Azuma’s inequality (see e.g., Lemma A.7 in [9]), with probability at least ,

Summing inequalities and , we obtain the lemma.

Equipped with the above lemma, we can now prove Theorem ?.

First, suppose that the algorithm returns “infeasible”. By the definition of the oracle , this happens if for some , there does not exists such that

This implies that the robust counterpart cannot be feasible.

Next, suppose that a solution is returned (note that must lie in the set as we assume that is convex). This ensures that for all and (otherwise, the algorithm would have returned “infeasible”), whence

On the other hand, Lemma ? implies that for each we have

with probability at least . Recalling that for all and applying a union bound, we obtain that with probability at least ,

Using our choice of now gives that with probability at least ,

Combining and , we conclude that with probability at least , for all ,

where the final inequality follows from the convexity of the functions with respect to . Hence, with probability at least , for every we have

implying that is an -approximate robust solution.

As mentioned above, in our analysis we require a noisy version of the FPL algorithm, namely a variant capable of using an approximate linear optimization procedure over the decision domain rather than an exact one. Here we analyze such a variant and prove Theorem ? being used in the proof of Lemma ? above.

Assume we have a procedure for -approximating linear programs over a (not necessarily convex) domain , that is, for all the output of satisfies

for some constant . We analyze the following version of the FPL algorithm: at round choose by first choosing a perturbation uniformly at random, and computing:

We show that the error introduced by the noisy optimization procedure does not harm the regret too much. Formally, we prove:

Throughout this section we use the notation as a shorthand for the sum . Following the analysis of [15], we first prove that being the approximate leader yields approximately zero regret.

The proof is by induction on . For the claim is trivial. Next, assuming correctness for some value of we have

which completes the proof.

Next, we bound the regret of a hypothetical algorithm that on round uses the unobserved function for predicting .

Imagine a fictitious round in which a reward vector is observed. Then, using Lemma ? we can write

Using the guarantee of , we can bound the first term on the right hand side as

Putting things together, for we have

where the final inequality follows from Hölder’s inequality, since and .

Our final lemma bounds the expected difference in quality between the prediction made by the hypothetical algorithm to the one made by the approximate FPL algorithm.

Lemma 3.2 in [15] shows that the cubes and overlap in at least fraction. On this intersection, the random variables and are identical. Otherwise, they can differ by at most . This gives the claim.

We can now prove our regret bound.

Since we are bounding the expected regret, we can simply assume that with uniformly distributed in the cube . Combining the above lemmas, we see that

The claimed regret bound now follows from our choice of .

## 4Examples and Applications

In this section we provide several examples for the applicability of our results. All the problems we consider are stated as feasibility problems. For concreteness, we focus on ellipsoidal uncertainty sets, being the most common model of data uncertainty.

### 4.1Robust Linear Programming

A linear program (LP) in the standard form is given by

The robust counterpart of this optimization problem is a second-order conic program (SOCP) that can be solved efficiently, see e.g. [3]. In many cases of interest there exist highly efficient solvers for the original LP problem, as in the important case of network flow problems where the special combinatorial structure allows for algorithms that are much more efficient than generic LP solvers. However, this combinatorial structure is lost for its corresponding robust network flow problem. Hence, solving the robust problem using an oracle-based approach might be favorable in these cases. For the same reason, our technique is relevant even in the case of polyhedral uncertainty, where the robust counterpart remains an LP but possibly without the special structure of the original formulation.

In the discussion below, we assume that the feasible domain of the LP is inscribed in the Euclidean unit ball (this can be ensured via standard scaling techniques). Notice this also implies that the feasible domain of the corresponding robust formulation is inscribed in the same ball.

A robust linear program with ellipsoidal noise is given by:

where is a matrix controlling the shape of the ellipsoidal uncertainty, are the nominal parameter vectors, and is the -dimensional Euclidean unit ball.

Dual-Subgradient Algorithm. The robust linear program is amenable to our OGD-based meta-algorithm (Algorithm ?), as the constraints are linear with respect to the noise terms . In this case we have , so that in each iteration of the algorithm, the update of the variables takes the simple form

Specializing Theorem ? to the case of robust LPs, we obtain the following.

Note that for all and ,

Setting and in Theorem ?, we obtain the statement.

Dual-Perturbation Algorithm. Since the constraints of the robust LP are linear in the uncertainties , we can also apply our FPL-based meta-algorithm to the problem . Using the notations of Section 3.2, we have . Hence, the computation of the noise variables can be done in closed-form, as follows:

In this case, Theorem ? implies:

Using the notations of Section 3.2 with , we have

Making the substitutions into the guarantees in Theorem ? completes the proof.

We see that the asymptotic performance of Algorithm ? is factor- worse than that of Algorithm ?, in the case of robust LP problems.

with , , . As in the case of LPs, we assume that the feasible domain of the above program is inscribed in the Euclidean unit ball. The robust counterpart of this optimization problem is a semidefinite program [3]. Current state-of-the-art QP solvers can handle two to three orders of magnitude larger QPs than SDPs, motivating our results. Indeed, our approach avoids reducing the robust program into an SDP and approximates it using a QP solver.

A robust QP with ellipsoidal uncertainties is given by 3

where are fixed matrices and . Here denotes the ’th entry of the noise vector .

Notice that Algorithm ? does not apply to formulation , as the constraints are certainly not concave with respect to the noise terms (in fact, they are convex in , as we show below). This motivates the need for our FPL-based meta-algorithm.

Dual-Perturbation Algorithm. We now show that the problem falls in the scope of Section 3.2, and the assumptions required there hold for this program. We let denote the total magnitude of the admissible noise, and assume that the Frobenius norms of the nominal matrices are upper bounded by .

The following lemma shows that the ’th constraint is in fact a convex quadratic in .

Define and for . We have

where is a matrix whose columns are . We see that the first claim holds for , and , all of which are independent of .

It is left to bound the coefficients in the above quadratic form. Note that for all with , so that

Hence,

and

which proves the second claim.

The above lemma demonstrates the well-known fact that the problem of finding the worst-case noise in robust QP with ellipsoidal uncertainty is a maximization of a convex quadratic over the unit ball (see [3]), a mathematical program known as the trust region subproblem. For this well-studied problem, fast approximation algorithms are available [17] that are able to avoid solving an SDP (see also the recent linear-time approximation algorithm of [13]).

Finally, we compute the number of iterations required for Algorithm ? to converge.

According to Lemma ?, the ’th constraint in can be written in the linear form by setting and and , with , , and , and . That is, for the analysis only, we imagine that we work over a transformed, non-convex uncertainty set,

(Recall that the convergence properties of Algorithm ? do not require the convexity of the uncertainty set.) Notice that maximizing the linear function with respect to is equivalent to maximizing the function over , as established by the oracle to the non-robust problem. With the above definitions and the notations of Section 3.2, we have for all that

and for all , it holds that

Hence, for all and ,

Hence, we may set

Theorem ? with the above quantities now implies that Algorithm ? needs at most iterations for -approximating the problem .

### 4.3Robust Semidefinite Programming

A semidefinite program (SDP) is given by

where is the cone of positive semidefinite matrices, are nominal parameter matrices, and denotes the dot-product of the matrices and . Again, we assume that the feasible domain of the SDP is inscribed in the Euclidean unit ball (defined by the Frobenius matrix norm).

The robust counterpart of an SDP program is, in general, NP-hard even with simple ellipsoidal uncertainties [2]. Nevertheless, using our framework we are able to approximate robust SDP programs to within an arbitrary precision, as we now describe.

A robust SDP program with ellipsoidal uncertainties takes the following form:

where are fixed matrices and .

Dual-Subgradient Algorithm. Similarly to robust LPs, our OGD-based meta-algorithm can be applied to the robust SDP program as the constraints are linear with respect to the noise terms . In the present case we have , so that the update of the noise variables takes the simple form

For the resulting algorithm, we have the following.

By Cauchy-Schwarz, for all and ,

Therefore, we may take and in the bound of Theorem ?, giving our claim.

Finally, we note that Algorithm ? also applies to robust SDPs, but gives a guarantee worse by a factor of .

## 5Conclusion

In this paper we considered using online learning approaches for effectively solving robust optimization problems without transforming the problem to a different, more complex, class of problems. We showed that if the original problem is convex and comes equipped with an oracle capable of approximating it, then we can solve the robust problem approximately by employing an online learning approach that invokes the oracle a polynomial number of times. Essentially, our approach is applicable to any robust optimization problem for which we can efficiently approximate the worst-case noise for any given feasible solution, and is particularly efficient when the latter task can be accomplished via subgradient methods.

Our approach opens up avenues for solving large-scale robust optimization problems that are more common in data analysis and machine learning. The key observation is that the number of iterations of the online learning algorithms is independent of the dimension of the problem. This means that as long as the original problem is solvable efficiently (e.g., support vector machines) the robust problem does not become much more difficult if the accuracy of the solution can be compromised.

Our approach can be used to solve other RO problems of interest. For example, solving robust multi-stage decision problems such as Markov decision processes [21] is of interest; see [19] for discussion of robust Markov decision processes. Standard (non-robust) Markov decision processes are solvable using linear programming. However, their robust counterpart is in general not amenable to linear programming and a dynamic programming approach is needed to solve the stochastic game between the decision maker and Nature. This approach does not seem to scale up to large problems where approximate dynamic programming is needed. Using an online approach as we suggested may prove very useful since solving the original problem seems easy (solving a linear program) and finding the worst-case noise is also not too difficult, depending on the noise model. We leave the important case of multi-stage problems for future research.

Finally, it would be interesting to adapt our approach to robust combinatorial optimization, where few disciplined robust optimization methods are available. While our methods assume the original problem to be convex, our main interaction with the problem is through a black-box oracle (that may be available for non-convex problems), so it seems that the convexity requirement might be relaxed in certain cases of interest.

### Footnotes

1. Recall that we are only providing an approximate solution, and thus our algorithms formally constitute a “polynomial time approximation scheme” (PTAS).
2. Here we present OCO as the problem of online maximization of concave reward functions rather than online minimization of convex cost functions. While the latter is more common, both formulations are equivalent.
3. For simplicity, the uncertainties we consider here are only in the matrices (and not in the vectors ). In a similar, albeit more technical way we can also analyze our algorithm with general ellipsoidal uncertainties.

### References

1. The multiplicative weights update method: a meta-algorithm and applications.
S. Arora, E. Hazan, and S. Kale. Theory of Computing
2. Robust optimization - methodology and applications.
A. Ben-Tal and A. Nemirovski. Math. Program.
3. Robust Optimization

A. Ben-Tal, L. E. Ghaoui, and A. Nemirovski. .
4. Robust discrete optimization and network flows.
D. Bertsimas and M. Sim. Math. Program.
5. Theory and applications of robust optimization.
D. Bertsimas, D. B. Brown, and C. Caramanis. SIAM Review
6. Robust sparse hyperplane classifiers: Application to uncertain molecular profiling data.
C. Bhattacharyya, L. R. Grate, M. I. Jordan, L. El Ghaoui, and I. S. Mian. Journal of Computational Biology
7. A second order cone programming formulation for classifying missing data.
C. Bhattacharyya, K. S. Pannagadatta, and A. J. Smola. In L. K. Saul, Y. Weiss, and L. Bottou, editors, Advances in Neural Information Processing Systems (NIPS17), Cambridge, MA, 2004b. MIT Press.
8. Uncertain convex programs: randomized solutions and confidence levels.
G. Calafiore and M. C. Campi. Mathematical Programming
9. Prediction, Learning, and Games

N. Cesa-Bianchi and G. Lugosi. .
10. Sublinear optimization for machine learning.
K. L. Clarkson, E. Hazan, and D. P. Woodruff. J. ACM
11. Approximating semidefinite programs in sublinear time.
D. Garber and E. Hazan. In 25th Annual Conference on Neural Information Processing Systems (NIPS), pages 1080–1088, 2011.
12. The convex optimization approach to regret minimization.
E. Hazan. Optimization for machine learning
13. A linear-time algorithm for trust region problems.
E. Hazan and T. Koren. arXiv preprint arXiv:1401.6757
14. Beating sgd: Learning svms in sublinear time.
E. Hazan, T. Koren, and N. Srebro. In Advances in Neural Information Processing Systems, pages 1233–1241, 2011.
15. Efficient algorithms for online decision problems.
A. T. Kalai and S. Vempala. J. Comput. Syst. Sci.
16. A robust minimax approach to classification.
G. R. Lanckriet, L. El Ghaoui, C. Bhattacharyya, and M. I. Jordan. Journal of Machine Learning Research
17. Computing a trust region step.
J. J. Moré and D. C. Sorensen. SIAM Journal on Scientific and Statistical Computing
18. Cutting-set methods for robust convex optimization with pessimizing oracles.
A. Mutapcic and S. P. Boyd. Optimization Methods and Software
19. Robust control of Markov decision processes with uncertain transition matrices.
A. Nilim and L. El Ghaoui. Operations Research
20. Fast approximation algorithms for fractional packing and covering problems.
S. A. Plotkin, D. B. Shmoys, and É. Tardos. Mathematics of Operations Research
21. Markov Decision Processes

M. L. Puterman. .
22. A semidefinite framework for trust region subproblems with applications to large scale minimization.
F. Rendl and H. Wolkowicz. Mathematical Programming
23. Online learning and online convex optimization.
S. Shalev-Shwartz. Found. Trends Mach. Learn.
24. Second order cone programming approaches for handling missing and uncertain data.
P. K. Shivaswamy, C. Bhattacharyya, and A. J. Smola. Journal of Machine Learning Research
25. Robust support vector machines for classification and computational issues.
T. Trafalis and R. Gilbert. Optimization Methods and Software
26. Robustness and regularization of support vector machines.
H. Xu, C. Caramanis, and S. Mannor. Journal of Machine Learning Research
27. Robust regression and lasso.
H. Xu, C. Caramanis, and S. Mannor. IEEE Transactions on Information Theory
28. Online convex programming and generalized infinitesimal gradient ascent.
M. Zinkevich. In ICML, pages 928–936, 2003.
10366