Stability and instability in saddle point dynamics Part II: The subgradient method
Abstract
In part I we considered the problem of convergence to a saddle point of a concaveconvex function via gradient dynamics and an exact characterization was given to their asymptotic behaviour. In part II we consider a general class of subgradient dynamics that provide a restriction in an arbitrary convex domain. We show that despite the nonlinear and nonsmooth character of these dynamics their limit set is comprised of solutions to only linear ODEs. In particular, we show that the latter are solutions to subgradient dynamics on affine subspaces which is a smooth class of dynamics the asymptotic properties of which have been exactly characterized in part I. Various convergence criteria are formulated using these results and several examples and applications are also discussed throughout the manuscript.
theoremdummy \aliascntresetthetheorem \newaliascntpropositiondummy \aliascntresettheproposition \newaliascntcorollarydummy \aliascntresetthecorollary \newaliascntlemmadummy \aliascntresetthelemma \newaliascntprimaldummy \aliascntresettheprimal \newaliascntdualdummy \aliascntresetthedual \newaliascntexampledummy \aliascntresettheexample \newaliascntdefinitiondummy \aliascntresetthedefinition \newaliascntproblemdummy \aliascntresettheproblem \newaliascntremarkdummy \aliascntresettheremark
I Introduction
In [18] we studied the asymptotic behaviour of the gradient method when this is applied on a general concaveconvex function in an unconstrained domain, and provided an exact characterization to its limiting solutions. Nevertheless, in many applications, such as primal/dual algorithms in optimization problems, it becomes necessary to constrain the system states in a prescribed convex set, e.g. positivity constraints on Lagrange multipliers or constraints on physical quantities like data flow, and prices/commodities in economics [20], [24], [37], [12]. The subgradient method is used in such cases, which is a version of the gradient method with a projection term in the vector field additionally included, so as to ensure that the trajectories do not leave the desired set.
In discrete time, there is an extensive literature on the subgradient method, via its application in optimization problems (see e.g. [33]). However, in many applications, for example power networks [41, 10, 22, 7, 8, 23, 38, 28, 32] and classes of data network problems [24], [37], [12], [30] continuous time models are considered. It is thus important to have a good understanding of the subgradient dynamics in a continuous time setting, which could also facilitate analysis and design by establishing links with other more abstract results in dynamical systems theory.
A main complication in the study of the subgradient method arises from the fact the this is a nonsmooth system, i.e. a nonlinear ODE with a discontinuous vector field due to the projections involved. This prohibits the direct application of classical Lyapunov or LaSalle theorems (e.g. [25]), which is reflected in the direct approach used by Arrow, Hurwicz and Uzawa in [1] that avoids the use of such tools. More recently, the work of Feijer and Paganini [12] unified the previously adhoc and application focused analysis of primal dual gradient dynamics in network optimisation, and proposed that the switching in the dynamics be interpreted in the framework of hybrid automata, where a LaSalle Invariance principle was recently obtained in [31]. However, as recently pointed out in [4], there are cases where the assumptions required in [31] do not hold. In [4], the LaSalle invariance principle for discontinuous Carathéodory systems is applied to prove convergence of the subgradient method under positivity constraints and the assumption of strict concavity. Further results on the asymptotic properties of the subgradient method under positivity constraints where derived in [5] where global convergence was also shown under a condition of local strict concavityconvexity. In [35] the subgradient method is used to solve linear programs with inequality constraints. In general, proving convergence for the subgradient method even in simple cases, is a nontrivial problem that requires the nonsmooth character of the system to be explicitly addressed.
Our aim in this paper is to provide a framework of results that allow one to study the asymptotic behaviour of the subgradient method in a general setting, where the trajectories are constrained to an arbitrary convex domain, and the concaveconvex function considered is not necessarily strictly concaveconvex. One of our main results is to show that despite the nonlinear and non smooth character of the subgradient dynamics, their limiting behaviour are solutions to explicit linear differential equations.
In particular, we show that these linear ODEs are limiting solutions of subgradient dyanmics on an affine subspace, which is a class of dynamics that fit within the framework studied in Part I [18]. These dynamics can therefore be exactly characterized, thus allowing to prove convergence to a saddle point for broad classes of problems.
The results in this paper are illustrated by means of examples that demonstrate also the complications in the dynamic behaviour of the subgradient method relative to the unconstrained gradient method. We also apply our results to modification schemes in network optimization, that provide convergence guarantees while maintaining a decentralized structure in the dynamics.
The methodology used for the derivations in the paper is also of independent technical interest. In particular, the notion of a face of a convex set is used to characterize the ODEs associated with the limiting behaviour of the subgradient dynamics. Furthermore, some more abstract results on corresponding semiflows have been used to address the complications associated with the nonsmooth character of subgradient dynamics.
The paper is structured as follows. Section II provides preliminaries from convex analysis and dynamical systems theory that will be used within the paper. The problem formulation is given in section III and the main results are presented in section IV, where various examples that illustrate those are also discussed. Applications to modification methods in network optimization are given in section V. The proofs of the results are given in sections VI and VII and an application to the problem of multipath routing is discussed in Appendix B.
Ii Preliminaries
We use the same notation and definitions as in part I of this work [18] and we refer the reader to the preliminaries section therein. The notions below from convex analysis and analysis of dynamical systems will additionally be used throughout the paper.
Iia Convex analysis
We recall first for convenience the following notions defined in part I [18] that will be frequently used in this manuscript. For a closed convex set and , we denote the normal cone to through as . When is an affine space is independent of and is denoted . If is in addition nonempty, then we denote the projection of onto as . Also for vectors , denotes the Euclidean metric and the Euclidean norm.
Concaveconvex functions and saddle points
For a function that is concaveconvex on the (standard) notion of a saddle point was given in part I [18]. We now consider restricted to a nonempty closed convex set , in which case the notion of saddle point needs to be modified to incorporate the constraints.
Definition \thedefinition (Restricted saddle point).
Let be nonempty closed and convex. For a concaveconvex function , we say that is a restricted saddle point of if for all and with we have the inequality .
If in addition then is a restricted saddle point if and only if the vector of partial derivatives lies in the normal cone .
Any restricted saddle point in the interior of is also a saddle point. If is closed and convex and is a restricted saddle point, then is also a restricted saddle point.
However, it in general does not hold that if has a saddle point, and is closed convex and nonempty, then has a restricted saddle point (an explicit example illustrating this is given later in subsection IVB(ii)). In this manuscript we will only consider cases where at least one restricted saddle point exists, leaving the problem of showing existence to the specific application.
Concave programming
Concave programming (see e.g. [3]) is concerned with the study of optimization problems of the form
(1) 
where , are concave functions and is nonempty closed and convex. Under some mild assumptions, the solutions to such problems are saddle points of the Lagrangian
(2) 
where are the Lagrange multipliers. This is stated in the Theorem below.
Theorem \thetheorem.
Faces of convex sets
Some of the main results of this manuscript refer to faces of a convex set. We refer the reader to [16, Chap. 1.8.] for further discussion of such topics.
Definition \thedefinition (Face of a convex set).
Given a nonempty closed convex set , a face of is a subset of that has both the following properties:

is convex.

For any line segment , if then .
For the readers convenience we recall some standard properties of faces:

The intersection of two faces of is a face of .

The empty set and itself are both faces of . If a face is neither or it is called a proper face.

If is a face of and is a face of , then is a face of .

For a face of , the normal cone is independent of the choice of . In these cases we drop the dependence and write it as .

may be written as the disjoint union:
(4)
Property (a) above leads to the following definition.
Definition \thedefinition (Minimal face containing a set).
For a convex set and a subset we define the minimal face containing as
which is a face by property (a) above.
IiB Dynamical systems
Definition \thedefinition (Flows and semiflows).
A triple is a flow (resp. semiflow) if is a metric space, is a continuous map from (resp. ) to which satisfies the two properties

For all , .

For all , (resp. ),
(5)
When there is no confusion over which (semi)flow is meant, we shall denote as . For sets (resp. ) and we define .
Definition \thedefinition (limit set).
Given a semiflow we denote the set of limit points of trajectories as
(6) 
where denotes the closure of in .
Definition \thedefinition (Invariant sets).
For a semiflow we say that a set is positively invariant if . If is also a flow we say that is negatively invariant if . If for all then we say is invariant.
Definition \thedefinition (Sub(semi)flow).
For a flow (resp. semiflow) and an invariant (resp. positively invariant) set we obtain the subflow (resp. subsemiflow) by restricting to act on and denote it as .
Definition \thedefinition (Global convergence).
We say that a (semi)flow is globally convergent, if for all initial conditions , the trajectory converges to the set of equilibrium points of as , i.e.
In part I of this work much of the analysis relied on a specific form of stability, linked to incremental stability, which we reproduce below for the convenience of the reader.
Definition \thedefinition (Pathwise stability).
We say that a semiflow is pathwise stable if for any two trajectories the distance is nonincreasing in time.
As it will be discussed in the paper, the limit set of pathwise stable semiflows, is comprised of semiflows of the class defined below.
Definition \thedefinition ((Semi)Flow of isometries).
We say that a (semi)flow is a (semi)flow of isometries if for every (resp. ), the function is an isometry, i.e. for all it holds that .
Finally, we will need the notion of Carathéodory solutions of differential equations.
Definition \thedefinition (Carathéodory solution).
We say that a trajectory is a Carathéodory solution to a differential equation , if is an absolutely continuous function of , and for almost all times , the derivative exists and is equal to .
Iii Problem formulation
The main object of study in this work is the subgradient method on an arbitrary concaveconvex function in and an arbitrary convex domain . We first recall the definition of the gradient method, which is studied in part I of this work [18].
Definition \thedefinition (Gradient method).
Given a concaveconvex function on , we define the gradient method as the flow on generated by the differential equation
(7)  
The subgradient method is obtained by restricting the gradient method to a convex set by the addition of a projection term to the differential equation (7).
Definition \thedefinition (Subgradient method).
Given a nonempty closed convex set and a function that is concaveconvex on , we define the subgradient method on as a semiflow on consisting of Carathéodory solutions of
(8)  
by a transformation of coordinates.
Remark \theremark.
For (nonaffine) convex sets the subgradient method (8) is a nonsmooth system. The vector field is discontinuous due to the convex projection term, independently of the regularity of the function or of the boundary of . This is in contrast to the gradient method (7), which is a smooth system, as it inherits the regularity of the function .
The equilibrium points of the subgradient method on are exactly the restricted saddle points.
We briefly summarise the contributions of this work in the bullet points below.

We show that the subgradient dynamics, despite being nonlinear and nonsmooth, have an limit set that is comprised of solutions to only linear ODEs.

These solutions are shown to belong to the limit set of the subgradient method on affine subspaces. This links with part I [18] of this two part work, where the limiting solutions of such systems have been exactly characterized. Based on this characterization of the limiting solutions, a convergence result for subgradient dynamics is also presented.

Various applications of the results above are considered. In particular, we give a proof of the convergence of the subgradient method applied to any strictly concaveconvex function for an arbitrary convex domain. Furthermore, we apply our results to modifications methods in network optimization that provide convergence guarantees while maintaining a decentralized structure in the dynamics. An application to the problem of multipath routing is also discussed.
Iv Main Results
This section states the main results of the paper. The results are divided into three subsections. To facilitate the readability of section IV we outline below the main Theorems that will be presented and the way these are related.
In subsection IVA we consider pathwise stable semiflows, an abstraction we use for the subgradient dynamcis in order to develop tools for their analysis that are valid despite their nonsmooth character. In particular, subsection IVA gives an invariance principle for such semiflows, which applies without any smoothness assumption on the dynamics. We then additionally incorporate projections that constrain the trajectories within a closed convex set. Our key result, subsection IVA, says that for these semiflows the dynamics on the limit set are smooth.
In subsection IVB we apply these tools to the subgradient method (8). In subsection IVB we show that the limiting solutions of the (nonsmooth) subgradient method on a convex set are given by the dynamics of the (smooth) subgradient method on an affine subspace. This allows us to obtain subsection IVB, a criterion for global asymptotic stability of the subgradient method.
In subsection IVC we combine subsection IVB with the results of Part I of this work [18] (for convenience of the reader reproduced in Appendix A) to obtain a general convergence criterion (subsection IVC) for the subgradient method.
These results are illustrated with examples throughout. The proofs of the results are given in section VI.
Iva Pathwise stability and convex projections
If one wishes to extend the results of Part I of this work [18] to the subgradient method on a nonempty closed convex set , then one runs into two problems, both coming from the discontinuity of the vector field in (8). The first is that the previously simple application of LaSalle’s theorem would become much more technical  needing tools from nonsmooth analysis. The second, more fundamental, problem is that LaSalle’s theorem only gives convergence to a set of trajectories, and it remains to characterise this set. The trajectories in this set still satisfy an ODE with a discontinuous vector field, and we do not have uniqueness of the solution backwards in time  we still, though, have a semiflow.
To solve these issues we reinterpret the prior results in terms of a simple property which is still present in the subgradient method.
The main tool used to prove the results in [18] was pathwise stability, (subsection IIB), which says that the Euclidean distance between any two solutions is nonincreasing with time (we will later prove such a result for the subgradient method). Intuitively, one would expect that the distance between any two of the limiting solutions would be constant. A more abstract way of saying this is that the subflow obtained by considering the gradient method with initial conditions in the limit set is a flow of isometries. In fact, this can be proved for any pathwise stable semiflow, as stated in Proposition IVA below.
Proposition \theproposition.
Let be a pathwise stable semiflow (see subsection IIB) with which has an equilibrium point . Let be its limit set. Then the subsemiflow (see subsection IIB) defines a flow of isometries (see subsection IIB). Moreover, is a convex set.
Note here that is a flow rather than a semiflow. This comes from the simple observation that an isometry is always invertible, so we can define, for , as .
Remark \theremark.
Care should be taken in interpreting the backwards flow given by subsection IVA. There could be multiple trajectories in that meet at a point in at time , but exactly one of these trajectories will lie in for all times .
We would like to note that we are not the first to make this observation. Indeed, we deduce this result from a more general result in [6] which was published in 1970.
We consider pathwise stable differential equations which are projected onto a convex set, and make the following set of assumptions.
the semiflow of Carathéodory solutions of  (9)  
It should be noted that the final inequality in (9) holds for the subgradient method (8), which is evident from the proof of the pathwise stability of the gradient method presented in [18, Appendix B].
A simple first result is that the projected dynamics are still pathwise stable.
Lemma \thelemma.
Let (9) hold. Then is pathwise stable.
Our main result on such projected differential equations is that, even though the projection term gives a discontinuous vector field, when we restrict our attention to the limit set, the vector field is . This allows us to replace nonsmooth analysis with smooth analysis when studying the asymptotic behaviour of such systems.
Theorem \thetheorem.
Let (9) hold and assume that the semiflow has an equilibrium point. Let be its limit set. Then defines a flow of isometries given by solutions to the following differential equation, which has a vector field,
(10) 
Here is the affine span of the (unique) minimal face (see subsubsection IIA3) of that contains the set of equilibrium points of the semiflow.
Remark \theremark.
The existence of a minimal face of that contains the set of equilibrium points is a simple consequence of the definition of a face (see subsubsection IIA3 and the discussion that follows). The important part of subsection IVA is that the dynamics on are given by (10), i.e. the projection operator in (8) becomes which does not depend on the position .
IvB The subgradient method
We now apply theses results to the subgradient method. Our first result reduces the study of the convergence on general convex domains, where the subgradient method is nonsmooth, to the study of convergence of the subgradient method on affine spaces, which is a smooth dynamical system studied in [18]. We also show that when an internal saddle point exists then the limiting behaviour of the subgradient method is determined by that of the corresponding unconstrained gradient method.
As in part I of this work [18], given a concaveconvex function we define the following

is the set of saddle points of

is the set of solutions to the gradient method (7) (i.e. no projections included) that lie a constant distance from any saddle point.
Theorem \thetheorem.
Let be nonempty, closed and convex. Let be , concaveconvex on and have a restricted saddle point. Let denote the subgradient method (8) on and be its limit set. Then is convex, and defines a flow of isometries. Furthermore, the following hold:

The trajectories of solve the ODE:
(11) where is the affine span of , with being the minimal face containing all restricted saddle points.

If there exists a saddle point of in the interior of , then
(12) where is as defined before the theorem statement.
Remark \theremark.
The ODE (11) is the subgradient method on the affine subspace . A main significance of subsection IVB is the fact that the solutions of (11) in can be characterized using the results in part I [18]. In particular, it follows from Theorem A in Appendix A that these satisfy explicit linear ODEs. This therefore shows that even though the subgradient dynamics are nonlinear and nonsmooth their limit set is comprised of solutions to only linear ODEs (stated in subsection IVC).
Remark \theremark.
Later, in subsection IVC we use the results in [18] on the subgradient method on affine subspaces together with subsection IVB to obtain a convergence criterion for the subgradient method. This is used subsequently to give proofs for the applications considered in section V.
Remark \theremark.
It will be discussed in the proof of subsection IVB that subsection IVB(ii) is a special case of subsection IVB(i) where the projection term in (11) equal to zero. In subsection IVB(ii) there is a simple characterization of the limiting solutions of the subgradient method, as just the limiting solutions of the corresponding gradient method that lie in . Note that the set in (12) was exactly characterized in [18].
Remark \theremark.
A simple consequence of (12) is the fact if there exists a saddle point in the interior of then the subgradient method is globally convergent if the corresponding unconstrained gradient method is globally convergent.
We now present several examples to illustrate the application of subsection IVB in some simple cases.
The first example corresponds to a case where the unconstrained gradient method (7) is globally convergent, but the subgradient method is not.
Example \theexample.
Define the concaveconvex function
(13) 
where . This has a single saddle point at , and is the Lagrangian of the optimisation problem
(14) 
where variable in function is the Lagrange multiplier associated with the constraint. On this function the gradient method is the linear system
(15) 
It is easily verified that all the eigenvalues of this matrix lie in the left half plane, so that the gradient method is globally convergent. Now consider the family of convex sets defined by
(16) 
for . The subgradient method on is given by the system
(17)  
The convergence of the subgradient method on depends crucially on the value of . There are three cases:

: In this case the saddle point lies in the interior of so that subsection IVB(ii) applies, and as the unconstrained gradient method is globally convergent, so is the subgradient method on .

: Here the unconstrained saddle point lies outside . A simple computation shows that the point is the only restricted saddle point. subsection IVB(i) can be used here. The only proper face of is the set
(18) The subgradient method on is the system
(19) together with the equality . This matrix has imaginary eigenvalues , showing that the subgradient method on is not globally convergent. It is easy to verify that some of these oscillatory solutions are also solutions of the subgradient method on . Therefore the subgradient method on is not globally convergent when .

: In this case the saddle point lies on the boundary of . subsection IVB(i) applies, and the analysis of the subgradient method on is the same as in case (ii) above. However, when we check whether any oscillatory solutions of the subgradient method on are also solutions of the subgradient method on , we find that there are no such solutions. Indeed, for a trajectory to be a solution to both the subgradient method on and the subgradient method on we must have both and by (17). Then (17) implies that and then that . So the only such solution is the saddle point. Therefore the subgradient method on is globally convergent.
This shows that the subgradient method on undergoes a bifurcation at .
The following example illustrates that the subgradient method can be globally convergent when the gradient method is not.
Example \theexample.
Define the concaveconvex function
(20) 
This has a single saddle point at and corresponds to the optimisation problem
(21) 
where the constraint is relaxed via the Lagrange multiplier . The gradient method applied to is the linear system
(22) 
whose matrix has eigenvalues so the gradient method is not globally convergent. We again consider the subgradient method on the closed convex set defined by (16) for splitting into three cases:

: As in subsection IVB(i) the saddle point lies in the interior of . As the unconstrained gradient method is not globally convergent, subsection IVB(ii) implies that the subgradient method on is also not globally convergent.

: The subgradient method on is given by
(23) The saddle point lies outside . For to be a restricted saddle point, (23) implies that , but this is impossible in , so there are no restricted saddle points. This can also be understood in terms of the optimisation problem (21) which has empty feasible set if we impose the further condition that . This means that none of our results apply, but a direct analysis of (23) shows that so that as , and the system is not globally convergent.

: Solving (23) for the restricted saddle points yields the continuum . None of these lie in the interior of , so subsection IVB(ii) does not apply and subsection IVB(i) is used to analyze the asymptotic behaviour. The only proper face of is defined by (18). On , the subgradient method is the system
(24) together with the equality , which is clearly globally convergent, noting that the set of restricted saddle points is . Therefore the subgradient method on is also globally convergent.
So in this case the subgradient method on starts nonconvergent for , becomes globally convergent for and finally looses all its equilibrium points when .
Although the minimal face in subsection IVB(i) is given as the intersection of all faces that contain restricted saddle points, it can be useful to obtain convergence criteria that do not depend upon knowledge of all restricted saddle points. We note that if the subgradient method is globally convergent on any affine span of a face of , then global convergence is implied.
Corollary \thecorollary.
Let be nonempty, closed and convex. Let be and concaveconvex on . Let have a restricted saddle point. Assume that, for any face of that contains a restricted saddle point, the subgradient method on is globally convergent. Then the subgradient method on is globally convergent.
Example \theexample.
To illustrate this result, let us consider the case of positivity constraints, where are restricted to . Here the faces of are given by sets of the form
where and are sets of indices. The affine span of such a face is then given by
(25) 
Thus, by subsection IVB, checking convergence of the subgradient method in this case may be done by checking convergence of the gradient method with any arbitrary set of coordinates fixed as zero
In some cases the faces of the constraint set have an interpretation in terms of the specific problem.
Example \theexample.
Consider the optimisation problem
(26) 
where are concave functions in . This is associated with the Lagrangian
(27) 
where is a vector of Lagrange multipliers
(28) 
which is associated with the modified optimisation problem
(29) 
where, compared to (26), the inequality constraints are replaced by equality constraints, and some subset of the constraints are removed.
If is concaveconvex on then subsection IVB applies. We obtain that the subgradient method on applied to is globally convergent, if, for any , the gradient method applied to the Lagrangian corresponding to the modified optimisation problem (29) is globally convergent.
IvC A general convergence criterion
By combining subsection IVB with the results on the limiting solutions of the (smooth) subgradient method on affine subspaces given in [18] we obtain the following convergence criterion for the subgradient method on arbitrary convex sets and arbitrary concaveconvex functions. This states that the subgradient method is globally convergent, if it has no trajectory satisfying an explicit linear ODE.
To state the theorem we recall from [18] the definition of the following matrices of partial derivatives of a concaveconvex function
(30)  
The theorem is stated under the assumption that is a restricted saddle point. The general case is obtained by a translation of coordinates.
Theorem \thetheorem.
Let be nonempty, closed and convex in with . Let be concaveconvex on and have as a restricted saddle point. Let be the minimal face of that contains all restricted saddle points and let be the affine span of . Let be the orthogonal projection matrix onto the orthogonal complement of . Let also and be the matrices defined in (30).
Then if the subgradient method (8) on applied to has no nonconstant trajectory that satisfies both the following

the linear ODE
(31) 
for all and ,
(32)
then the subgradient method is globally convergent.
Remark \theremark.
Although the condition (32) appears difficult to verify, it is only necessary to show that the condition does not hold (by nontrivial trajectories) in order to prove global convergence. This turns out to be easy in many cases, for example in the proofs of the convergence of the modification methods discussed in section V (subsubsection VB4).
Remark \theremark.
V Applications
In this section we apply the results of section IV to obtain global convergence in a number cases. First we consider the subgradient method applied to a strictly concaveconvex function on an arbitrary convex domain. Then we look at some examples of modification methods, relevant in network optimization, where the concaveconvex function is modified to provide guarantees of convergence. The application of one such modification method to the problem of multipath routing is also discussed in Appendix B.
The proofs for this section are provided in section VII.
Va Convergence for strictly concaveconvex functions on arbitrary convex domains
The convergence of the subgradient method when applied to functions that are strictly concaveconvex, (i.e. at least one of the concavity or convexity is strict), was proved by Arrow, Hurwicz and Uzawa [1] under positivity constraints. More recently, [12] and [4] revisited this result, giving more modern proofs in the case where the concaveconvex function has the form (2) with and strictly concave, with further extensions provided in [5] for concaveconvex functions with positivity constraints in one of the variables. The case of restriction of a general concaveconvex function to an arbitrary convex set appears to be unknown in the literature (the theory for discrete time subgradient methods is more complete, see e.g. [33]). Using the results established in the previous section we prove here that for a nonempty closed convex set the subgradient method on applied to a strictly concaveconvex function is globally convergent.
Theorem \thetheorem.
Let be nonempty, closed and convex. Let be and strictly concaveconvex on , and have a restricted saddle point. Then the subgradient method (8) on is globally convergent.
Remark \theremark.
It follows from the proof of Theorem VA that it is sufficient for the concaveconvex function to be strictly concaveconvex only in an open ball about the saddle point
VB Modification methods for convergence
We will consider methods for modifying so that the (sub)gradient method converges to a saddle point. The methods that will be discussed are relevant in network optimisation (see e.g. [1], [12]), as they preserve the localised structure of the dynamics. It should be noted that these modifications do not necessarily render the function strictly concaveconvex and hence convergence proofs are more involved. We show below that the results in section IV provide a systematic and unified way of proving convergence by making use of Theorem IVC, while also allowing to consider these methods in a generalized setting of a general convex domain.
Auxiliary variables method
Given a concaveconvex function defined on a convex domain , we define the modified concaveconvex function as
(33)  
where is a vector of auxiliary variables, and is a constant matrix that satisfies for a restricted saddle point of .
We define the augmented convex domain as . Note that the additional auxiliary variables are not restricted and are allowed to take values in the whole of . Also note that the identity matrix always satisfies the assumptions upon above.
Remark \theremark.
An important feature of this modification (and also the ones that will be considered below) is the fact that there is a correspondence between restricted saddle points of and restricted saddle points of , with the values of at the saddle points remaining unchanged. In particular, if is a restricted saddle point of , then is a restricted saddle point of . In the reverse direction, if is a restricted saddle point of then and is a restricted saddle point of .
Remark \theremark.
The significance of this method will become more clear in the multipath routing problem discussed in Appendix B. In particular, this method allows convergence to be guaranteed in network optimization problems without introducing additional information transfer among nodes. Special cases of this method have also been used in [9], [19] in applications in economic and power networks.
Penalty function method
For this and the next method we will assume that the concaveconvex functions is a Lagrangian originating from a concave optimization problem (see subsubsection IIA2). We will assume that the Lagrangian satisfies
(34)  
We consider a so called penalty method (see e.g. [14]). This method adds a penalising term to the Lagrangian based directly on the constraint functions. The new Lagrangian is defined by
(35)  
It is easy to see that the saddle points of and are the same.
Remark \theremark.
This modification method is also often applied to network optimization problems, i.e. problems where is of the form and each of the is a function of only a few of the components of . Similarly each component, , of the constraints depends on only a few of the components of . The subgradient method for such problems applied to (34) has a decentralized structure. When applied to the modified version (35) the dynamics will still have a decentralized structure, but will often also involve additional information exchange between neighboring nodes, e.g. when is linear, due to the nonlinearity of the function .
Remark \theremark.
This method has been considered previously by many authors, (see [12] and the references therein
Constraint modification method
We next recall a method proposed by Arrow et al.[1] and later studied in [12]. Here we instead modify the constraints to enforce strict concavity. The Lagrangian (34) is modified to become:
(36)  
It is clear that the value of at the saddle points of the modified and original Lagrangian will be the same. In analogy with Remark VB2, this method also preserves the decentralized structure of the subgradient method for network optimization problems, but may require additional information transfer.
Remark \theremark.
Previous works [1],[12],[4] have proved convergence of this method with positivity constraints, i.e. . subsubsection VB4 below applies to any constraint set which is a product set with , both nonempty closed and convex.
Convergence results
We now give a global convergence result for each of the methods described above on general convex domains.
Theorem \thetheorem (Convergence of modification methods).
Assume that , and satisfy one of the following:

Auxiliary variable method: Let be concaveconvex on a nonempty closed convex set. Let and be defined by (33) and the text directly below it.
Also assume that has a restricted saddle point. Then the subgradient method (8) applied to on domain in and domain in is globally convergent.
Remark \theremark.
Each of the convergence results in subsubsection VB4 is proved using subsection IVC. It should also be noted that none of the modification methods produce necessarily a strictly concaveconvex function . Global convergence to a saddle point is still though guaranteed by ensuring that no trajectory, other than saddle points, satisfy conditions (31), (32) in subsection IVC.
Vi Proofs of the main results
In this section we prove the main results of the paper, which are stated in section IV.
Via Outline of the proofs
We first give a brief outline of the derivations of the results to improve their readability.
Pathwise stability and convex projections
In subsection VIB we prove the results described in subsection IVA.
We revisit some of the literature on topological dynamical systems [6], quoting a more general result subsection VIB, from which subsection IVA is deduced. Then subsection IVA is proved using the convexity of the domain . The combination of these results allow us to prove the main result of the subsection, subsection IVA, using the fact that the convex projection term cannot break the isometry property of the flow on the limit set.
Subgradient method
In subsections VIC, VID we prove the results in subsections IVB, IVC, respectively, using the results in subsection IVA.
ViB Convergence to a flow of isometries
In this section we provide the proofs of subsection IVA, subsection IVA and subsection IVA.
We begin by revisiting the literature on topological dynamical systems, in which a type of incremental stability is studied, and show how this leads to an invariance principle for pathwise stability.
Definition \thedefinition (Equicontinuous semiflow).
We say that a flow (resp. semiflow) is equicontinuous if for any and there is a such that if then
(37) 
Remark \theremark.
In the control literature equicontinuity of a semiflow would correspond to ‘semiglobal nonasymptotic incremental stability’, but we shall keep the term equicontinuity for brevity and consistency with [6].
Definition \thedefinition (Uniformly almost periodic flow).
We say that a flow is uniformly almost periodic if for any there is a syndetic set , (i.e. for some compact set ), for which
(38) 
For the readers convenience we reproduce the results, [6, Theorem 8] and [11, Proposition 4.4.], that we will use.
Theorem \thetheorem (G. Della Riccia [6]).
Let be an equicontinuous semiflow and let be either locally compact or complete. Let be its limit set. Then is an equicontinuous semiflow of homeomorphisms of onto . This generates an equicontinuous flow.
The backwards flow given by subsection VIB is only unique on , (see subsection IVA which also applies here).
Proposition \theproposition (R. Ellis [11]).
Let be a flow, with compact. Then the following are equivalent:

The flow is equicontinuous.

The flow is uniformly almost periodic.
In our case we study pathwise stability which is a particular form of equicontinuity. We prove stronger results in this special case.
Proof of subsection IvA.
By subsection VIB is an equicontinuous flow with an equilibrium point . Let be arbitrary, and define
(39) 
As the flow is equicontinuous, is a closed bounded subset of and hence compact, and moreover, the union of the sets over is . By subsection VIB the flow is uniformly almost periodic. By pathwise stability, is a nonincreasing along the direct product flow, and is a continuous function on a compact set. Hence we have the inequality, for any two points ,
(40)  
We claim that the two limits are equal. Indeed, by uniform almost periodicity there are sequences and as for which
(41) 
and the analogous limits hold for for the same sequences . Hence, by continuity of , we have
(42) 
Hence is constant. By picking big enough, this holds for any , which completes the proof that the subsemiflow generates a flow of isometries.
It remains to show that is convex. To this end let be two trajectories of . Let that and define . By the same argument as used in the proof of [18, Proposition 28] we deduce that is a trajectory of the original semiflow, but (as argued above) by uniform almost periodicity of we have a sequence of times for which as and the same limit for . Hence also, showing that is in the limit set. ∎
We now work under the set of assumptions (9) and consider projected pathwise stable differential equations.
Proof of subsection IvA.
Let and be two arbitrary solutions to the projected ODE, and define . Then is absolutely continuous and for almost all times we have,
(43)  