A Appendix

# The role of convexity on saddle-point dynamics: Lyapunov function and robustness

## 1 Introduction

Saddle-point dynamics and its variations have been used extensively in the design and analysis of distributed feedback controllers and optimization algorithms in several domains, including power networks, network flow problems, and zero-sum games. The analysis of the global convergence of this class of dynamics typically relies on some global strong/strict convexity-concavity property of the saddle function defining the dynamics. The main aim of this paper is to refine this analysis by unveiling two ways in which convexity-concavity of the saddle function plays a role. First, we show that local strong convexity-concavity is enough to conclude global asymptotic convergence, thus generalizing previous results that rely on global strong/strict convexity-concavity instead. Second, we show that, if global strong convexity-concavity holds, then one can identify a novel Lyapunov function for the projected saddle-point dynamics for the case when the saddle function is the Lagrangian of a constrained optimization problem. This, in turn, implies a stronger form of convergence, that is, input-to-state stability (ISS) and has important implications in the practical implementation of the saddle-point dynamics.

## 2 Preliminaries

This section introduces our notation and preliminary notions on convex-concave functions, discontinuous dynamical systems, and input-to-state stability.

### 2.1 Notation

Let , , and denote the set of real, nonnegative real, and natural numbers, respectively. We let denote the -norm on and the respective induced norm on . Given , denotes the -th component of , and denotes for . For vectors and , the vector denotes their concatenation. For and , we let

 [a]+b={a, if b>0,max{0,a}, if b=0.

For vectors and , denotes the vector whose -th component is , for . Given a set , we denote by , , and its closure, interior, and cardinality, respectively. The distance of a point to the set in -norm is . The projection of onto a closed set is defined as the set . When is also convex, is a singleton for any . For a matrix , we use , , , and to denote that is positive semidefinite, positive definite, negative semidefinite, and negative definite, respectively. For a symmetric matrix , and denote the minimum and maximum eigenvalue of . For a real-valued function , , we denote by and the column vector of partial derivatives of with respect to the first and second arguments, respectively. Higher-order derivatives follow the convention , , and so on. A function is class if it is continuous, strictly increasing, and . The set of unbounded class functions are called functions. A function is class if for any , is class and for any , is continuous, decreasing with as .

### 2.2 Saddle points and convex-concave functions

Here, we review notions of convexity, concavity, and saddle points from [21]. A function is convex if

 f(λx+(1−λ)x′)≤λf(x)+(1−λ)f(x′),

for all (where is a convex domain) and all . A convex differentiable satisfies the following first-order convexity condition

 f(x′)≥f(x)+(x′−x)⊤∇f(x),

for all . A twice differentiable function is locally strongly convex at if is convex and for some (note that this is equivalent to having in a neighborhood of ). Moreover, a twice differentiable is strongly convex if for all for some . A function is concave, locally strongly concave, or strongly concave if is convex, locally strongly convex, or strongly convex, respectively. A function is convex-concave (on ) if, given any point , is convex and is concave. When the space is clear from the context, we refer to this property as being convex-concave in . A point is a saddle point of on the set if , for all and . The set of saddle points of a convex-concave function  is convex. The function is locally strongly convex-concave at a saddle point if it is convex-concave and either or for some . Finally, is globally strongly convex-concave if it is convex-concave and either is strongly convex for all or is strongly concave for all .

### 2.3 Discontinuous dynamical systems

Here we present notions of discontinuous dynamical systems [22, 23]. Let be Lebesgue measurable and locally bounded. Consider the differential equation

 ˙x=f(x). (1)

A map is a (Caratheodory) solution of (1) on the interval if it is absolutely continuous on and satisfies almost everywhere in . We use the terms solution and trajectory interchangeably. A set is invariant under (1) if every solution starting in remains in . For a solution of (1) defined on the time interval , the omega-limit set is defined by

 Ω(γ)={y∈Rn|∃{tk}∞k=1⊂[0,∞) with limk→∞tk=∞ and limk→∞γ(tk)=y}.

If the solution is bounded, then by the Bolzano-Weierstrass theorem [24, p. 33]. Given a continuously differentiable function , the Lie derivative of along (1) at is . The next result is a simplified version of [22, Proposition 3].

###### Proposition 2.1

(Invariance principle for discontinuous Caratheodory systems): Let be compact and invariant. Assume that, for each point , there exists a unique solution of (1) starting at and that its omega-limit set is invariant too. Let be a continuously differentiable map such that for all . Then, any solution of (1) starting at converges to the largest invariant set in .

### 2.4 Input-to-state stability

Here, we review the notion of input-to-state stability (ISS) following [25]. Consider a system

 ˙x=f(x,u), (2)

where is the state, is the input that is measurable and locally essentially bounded, and is locally Lipschitz. Assume that starting from any point in , the trajectory of (2) is defined on for any given control. Let be the set of equilibrium points of the unforced system. Then, the system (2) is input-to-state stable (ISS) with respect to if there exists and such that each trajectory of (2) satisfies

 ∥x(t)∥Eq(f)≤β(∥(x(0)∥Eq(f),t)+γ(∥u∥∞)

for all , where is the essential supremum (see [24, p. 185] for the definition) of . This notion captures the graceful degradation of the asymptotic convergence properties of the unforced system as the size of the disturbance input grows. One convenient way of showing ISS is by finding an ISS-Lyapunov function. An ISS-Lyapunov function with respect to the set for system (2) is a differentiable function such that

1. there exist such that for all ,

 α1(∥x∥Eq(f))≤V(x)≤α2(∥x∥Eq(f)); (3)
2. there exists a continuous, positive definite function and such that

 ∇V(x)⊤f(x,v)≤−α3(∥x∥Eq(f)) (4)

for all , for which .

###### Proposition 2.2

(ISS-Lyapunov function implies ISS): If (2) admits an ISS-Lyapunov function, then it is ISS.

## 3 Problem statement

In this section, we provide a formal statement of the problem of interest. Consider a twice continuously differentiable function , , which we refer to as saddle function. With the notation of Section 2.2, we set and , and assume that is convex-concave on . Let denote its (non-empty) set of saddle points. We define the projected saddle-point dynamics for as

 ˙x =−∇xF(x,y,z), (5a) ˙y =[∇yF(x,y,z)]+y, (5b) ˙z =∇zF(x,y,z). (5c)

When convenient, we use the map to refer to the dynamics (5). Note that the domain is invariant under (this follows from the definition of the projection operator) and its set of equilibrium points precisely corresponds to (this follows from the defining property of saddle points and the first-order condition for convexity-concavity of ). Thus, a saddle point satisfies

 ∇xF(x∗,y∗,z∗)=0,∇zF(x∗,y∗,z∗)=0, (6a) ∇yF(x∗,y∗,z∗)≤0,y⊤∗∇yF(x∗,y∗,z∗)=0. (6b)

Our interest in the dynamics (5) is motivated by two bodies of work in the literature: one that analyzes primal-dual dynamics, corresponding to (5a) together with (5b), for solving inequality constrained network optimization problems, see e.g., [3, 5, 14, 11]; and the other one analyzing saddle-point dynamics, corresponding to (5a) together with (5c), for solving equality constrained problems and finding Nash equilibrium of zero-sum games, see e.g., [19] and references therein. By considering (5a)-(5c) together, we aim to unify these lines of work. Below we explain further the significance of the dynamics in solving specific network optimization problems.

###### Remark 3.1

(Motivating examples): Consider the following constrained convex optimization problem

 min{f(x)|g(x)≤0,Ax=b},

where and are convex continuously differentiable functions, , and . Under zero duality gap, saddle points of the associated Lagrangian correspond to the primal-dual optimizers of the problem. This observation motivates the search for the saddle points of the Lagrangian, which can be done via the projected saddle-point dynamics (5). In many network optimization problems, is the summation of individual costs of agents and the constraints, defined by and , are such that each of its components is computable by one agent interacting with its neighbors. This structure renders the projected saddle-point dynamics of the Lagrangian implementable in a distributed manner. Motivated by this, the dynamics is widespread in network optimization scenarios. For example, in optimal dispatch of power generators [11, 12, 13, 14], the objective function is the sum of the individual cost function of each generator, the inequalities consist of generator capacity constraints and line limits, and the equality encodes the power balance at each bus. In congestion control of communication networks [4, 26, 5], the cost function is the summation of the negative of the utility of the communicated data, the inequalities define constraints on channel capacities, and equalities encode the data balance at each node.

Our main objectives are to identify conditions that guarantee that the set of saddle points is globally asymptotically stable under the dynamics (5) and formally characterize the robustness properties using the concept of input-to-state stability. The rest of the paper is structured as follows. Section 4 investigates novel conditions that guarantee global asymptotic convergence relying on LaSalle-type arguments. Section 5 instead identifies a strict Lyapunov function for constrained convex optimization problems. This finding allows us in Section 6 to go beyond convergence guarantees and explore the robustness properties of the saddle-point dynamics.

## 4 Local properties of the saddle function imply global convergence

Our first result of this section provides a novel characterization of the omega-limit set of the trajectories of the projected saddle-point dynamics (5).

###### Proposition 4.1

(Characterization of the omega-limit set of solutions of ): Given a twice continuously differentiable, convex-concave function , each point in the set is stable under the projected saddle-point dynamics and the omega-limit set of every solution is contained in the largest invariant set in , where

 E(F) ={(x,y,z)∈Rn×Rp≥0×Rm| (x−x∗;y−y∗;z−z∗)∈ker(¯¯¯¯¯H(x,y,z,x∗,y∗,z∗)), for all (x∗,y∗,z∗)∈Saddle(F)}, (7)

and

 ¯¯¯¯¯H(x,y,z,x∗,y∗,z∗)=∫10H(x(s),y(s),z(s))ds, (x(s),y(s),z(s))=(x∗,y∗,z∗)+s(x−x∗,y−y∗,z−z∗), H(x,y,z)=⎡⎢⎣−∇xxF000∇yyF∇yzF0∇zyF∇zzF⎤⎥⎦(x,y,z). (8)
{IEEEproof}

The proof follows from the application of the LaSalle Invariance Principle for discontinuous Caratheodory systems (cf. Proposition 2.1). Let and be defined as

 V1(x,y,z)=12(∥x−x∗∥2+∥y−y∗∥2+∥z−z∗∥2). (9)

The Lie derivative of along (5) is

 LXp-spV1(x,y,z) =−(x−x∗)⊤∇xF(x,y,z)+(y−y∗)⊤[∇yF(x,y,z)]+y +(z−z∗)⊤∇zF(x,y,z) =−(x−x∗)⊤∇xF(x,y,z)+(y−y∗)⊤∇yF(x,y,z) +(z−z∗)⊤∇zF(x,y,z) +(y−y∗)⊤([∇yF(x,y,z)]+y−∇yF(x,y,z)) ≤−(x−x∗)⊤∇xF(x,y,z)+(y−y∗)⊤∇yF(x,y,z) +(z−z∗)⊤∇zF(x,y,z), (10)

where the last inequality follows from the fact that for each . Indeed if , then and if , then and which implies that . Next, denoting and , we simplify the above inequality as

 LXp-spV1(x,y,z) ≤−(x−x∗)⊤∇xF(x,λ)+(λ−λ∗)⊤∇λF(x,λ) (a)=−(x−x∗)⊤∫10(∇xxF(x(s),λ(s))(x−x∗) +∇λxF(x(s),λ(s))(λ−λ∗))ds +(λ−λ∗)⊤∫10(∇xλF(x(s),λ(s))(x−x∗) +∇λλF(x(s),λ(s))(λ−λ∗))ds (b)=[x−x∗;λ−λ∗]⊤¯¯¯¯¯H(x,λ,x∗,λ∗)[x−x∗λ−λ∗](c)≤0,

where (a) follows from the fundamental theorem of calculus using the notation and and recalling from (6) that and ; (b) follows from the definition of using ; and (c) follows from the fact that is negative semi-definite. Now using this fact that is nonpositive at any point, one can deduce, see e.g. [20, Lemma 4.2-4.4], that starting from any point a unique trajectory of exists, is contained in the compact set at all times, and its omega-limit set is invariant. These facts imply that the hypotheses of Proposition 2.1 hold and so, we deduce that the solutions of the dynamics converge to the largest invariant set where the Lie derivative is zero, that is, the set

 E(F,x∗,y∗,z∗)={(x,y,z)∈Rn×Rp≥0×Rm| (x;y;z)−(x∗;y∗;z∗)∈ker(¯¯¯¯¯H(x,y,z,x∗,y∗,z∗))}. (11)

Finally, since was chosen arbitrary, we get that the solutions converge to the largest invariant set contained in , concluding the proof.

Note that the proof of Proposition 4.1 shows that the Lie derivative of the function is negative, but not strictly negative, outside the set . From Proposition 4.1 and the definition (4.1), we deduce that if a point belongs to the omega-limit set (and is not a saddle point), then the line integral of the Hessian block matrix (4.1) from the any saddle point to cannot be full rank. Elaborating further,

1. if is full rank at a saddle point and if the point belongs to the omega-limit set, then , and

2. if is full rank at a saddle point , then .

These properties are used in the next result which shows that local strong convexity-concavity at a saddle point together with global convexity-concavity of the saddle function are enough to guarantee global convergence.proving Theorem 4.2.

###### Theorem 4.2

(Global asymptotic stability of the set of saddle points under ): Given a twice continuously differentiable, convex-concave function which is locally strongly convex-concave at a saddle point, the set is globally asymptotically stable under the projected saddle-point dynamics and the convergence of trajectories is to a point.

{IEEEproof}

Our proof proceeds by characterizing the set defined in (4.1). Let be a saddle point at which is locally strongly convex-concave. Without loss of generality, assume that (the case of negative definiteness of the other Hessian block can be reasoned analogously). Let (recall the definition of this set in (4)). Since and is twice continuously differentiable, we have that is positive definite in a neighborhood of and so

 ∫10∇xxF(x(s),y(s),z(s))ds≻0,

where , , and . Therefore, by definition of , it follows that and so, . From Proposition 4.1 the trajectories of converge to the largest invariant set contained in . To characterize this set, let and be a trajectory of that is contained in and hence in . From (10), we get

 LXp-spV1(x,y,z) ≤−(x−x∗)⊤∇xF(x,y,z)+(y−y∗)⊤∇yF(x,y,z) +(z−z∗)⊤∇zF(x,y,z) ≤F(x,y,z)−F(x,y∗,z∗)+F(x∗,y,z)−F(x,y,z) ≤F(x∗,y∗,z∗)−F(x,y∗,z∗)+F(x∗,y,z) −F(x∗,y∗,z∗)≤0, (12)

where in the second inequality we have used the first-order convexity and concavity property of the maps and . Now since , using the above inequality, we get for all . Thus, for all , which yields

 ∇yF(x∗,y(t),z(t))⊤[∇yF(x∗,y(t),z(t))]+y(t)+∥∇zF(x∗,y(t),z(t))∥2=0

Note that both terms in the above expression are nonnegative and so, we get and for all . In particular, this holds at and so, , and we conclude . Hence is globally asymptotically stable. Combining this with the fact that individual saddle points are stable, one deduces the pointwise convergence of trajectories along the same lines as in [27, Corollary 5.2].

A closer look at the proof of the above result reveals that the same conclusion also holds under milder conditions on the saddle function. In particular, need only be twice continuously differentiable in a neighborhood of the saddle point and the local strong convexity-concavity can be relaxed to a condition on the line integral of Hessian blocks of . We state next this stronger result.

###### Theorem 4.3

(Global asymptotic stability of the set of saddle points under ): Let be convex-concave and continuously differentiable with locally Lipschitz gradient. Suppose there is a saddle point and a neighborhood of this point such that is twice continuously differentiable on and either of the following holds

1. for all ,

 ∫10∇xxF(x(s),y(s),z(s))ds≻0,
2. for all ,

 ∫10[∇yyF∇yzF∇zyF∇zzF](x(s),y(s),z(s))ds≺0,

where are given in (4.1). Then, is globally asymptotically stable under the projected saddle-point dynamics and the convergence of trajectories is to a point.

We omit the proof of this result for space reasons: the argument is analogous to the proof of Theorem 4.2, where one replaces the integral of Hessian blocks by the integral of generalized Hessian blocks (see [28, Chapter 2] for the definition of the latter), as the function is not twice continuously differentiable everywhere.

###### Example 4.4

(Illustration of global asymptotic convergence): Consider given as

 F(x,y,z)=f(x)+y(−x1−1)+z(x1−x2), (13)

where

 f(x)={∥x∥4, if ∥x∥≤12,116+12(∥x∥−12), if ∥x∥≥12.

Note that is convex-concave on and . Also, is continuously differentiable on the entire domain and its gradient is locally Lipschitz. Finally, is twice continuously differentiable on the neighborhood of the saddle point and hypothesis (i) of Theorem 4.3 holds on  . Therefore, we conclude from Theorem 4.3 that the trajectories of the projected saddle-point dynamics of converge globally asymptotically to the saddle point . Figure 1 shows an execution.

###### Remark 4.5

(Comparison with the literature): Theorems 4.2 and 4.3 complement the available results in the literature concerning the asymptotic convergence properties of saddle-point [3, 19, 17] and primal-dual dynamics [5, 20]. The former dynamics corresponds to (5) when the variable is absent and the later to (5) when the variable is absent. For both saddle-point and primal-dual dynamics, existing global asymptotic stability results require assumptions on the global properties of , in addition to the global convexity-concavity of , such as global strong convexity-concavity [3], global strict convexity-concavity, and its generalizations [19]. In contrast, the novelty of our results lies in establishing that certain local properties of the saddle function are enough to guarantee global asymptotic convergence.

## 5 Lyapunov function for constrained convex optimization problems

Our discussion above has established the global asymptotic stability of the set of saddle points resorting to LaSalle-type arguments (because the function defined in (9) is not a strict Lyapunov function). In this section, we identify instead a strict Lyapunov function for the projected saddle-point dynamics when the saddle function  corresponds to the Lagrangian of a constrained optimization problem, cf. Remark 3.1. The relevance of this result stems from two facts. On the one hand, the projected saddle-point dynamics has been employed profusely to solve network optimization problems. On the other hand, although the conclusions on the asymptotic convergence of this dynamics that can be obtained with the identified Lyapunov function are the same as in the previous section, having a Lyapunov function available is advantageous for a number of reasons, including the study of robustness against disturbances, the characterization of the algorithm convergence rate, or as a design tool for developing opportunistic state-triggered implementations. We come back to this point in Section 6 below.

###### Theorem 5.1

(Lyapunov function for ): Let be defined as

 F(x,y,z)=f(x)+y⊤g(x)+z⊤(Ax−b), (14)

where is strongly convex, twice continuously differentiable, is convex, twice continuously differentiable, , and . For each , define the index set of active constraints

 J(x,y,z)={j∈{1,…,p}|yj=0 and (∇yF(x,y,z))j<0}.

Then, the function ,

is nonnegative everywhere in its domain and if and only if . Moreover, for any trajectory of , the map

1. is differentiable almost everywhere and if for some , then provided the derivative exists. Furthermore, for any sequence of times such that and exists for every , we have ,

2. is right-continuous and at any point of discontinuity , we have .

As a consequence, is globally asymptotically stable under and convergence of trajectories is to a point.

{IEEEproof}

We start by partitioning the domain based on the active constraints. Let and

 Extra open brace or missing close brace

Note that for , , we have . Moreover,

 Rn×Rp≥0×Rm=⋃I⊂{1,…,p}D(I).

For each , define the function