A novel iterative method to approximate structured singular values

A novel iterative method to approximate structured singular values

Nicola Guglielmi111Dipartimento di Ingegneria Scienze Informatiche e Matematica (DISIM), Università degli Studi di L’ Aquila, Via Vetoio - Loc. Coppito, and Gran Sasso Science Institute (GSSI), via Crispi 7, I-67010 L’ Aquila, Italy. Email: guglielm@univaq.it    Mutti-Ur Rehman222Gran Sasso Science Institute (GSSI), via Crispi 7, I-67010 L’ Aquila, Italy. Email: mutti.abbasi@gssi.infn.it    Daniel Kressner333EPFL-SB-MATHICSE-ANCHP, Station 8, CH-1015 Lausanne, Switzerland. Email: daniel.kressner@epfl.ch
21 March 2016
Abstract

A novel method for approximating structured singular values (also known as -values) is proposed and investigated. These quantities constitute an important tool in the stability analysis of uncertain linear control systems as well as in structured eigenvalue perturbation theory. Our approach consists of an inner-outer iteration. In the outer iteration, a Newton method is used to adjust the perturbation level. The inner iteration solves a gradient system associated with an optimization problem on the manifold induced by the structure. Numerical results and comparison with the well-known Matlab function mussv, implemented in the Matlab Control Toolbox, illustrate the behavior of the method.

S

tructured singular value, -value, spectral value set, block diagonal perturbations, stability radius, differential equation, low-rank matrix manifold.

{AMS}

15A18, 65K05

1 Introduction

The structured singular value (SSV) [14] is an important and versatile tool in control, as it allows to address a central problem in the analysis and synthesis of control systems: To quantify the stability of a closed-loop linear time-invariant systems subject to structured perturbations. The class of structures addressed by the SSV is very general and allows to cover all types of parametric uncertainties that can be incorporated into the control system via real or complex linear fractional transformations. We refer to [1, 3, 4, 8, 9, 10, 14, 17, 20] and the references therein for examples and applications of the SSV.

The versatility of the SSV comes at the expense of being notoriously hard, in fact NP hard [2], to compute. Algorithms used in practice thus aim at providing upper and lower bounds, often resulting in a coarse estimate of the exact value. An upper bound of the SSV provides sufficient conditions to guarantee robust stability, while a lower bound provides sufficient conditions for instability and often also allows to determine structured perturbations that destabilize the closed loop linear system.

The widely used function mussv in the Matlab Control Toolbox computes an upper bound of the SSV using diagonal balancing / LMI techniques [19, 5]. The lower bound is computed by a generalization of the power method developed in [18, 15]. This algorithm resembles a mixture of the power methods for computing the spectral radius and the largest singular value, which is not surprising, since the SSV can be viewed as a generalization of both. When the algorithm converges, a lower bound of the SSV results and this is always an equilibrium point of the iteration. However, in contrast to the standard power method, there are, in general, several stable equilibrium points and not all of them correspond to the SSV. In turn, one cannot guarantee convergence to the exact value but only to a lower bound. We remark that, despite this drawback, mussv is a very reliable and powerful routine, which reflects the state of the art in the approximation of the SSV.

In this paper, we present a new approach to computing a lower bound of the SSV associated with general mixed real/complex perturbations. The main ingredient of our new algorithm is a gradient system that evolves perturbations on a certain matrix manifold towards critical perturbations. Among the theoretical properties established for this gradient system, we prove a monotonicity property that indicates robustness and can also be exploited in the numerical discretization. We show several numerical examples for which our algorithm provides tighter bounds than those computed by mussv.

1.1 Overview of the article

Section 2 provides the basic framework for the proposed methodology. In particular, we explain how the computation of the SSV can be addressed by an inner-outer algorithm, where the outer algorithm determines the perturbation level and the inner algorithm determines a (local) extremizer of the structured spectral value set. Moreover, an example illustrates that the output mussv may fail to satisfy a necessary condition for optimality.

In Section 3 we develop the inner algorithm for the case of complex structured perturbations. An important characterization of extremizers shows that we can restrict ourselves to a manifold of structured perturbations with normalized and low-rank blocks. A gradient system for finding extremizers on this manifold is established and analyzed.

Section 4 extends the results of Section 3 to perturbations with complex full blocks alternated and mixed complex/real repeated scalar blocks.

The outer algorithm is addressed in Section 5, where a Newton method for determining the correct perturbation level is developed. The algorithm proposed in this work is presented in Section 5.3.

Finally, in Section 6, we present a range of numerical experiments to compare the quality of the lower bounds obtained with our algorithm to those obtained with mussv.

2 Framework

We consider a matrix and an underlying perturbation set with prescribed block diagonal structure,

(1)

where denotes the identity matrix. Each of the scalars and the matrices may be constrained to stay real in the definition of . The integer denotes the number of repeated scalar blocks (that is, scalar multiples of the identity) and denotes the number of full blocks. This implies . In order to distinguish complex and real scalar blocks, we assume that the first blocks are complex while the (possibly) remaining blocks are real. Similarly we assume that the first full blocks are complex and the (possibly) remaining blocks are real. The literature (see, e.g., [14]) usually does not consider real full blocks, that is, . In fact, in control theory, full blocks arise from uncertainties associated to the frequency response of a system, which is complex-valued.

For simplicity, we assume that all full blocks are square, although this is not necessary and our method extends to the non-square case in a straightforward way. Similarly, the chosen ordering of blocks should not be viewed as a limiting assumption; it merely simplifies notation.

The following definition is given in [14], where denotes the matrix -norm and the identity matrix. {definition} Let and consider a set of the form (1). Then the SSV (or -value) is defined as

(2)

In Definition (2) and in the following, we use the convention that the minimum over an empty set is . In particular, if for all .

Note that is a positively homogeneous function, i.e.,

For , it follows directly from Definition 2 that . For general , the SSV can only become smaller and we thus have the upper bound . This can be refined further by exploiting the properties of , see [20]. These relations between and , the largest singular value of , justifies the name structured singular value for .

The important special case when only allows for complex perturbations, that is, and , deserves particular attention. In this case we will write instead of . Note that implies for any . In turn, there is such that if and only if there is , with the same norm, such that has the eigenvalue , which implies . This gives the following alternative expression:

(3)

where denotes the spectral radius of a matrix. For any nonzero eigenvalue of , the matrix satisfies the constraints of the minimization problem in (3). This establishes the lower bound for the case of purely complex perturbations. Note that for . Hence, both the spectral radius and the matrix 2-norm are included as (trivial) special cases of the SSV.

2.1 A motivating example

Consider the matrix

where denotes the imaginary unit, along with the perturbation set

Applying the Matlab function mussv***In all experiments we have used mussv with its default parameters. yields the bounds

(4)

The large difference between the lower and upper bounds is caused by the lower bound. The perturbation determining the lower bound is given by with

and . The scaling has been chosen such that . However, not all blocks of have unit norm; the repeated scalar block of has norm . We will see in Theorem 4.1 below that this violates a necessary optimality condition for an extremizer , which states that the spectral norm of all blocks of a normalized extremizer, under suitable conditions which are fulfilled here, should be one.

Applying our new algorithm, Algorithm 1 below, we obtain the perturbation with

and , determining the lower bound

which makes the estimate (4) substantially sharper. Note that both blocks of have unit norm.

2.2 A reformulation based on structured spectral value sets

The structured spectral value set of with respect to a perturbation level is defined as

(5)

where denotes the spectrum of a matrix. Note that for purely complex , the set (5) is simply a disk centered at . The set

(6)

allows us to express the SSV defined in (2) as

that is, as a structured distance to singularity problem. We have that if and only if .

For a purely complex perturbation set , we can use (3) to alternatively express the SSV as

(7)

We have that , where denotes the open complex unit disk, if and only if .

2.3 Overview of the proposed methodology

Let us consider the minimization problem

(8)

for some fixed . By the discussion above, the SSV is the reciprocal of the smallest value of for which . This suggests a two-level algorithm: In the inner algorithm, we attempt to solve (8). In the outer algorithm, we vary by an iterative procedure which exploits the knowledge of the exact derivative of an extremizer – say – with respect to . We address (8) by solving a system of ODEs. In general, this only yields a local minimum of (8) which, in turn, gives an upper bound for and hence a lower bound for . Due to the lack of global optimality criteria for (8), the only way to increase the robustness of the method is to compute several local optima.

The case of a purely complex perturbation set can be addressed analogously by letting the inner algorithm determine local optima for

(9)

which then yields a lower bound for .

3 Purely complex perturbations

In this section, we consider the solution of the inner problem (9) in the estimation of for and a purely complex perturbation set

3.1 Extremizers

We will make use of the following standard eigenvalue perturbation result, see, e.g., [11, Section II.1.1]. Here and in the following, we denote .

{lemma}

Consider a smooth matrix family and let be an eigenvalue of converging to a simple eigenvalue of as . Then is analytic near with

where and are right and left eigenvectors of associated to , that is, and .

Our goal is to solve the maximization problem (9), which requires finding a perturbation such that is maximal among all with . In the following, we call a largest eigenvalue if equals the spectral radius. {definition} A matrix such that and has a largest eigenvalue that locally maximizes the modulus of is called a local extremizer.

The following result provides an important characterization of local extremizers.

{theorem}

Let

be a local extremizer of . We assume that has a simple largest eigenvalue , with the right and left eigenvectors and scaled such that . Partitioning

(10)

such that the size of the components equals the size of the th block in , we additionally assume that

(11)
(12)

Then

that is, all blocks of have unit -norm. {proof} The result is proved by contradiction. We first assume that for some and consider the matrix-valued function

(13)

which satisfies and for sufficiently small. Since is simple, we can apply Lemma 3.1 to and obtain

(14)

Inserting (13) and exploiting (12), we obtain

which contradicts the extremality of .

Let us now assume that for some and consider the matrix valued function

which again satisfies and for sufficiently small. In analogy to the first part, Assumption (11) implies

This again gives a contradiction.

Remark 3.1

Note that Assumptions (11) and (12) as well as the simplicity of are generic and commonly found in the literature on algorithms for the SSV, see, e.g., [14, Sec. 7.2].

The following theorem allows us to replace the full blocks in a local extremizer by rank-1 matrices. {theorem} Let be a local extremizer and let be defined and partitioned as in Theorem 3.1. Assuming that (12) holds, every block has a singular value with associated singular vectors and for some . Moreover, the matrix

is also a local extremizer, i.e., . {proof} Let , and consider the matrix valued function

which has -norm bounded by for . By Theorem 3.1, , which implies . Consequently,

Combined with the extremality assumption, we obtain . This implies that has singular vectors and , which completes the proof.

Remark 3.2

Theorem 3.1 allows us to restrict the perturbations in the structured spectral value set (5) to those with rank-1 blocks, which was also shown in [14]. Since the Frobenius and the matrix 2-norms of a rank-1 matrix are equal, we can equivalently search for extremizers within the submanifold

(15)

3.2 A system of ODEs to compute extremal points of

In order to compute a local maximizer for , with , we will first construct a matrix valued function , where , such that a largest eigenvalue of has maximal local increase. We then derive a system of ODEs satisfied by this choice of .

Orthogonal projection onto

In the following, we make use of the Frobenius inner product for two matrices . We let

(16)

denote the orthogonal projection, with respect to the Frobenius inner product, of a matrix onto . To derive a compact formula for this projection, we use the pattern matrix

(17)

where denotes the -matrix of all ones. {lemma} For , let

denote the block diagonal matrix obtained by entrywise multiplication of with the matrix defined in (17). Then the orthogonal projection of onto is given by

(18)

where , and . {proof} The result follows directly from the fact that

holds for every .

If is a rank- matrix, with the partitioning

then the diagonal blocks of the orthogonal projection are again rank- matrices and, moreover, .

The local optimization problem

Let us recall the setting from Section 3.1: We assume that is a simple eigenvalue with eigenvectors normalized such that

(19)

As a consequence of Lemma 3.1, see also (14), we have

(20)

where and the dependence on is intentionally omitted.

Letting , with as in (15), we now aim at determining a direction that locally maximizes the increase of the modulus of . This amounts to determining

(21)

as a solution of the optimization problem

(22)

The target function in (22) follows from (20), while the constraints in (21) and (22) ensure that is in the tangent space of at . In particular, (22) implies that the the norms of the blocks of are conserved. Note that (22) only becomes well-posed after imposing an additional normalization on the norm of . The scaling chosen in the following lemma aims at .

{lemma}

With the notation introduced above and partitioned as in (10), a solution of the optimization problem (22) is given by

with

(23)
(24)

Here, is the reciprocal of the absolute value of the right-hand side in (23), if this is different from zero, and otherwise. Similarly, is the reciprocal of the Frobenius norm of the matrix on the right hand side in (24), if this is different from zero, and otherwise. If all right-hand sides are different from zero then . {proof} The equality

implies that the maximization problem (22) decouples, which allows us to maximize for each block of individually.

For a full block , the term is maximized by the orthogonal projection of onto the (real linear) subspace . This gives (24), with the scaling chosen such that unless .

For a block , the term is maximized by projecting onto . This gives (23), with the scaling chosen such that unless . {corollary} The result of Lemma 3 can be expressed as

(25)

where is the orthogonal projection from Definition 16, and are diagonal matrices with positive. {proof} The statement is an immediate consequence of Lemma 3.

The system of ODEs

Lemma 3 and Corollary 3 suggest to consider the following differential equation on the manifold :

(26)

where is an eigenvector, of unit norm, associated to a simple eigenvalue of for some fixed . Note that depend on as well. The differential equation (26) is a gradient system because, by definition, the right-hand side is the projected gradient of .

The following result follows directly from Lemmas 3.1 and 3. {theorem} Let satisfy the differential equation (26). If is a simple eigenvalue of , then increases monotonically.

The following lemma establishes a useful property for the analysis of stationary points of (26). {lemma} Let satisfy the differential equation (26). If is a nonzero simple eigenvalue of , with right and left eigenvectors and scaled according to (19), then

(27)

where . {proof} For convenience, we again omit the dependence on and let . Assume – in contradiction to the statement – that . Because of the block diagonal structure of , this implies

(28)

On the other hand,

Exploiting the normalization (19) and the simplicity of , we obtain This, however, contradicts (28).

The differential equation (26) can be expressed in terms of the blocks of , that is, through and , as follows:

(29)

with the scalars , defined in Lemma 3.

Because of , we can reparametrize and rewrite the first set of equations in (29) as a system of ODEs in . Setting , we obtain

which gives With the normalization imposed by , this finally yields

In particular, this means that if and only if ; maximizers correspond to .

Remark 3.3

The choice of , originating from Lemma 3, to achieve unit norm of all blocks in (25), is completely arbitrary. Other choices would be also acceptable and investigating an optimal one in terms of speed of convergence to stationary points would be an interesting issue.

The following result characterizes stationary points of (26). {theorem} Assume that is a solution of (26) and is a largest simple nonzero eigenvalue of with right/left eigenvectors , . Moreover, suppose that Assumptions (11) and (12) hold for and . Then

(30)

for a specific real diagonal matrix . Moreover if has (locally) maximal modulus over the set then is positive.

{proof}

By (20), implies . Inserting (29) shows that each block of is necessarily zero and hence . The other direction of the first equivalence in (30) is trivial. The second equivalence in (30) follows directly from (26). By Theorem 3.1, all blocks of have norm and hence none of the scalars defining can be zero. Thus, is nonsingular.

Assumping that has (locally) maximal modulus, we now prove positivity of by contradiction. Suppose that the th full block of is equal to with , implying . Consider an ODE with the th block and initial datum , while leaving all other blocks of unaltered. Such an ODE clearly decreases the norm of for , for some (implying that does not exceed ). By the usual derivative formula from Lemma 3.1 the largest eigenvalue of is such that , which contradicts local maximality.

Similarly consider a repeated scalar block and assume that the th block of is equal to with , which means . Consider, similarly to previous case, an ODE with the th block and initial datum . Again, decreases in a sufficiently small time-interval and increases in the same interval, yielding again a contradiction.

3.3 Projection of full blocks on rank- manifolds

In order to exploit the rank- property of extremizers established in Theorem 3.1, we can proceed in complete analogy to [6] in order to obtain for each full block an ODE on the manifold of (complex) rank-1 matrices. We express as

where and have unit norm. The parameters , are uniquely determined by and when imposing the orthogonality conditions .

In the differential equation (29) we replace the right-hand side by its orthogonal projection onto the tangent space (and also remove the normalization constant) to obtain

(31)

Note that the orthogonal projection of a matrix onto at is given by

Following the arguments of [6], the equation is equivalent to