Optimal actuator design for minimizing the worst-case control energy

# Optimal actuator design for minimizing the worst-case control energy

[    [
###### Abstract:

We consider the actuator design problem for linear systems. Specifically, we aim to identify an actuator which requires the least amount of control energy to drive the system from an arbitrary initial condition to the origin in the worst case. Said otherwise, we investigate the minimax problem of minimizing the control energy over the worst possible initial conditions. Recall that the least amount of control energy needed to drive a linear controllable system from any initial condition on the unit sphere to the origin is upper-bounded by the inverse of the smallest eigenvalue of the associated controllability Gramian, and moreover, the upper-bound is sharp. The minimax problem can be thus viewed as the optimization problem of minimizing the upper-bound via an actuator design. In spite of its simple and natural formulation, this problem is difficult to solve. In fact, properties such as the stability of the system matrix, which are not related to controllability, now play important roles. We focus in this paper on the special case where the system matrix is positive definite (and hence the system is completely unstable). Under this assumption, we are able to provide a complete solution to the optimal actuator design problem and highlight the difficulty in solving the general problem.

First]Xudong Chen Second]M.-A. Belabbas

{}^{\mathrm{*}}\,Department of ECEE, University of Colorado at Boulder,
Boulder, CO 80309 USA (e-mail: xudong.chen@colorado.edu). {}^{\mathrm{}}

{}^{\mathrm{†}}\,Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA (e-mail: belabbas@illinois.edu). {}^{\mathrm{}}

Keywords: Linear control system, Optimal actuator design, Minimax problem, Matrix analysis

\@xsect

We consider in this paper the following single-input linear control system

 \dot{x}(t)=Ax(t)+bu(t),\hskip 20.0ptx(t)\in\mathbb{R}^{n}, (1)

together with an infinite horizon quadratic cost function:

 \eta:=\intop^{\infty}_{0}u(t)^{\top}u(t)dt,

which penalizes the energy consumption for driving system (Optimal actuator design for minimizing the worst-case control energy) from an initial condition x_{0} to the origin. It is well known that if system (Optimal actuator design for minimizing the worst-case control energy) is controllable (Brockett (1970)), i.e., the controllability matrix {\mathrm{C}}(A,b):=[b,Ab,\ldots,A^{n-1}b] is non-singular, then the minimal energy consumption (with respect to the initial condition x_{0}) is given by

 \eta_{\min}(x_{0},b)=x_{0}^{\top}W_{A}(b)^{-1}x_{0}, (2)

where W_{A}(b) is given by

 W_{A}(b):=\intop^{\infty}_{0}e^{-At}bb^{\top}e^{-A^{\top}t}dt, (3)

We note here that if we replace -A with A in (3), then the resulting matrix W_{A}(b) is the controllability Gramian associated with (Optimal actuator design for minimizing the worst-case control energy). From (2), it should be clear that if an initial condition x^{\prime}_{0} (resp. the actuator b^{\prime}) is a scalar multiple of x_{0} (resp. b), i.e., x^{\prime}_{0}=\alpha x_{0} (resp. b^{\prime}=\beta b), then

 \eta_{\min}(x^{\prime}_{0},b^{\prime})=\frac{\alpha^{2}}{\beta^{2}}\eta_{\min}% (x_{0},b).

Following the fact, we normalize in the sequel the initial condition x_{0}, as well as the actuator b, such that \|x_{0}\|=\|b\|=1. We further note that if the matrix A is stable, i.e., all the eigenvalues of A lie in the left half plane, then one can set u(t)\equiv 0, and hence \eta_{\min}(x_{0},b)=0 for any pair (x_{0},b). On the other hand, if the matrix A is such that any of its eigenvalues lies in the right half plane (or equivalently, -A is stable), then the open-loop system (Optimal actuator design for minimizing the worst-case control energy) is unstable. In particular, there does not exists a pair (x_{0},b) such that \eta_{\min}(x_{0},b)=0. For the reason mentioned above, we will assume in the sequel that system (Optimal actuator design for minimizing the worst-case control energy) is open-loop unstable. We call the system matrix A completely unstable.

Problem formulation. In this paper, we optimize b, with \|b\|=1 fixed, for minimizing the energy consumption \eta_{\min}(x_{0},b) whereas the initial condition x_{0} is chosen so as to maximize \eta_{\min}(x_{0},b) for a fixed b. More specifically, we investigate the following minimax problem:

 \begin{array}[]{l}\phi:=\min_{b}\max_{x_{0}}x_{0}^{\top}W_{A}(b)^{-1}x_{0},\\ \hskip 5.0pt{\mathrm{s}.t.}\hskip 5.0pt\|x_{0}\|=1\hskip 5.0pt{\mathrm{a}nd}% \hskip 5.0pt\|b\|=1.\\ \end{array} (4)

For an arbitrary real symmetric matrix M, we denote by \lambda_{\max}M (resp. \lambda_{\min}M) the largest (resp. smallest) eigenvalue of M. Note that the matrix W_{A}(b) is positive semi-definite (and hence symmetric). Thus, for a fix vector b\in S^{n-1} where S^{n-1} denotes the unit sphere in \mathbb{R}^{n}, we have

 \max_{x\in S^{n-1}}x^{\top}W_{A}(b)^{-1}x=\lambda_{\max}W_{A}(b)^{-1}=\left(% \lambda_{\min}W_{A}(b)\right)^{-1}, (5)

In the case W_{A}(b) is singular, we set \lambda_{\max}W_{A}(b)^{-1} to be infinity. From (5), we obtain

 \phi=\min_{b\in S^{n-1}}\max_{x\in S^{n-1}}x^{\top}W_{A}(b)^{-1}x=\min_{b\in S% ^{n-1}}\lambda_{\max}W_{A}(b)^{-1}.

Said in another way, the original minimax problem can be also viewed as an optimization problem that minimizes the largest eigenvalue of W_{A}(b)^{-1} (or equivalently, maximizes the smallest eigenvalue of W_{A}(b)) over b\in S^{n-1}. We further define \arg\phi to be the set of pairs (x,b) in S^{n-1}\times S^{n-1} satisfying the following properties:

1. For the vector b, we have \lambda_{\max}W(b)^{-1}=\phi, i.e., the choice of b minimizes \lambda_{\max}W(b)^{-1}.

2. The vector x is an eigenvector of W(b)^{-1} corresponding to its largest eigenvalue, and hence

 x^{\top}W(b)^{-1}x=\phi.

Our objective is thus to compute both \phi and \arg\phi.

There have been a few studies in recent years related to the general problem of actuator design, and its dual sensor design for minimal actuator energy problems. For example, it has been investigated in Belabbas (2016) about how to place an actuator of system (Optimal actuator design for minimizing the worst-case control energy) so as to minimize an infinite-horizon quadratic cost function:

 \eta=\lim_{T\to\infty}\frac{1}{T}\intop_{0}^{T}(x^{\top}Qx+u^{\top}u)dt.

However, the initial condition x_{0} there is not chosen to be in the worst case, but rather treated as a random variable drawn from a rotationally invariant distribution. A fairly complete solution was provided there, with the assumptions that the system matrix A is stable and that the norm of the actuator \|b\| is not too large. There are other investigations into this problem, most of them are ad-hoc or application specific (see, for example Hiramoto et al. (2000); Chen and Rowley (2014); Singiresu et al. (1991)). Besides the works on the finite dimensional linear systems, there has also been ample work on the infinite-dimensional case: we refer to Morris (2011) and references therein. For the problems which are specific about minimizing control energy, we mention the general investigation of various control energy measures in Pasqualetti et al. (2014). We also note that in the work Olshevsky (2016), the author there considered a similar problem for discrete-time dynamical systems, and established bounds on the smallest eigenvalue of the corresponding discrete-time controllability Gramian, and the work Dhingra et al. (2014) in which the authors use an L1 optimization approach to promote sparsity of a controller in related scenarios.

Outline of contribution. We solve in the paper the minimax problem (4) for the case where the system matrix A is positive definite, i.e., A is a symmetric matrix with positive eigenvalues. We compute explicitly the value of \phi. Furthermore, we provide a complete characterization of the set \arg\phi, i.e., we solve the optimal actuators, as well as the corresponding worst-case initial conditions. Even though we have made the assumption that A is symmetric so as to simplify the problem (as we will see at the beginning of Section 2, it suffices to consider the case where A is a diagonal matrix), the analysis needed for solving the minimax problem is not trivial at all. Indeed, the properties we establish for the set \arg\phi provide many insights for solving the minimax problem within a general context (i.e., A is arbitrary). For example, we show in the paper that if a pair (x,b) lies in \arg\phi, then the signs of the entries of x and b exhibits an interlacing pattern:

 \operatorname{sgn}(x)=\pm\begin{bmatrix}-1&&&\\ &1&&\\ &&\ddots&\\ &&&(-1)^{n}\end{bmatrix}\operatorname{sgn}(b)

where \operatorname{sgn}(\cdot) denotes the sign function applying on a vector entry-wise (a precise definition is given in the next section). Such a property has also been observed via simulations for general cases where A is not necessarily symmetric.

The remainder of the paper is organized as follows: In section 2, we first introduce definitions and certain key notations, and then establish the main result, Theorem 1, of the paper, in which we completely solve \phi and \arg\phi. Then, in section 3, we establish various properties (e.g., the interlacing sign pattern for a pair (x,b)\in\arg\phi) that are needed to prove the main result. We provide conclusions and outlooks. The paper ends with an appendix which contains a proof of a technical result.

\@xsect

We assume that the system matrix A is positive definite, and denote by \lambda_{1},\ldots,\lambda_{n} its eigenvalues. We further take a generic assumption that the eigenvalues of A are pairwise distinct. We re-arrange the order of the \lambda_{i}’s so that

 0<\lambda_{1}<\cdots<\lambda_{n}.

Now, let \Theta be the orthogonal matrix that diagonalizes A, i.e., A=\Theta\Lambda\Theta^{\top}, with \Lambda a diagonal matrix given by \Lambda:=\operatorname{diag}(\lambda_{1},\ldots,\lambda_{n}). Note that if we let x^{\prime}:=\Theta^{\top}x and b^{\prime}:=\Theta^{\top}b, then by computation,

 x^{\top}W_{A}(b)^{-1}x=x^{\prime\top}W_{\Lambda}(b^{\prime})^{-1}x^{\prime}.

This, in particular implies that

 \displaystyle\min_{b\in S^{n-1}}\max_{x\in S^{n-1}}x^{\top}W_{A}(b)^{-1}x=\\ \displaystyle\min_{b^{\prime}\in S^{n-1}}\max_{x^{\prime}\in S^{n-1}}x^{\prime% \top}W_{\Lambda}(b^{\prime})^{-1}x^{\prime}.

Thus, by following this fact, we can assume without loss of any generality that the matrix A is itself a diagonal matrix, i.e., A=\Lambda. We will take such an assumption for the remainder of the paper. Further, for ease of notation, we will suppress the sub-index A of the matrix W_{A}(b), and simply write W(b).

We will now describe the main result of the paper. To proceed, we first introduce some definitions and notations. Let {\mathbf{1}}\in\mathbb{R}^{n} be a vector of all ones. We define a positive definite matrix \Psi as follows:

 \Psi:=W({\mathbf{1}})=\intop^{\infty}_{0}e^{-At}{\mathbf{1}}{\mathbf{1}}^{\top% }e^{-A^{\top}t}dt

Since A=\operatorname{diag}(\lambda_{1},\ldots,\lambda_{n}), we obtain by computation that \Psi is a Cauchy matrix (Schechter (1959)) given by

 \Psi=\left[\frac{1}{\lambda_{i}+\lambda_{j}}\right]_{ij}, (6)

i.e., 1/(\lambda_{i}+\lambda_{j}) is the ij-th entry of \Psi. Note that with the matrix \Psi defined above, one can express W(b) as follows:

 W(b)=\operatorname{diag}(b)\Psi\operatorname{diag}(b).

We further need the following definition:

###### Definition 1 (Signature matrix)

A real n\times n matrix M is a signature matrix if it is a diagonal matrix, and the absolute value of each diagonal entry is one.

We denote by \Sigma the set of all n\times n signature matrices, and \sigma an element in \Sigma. It should be clear that \Sigma is a finite set, with 2^{n} diagonal matrices in total. Among these signature matrices, there is a special signature matrix \sigma_{*} of our particular interest, which is defined as follows:

 \sigma_{*}:=\begin{bmatrix}-1&&&\\ &1&&\\ &&\ddots&\\ &&&(-1)^{n}\end{bmatrix}. (7)

Note that the diagonal entries of \sigma_{*} exhibit an interlacing sign pattern. With the definitions and notations above, we are now in a position to state the main result of the paper:

###### Theorem 1

The following hold for \phi and \arg\phi:

1. Let \Psi and \sigma_{*} be defined in (6) and (7), respectively. Then, \phi={\mathbf{1}}^{\top}\sigma_{*}\Psi^{-1}\sigma_{*}{\mathbf{1}}.

2. The entries of the vector \sigma_{*}\Psi^{-1}\sigma_{*}{\mathbf{1}} are positive. Let v_{*} be the (unique) vector of positive entries such that

 \begin{bmatrix}v^{2}_{*,1}\\ \vdots\\ v^{2}_{*,n}\end{bmatrix}=\frac{1}{\phi}\sigma_{*}\Psi^{-1}\sigma_{*}{\mathbf{1% }}. (8)

Then, \arg\phi has 2^{n+1} elements:

 \arg\phi=\{(\pm\sigma v_{*},\sigma\sigma_{*}v_{*})\mid\sigma\in\Sigma\}. (9)

The remainder of the section is devoted to the proof of Theorem 1. We will first establish in Subsection Optimal actuator design for minimizing the worst-case control energy the sign pattern for a pair (x,b)\in\arg\phi. Then, in Subsection Optimal actuator design for minimizing the worst-case control energy, we reformulate the minimax problem as an optimization problem maximizing \lambda_{\min}W(b), and establish a necessary and sufficient condition for a vector b\in S^{n-1} to be a critical point of the function \lambda_{\min}W(b). A complete proof of Theorem 1 is provided in Subsection 2.3.

\@xsect

To proceed, we first have some definitions and notations. We denote by \operatorname{sgn}(\cdot) the sign function, i.e., for a scalar x\in\mathbb{R}, we let

 \operatorname{sgn}(x):=\left\{\begin{array}[]{ll}1&x>0,\\ 0&x=0,\\ -1&x<0.\end{array}\right.

Then, for an arbitrary matrix A=[a_{ij}]_{ij}\in\mathbb{R}^{m\times n} (or simply a vector), we define \operatorname{sgn}(A) by letting \operatorname{sgn} act on each entry of A:

 \operatorname{sgn}(A):=\left[\operatorname{sgn}(a_{ij})\right]_{ij}.

We further let {\mathrm{a}bs}(A) be a positive matrix defined by replacing each entry of A with its absolute value. We note that if v is a vector, then \operatorname{abs}(v) can be expressed as follows:

 \operatorname{abs}(v)=\operatorname{diag}(\operatorname{sgn}(v))v.

With the definitions above, we establish below the following result:

###### Proposition 1

Let (x_{*},b_{*}) be a pair in \arg\phi. Then, all the entries of x_{*} and of b_{*} are nonzero. Moreover,

 \operatorname{sgn}(x_{*})=\pm\sigma_{*}\operatorname{sgn}(b_{*}), (10)

where \sigma_{*} is the signature matrix defined in (7)

We establish below proposition 1. First, recall that the matrix \Psi\in\mathbb{R}^{n\times n}, defined in (6), is positive definite. It then follows that

 W(b)=\operatorname{diag}(b)\Psi\operatorname{diag}(b)

is positive semi-definite, and is positive definite if and only if the entries of b are all nonzero. Indeed, the number of zero eigenvalues of W(b) is the same as the number of zero entries of b. We also note that for a fixed vector b\in\mathbb{R}^{n},

 \max_{x\in S^{n-1}}x^{\top}W(b)^{-1}x=\lambda_{\max}W(b)^{-1}.

Hence, if b_{1},b_{2}\in S^{n-1} are two vectors such that b_{1} has no zero entry and b_{2} has at least one zero entry, then

 \lambda_{\max}W(b_{1})^{-1}<\lambda_{\max}W(b_{2})^{-1}=\infty.

The arguments above then imply the following fact:

###### Lemma 1

If (x_{*},b_{*}) is in \arg\phi, then all the entries of b_{*} are nonzero.

Now, following Lemma 1, we fix a vector b\in\mathbb{R}^{n} with no zero entry. It then follows that the matrix W(b) is invertible, with its inverse given by

 W(b)^{-1}=\operatorname{diag}(b)^{-1}\Psi^{-1}\operatorname{diag}(b)^{-1}.

We now compute explicitly \Psi^{-1}. To do so, we first recall a relevant fact about the determinant of a Cauchy matrix:

###### Lemma 2

Let \{\alpha_{i}\}^{n}_{i=1} and \{\beta_{i}\}^{n}_{i=1} be two sets of positive numbers, and M be an n\times n Cauchy matrix:

 M:=\left[\frac{1}{\alpha_{i}+\beta_{j}}\right]_{ij}.

Then, the determinant of M is given by

 \det(M)=\prod^{n}_{k=1}\frac{1}{\alpha_{k}+\beta_{k}}\prod_{1\leq i

In particular, \det(M)>0.

This fact has certainly been observed before (see, for example, Schechter (1959)). We reproduce a proof in the Appendix for the sake of completeness.

With the lemma at hand, we return to the computation of \Psi^{-1}: First, we write

 \Psi^{-1}=\frac{1}{\det(\Psi)}{\mathrm{C}o}(\Psi)^{\top},

where {\mathrm{C}o}(\Psi)={\mathrm{C}o}(\Psi)_{ij} is the cofactor of \Psi:

 {\mathrm{C}o}(\Psi)_{ij}=(-1)^{i+j}\det\left(M_{ij}\right),

with M_{ij} a minor of \Psi obtained by deleting the i-th row and the j-th column of \Psi. Since \Psi is a Cauchy matrix, we use (11) to obtain

 \det(\Psi)=\prod^{n}_{k=1}\frac{1}{2\lambda_{k}}\prod_{1\leq i

We further note that each minor M_{ij} is also a Cauchy matrix. Thus, using (11) again, we have

 \displaystyle\det\left(M_{ij}\right)=(-1)^{i+j}\frac{\prod^{n}_{k=1}2\lambda_{% k}}{\prod^{j}_{k=i}2\lambda_{k}}\prod^{j-1}_{k=i}(\lambda_{k+1}+\lambda_{k})\\ \displaystyle\frac{\prod_{1\leq i^{\prime}

In particular, \det(M_{ij})>0. Combining the computations above, we have the following fact for \Psi^{-1}:

###### Lemma 3

Let \Psi^{ij} be the ij-th entry of \Psi^{-1}, with i\leq j. Then,

 \Psi^{ij}=\frac{\prod^{j}_{k=i}2\lambda_{k}}{\prod^{j-1}_{k=i}(\lambda_{k+1}+% \lambda_{k})}\prod_{k\neq i}\frac{\lambda_{i}+\lambda_{k}}{\lambda_{i}-\lambda% _{k}}\prod_{k\neq j}\frac{\lambda_{j}+\lambda_{k}}{\lambda_{j}-\lambda_{k}}. (12)

In particular, \Psi^{-1} has the checkerboard sign pattern:

 \operatorname{sgn}\left(\Psi^{-1}\right)=\left[(-1)^{i+j}\right]_{ij}.

We recall that a square matrix M is said to be irreducible if there does not exist a permutation matrix P such that PMP^{\top} is a block upper triangular matrix. We also recall from the Perron-Frobenius theorem (see, for example, Gantmakher (1998)) that if M is a irreducible matrix of positive entries, then M has a unique largest eigenvalue \lambda. Furthermore, if v is an eigenvector of M corresponding to the eigenvalue \lambda, then by appropriate scaling, one has that \|v\|=1 and \operatorname{sgn}(v)={\mathbf{1}}. The following fact is then an immediate consequence of Lemma 3:

###### Corollary 1

The matrix \Psi is positive definite. It has a unique largest eigenvalue \lambda_{\max} and a unique smallest eigenvalue \lambda_{\min}. Moreover, if we let v_{\max} (resp, v_{\min}) be the eigenvector of \Psi corresponding the eigenvalue \lambda_{\max} (resp. \lambda_{\min}), then by appropriate scaling, we have

 \operatorname{sgn}(v_{\max})={\mathbf{1}}\hskip 5.0pt\mbox{ and }\hskip 5.0pt% \operatorname{sgn}(v_{\min})=\sigma_{*}{\mathbf{1}}.

Proof. First, note that \Psi is an irreducible matrix of positive entries, and hence from the Perron Frobenius theorem, it has a unique largest eigenvalue \lambda_{\max}, and hence we can choose v_{\max} such that \operatorname{sgn}(v_{\max})={\mathbf{1}}. On the other hand, from Lemma 3, we have that \sigma_{*}\Psi^{-1}\sigma_{*} is also an irreducible matrix of positive entries. We let \lambda^{\prime}_{\max} be the unique largest eigenvalue of \sigma_{*}\Psi^{-1}\sigma_{*}, and u_{\max} be an eigenvector of \sigma_{*}\Psi^{-1}\sigma_{*} corresponding to the eigenvalue \lambda^{\prime}_{\max} with \operatorname{sgn}(u_{\max})={\mathbf{1}}. Since \Psi^{-1} is similar to \sigma_{*}\Psi^{-1}\sigma_{*} via the signature matrix \sigma_{*}, we have that \lambda^{\prime}_{\max} is the unique largest eigenvalue of \Psi^{-1}, with \sigma_{*}u_{\max} a corresponding eigenvector. Note, in particular, that \operatorname{sgn}(\sigma_{*}u_{\max})=\sigma_{*}{\mathbf{1}}. Now, let v_{\min}:=\sigma_{*}u_{\max}. It then follows that

 \Psi v_{\min}=\frac{1}{\lambda^{\prime}_{\max}}v_{\min},

with 1/\lambda^{\prime}_{\max} the unique smallest eigenvalue of \Psi. \hfill\blacksquare

With the preliminaries results established above, we are now in a position to prove Proposition 1.

Proof of Proposition 1. Let (x_{*},b_{*}) be a pair in \arg\phi. From Lemma 1, the entries of b_{*} are nonzero. We recall that {\mathrm{a}bs}(b_{*})\in\mathbb{R}^{n} is defined by replacing each entry of b_{*} with its absolute value, and

 \operatorname{abs}(b_{*})=\operatorname{diag}(\operatorname{sgn}(b_{*}))b_{*}.

Now, we define a matrix M as follows:

 M:=\sigma_{*}\operatorname{diag}(\operatorname{abs}(b_{*}))^{-1}\Psi^{-1}\\ \operatorname{diag}(\operatorname{abs}(b_{*}))^{-1}\sigma_{*}.

Then, from Lemma 3, M is an irreducible matrix of positive entries. Appealing to the Perron-Frobenius theorem, we know that there is a unique largest eigenvalue \lambda of M. We further let v be the eigenvector of M corresponding to the eigenvalue \lambda, with \|v\|=1 and \operatorname{sgn}(v)={\mathbf{1}}.

Now, recall that W(b_{*})^{-1} is given by:

 W(b_{*})^{-1}=\operatorname{diag}(b_{*})^{-1}\Psi^{-1}\operatorname{diag}(b_{*% })^{-1}.

We thus obtain

 M=\sigma_{*}\operatorname{diag}(\operatorname{sgn}(b_{*}))W(b_{*})^{-1}\\ \operatorname{diag}(\operatorname{sgn}(b_{*}))\sigma_{*},

In other words, M and W(b_{*})^{-1} are related by a similarity transformation, via the signature matrix \sigma_{*}\operatorname{diag}(\operatorname{sgn}(b_{*})). This, in particular, implies that the matrix W(b_{*})^{-1} has \lambda as its unique largest eigenvalue. Moreover, if we let

 u:=\sigma_{*}\operatorname{diag}(\operatorname{sgn}(b_{*}))v,

then u is an eigenvector of W(b_{*})^{-1} corresponding to the eigenvalue \lambda. We further note that

 \|u\|=\|\sigma_{*}\operatorname{diag}(\operatorname{sgn}(b_{*}))v\|=1,

and hence for fixed b_{*}, the only possible solutions for x_{*}\in S^{n-1} (such that (x_{*},b_{*})\in\arg\phi) are given by x_{*}=\pm u. Then, using the fact that \operatorname{sgn}(v)={\mathbf{1}}, we conclude that

 \operatorname{sgn}(x_{*})=\pm\operatorname{sgn}(u)=\pm\sigma_{*}\operatorname{% sgn}(b_{*}),

which completes the proof. \hfill\blacksquare

\@xsect

For a vector b\in S^{n-1}, we let \xi(b) be the smallest eigenvalue of the matrix W(b). We think of

 \xi:b\mapsto\lambda_{\min}W(b)

as a potential function defined over S^{n-1}. Let Z be a proper subset of S^{n-1} defined by collecting any vector v\in S^{n-1} with nonzero entries:

 Z:=\left\{v\in S^{n-1}\mid|v_{i}|>0,\hskip 5.0pt\forall\,i=1,\ldots,n\right\}.

We note that for any b\in Z, the matrix W(b) is nonsingular, and hence \xi(b)>0. By the same arguments as used to establish Lemma 1, we know that \max_{b\in S^{n-1}}\xi(b) can be achieved by a vector in Z; indeed, the set S^{n-1}-Z is comprised of the global minima of \xi. We also note that if b\in Z, then from Proposition 1, there exists a unique smallest eigenvalue of W(b). Thus, the corresponding eigenspace is of dimension one. Now, let b_{*}\in Z be a global maximum point of \xi, and x_{*}\in S^{n-1} be an eigenvector of W(b_{*}) corresponding to the smallest eigenvalue of W(b_{*}). Then, it should be clear that (x_{*},b_{*})\in\arg\phi. Conversely, if (x_{*},b_{*}) is in \arg\phi, then b_{*} maximizes \xi(b). It thus suffices to locate the global maxima of \xi.

For a vector b\in S^{n-1}, we denote by D_{b}\xi the derivative of \xi at b. The map D_{b}\xi sends a vector v in T_{b}S^{n-1}—the tangent space of S^{n-1} at b—to a real number. The vector b is said to be a critical point of \xi if D_{b}\xi is identically zero, i.e., D_{b}\xi(v)=0 for all v\in T_{b}S^{n-1}. Note that a local maximum point of \xi is necessarily a critical point of \xi. The following fact then presents a necessary condition for a vector b\in Z to be a critical point of \xi:

###### Proposition 2

Let b\in Z be a critical point of \xi, with \lambda:=\xi(b) the (unique) smallest eigenvalue of W(b). Let x\in S^{n-1} be an eigenvector of W(b) corresponding to the eigenvalue \lambda. Then, the following holds:

 \left\{\begin{array}[]{l}W(b)x=\lambda x\\ W(x)b=\lambda b.\end{array}\right. (13)

We establish below Proposition 2. To proceed, we evaluate D_{b}\xi for a vector b\in Z. First, note that the tangent space of S_{n} at b is given by

 T_{b}S^{n-1}=\left\{v\in\mathbb{R}^{n}\mid v^{\top}b=0\right\}. (14)

We then note the following fact: Let M be an arbitrary symmetric matrix, with \lambda a distinct eigenvalue and v\in S^{n-1} a corresponding eigenvector. Suppose that we perturb M to (M+\epsilon N) for N symmetric and \epsilon sufficiently small, then up to the first order of \epsilon, the perturbed eigenvalue \lambda(\epsilon) of (M+\epsilon N) is given by \lambda(\epsilon)=\lambda+\epsilon v^{\top}Nv. We further recall, from the proof of Proposition 1, that for any vector b\in Z, the matrix W(b)^{-1} (resp. W(b)) has a unique largest (resp. least) eigenvalue. The arguments above then imply the following fact:

###### Lemma 4

Let b\in Z, and v\in T_{b}S^{n-1}. Then,

 D_{b}\xi(v)=2v^{\top}W(x)b. (15)

where x\in S^{n-1} is an eigenvector of W(b) corresponding to the smallest eigenvalue of W(b).

Proof. The lemma follows directly from computation; indeed, if we perturb b to b+\epsilon v, then up to the first order of \epsilon, the perturbed matrix W(b+\epsilon v) is given by

 \displaystyle W(b+\epsilon v)=W(b)+\\ \displaystyle\epsilon\left(\operatorname{diag}(v)\Psi\operatorname{diag}(b)+% \operatorname{diag}(b)\Psi\operatorname{diag}(v)\right).

Since W(b) has a unique smallest eigenvalue, for \epsilon small, the perturbed matrix W(b+\epsilon v) also has a unique smallest eigenvalue, which is given by

 \displaystyle\lambda_{\min}W(b+\epsilon v)=\lambda_{\min}W(b)+\\ \displaystyle 2\epsilon x^{\top}\operatorname{diag}(v)\Psi\operatorname{diag}(% b)x+{\mathrm{o}}(\epsilon)

We further note that

 x^{\top}\operatorname{diag}(v)\Psi\operatorname{diag}(b)x=v^{\top}W(x)b,

and hence D_{b}\xi(v)=2v^{\top}W(x)b. \hfill\blacksquare

With the preliminaries above, we are now in a position to prove Proposition 2:

Proof of Proposition 2. Since b\in Z is a critical point of \xi, we have that for any vector v\in T_{b}Z, D_{b}\xi(v)=0. From (15), we have v^{\top}W(x)b=0. Since the expression above holds for all v\in T_{b}S^{n-1}, we know from (14) that W(x)b=\mu b, for some constant \mu. It now suffices to show that \mu=\lambda. To see this, we first note that

 b^{\top}W(x)b=\mu b^{\top}b=\mu. (16)

On the other hand, we have W(b)x=\lambda x, and hence

 x^{\top}W(b)x=\lambda x^{\top}x=\lambda. (17)

Since the left hand side of (16) coincides with the left hand side of (17), we conclude that \mu=\lambda.

\@xsect

We prove here Theorem 1. The proof relies on the use of Proposition 2. More specifically, we prove Theorem 1 by establishing the following fact as a corollary to Proposition 2:

###### Corollary 2

There are 2^{n} isolated critical points of the potential function \xi over the set Z. They are given by \{\sigma v_{*}\mid\sigma\in\Sigma\}, where v_{*} is a positive vector defined in (8). Furthermore, the following properties hold:

1. The function \xi holds the same value at each of these critical points:

 \xi(\sigma v_{*})=\frac{1}{{\mathbf{1}}\sigma_{*}\Psi\sigma_{*}{\mathbf{1}}}.

Thus, the 2^{n} critical points form the global maxima of the function \xi.

2. For a critical point \sigma v_{*} of \xi, the two vectors \pm\sigma\sigma_{*}v_{*} are the eigenvectors of the matrix W(\sigma v_{*}) corresponding to its (unique) smallest eigenvalue.

Proof. Let b\in Z be a critical point of \xi, \lambda be the smallest eigenvalue of W(b), and x\in S^{n-1} be an eigenvector of W(b) corresponding to the eigenvalue \lambda. Then, from (13), we have that

 \left\{\begin{array}[]{l}\Psi\operatorname{diag}(b)x=\lambda\operatorname{diag% }(b)^{-1}x\\ \Psi\operatorname{diag}(x)b=\lambda\operatorname{diag}(x)^{-1}b,\end{array}\right. (18)

Since \lambda\neq 0 and

 \operatorname{diag}(b)x=\operatorname{diag}(x)b,

we obtain from (18) that

 \operatorname{diag}(b)^{-1}x=\operatorname{diag}(x)^{-1}b.

In other words, if we let b_{i} (resp. x_{i}) be the i-th entry of b (resp. x), then the expression above implies that

 |x_{i}|=|b_{i}|,\hskip 10.0pt\forall i=1,\ldots,n. (19)

On the other hand, from Proposition 1, we have

 \operatorname{sgn}(x)=\pm\sigma_{*}\operatorname{sgn}(b).

We then combine this fact with (19), and obtain that

 x=\pm\sigma_{*}b. (20)

From (18) and (20), we then have

 \Psi\sigma_{*}\begin{bmatrix}b^{2}_{1}\\ \vdots\\ b^{2}_{n}\end{bmatrix}=\lambda\sigma_{*}{\mathbf{1}},

and hence

 \begin{bmatrix}b^{2}_{1}\\ \vdots\\ b^{2}_{n}\end{bmatrix}=\lambda\sigma_{*}\Psi^{-1}\sigma_{*}{\mathbf{1}}. (21)

We note that \Psi^{-1} has the checkerboard sign pattern, and hence the entries of the matrix \sigma_{*}\Psi^{-1}\sigma_{*} on the right hand side of (21) are positive.

With (21) at hand, we can compute explicitly the scalar \lambda and the vector (b^{2}_{1},\ldots,b^{2}_{n}): First, for the scalar \lambda, we use the fact that \sum^{n}_{i=1}b^{2}_{i}=1, and obtain

 \lambda=\left({\mathbf{1}}^{\top}\sigma_{*}\Psi^{-1}\sigma_{*}{\mathbf{1}}% \right)^{-1}.

It then follows that

 \begin{bmatrix}b^{2}_{1}\\ \vdots\\ b^{2}_{n}\end{bmatrix}=\frac{\sigma_{*}\Psi^{-1}\sigma_{*}{\mathbf{1}}}{{% \mathbf{1}}^{\top}\sigma_{*}\Psi^{-1}\sigma_{*}{\mathbf{1}}}=\begin{bmatrix}v^% {2}_{*,1}\\ \vdots\\ v^{2}_{*,n}\end{bmatrix}.

Note that the expression above uniquely determines the set of critical points of \xi over Z: There are 2^{n} critical points of \xi, one to one corresponding to the signature matrices:

 \left\{\sigma v_{*}\mid\sigma\in\Sigma\right\}.

Moreover, the function \xi holds the same value at each of these critical points:

 \xi(\sigma v_{*})=\lambda=\left({\mathbf{1}}^{\top}\sigma_{*}\Psi^{-1}\sigma_{% *}{\mathbf{1}}\right)^{-1}.

This establishes the first item of the corollary. The second item directly follows from (20). \hfill\blacksquare

\@xsect

We considered in the paper the actuator design problem for a linear control system so as to minimize the worst case control energy, where worst case is to be understood with respect to the initial state of the system. This problem is in general difficult, and we focussed here only on systems for which the infinitesimal generator of the dynamics A is positive definite. Under this assumption, we provided a complete characterization of the optimal actuators and the corresponding worst case initial states. We also evaluated the value of the worst-case energy needed for the optimal actuator. Along the way, we highlighted several structural properties of the set of optimal actuators and their corresponding worst case initial states, such as the interlacing sign pattern of their entries. Future work may focus on a general case where A is not necessarily symmetric, and the cases where one has multiple actuators.

\@xsect

We prove here Lemma 2. The proof is carried out by induction on the dimension of the matrix M. For the base case, we have that M is a scalar given by 1/(\alpha_{1}+\beta_{1}). Thus, \det(M)=1/(\alpha_{1}+\beta_{1}), and hence (11) holds.

For the inductive step, we assume that Lemma 2 holds for n=k, and prove for n=k+1. To proceed, we first partition the matrix M into 2\times 2 blocks as follows:

 M=\begin{bmatrix}M_{11}&u\\ v^{\top}&\frac{1}{\alpha_{k+1}+\beta_{k+1}}\end{bmatrix},

with M_{11} a k\times k matrix. Next, we have the following elementary row operations on the matrix M:

 \begin{bmatrix}I_{k\times k}&-(\alpha_{k+1}+\beta_{k+1})u\\ 0&1\end{bmatrix}\begin{bmatrix}M_{11}&u\\ v^{\top}&\frac{1}{\alpha_{k+1}+\beta_{k+1}}\end{bmatrix},

by which we obtain the following matrix:

 \begin{bmatrix}M_{11}-(\alpha_{k+1}+\beta_{k+1})uv^{\top}&0\\ v^{\top}&\frac{1}{\alpha_{k+1}+\beta_{k+1}}\end{bmatrix}

Note that the elementary row operation defined above does not change the determinant of M, and hence

 \det(M)=\frac{\det\left(M_{11}-(\alpha_{k+1}+\beta_{k+1})uv^{\top}\right)}{% \alpha_{k+1}+\beta_{k+1}}. (22)

It thus suffices to evaluate the determinant of the matrix M_{11}-({\alpha_{k+1}+\beta_{k+1}})uv^{\top}.

To do so, we first obtain, by computation, the following expression:

 M_{11}-({\alpha_{k+1}+\beta_{k+1}})uv^{\top}=D_{\alpha}M^{\prime}D_{\beta}, (23)

where M^{\prime}\in\mathbb{R}^{k\times k} is given by

 M^{\prime}:=\left[\frac{1}{\alpha_{i}+\beta_{j}}\right]_{1\leq i,j\leq k},

and D_{\alpha},D_{\beta}\in\mathbb{R}^{k\times k} are diagonal matrices given by

 D_{\alpha}:=\begin{bmatrix}\frac{\alpha_{k+1}-\alpha_{1}}{\alpha_{k+1}+\alpha_% {1}}&&\\ &\ddots&\\ &&\frac{\alpha_{k+1}-\alpha_{k}}{\alpha_{k+1}+\alpha_{k}}\end{bmatrix}

and

 D_{\beta}:=\begin{bmatrix}\frac{\beta_{k+1}-\beta_{1}}{\beta_{k+1}+\beta_{1}}&% &\\ &\ddots&\\ &&\frac{\beta_{k+1}-\beta_{k}}{\beta_{k+1}+\beta_{k}}\end{bmatrix}.

From (22) and (23), we have

 \det(M)=\frac{\det(M^{\prime})}{\alpha_{k+1}+\beta_{k+1}}\prod^{k}_{i=1}\left(% \frac{\alpha_{k+1}-\alpha_{i}}{\alpha_{k+1}+\alpha_{i}}\frac{\beta_{k+1}-\beta% _{i}}{\beta_{k+1}+\beta_{i}}\right). (24)

We then appeal to the induction hypothesis and obtain the determinant of M^{\prime}:

 \det(M^{\prime})=\prod^{k}_{i=1}\frac{1}{\alpha_{i}+\beta_{i}}\prod_{1\leq i

Combining (24) and (25), we conclude that (11) holds. This completes the proof. \hfill\blacksquare

## References

• Belabbas (2016) Belabbas, M.A. (2016). Geometric methods for optimal sensor design. Proceedings of the Royal Society, Series A Math Phys Eng Sci, 472(2185), 20150312.
• Brockett (1970) Brockett, R.W. (1970). Finite Dimensional Linear Systems. Wiley, New York.
• Chen and Rowley (2014) Chen, K.K. and Rowley, C.W. (2014). Fluid flow control applications of H2 optimal actuator and sensor placement. In American Control Conference (ACC), 2014, 4044–4049. IEEE.
• Dhingra et al. (2014) Dhingra, N.K., Jovanović, M.R., and Luo, Z.Q. (2014). An admm algorithm for optimal sensor and actuator selection. In Decision and Control (CDC), 2014 IEEE 53rd Annual Conference on, 4039–4044. IEEE.
• Gantmakher (1998) Gantmakher, F. (1998). The Theory of Matrices, volume 131. American Mathematical Soc.
• Hiramoto et al. (2000) Hiramoto, K., Doki, H., and Obinata, G. (2000). Optimal sensor/actuator placement for active vibration control using explicit solution of algebraic Riccati equation. Journal of Sound and Vibration, 229(5), 1057 – 1075.
• Morris (2011) Morris, K. (2011). Linear-quadratic optimal actuator location. IEEE Transactions on Automatic Control, 56(1), 113–124.
• Olshevsky (2016) Olshevsky, A. (2016). Eigenvalue clustering, control energy, and logarithmic capacity. System and Control letters, 96, 45–50.
• Pasqualetti et al. (2014) Pasqualetti, F., Zampieri, S., and Bullo, F. (2014). Controllability metrics, limitations and algorithms for complex networks. IEEE Transactions on Control of Network Systems, 1(1), 40–52.
• Schechter (1959) Schechter, S. (1959). On the inversion of certain matrices. Mathematical Tables and Other Aids to Computation, 13(66), 73–77.
• Singiresu et al. (1991) Singiresu, S.R., Panand, T.S., and Venkayya, V.B. (1991). Optimal placement of actuators in actively controlled structures using genetic algorithms. AIAA Journal, 29(6), 942–943.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters