Bilinear Iterative Rational Krylov Algorithm111This material is based upon work supported by Council of Scientific and Industrial Research (India) Grant Number 25/(0220)/13/EMR-II.
Models coming from different physical applications are very large in size. Simulation with such systems is expensive so one usually obtains a reduced model (by model reduction) that replicates the input-output behaviour of the original full model. A recently proposed algorithm for model reduction of bilinear dynamical systems, Bilinear Iterative Rational Krylov Algorithm (BIRKA), does so in a locally optimal way. This algorithm requires solving very large linear systems of equations. Usually these systems are solved by direct methods (e.g., LU), which are very expensive. A better choice is iterative methods (e.g., Krylov). However, iterative methods introduce errors in linear solves because they are not exact. They solve the given linear system up to a certain tolerance. We prove that under some mild assumptions BIRKA is stable with respect to the error introduced by the inexact linear solves. We also analyze the accuracy of the reduced system obtained from using these inexact solves and support all our results by numerical experiments.
keywords:Bilinear Dynamical Systems, Model Reduction, Iterative Solves, Krylov Subspace Methods, Petrov-Galerkin, Backward Stability.
Msc: 34C20, 41A05, 65F10, 65G99
A dynamical system describes a relation between two or more measurable quantities by a set of differential equations. The system may be linear or nonlinear. A bilinear dynamical system is one such weakly nonlinear system. The system can be described both in the time domain and in the frequency domain. In the time domain, a Multiple Input Multiple Output (MIMO) bilinear dynamical system with m inputs and p outputs is represented as follows Dirac3 (); Dirac5 ():
where for , and . Also, , and . It is not possible to write the transfer function of a complete bilinear dynamical system, therefore, in Dirac5 (); Dirac10 () the authors represent the bilinear dynamical system in the frequency domain by a series of subsystem transfer functions, i.e.,
where and are the frequencies. The transfer function of the order subsystem is given as follows Dirac5 ():
where ; and are the identity matrices of size and , respectively; and denotes Kronecker product (defined later).
If in (1), the matrix is a zero matrix, then the system is a linear dynamical system. That is, a MIMO linear dynamical system is represented as
The transfer function of the linear dynamical system in the frequency domain is defined as follows:
In general, dynamical systems corresponding to real world applications are extremely large in size. Simulation and computation with such systems requires large amount of space and time. By using model reduction techniques Dirac6 (), these large dynamical systems are reduced into a smaller size, which makes the simulation and computation easier. Model reduction can be done in many ways, i.e., by using balanced truncation, Hankel approximations or Krylov projection Dirac6 (). Projection methods obtain the reduced model by projecting the original full model on a lower dimensional subspace, and are quite popular. In literature, there are several techniques of projecting a dynamical system Dirac6 (); Dirac4 (); serkphd (); zbai (); kahujaphd (); Dirac7 (). The Petrov-Galerkin projection is one such projection technique that gives nice properties in the reduced model. Interpolation is usually used to obtain the subspaces involved in the Petrov-Galerkin projection.
Based upon the theory of Petrov-Galerkin based interpolatory model reduction, authors in Dirac4 (); inexirka (); mimo () have proposed Iterative Rational Krylov Algorithm (IRKA) for model reduction of linear dynamical systems. IRKA provides the reduced model that is optimal (the kind of optimality is discussed in the next section). Similar to IRKA, authors in Dirac3 (); Dirac5 (); doi:10.1137/130947830 (); tobiasphd () have proposed Bilinear Iterative Rational Krylov Algorithm (BIRKA) for model reduction of bilinear dynamical systems.
The main computational bottleneck in reducing larger models (or dynamical systems) is solving large sparse linear systems of equations. The reason for this is that typically, model reducers use direct solvers, e.g., LU factorization to solve such linear systems of equations, which are expensive. The solution to this scaling problem is to use iterative methods, e.g., Krylov subspace methods.
Application of Krylov subspace methods for IRKA has been done ahuja2012recycling (); sarahsms (); sarahsphd (). Iterative methods are inexact, i.e., they solve linear systems of equations up to a certain stopping tolerance. Hence, it becomes important to check if the model reduction algorithm (IRKA or BIRKA) is stable with respect to these inexact solves. In other words, we need to check that small errors in linear solves does not substantially deteriorate the quality of the reduced model. For IRKA, stability analysis has been done in Dirac2 (). We do the same for BIRKA, i.e., prove that BIRKA is stable with respect to the inexact linear solves. From this work users will have more confidence in using iterative solvers for BIRKA.
In the next section (Section 2), we discuss model reduction by a Petrov-Galerkin based interpolatory model reduction framework. We discuss stability of BIRKA in Section 3. In Section 4, we analyze invertibility assumptions of all involved matrices as well as the accuracy of the reduced system obtained from a backward stable BIRKA. We support our theory with numerical experiments in Section 5, and give concluding remarks as well as future directions in Section 6. For the rest of this paper we use the terms and notations as listed below.
In literature Dirac3 (), the norm of a bilinear dynamical system is defined as
where is an identity matrix of size . If the type of norm is not written, then in the case of functional norm it is a norm. In the case of matrices it is a 2-norm.
The Kronecker product between two matrices (of size ), and (of size ) is defined as
where is an element of matrix and order of is .
operator on a matrix is defined as
Also, denotes the set of real numbers and denotes the discrete subset of real numbers.
2 Petrov-Galerkin Based Interpolatory Model Reduction Framework
According to the Petrov-Galerkin projection, the residual of a dynamical system obtained after projecting on a lower dimensional subspace, is made orthogonal to some other subspace defined by a test basis. Let denote the residual of this dynamical system, then according to the Petrov-Galerkin condition, , where denotes any test subspace.
The subspace on which we project, and the orthogonal subspace are not known to us. We can arbitrarily pick these subspaces, but then we cannot guarantee a good input-output behaviour from the reduced model. For the reduced model to provide a high fidelity approximation to the input-output behaviour of the original full model, we use interpolation to obtain these subspaces. In Dirac4 (), authors give an algorithm for model reduction of linear dynamical systems called IRKA (Iterative Rational Krylov Algorithm). IRKA is a Petrov-Galerkin based interpolatory model reduction algorithm. For a certain type of linear dynamical systems, IRKA locally converges to a local minimum of the underlying optimization problem flagg12Convergence (). For optimality discussion in the linear case we refer the reader to Dirac4 () and flagg12Convergence (). We discuss optimality in the bilinear case below.
Next, we apply Petrov-Galerkin based interpolatory model reduction to a bilinear dynamical system. This is a short summary of the original work in Dirac3 () and Dirac5 (). After reduction, the bilinear system (1) can be represented as Dirac3 ()
where for with . We want to approximate in an appropriate norm, and hence, should be nearly equal to for all admissible inputs. Let the two r-dimensional subspaces, and , be chosen in such a way that , where are matrices. We project the original full model (1) to a lower dimensional subspace, i.e., , and enforce the Petrov-Galerkin condition Dirac3 (); Dirac5 ()
Comparing the above equations with (6), we get
where is assumed to be invertible. Obtaining such an invertible matrix is not hard Dirac3 (). Different selection of the subspaces and give different reduced models, but we choose the subspaces and by enforcing interpolation. In the case of bilinear systems, there are two ways of doing interpolation Dirac5 ().
A bilinear system can be represented by a series of subsystem transfer functions. If we apply certain interpolation conditions on a finite number of subsystems then, it is called subsystem interpolation Dirac5 (). Another way is Volterra series interpolation. Here, interpolation is done on a weighted sum of all Volterra kernel transfer functions given by (1). We refer the reader to Dirac5 (); opac-b1090598 () for a detailed discussion on the definition of the Volterra series, the Volterra kernels, and the subsequent derivations.
As the subsystem interpolation approach is unable to satisfy any optimality condition Dirac5 () (error between the original full model and the reduced model is minimum in some norm), so our focus is on the Volterra series interpolation. We need to know how to build and such that the conditions of the Volterra series interpolation are satisfied. We also need to decide where to interpolate so that we get an optimal reduced model. Here, we focus on optimality.
In a bilinear system, the following error system expression is differentiated for getting the optimality conditions Dirac3 ()
where and are the initial guesses for the reduced system. Also, . Performing interpolation on the inverse images of the reduced system poles helps achieve optimality. Theorem 1 below summarizes this where the poles of the transfer function of every reduced subsystem (say ) are computed (say represented by ), inverted (leading to ), and finally, interpolation is performed at these points.
Dirac5 () Let be a bilinear system of order n. Let be an optimal approximation of order r. Then, satisfies the following multi-point Volterra series interpolation conditions:
where and are residues and poles of the transfer function associated with , respectively.
Obtaining the residues and the poles of the optimal reduced model is not possible since we do not have such a system. In Dirac3 (); doi:10.1137/130947830 () the authors propose Bilinear Iterative Rational Krylov Algorithm (BIRKA), which at convergence, ensures that the conditions of Theorem 1 are satisfied. BIRKA gives a locally optimal reduced model. Algorithm 1 lists BIRKA.
3 Backward Stability
In general, numerical algorithms for a problem are continuous in nature but, a digital computer solves them in a discrete manner. The reason is limitation on the representation of real / complex numbers. Since complex numbers can be represented by real numbers, we focus on latter only. Let be a function giving a finite approximation to a real number. It provides rounded equivalent as Dirac8 ()
for all ,
where is the machine precision. Also, for every operation between any two finite numbers, the result is exact up to a relative error, i.e., for all x, y
where can be any of the following operation: .
Consider a continuous mathematics algorithm . Say executing this algorithm on a digital computer (that uses finite precision arithmetic) is represented as . To check how good the approximated algorithm is, one usually computes the accuracy of . We say an algorithm is accurate if Dirac8 ()
where . From the above equation, we find that computing accuracy is not possible since we do not know . A more easier parameter to check the goodness of is stability. There are multiple notions of stability. One such notion is backward stability, which says that an algorithm is backward stable if Dirac8 ()
This notion of backward stability is useful since one can easily compute accuracy of the result/ output for a backward stable algorithm.
Dirac8 () If is a backward stable algorithm, and is the condition number of the problem, then the relative error
where is the machine precision (or perturbation in ).
Let’s look at lines 3b. and 3c. in BIRKA (Algorithm 1). There we need to solve linear systems to compute and , respectively. Solving these linear systems by direct methods (such as LU-factorization, Gaussian elimination, etc.) is too expensive (time complexity of where is the system size). Moreover the linear systems here have sparse matrices. For such systems, iterative methods, e.g., Krylov subspace methods Dirac6 (), are preferred because of the reduced complexity (time complexity of where is the number of nonzeros in the matrix) †††In fact, the matrices here are block sparse. Iterative methods for difficult to solve linear systems usually require a preconditioner. Hence, this block sparsity can be exploited in designing preconditioners here. E.g., in luo2015optimization (), authors have designed an Incomplete LU (ILU) factorization for efficiently solving block sparse linear systems. The techniques from luo2015optimization (), can be used for designing better ILU preconditioned iterative methods for block sparse linear systems..
Iterative methods are inexact in nature, which means they do not solve linear systems, say Ax = b, exactly. Instead Ax = b + is solved, where is the stopping tolerance. Our aim is to find that if one uses an iterative solver (also called inexact solver from now on) in IRKA or BIRKA, are these algorithms stable with respect to the error introduced by the inexact solves. As earlier, we check for backward stability. For IRKA, the backward stability analysis has been done in Dirac2 ().
Let in BIRKA be calculated exactly, and be the functional representation of the interpolation process that uses and in BIRKA (i.e., exact BIRKA). Similarly, let and be calculated inexactly (i.e., by an iterative solver), and be the functional representation of the interpolation process that uses in BIRKA (i.e., inexact BIRKA). Then, from the backward stability definition, BIRKA is backward stable if
where is the perturbed full model corresponding to the error in the linear solves for and in inexact BIRKA. This perturbation is denoted by . Next, we look at the above two conditions for stability in the two different sub-sections below.
3.1 Satisfying the First Condition of Backward Stability
Let the original full order model be represented as . Recall from Algorithm 1, the following:
Also, let the residuals associated with iterative solves for computing and be and , respectively. Then, the above equations lead to
Let and . The Petrov-Galerkin projection connects the reduced model matrices (obtained by inexact BIRKA) to the original full model matrices as
where this reduced model is represented as .
By the backward stability definition, next we find a perturbed full model whose exact interpolation will give the reduced model as obtained by inexact interpolation of the original full model. Let the perturbed full model be represented as or , where are the constant perturbation matrices. Then, we have
As earlier, and . Using the Petrov-Galerkin projection to connect the reduced model matrices (obtained by exact BIRKA) with the perturbed full model matrices we get
Similarly, , and .
From the above, we note that if , then . Similarly, if , then ; if , then ; and if , then . Using the Petrov-Galerkin framework for the inexact solves in (12) and (13), we can easily achieve some of the above relations. We discuss this next.
3.1.1 The Petrov-Galerkin Framework for Inexact Solves
The Petrov-Galerkin framework by definition implies finding the solution of a linear system of equation such that its residual at every point is orthogonal to some other suitable subspace van2003iterative (). In our context, we define the Petrov-Galerkin framework as below.
Next, we consider perturbations in and individually, and use the Petrov-Galerkin framework discussed above. First, if we take the perturbation in only, then (20) is equivalent to
In the above, if we multiply from left in the first equation and from right in the second equation, then we get
From the Petrov-Galerkin framework (19), , and hence,
Similarly, if we take the perturbation in any one matrix, then (20) is equivalent to
Again in the above, if we multiply from left in the first equation and from right in the second equation, then we get
Using the Petrov-Galerkin framework (19) in above we get
To achieve the desired result, i.e., , we need to be invertible. This cannot always be guaranteed. Thus, we drop the perturbation analysis with matrices.
Finally, if we only take the perturbations and , in the matrices and , respectively, then (20) is equivalent to
As in the last two paragraphs, multiplying by from left in the first equation above, multiplying by from right in the second equation above, and using the Petrov-Galerkin framework (19) we get
As above, to achieve the desired result, i.e., and , we need and to be invertible. This cannot always be guaranteed because these are non-square matrices. Thus, we drop the perturbation analysis with and matrices both.
Hence, (22) implies that if we consider the perturbation in matrix only and use a Petrov-Galerkin framework for the inexact linear solves, then
The theorem below summarizes this.
3.2 Satisfying the Second Condition of Backward Stability
Next, we show that the second condition of backward stability, given in (10), is also satisfied. According to (10), the difference between the original full model and the perturbed full model should be order of the perturbation, i.e.,
We satisfy the above condition in the absolute sense, since is independent of . That is,