Algebraic Tools for the Analysis of State Space Models

Algebraic Tools for the Analysis of State Space Models

Nicolette Meshkat Department of Mathematics and Computer Science, Santa Clara University, Santa Clara, CA nmeshkat@scu.edu Zvi Rosen Department of Mathematics, University of Pennsylvania, Philadelphia, PA zvihr@math.upenn.edu  and  Seth Sullivant Department of Mathematics, North Carolina State University, Raleigh, NC smsulli2@ncsu.edu
Abstract.

We present algebraic techniques to analyze state space models in the areas of structural identifiability, observability, and indistinguishability. While the emphasis is on surveying existing algebraic tools for studying ODE systems, we also present a variety of new results. In particular: On structural identifiability, we present a method using linear algebra to find identifiable functions of the parameters of a model for unidentifiable models. On observability, we present techniques using Gröbner bases and algebraic matroids to test algebraic observability of state space models. On indistinguishability, we present a sufficient condition for distinguishability using computational algebra and demonstrate testing indistinguishability.

Key words and phrases:
Identifiability and Observability and Indistinguishability and State space models

1. Introduction

Consider a dynamic systems model in the following state space form:

(1)

Here is the state variable vector, is the input vector (or control vector), is the output vector, and is a parameter vector composed of unknown real parameters . In this modeling framework the only observed quantities are the input and output trajectories, and (or more realistically, the trajectories observed at some finite number of time points ), together with the underlying modeling structure (that is, the functions and ). State space models are widely used throughout the applied sciences, including the areas of control [27, 52, 58, 67], systems biology [22], economics and finance [34, 76], and probability and statistics [11, 39].

A simple example of a state space model is a linear compartment model.

Example 1.1.

Consider the following ODE:

This model is called the linear 2-compartment model and will be referenced in later sections. Here is the state variable vector, is the input (or control), is the output, and is the unknown parameter vector.

Although the analysis of the behavior and use of state space models falls under the dynamical systems research area umbrella, tools from algebra can be used to analyze these models when the functions and are rational functions. Algebraic methods typically focus on determining which key features that the models satisfy a priori before the models are used to analyze data. The point of the present paper is to give an overview of these algebraic techniques to show how they can be applied to analyze state space models. We focus on three main problems where algebraic techniques can be helpful: determining structural identifiability, observability, and indistinguishability of the models. We provide an overview of techniques for these problems coming from computational algebra and we also introduce some new results coming from matroid theory.

2. State Space Models

In this section, we provide a more detailed introduction to state space models, and the basic theoretical problems of identifiability, observability, and indistinguishability that we will address in this paper. We also provide a detailed introduction to the linear compartment models that will be an important set of examples that we use to illustrate the theory.

Consider a general state space model

(2)

as in the introduction, with , , and .

The state space model (2) is called identifiable if the unknown parameter vector can be recovered from observation of the input and output alone. The model is observable if the trajectories of the state space variables can be recovered from observation of the input and output alone. Two state space models are indistinguishable if for any choice of parameters in the first model, there is a choice of parameters in the second model that will yield the same dynamics in both models, and vice versa. Before getting into the technical details of these definitions for state space models, we introduce some key examples of state space models that we will use to illustrate the main concepts throughout the paper.

Example 2.1 (SIR Model).

A commonly used model in epidemiology is the Susceptible-Infected-Recovered model (SIR model) ([8],[9],[10],[46],[62]) which has the following form:

The interpretation of the state variables is that is the number of susceptible individuals at time , is the number of infected individuals at time , and is the number of recovered individuals at time . The unknown parameters are the birth/death rate , the transmission parameter , the recovery rate , the total population , and the proportion of the infected population measured . In this model, we assume that we only observe the trajectory , an (unknown) proportion of the infected population. Note that this simple model has no input/control.

Identifiability and observability analysis in this model are concerned with determining which unmeasured quantities can be determined from only the observed output trajectory . Identifiability specifically concerns the unobserved parameters , and , whereas observability specifically is concerned with the unobserved state variables , , and . ∎

A commonly used family of state space models are the linear compartment models. We outline these models here. Let be a directed graph with vertex set and set of directed edges . Each vertex corresponds to a compartment in our model and each edge corresponds to a direct flow of material from the th compartment to the th compartment. Let be three sets of compartments: the set of input compartments, output compartments, and leak compartments respectively. To each edge we associate an independent parameter , the rate of flow from compartment to compartment . To each leak node , we associate an independent parameter , the rate of flow from compartment leaving the system.

To such a graph and set of leaks we associate the matrix in the following way:

For brevity, we will often use to denote . Then we construct a system of linear ODEs with inputs and outputs associated to the quadruple as follows:

(3)

where for . The coordinate functions are the state variables, the functions are the output variables, and the nonzero functions are the inputs. The resulting model is called a linear compartment model.

We use the following convention for drawing linear compartment models [22]. Numbered vertices represent compartments, outgoing arrows from the compartments represent leaks, an edge with a circle coming out of a compartment represents an output, and an arrowhead pointing into a compartment represents an input.

Figure 1. A 2-compartment model with , , and .
Example 2.2.

For the compartment model in Figure 1, the ODE system has the form given in Example 1.1. Since this model has a leak in every compartment, the diagonal entries of are algebraically independent of the other entries. In this situation, we can re-write the diagonal entries of the matrix as and . Thus we have the following ODE system:

3. Differential Algebra Approach To Identifiability

In this paper we focus on the structural versions of identifiability, observability, and indistinguishability (that is, structural identifiability, structural observability, structural indistinguishability). That means we study when these properties hold assuming that we are able to observe trajectories perfectly. Practical versions of these problems concern how noise affects the ability to, e.g., infer parameters of the models. Structural answers are important because the structural version of the condition is necessary to insure that the practical version holds. On the other hand, practical versions of these problems depend on the specific data dependent context in which the data might be observed, and might further depend on the particular underlying unknown parameter choices. We will drop “structural” throughout the paper since this will be implicit in the majority of our discussion.

To make the definitions of identifiability, observability, and indistinguishability precise we will use tools from differential algebra. In this approach, we must form the input-output equations associated to our model by performing differential elimination. We carry out operations in the differential ring

with the derivation with respect to time such that the parameters are constants with respect to the derivation, and , etc. Differential algebra was developed by Ritt [59] and Kolchin [40] and has its most well-known applications to the study of the algebraic solution to systems of differential equations [63].

The goal of this differential elimination process for state space models is to eliminate the state variables and their derivatives, so that the resulting equations are purely in terms of the input variables, output variables, and the parameters. The equations that result from applying the differential elimination process are called the input-output equations. We obtain input-output equations in the following form:

where are rational functions in the parameter vector and are differential monomials in and . Let denote the vector of coefficients of the input-output equations, which are rational functions in the parameter vector . This coefficient vector induces a map called the coefficient map, that plays an important role in the study of identifiability and indistinguishability.

For general state space models of the form (2) we can also use ordinary Gröbner basis calculations to determine the input/output equation.

Proposition 3.1.

Consider a state space model of the form (2) where and are polynomial functions and where there are state-space variables, output variable, and input variables. Let be the ideal

Then is not the zero ideal and hence contains an input-output equation.

Although Proposition 3.1 is known in the literature [28, 36], we include a proof because it will illustrate some useful ideas that we will use in other new results later on. Note that although this is stated for a single output, one can apply Proposition 3.1 one output at a time to find input/output equations for each output separately and hence obtain Proposition 3.2.

Proof.

Note that is a prime ideal, since, with a carefully chosen lexicographic term order, it has as its initial ideal

which is a prime ideal. Since is prime, we can consider the algebraic matroid associated to this ideal. To say that is not the zero ideal is equivalent to saying that the set is a dependent set in the associated algebraic matroid. The initial ideal also shows that this ideal is a complete intersection, so it is has codimension (since this is the number of equations involved). The total number of variables in our polynomial ring is , where counts the variables, counts the variables, and counts the variables. Thus has dimension . Since the total number of variables in the set is , these variables must be dependent, i.e. there must exist a relation. ∎

For multiple outputs, one can again take derivatives up to order and show that there must exist an input-output equation for each output:

Proposition 3.2.

Consider a state space model of the form (2) where and are polynomial functions and where there are state-space variables, output variables, and input variables. Let be the ideal

Then is not the zero ideal and hence contains an input-output equation for each .

Proof.

We follow the proof of Proposition 3.1. The number of equations involved is . The total number of variables in our polynomial ring is . Here counts the variables, counts the variables, and counts the variables. Thus has dimension . For each , the total number of variables in the set is . Thus these variables must be dependent, i.e. there must exist a relation for each . ∎

Note that one could also work with smaller ideals than with only up to derivatives, as in [47]. In some instances this might produce an input output equation, but the dimension guarantee that ensures the existence of an input/output equation only occurs when .

Example 3.3.

Consider the SIR model from Example 2.1. The ideal in this example is:

This model has no input, so in this case we get a single output equation in the output variable and the parameters and . The output equation is:

This differential equation has differential monomials so the coefficient vector gives a function from to , given by

The dynamics of the input and output will only depend on the input-output equation up to a nonzero constant multiple. Hence, the coefficient map is only truly well-defined up to scalar multiplication. There are two natural ways to deal with this issue. The most appealing for an algebraist is to treat the coefficient map as a map into projective space: . The second approach is to force the equation to have a fixed form that will avoid this issue, by forcing the equation to be monic by dividing through by one of the coefficients. We will take the second approach in this paper. In the output equation in Example 3.3, one possible normalization yields the coefficient map

In the standard differential algebra approach to identifiability, we assume that the coefficients of the input-output equations can be recovered uniquely from the input-output data, and thus are assumed to be known quantities. This is a reasonable assumption when the input is a general enough function and the parameters are generic: in this case the dynamics will yield a unique differential equation. The identifiability question is then: can the parameters of the model be recovered from the coefficients of the input-output equations?

Definition 3.4.

Let denote the vector of coefficients of the input-output equations, which are rational functions in the parameter vector , which we assume to be normalized so that the input-output equations are monic. We consider as a function from some natural open biologically relevant parameter space .

  • The model is globally identifiable if is a one-to-one function.

  • The model is generically globally identifiable if there is a dense open subset such that is one-to-one.

  • The model is locally identifiable if around any point there is an open neighborhood such that is a one-to-one function.

  • The model is generically locally identifiable if there is a dense open subset such that for all there is an open neighborhood such that is a one-to-one function.

  • The model is unidentifiable if there is a such that is infinite.

  • The model is generically unidentifiable if there is a dense subset such that for all , is infinite.

As can be seen, there are many different variations on the notions of identifiability. Because of problems that might arise on sets of measure zero that can ruin the strongest form of global identifiability, one usually needs to add the generic conditions to get meaningful results. In this paper, we will consider state space models (2) where and are polynomial (or rational) functions. This ensures, via the differential elimination procedure, that the coefficient function is a rational function of the parameters. For linear compartment models this can always be taken to be polynomial functions.

In this paper, we will also focus almost exclusively on generic local identifiability and generic nonidentifiability and will use the following result to determine which of these conditions the model satisfies.

Proposition 3.5.

The model is generically locally identifiable if and only if the rank of the Jacobian of is equal to when evaluated at a generic point. Conversely, if the rank of the Jacobian of is less than for all choices of the parameters then the model is generically unidentifiable.

Proof.

Since the coefficients in are all polynomial or rational functions of the parameters, the model is generically locally identifiable if and only if the image of has dimension equal to the number of parameters, i.e. . The dimension of the image of a map is equal to the evaluation of the Jacobian at a generic point. ∎

Example 3.6.

SIR Model From Example 3.3, we have the following coefficient map:

We obtain the Jacobian with respect to the parameter ordering :

Since the rank of the Jacobian at a generic point is , not , the model is generically unidentifiable.

3.1. Input-output equations for linear models

There have been several methods proposed to find the input-output equations of nonlinear ODE models [3, 5, 25, 26, 42, 47, 53], but for linear models the problem is much simpler. We use Cramer’s rule in the following theorem, whose proof can be found in [49]:

Theorem 3.7.

Let be the differential operator and let be the submatrix of obtained by deleting the row and the column of . Then the input-output equations are of the form:

where is the greatest common divisor of , such that for a given .

Example 3.8.

Linear Compartment Model. For the linear 2-compartment model from Example 2.2, we obtain the following input-output equation:

Thus we have the following coefficient map:

We obtain the Jacobian with respect to the parameter ordering :

Since the rank of the Jacobian at a generic point is , not , the model is generically unidentifiable.

4. Identifiable functions

One issue that arises in identifiability analysis of state space models is figuring out what to do with a model that is generically unidentifiable. In some circumstances, the natural approach is to develop a new model that has fewer parameters that is identifiable. In other circumstances, the given model is forced upon us by the biology, and we cannot change it. When working with such a generically unidentifiable model, we would still like to determine what functions of the parameters can be determined from given input and output data.

Definition 4.1.

Let be the coefficient map, and let be another function. We say the function is

  • identifiable from if for all , implies ;

  • generically identifiable from if there is an open dense subset such that is identifiable from on ;

  • rationally identifiable from if there is a rational function such that on a dense open subset ;

  • locally identifiable from if there is an open dense subset such that for all , there is an open neighborhood such that is identifiable from on ;

  • non-identifiable from if there exists such that but ; and

  • generically non-identifiable from if there is a subset of nonzero measure such that for all the set is infinite.

Example 4.2.

From the linear 2-compartment model in Example 3.8, let and let . Then the functions are rationally identifiable since

Because we work with polynomial and rational maps and in this work, the majority of these conditions can be phrased in algebraic language, and checked using computer algebra.

Proposition 4.3.
  1. The function is rationally identifiable from if and only if as field extensions.

  2. The function is locally identifiable from if and only if is algebraic over .

  3. The function is generically non-identifiable from if and only if is transcendental over .

To explain how to use Proposition 4.3 to check the various identifiability conditions we need to introduce some terminology. Associated to a set we have the vanishing ideal defined by

When for a rational map, the vanishing ideal can be computed using Gröbner bases and elimination [16]. Associated to the pair of coefficient map and function that we want to test identifiability of, we have the augmented map , , and the augmented vanishing ideal

Proposition 4.4.

[30, Proposition 3] Suppose that is a polynomial such that appears in and that we may write so that is not in .

  1. If is linear, then is rationally identifiable from by the formula . If in addition for all then is globally identifiable.

  2. If has higher degree in , then is locally identifiable, and there are generically at most possible values for among all with .

  3. If no such polynomial exists then is generically non-identifiable from .

For local identifiability of a function, it is also possible to check using a Jacobian calculation, a result that follows easily from Proposition 3.5.

Proposition 4.5.

Let be the coefficient map. A function is locally identifiable from if is in the span of the rows of the Jacobian . Equivalently, consider the augmented map . Then is locally identifiable from if and only if the dimension of the image of equals the dimension of the image of .

5. Finding identifiable functions

The previous section showed how to check, given the coefficient function and another function of the parameters , whether is identifiable from (under various variations on the definition of identifiability). In some circumstances, there are natural functions to check for their identifiability (e.g. the individual underlying parameters, or certain specific functions with biological interpretations). However, when these fail to be identifiable, one would like tools to discover new functions that are identifiable in a given state space model. In practice the goal is to find a simple set of functions that generates the field (for globally identifiable functions), or a set of functions that are algebraic over and such that (for locally identifiable functions). The notion “simple” is intentionally left vague; typically, we mean functions of low degree that involve few parameters. While there is no general purpose method guaranteed to solve these problems, there are some useful heuristic approaches that seem to work well in practice. We highlight some of these methods in the present section.

One approach to find identifiable functions is to use Gröbner bases. Specifically, one can find a Gröbner basis of the ideal . We state the main result from [48].

Proposition 5.1.

[48, Theorem 1] If is an element of a Gröbner basis of for some elimination ordering of the parameter vector , then is globally identifiable. If instead is a factor of an element in the Gröbner basis of for some elimination ordering of the parameter vector , then is locally identifiable.

In practice, the Gröbner basis computations can be performed by picking a random point and computing a Gröbner basis in the ring . This certifies identifiability with high probability. The elimination ordering is used since elements in the Gröbner basis at the end of the order are likely to be sparse.

The main issue with the Gröbner basis approach to finding identifiable functions is that it is unclear a priori how many Gröbner bases one needs to find in order to generate a full set of algebraically independent identifiable functions. Since Gröbner basis computations can become computationally expensive, we provide another approach to find identifiable functions in this paper, using linear algebra with the Jacobian matrix . Specifically, we describe a sort of converse of Proposition 4.5, which allows us to take appropriate elements in the row span of and deduce that they came from an identifiable function. We first prove a result in the homogeneous case and then extend to arbitrary coefficient maps via homogenization.

Theorem 5.2.

Let be a homogeneous function of degree , corresponding to a coefficient of the input-output equations. Let be a vector in the span of over the field (that is, each ). Then the dot product is a rationally identifiable function. If each is locally identifiable then is locally identifiable.

To prove Theorem 5.2 we make use of the Euler homogeneous function theorem.

Proposition 5.3 (Euler’s Homogeneous Function Theorem).

Let be a homogeneous function of degree . Then .

Proof of Theorem 5.2.

Let be the row vector of the . The function has the form

The rows of are the gradients of the ’s. Since these functions are homogeneous, we have that by Euler’s homogeneous function theorem. But then

which expresses as a polynomial function in elements of , so is rationally identifiable. If each were locally identifiable, would belong to an algebraic extension of and hence be locally identifiable. ∎

Theorem 5.2 must be used in conjunction with Gaussian elimination and Proposition 4.4 or 4.5. Indeed, our strategy in implementations is to attempt Gaussian elimination cancellations starting with the Jacobian matrix . At each step when we want to perform an elementary operation, we use Proposition 4.4 or 4.5 to check whether the corresponding multiplier is rationally identifiable or locally identifiable. An approach based completely on linear algebra would only make use of Proposition 4.5 in which case we only deduce local identifiability.

Example 5.4.

Let be the map from the linear 2-compartment model in Example 4.2. Then the Jacobian is given by

Then applying Gaussian elimination over , we obtain:

This implies that and are all locally identifiable. Thus, and are locally identifiable.

Remark.

Note that in Example 4.2, we obtained that the functions and are rationally identifiable, whereas in Example 5.4, we only obtained that the functions and are locally identifiable. This is the cost of not using a Gröbner basis.

Remark.

The identifiable functions obtained using linear algebra on the Jacobian matrix depend heavily on the specific column ordering of the Jacobian matrix chosen. Thus, for a given column ordering (corresponding to a given parameter ordering), we may not generate the “simplest” locally identifiable functions. We do, however, always generate identifiable functions, as opposed to the Gröbner basis approach, in which there is no guarantee of generating elements/factors of elements of the form for a given elimination ordering .

Example 5.5.

From the SIR Model in Example 3.3, we can form the following coefficient map, ignoring constant coefficients:

thus we obtain the following Jacobian with respect to the parameter ordering :

from this we get the row-reduced Jacobian:

Thus, dotting each row vector with and dividing each polynomial by their respective degrees, we find that are locally identifiable.

When the coefficient functions are not homogeneous functions, we can homogenize the functions by some variable and add to the list of identifiable functions. This results in a similar identifiability result.

Theorem 5.6.

Let be the homogenization of the coefficient function and suppose it has degree . Let be a vector in the span of over the field . Then the dot product is a rationally identifiable function. If are locally identifiable given then is locally identifiable.

Proof.

Clearly is rationally identifiable over the field by Theorem 5.2. We need to show that setting preserves identifiability. Since is algebraic over the field , then clearly is algebraic over the field . Since is precisely , then is in the field . If are algebraic over then is algebraic over . ∎

Example 5.7.

Let be the map . Then the homogenized map is the map . Then the Jacobian is given by

Then applying Gaussian elimination over , we obtain:

Thus, dotting each row vector with , we obtain , and are locally identifiable. Dividing by the degree and setting , we obtain that and are locally identifiable.

6. Observability

In this section we explore how algebraic and combinatorial tools can be used to determine whether or not the state variables are observable. Roughly speaking, the state variable is observable if it can be recovered from observation of the input and output alone. We will use algebraic language to make this precise and explain how Gröbner bases and matroids can be used to check this condition.

Definition 6.1.

Consider a state space model of form (2).

  • The state variable is generically observable given the input and output trajectories and generic parameter value if there is a unique trajectory for compatible with the given input/output trajectory.

  • The state variable is rationally observable given input and output trajectories and generic parameter value if there is a rational function such that the trajectory satisfies .

  • The state variable is generically locally observable if given a generic parameter vector, there is an open neighborhood of the trajectory such that there is no other trajectory that is compatible with input/output data.

  • The state variable is generically unobservable if given the input and output trajectories and a generic parameter value there are infinitely many trajectories for compatible with the given input/output trajectory.

As usual, when and are polynomial functions, we can give equivalent definitions to many of these conditions, and algebraic methods for checking them.

The following proposition gives algebraic conditions for observability. More details on the differential algebra involved can be found in [31].

Proposition 6.2.

Consider a state space model of form (2). Let be the differential ideal generated the polynomials . Let be a polynomial and write this as where each , , and . Then

  • If , then is rationally observable.