On the bijectivity of families of exponential/generalized polynomial maps

# On the bijectivity of families of exponential/generalized polynomial maps

Stefan Müller, Josef Hofbauer, Georg Regensburger
###### Abstract

We start from a parametrized system of generalized polynomial equations (with real exponents) for positive variables, involving generalized monomials with positive parameters. Existence and uniqueness of a solution for all parameters (and for all right-hand sides) is equivalent to the bijectivity of a family of generalized polynomial/exponential maps.

We characterize the bijectivity of the family of exponential maps in terms of two linear subspaces arising from the coefficient and exponent matrices, respectively. In particular, we obtain conditions in terms of sign vectors of the two subspaces and a nondegeneracy condition involving the exponent subspace itself. Thereby, all criteria can be checked effectively.

Moreover, we characterize when the existence of a unique solution is robust with respect to small perturbations of the exponents or/and the coefficients. In particular, we obtain conditions in terms of sign vectors of the linear subspaces or, alternatively, in terms of maximal minors of the coefficient and exponent matrices.
Keywords: Birch’s theorem, global invertibility, Hadamard’s theorem, Descartes’ rule, sign vectors, oriented matroids, perturbations, robustness
AMS subject classification: 12D10 26C10 52B99 52C40

\@footnotetext

Stefan Müller Josef Hofbauer
Faculty of Mathematics, University of Vienna, Oskar-Morgenstern-Platz 1, 1090 Wien, Austria
Georg Regensburger
Institute for Algebra, Johannes Kepler University Linz, Altenberger Straße 69, 4040 Linz, Austria
Corresponding author: st.mueller@univie.ac.at

## 1 Introduction

Given two matrices , with and full rank, consider the parametrized system of generalized polynomial equations

 n∑j=1wijcjx~w1j1⋯x~wdjd=yi,i=1,…,d

for positive variables (and right-hand sides ), involving the ‘monomials’ , , in particular, the positive parameters . In other words, , , and . As in the theory of fewnomials [15, 23], the monomials are given, however, with a positive parameter associated to every monomial.

Writing the vector of monomials as , thereby introducing as and denoting componentwise multiplication by , yields the compact form

 W(c∘x~W)=y.

Note that, for the existence of a positive solution , the right-hand side must lie in the interior of , the polyhedral cone generated by the columns of . The question arises whether the above equation system has a unique positive solution , for all right-hand sides and all positive parameters . This question is equivalent to whether the generalized polynomial map ,

 fc(x)=W(c∘x~W)

or, equivalently, the exponential map ,

 Fc(x)=W(c∘e~WTx)

is bijective for all .

In the context of generalized chemical reaction networks [18, 19], the question is equivalent to whether every set of complex-balanced equilibria (an ‘exponential manifold’) intersects every stoichiometric compatibility class (an affine subspace) in exactly one point. For a motivation from Chemical Reaction Network Theory, see Appendix D or [9]. In the context of classical chemical reaction networks, the assumption of mass-action kinetics implies , and in this case there is indeed exactly one complex-balanced equilibrium in every stoichiometric compatibility class.

In case , the map also appears in toric geometry [11], where it is related to moment maps, and in statistics [20], where it is related to log-linear models. The following result guarantees the bijectivity of for all . It is a variant of Birch’s Theorem [25, 20, 6].

###### Theorem 1 ([11], Section 4.2).

Let . Then the map is a real analytic isomorphism of onto for all .

In this work, we characterize the bijectivity of the map for all (for given coefficients and exponents ) in terms of (sign vectors of) the linear subspaces and . Thereby we extend previous results, in particular, sufficient conditions for bijectivity [18, 17, 9]. Moreover, we characterize the robustness of bijectivity with respect to small perturbations of the exponents or/and the coefficients , corresponding to small perturbations of the subspaces and (in the Grassmannian).

Our main technical tool is Hadamard’s global inversion theorem which essentially states that a -map is a diffeomorphism if and only if it is locally invertible and proper. By previous results [8, 18], the map is locally invertible for all if and only if it is injective for all which can be characterized in terms of sign vectors of the subspaces and . Most importantly, we show that is proper if and only if it is ‘proper along rays’ and that properness for all can be characterized in terms of sign vectors of and , together with a nondegeneracy condition depending on the subspace itself.

The crucial role of sign vectors in the characterization of existence and uniqueness of positive solutions to parametrized polynomial equations suggests a comparison with Descartes’ rule of signs for univariate polynomials. Consider a univariate polynomial and order the monomials by their exponents. Now, let be the number of sign changes in the sequence of (nonzero) coefficients, and let be the number of positive roots (where multiple roots are counted separately). Then, Descartes’ rule [24] states that and is even. As shown by Laguerre [16, 14] the same statement holds for generalized monomials (with real exponents). More recently it has been shown that the upper bound is sharp [1]: for given sign sequence, there exist coefficients such that . Hence a sharp Descartes’ rule states that a univariate polynomial has exactly one positive solution for all coefficients with given signs if and only if there is exactly one sign change. Indeed, this statement follows from our main result (for univariate polynomials). Hence our main result can be seen as a multivariate generalization of the sharp Descartes’ rule for exactly one positive solution.

### Organization of the work

In Section 2, we introduce the familiy of exponential maps with and discuss previous results on injectivity.

In Section 3, we present our main result, Theorem 13, characterizing the bijectivity of the family , and the crucial Lemmas 11 and 16, regarding the properness of . In Subsection 3.1, we discuss two extreme cases regarding the geometry of , the polyhedral cone generated by the columns of . Namely, or is pointed. In the latter case, we present necessary conditions for the surjectivity of . In Subsection 3.2, we show that the bijectivity of the family cannot be characterized in terms of sign vectors only, cf. Example 21. Still, there are sufficient conditions for bijectivity in terms of sign vectors or in terms of faces of the Newton polytope.

In Section 4, we study the robustness of bijectivity. In Subsection 4.1, we consider perturbations of the exponents and show that robustness of bijectivity is equivalent to robustness of injectivity which can be characterized in terms of sign vectors, cf. Theorem 32. The criterion involves the closure of a set of sign vectors and represents another sufficient condition for bijectivity. In Subsection 4.2, we consider perturbations of the coefficients and characterize robustness of bijectivity again in terms of sign vectors (including another closure condition), cf. Theorem 38. In particular, robustness of bijectivity implies that either or is pointed. In the latter case, the faces of are minimally generated. Finally, in Subsection 4.3, we express the closure condition in terms of maximal minors of and . Further, we consider general perturbations (of both exponents and coefficients) and characterize robustness of bijectivity in terms of sign vectors and maximal minors, cf. Theorem 42.

Finally, we provide appendices on (A) a motivation from Chemical Reaction Network Theory, (B) oriented matroids, and (C) a theorem of the alternative.

### Notation

We denote the positive real numbers by and the nonnegative real numbers by . We write for and for . For vectors , we denote their scalar product by and their componentwise (Hadamard) product by .

For a vector , we obtain the sign vector by applying the sign function componentwise, and we write

 sign(S)={sign(x)∣x∈S}

for a subset .

For a vector with or , we denote its support by . For a subset , we say that a nonzero vector has (inclusion-)minimal support, if implies for all nonzero .

For a sign vector , we introduce

 τ−={i∣τi=−},τ0={i∣τi=0},andτ+={i∣τi=+}.

In particular, . For a subset , we write

 Σ⊕=Σ∩{0,+}n.

The inequalities and induce a partial order on : for sign vectors , we write if the inequality holds componentwise. The product on is defined in the obvious way. For , we write ( and are orthogonal) if either for all or there exist with and . For a set , we introduce the orthogonal complement

 Σ⊥={τ∈{−,0,+}n∣τ⋅ρ=0 for all ρ∈Σ}.

Moreover, for , we define the composition as if and otherwise.

For a matrix , we denote its column vectors by . For any natural number , we define . For with and of cardinality , we denote the square submatrix of with column indices in by .

## 2 Families of exponential maps

Let , be matrices with and full rank. Further, let

 C=coneW⊆Rd

be the cone generated by the columns of . Since has full rank, the cone has nonempty interior . Finally, let . We define the exponential map

 Fc: R~d→C∘⊆Rd (1) x↦W(c∘e~WTx)=n∑i=1cie~wi⋅xwi

and the related subspaces

 S=kerW⊆Rnand~S=ker~W⊆Rn. (2)

Note that injectivity and surjectivity of only depend on and . In fact, let , be such that , , and let

 Gc(x)=V(c∘e~VTx)

be the corresponding exponential map. Then , for invertible matrices , , and

 Gc(x)=UFc(~UTx).

### 2.1 Previous results on injectivity

In the context of multiple equilibria in mass-action systems [7] and geometric modeling [8], where , it was shown that the map is injective for all if and only if is a local diffeomorphism for all .

###### Theorem 2 (Theorem 7 and Corollary 8 in [8]).

Let be as in (1) with . Then the following statements are equivalent:

1. is injective for all .

2. for all and all .

3. for all subsets of cardinality (or ‘’ for all ) and for some .

In [18], we gave an alternative proof of this result and extended it to the case , by using the sign vectors of the subspaces and .

###### Theorem 3 (Theorem 3.6 in [18]).

Let be as in (1) and be as in (2). Then the following statements are equivalent:

1. is injective for all .

2. is an immersion for all .
( is injective for all and all .)

3. .

Theorems 2 and 3 characterize the injectivity of with for all equivalently in terms of maximal minors and sign vectors.

###### Corollary 4.

Let be subspaces of of dimension (with ). For every (with full rank ) such that and , the following statements are equivalent.

1. .

2. for all subsets of cardinality (or ‘’ for all ) and for some .

In the language of oriented matroids, Corollary 4 relates chirotopes (maximal minors of and ) to vectors (sign vectors of and ), see also Appendix E. Thereby, the sign vector condition is symmetric with respect to and .

###### Corollary 5 (Corollary 3.8 in [18]).

Let be subspaces of of equal dimension. Then

 sign(S)∩sign(~S⊥)={0}if and only ifsign(~S)∩sign(S⊥)={0}.

See also [5] for a direct proof of Corollaries 4 and 5.

## 3 Bijectivity

A necessary condition for the bijectivity of the map is . In the rest of the paper, we consider as in (1) with and the related subspaces as in (2).

A first sufficient condition for the bijectivity of the map for all (in terms of sign vectors of and ) was given in [18], thereby extending Theorem 1 (Birch’s Theorem).

###### Theorem 6 (Proposition 3.9 in [18]).

If and , then the map is a real analytic isomorphism for all .

As it will turn out, is sufficient for bijectivity, and the technical condition in [18] is not needed, cf. Corollary 15. We note that Theorems 2, 3, and 6 allowed a first multivariate generalization of Descartes’ rule of signs for at most/exactly one positive solution, see [17].

In order to characterize the bijectivity of the map for all , we start with the following observation.

###### Proposition 7.

The following statements are equivalent.

1. is bijective for all .

2. is a diffeomorphism for all .

3. is a real analytic isomorphism for all .

###### Proof.

Let be bijective for all . In particular, it is injective, and for all and , by Theorems 2 or 3. Hence, is a local diffeomorphism for all . Further, is real analytic and hence a local real analytic isomorphism for all . ∎

Most importantly, we will use Hadamard’s global inversion theorem [13].

###### Theorem 8.

A -map is a diffeomorphism if and only if the Jacobian for all and whenever .

Obviously, we need a slightly more general version of this result that follows from the general invertibility theorem in [3], see also [12].

###### Theorem 9.

Let be open and convex. A -map is a diffeomorphism if and only if the Jacobian for all and is proper.

Recall that a map is proper, if is compact for each compact subset of . This is obviously necessary for the inverse to be continuous.

###### Lemma 10.

Let be open. A continuous map is proper if and only if, for sequences in with and and in with , implies .

###### Proof.

Suppose is proper and , but . Take a closed ball around . Then contains the unbounded sequence and hence is not compact, a contradiction.

Conversely, let be a compact subset of . We need to show that every sequence in has an accumulation point. Since is closed, we only need to show that has a bounded subsequence. Suppose not, then . Since , there is a subsequence (call it again) such that . Now there is another subsequence (call it again) such that , that is, the sequence on the unit sphere converges. With , we have , a contradiction. ∎

In particular, if is proper, then, for all nonzero , as implies . That is, if the function values converge along a ray, then the limit lies on the boundary of the range.

By Lemma 11 below, the map under consideration is proper, if it is ‘proper along rays’. Before we prove this result, we discuss the behaviour of along a ray. For and , we introduce

 Ix,λ={i∣~wi⋅x=λ}

and write

 Fc(xt)=∑λ∑i∈Ix,λcieλtwi.

Now, for nonzero , either as or . In the first case, there is such that

 Fc(xt)e−λt→∑i∈Ix,λciwi≠0

as . In the second case, for all and

 Fc(xt)→∑i∈Ix,0ciwi∈C.

If , then .

###### Lemma 11.

The map is proper, if

 Fc(xt)→yast→∞impliesy∈∂C (∗)

for all nonzero .

###### Proof.

We assume that the ray condition (11) holds for all nonzero .

Let with . In order to apply Lemma 10, we consider sequences in with and and in with .

To begin with, we show that as implies as . Suppose , that is, there is such that as . For close to , we have the partition

 Ix,λ=Ix′,μ1∪⋯∪Ix′,μp

with close to and hence . Most importantly, there exists a largest such that . Otherwise,

 ∑i∈Ix,λciwi=∑i∈Ix′,μ1ciwi+…+∑i∈Ix′,μpciwi=0.

Additionally, there may exist an even larger with . In any case, there is such that

 Fc(x′t)e−λ′t→∑i∈Ix′,λ′ciwi≠0

as and hence with independent of ; that is, as . Hence as ; that is, , as claimed.

In case (), the ray condition (11) implies as and hence as . By Lemma 10, is proper.

In case , assume as . Then, as , by the argument above. In particular, for all and . The vectors with and lie in the lineality space of , and hence

 cone(wi∣i∈Ix,λ with λ>0)⊆∂C.

By the ray condition (11), , and hence

 cone(wi∣i∈Ix,0)⊆∂C.

Finally, we write

 Fc(xntn)=n∑i=1cie~wi⋅xntnwi=∑λ∑i∈Ix,λcie~wi⋅xntnwi.

For close to , we have close to for , in particular, for . The limit as implies

 ∑λ≥0∑i∈Ix,λcie~wi⋅xntnwi→y′,

and hence . By Lemma 10, is proper. ∎

Let as along the ray given by and as for a sequence (with and ), approaching the ray. In the proof of Lemma 11, we have shown that, if , then , where is the lineality space of . In general, if , then . Note that there are only finitely many index sets and hence finitely many limit points (for fixed ), whereas every arises as a limit point (if is surjective).

Using Theorem 9 (Hadamard’s global inversion theorem) together with Theorems 2 or 3 and Lemma 11, we summarize our findings.

###### Corollary 12.

The map is bijective for all if and only if is injective for all and the ray condition (11) holds for all nonzero and all .

By Theorems 2 or 3, the injectivity of (for all ) can be characterized in terms of sign vectors of the subspaces and . By Lemma 16 below, the ray condition (11) (for all nonzero and all ) can be characterized in terms of sign vectors of and together with a nondegeneracy condition depending on sign vectors of and on the subspace itself.

Hence, as our main result, we characterize the bijectivity of (for all ) in terms of the subspaces and .

###### Theorem 13.

The map is a diffeomorphism for all if and only if

1. ,

2. for every nonzero , there is a nonzero such that , and

3. the pair is nondegenerate.

To complete the statement, we have to define nondegeneracy.

###### Definition 14.

Let be subspaces of . A vector with a positive component is called nondegenerate if

• there is (a nonzero) with for some such that or

• for , there is a nonzero such that .

The pair is called nondegenerate if every with a positive component is nondegenerate.

First, we note that Theorem 13 immediately implies Theorems 1 and 6 (Birch’s Theorem and its first extension).

###### Corollary 15.

The map is a diffeomorphism for all if .

###### Proof.

Note that , cf. [26, Prop. 6.8]. Hence, implies conditions (i) and (ii) in Theorem 13. Now, for with a positive component , consider with and . Obviously, , that is, , and is nondegenerate, as required by condition (iii). ∎

Second, we note that condition (i) in Theorem 13 can also be characterized in terms of maximal minors of the matrices and , cf. Corollary 4. Moreover, condition (ii) can be reformulated using faces of the cones and :

1. for every proper face of with , there is a proper face of with such that .

Indeed, a face of with corresponds to a supporting hyperplane with normal vector such that for and otherwise (for lying on the positive side of the hyperplane). Hence is characterized by the nonnegative sign vector with . Analogously, a face of with is characterized by a nonnegative sign vector with . Clearly, is equivalent to . (For more details on sign vectors and face lattices, see Appendix E.)

Third, before we prove Lemma 16 below, we discuss how the ray condition (11) implies conditions (ii) and (iii).

Let be nonzero, and assume that the ray condition (11) holds for all . Then, for all , either there is such that

 Fc(xt)e−λt→∑i∈Ix,λciwi≠0

as or

 Fc(xt)→∑i∈Ix,0ciwi∈∂C.

Note that the sets are disjoint, and the sums involve different coefficients  for different . Hence,

• there is such that for all or

• for all .

To see this, assume (a), that is, there exists such that for all . Then, for all , that is, (b).

Now, let .

If , then defines a proper face of with index set . Indeed, for and otherwise. Condition (b) implies that (the interior of) the lies in a proper face of with index set . That is, , as required by condition (ii).

If , let , having a positive component. Condition (a) implies that there is with such that for all . Equivalently, there is with such that . That is, is nondegenerate, as required by condition (iii). Moreover, let and hence . Condition (b) implies that there is a proper face of , characterized by a nonzero sign vector , such that . Again, is nondegenerate, as required by condition (iii).

###### Lemma 16.

The ray condition (11) holds for all nonzero and for all if and only if conditions (ii) and (iii) in Theorem 13 hold.

###### Proof.

To show necessity and sufficiency of (ii) and (iii), we vary over all nonzero .

Let be nonzero and .

1. If , then defines a proper face of  and as . Necessity and sufficiency of (ii): The ray condition (11) for all is equivalent to for all . That is, there is a proper face of  characterized by a nonzero such that . Equivalently, , that is, (ii) for .

By varying over all nonzero (with ), all nonzero are covered.

2. If , then has a positive component. Necessity and sufficiency of (iii): The ray condition (11) for all is equivalent to

• either there is such that as

• or ,

for all . That is,

• there is such that for all or

• for all ,

thereby using that the sets are disjoint and the sums involve different coefficients  for different . Equivalently,

• there is with such that for all with , that is, there is with such that , or

• for and hence , there is a proper face of , characterized by a nonzero such that ,

that is, (iii) for .

By varying over all nonzero (with ), all with a positive component are covered.

### 3.1 Special cases: C=Rd or C is pointed

We discuss the conditions for bijectivity in Theorem 13 for two extreme cases, regarding the geometry of the cones and .

If (that is, ), then condition (ii) is equivalent to . Hence, if and is bijective for all , then . However, the converse does not hold.

###### Example 17.

Let be given by the matrices

 ~W=(10−101−1)andW=(10−1010).

Then and is bijective for all . However, .

If (that is, is pointed and no column of is zero), then condition (iii) holds (since ), and conditions (i) and (ii) imply (by Proposition 19 below). Hence, if and is bijective for all , then . However, the converse does not hold.

###### Example 18.

Let be given by the matrices

 ~W=(110011)andW=(10−1010).

Then and is bijective for all . However,