On the bijectivity of families of exponential/generalized polynomial maps
We start from a parametrized system of generalized polynomial equations (with real exponents) for positive variables, involving generalized monomials with positive parameters. Existence and uniqueness of a solution for all parameters (and for all right-hand sides) is equivalent to the bijectivity of a family of generalized polynomial/exponential maps.
We characterize the bijectivity of the family of exponential maps in terms of two linear subspaces arising from the coefficient and exponent matrices, respectively. In particular, we obtain conditions in terms of sign vectors of the two subspaces and a nondegeneracy condition involving the exponent subspace itself. Thereby, all criteria can be checked effectively.
Moreover, we characterize when the existence of a unique solution is robust with respect to small perturbations of the exponents or/and the coefficients.
In particular, we obtain conditions in terms of sign vectors of the linear subspaces or, alternatively,
in terms of maximal minors of the coefficient and exponent matrices.
Keywords: Birch’s theorem, global invertibility, Hadamard’s theorem, Descartes’ rule, sign vectors, oriented matroids, perturbations, robustness
AMS subject classification: 12D10 26C10 52B99 52C40
Stefan Müller Josef Hofbauer
Faculty of Mathematics, University of Vienna, Oskar-Morgenstern-Platz 1, 1090 Wien, Austria
Institute for Algebra, Johannes Kepler University Linz, Altenberger Straße 69, 4040 Linz, Austria
Corresponding author: firstname.lastname@example.org
Given two matrices , with and full rank, consider the parametrized system of generalized polynomial equations
for positive variables (and right-hand sides ), involving the ‘monomials’ , , in particular, the positive parameters . In other words, , , and . As in the theory of fewnomials [15, 23], the monomials are given, however, with a positive parameter associated to every monomial.
Writing the vector of monomials as , thereby introducing as and denoting componentwise multiplication by , yields the compact form
Note that, for the existence of a positive solution , the right-hand side must lie in the interior of , the polyhedral cone generated by the columns of . The question arises whether the above equation system has a unique positive solution , for all right-hand sides and all positive parameters . This question is equivalent to whether the generalized polynomial map ,
or, equivalently, the exponential map ,
is bijective for all .
In the context of generalized chemical reaction networks [18, 19], the question is equivalent to whether every set of complex-balanced equilibria (an ‘exponential manifold’) intersects every stoichiometric compatibility class (an affine subspace) in exactly one point. For a motivation from Chemical Reaction Network Theory, see Appendix D or . In the context of classical chemical reaction networks, the assumption of mass-action kinetics implies , and in this case there is indeed exactly one complex-balanced equilibrium in every stoichiometric compatibility class.
In case , the map also appears in toric geometry , where it is related to moment maps, and in statistics , where it is related to log-linear models. The following result guarantees the bijectivity of for all . It is a variant of Birch’s Theorem [25, 20, 6].
Theorem 1 (, Section 4.2).
Let . Then the map is a real analytic isomorphism of onto for all .
In this work, we characterize the bijectivity of the map for all (for given coefficients and exponents ) in terms of (sign vectors of) the linear subspaces and . Thereby we extend previous results, in particular, sufficient conditions for bijectivity [18, 17, 9]. Moreover, we characterize the robustness of bijectivity with respect to small perturbations of the exponents or/and the coefficients , corresponding to small perturbations of the subspaces and (in the Grassmannian).
Our main technical tool is Hadamard’s global inversion theorem which essentially states that a -map is a diffeomorphism if and only if it is locally invertible and proper. By previous results [8, 18], the map is locally invertible for all if and only if it is injective for all which can be characterized in terms of sign vectors of the subspaces and . Most importantly, we show that is proper if and only if it is ‘proper along rays’ and that properness for all can be characterized in terms of sign vectors of and , together with a nondegeneracy condition depending on the subspace itself.
The crucial role of sign vectors in the characterization of existence and uniqueness of positive solutions to parametrized polynomial equations suggests a comparison with Descartes’ rule of signs for univariate polynomials. Consider a univariate polynomial and order the monomials by their exponents. Now, let be the number of sign changes in the sequence of (nonzero) coefficients, and let be the number of positive roots (where multiple roots are counted separately). Then, Descartes’ rule  states that and is even. As shown by Laguerre [16, 14] the same statement holds for generalized monomials (with real exponents). More recently it has been shown that the upper bound is sharp : for given sign sequence, there exist coefficients such that . Hence a sharp Descartes’ rule states that a univariate polynomial has exactly one positive solution for all coefficients with given signs if and only if there is exactly one sign change. Indeed, this statement follows from our main result (for univariate polynomials). Hence our main result can be seen as a multivariate generalization of the sharp Descartes’ rule for exactly one positive solution.
Organization of the work
In Section 2, we introduce the familiy of exponential maps with and discuss previous results on injectivity.
In Section 3, we present our main result, Theorem 13, characterizing the bijectivity of the family , and the crucial Lemmas 11 and 16, regarding the properness of . In Subsection 3.1, we discuss two extreme cases regarding the geometry of , the polyhedral cone generated by the columns of . Namely, or is pointed. In the latter case, we present necessary conditions for the surjectivity of . In Subsection 3.2, we show that the bijectivity of the family cannot be characterized in terms of sign vectors only, cf. Example 21. Still, there are sufficient conditions for bijectivity in terms of sign vectors or in terms of faces of the Newton polytope.
In Section 4, we study the robustness of bijectivity. In Subsection 4.1, we consider perturbations of the exponents and show that robustness of bijectivity is equivalent to robustness of injectivity which can be characterized in terms of sign vectors, cf. Theorem 32. The criterion involves the closure of a set of sign vectors and represents another sufficient condition for bijectivity. In Subsection 4.2, we consider perturbations of the coefficients and characterize robustness of bijectivity again in terms of sign vectors (including another closure condition), cf. Theorem 38. In particular, robustness of bijectivity implies that either or is pointed. In the latter case, the faces of are minimally generated. Finally, in Subsection 4.3, we express the closure condition in terms of maximal minors of and . Further, we consider general perturbations (of both exponents and coefficients) and characterize robustness of bijectivity in terms of sign vectors and maximal minors, cf. Theorem 42.
Finally, we provide appendices on (A) a motivation from Chemical Reaction Network Theory, (B) oriented matroids, and (C) a theorem of the alternative.
We denote the positive real numbers by and the nonnegative real numbers by . We write for and for . For vectors , we denote their scalar product by and their componentwise (Hadamard) product by .
For a vector , we obtain the sign vector by applying the sign function componentwise, and we write
for a subset .
For a vector with or , we denote its support by . For a subset , we say that a nonzero vector has (inclusion-)minimal support, if implies for all nonzero .
For a sign vector , we introduce
In particular, . For a subset , we write
The inequalities and induce a partial order on : for sign vectors , we write if the inequality holds componentwise. The product on is defined in the obvious way. For , we write ( and are orthogonal) if either for all or there exist with and . For a set , we introduce the orthogonal complement
Moreover, for , we define the composition as if and otherwise.
For a matrix , we denote its column vectors by . For any natural number , we define . For with and of cardinality , we denote the square submatrix of with column indices in by .
2 Families of exponential maps
Let , be matrices with and full rank. Further, let
be the cone generated by the columns of . Since has full rank, the cone has nonempty interior . Finally, let . We define the exponential map
and the related subspaces
Note that injectivity and surjectivity of only depend on and . In fact, let , be such that , , and let
be the corresponding exponential map. Then , for invertible matrices , , and
2.1 Previous results on injectivity
Theorem 2 (Theorem 7 and Corollary 8 in ).
Let be as in (1) with . Then the following statements are equivalent:
is injective for all .
for all and all .
for all subsets of cardinality (or ‘’ for all ) and for some .
In , we gave an alternative proof of this result and extended it to the case , by using the sign vectors of the subspaces and .
Theorem 3 (Theorem 3.6 in ).
Let be subspaces of of dimension (with ). For every (with full rank ) such that and , the following statements are equivalent.
for all subsets of cardinality (or ‘’ for all ) and for some .
In the language of oriented matroids, Corollary 4 relates chirotopes (maximal minors of and ) to vectors (sign vectors of and ), see also Appendix E. Thereby, the sign vector condition is symmetric with respect to and .
Corollary 5 (Corollary 3.8 in ).
Let be subspaces of of equal dimension. Then
Theorem 6 (Proposition 3.9 in ).
If and , then the map is a real analytic isomorphism for all .
As it will turn out, is sufficient for bijectivity, and the technical condition in  is not needed, cf. Corollary 15. We note that Theorems 2, 3, and 6 allowed a first multivariate generalization of Descartes’ rule of signs for at most/exactly one positive solution, see .
In order to characterize the bijectivity of the map for all , we start with the following observation.
The following statements are equivalent.
is bijective for all .
is a diffeomorphism for all .
is a real analytic isomorphism for all .
Most importantly, we will use Hadamard’s global inversion theorem .
A -map is a diffeomorphism if and only if the Jacobian for all and whenever .
Let be open and convex. A -map is a diffeomorphism if and only if the Jacobian for all and is proper.
Recall that a map is proper, if is compact for each compact subset of . This is obviously necessary for the inverse to be continuous.
Let be open. A continuous map is proper if and only if, for sequences in with and and in with , implies .
Suppose is proper and , but . Take a closed ball around . Then contains the unbounded sequence and hence is not compact, a contradiction.
Conversely, let be a compact subset of . We need to show that every sequence in has an accumulation point. Since is closed, we only need to show that has a bounded subsequence. Suppose not, then . Since , there is a subsequence (call it again) such that . Now there is another subsequence (call it again) such that , that is, the sequence on the unit sphere converges. With , we have , a contradiction. ∎
In particular, if is proper, then, for all nonzero , as implies . That is, if the function values converge along a ray, then the limit lies on the boundary of the range.
By Lemma 11 below, the map under consideration is proper, if it is ‘proper along rays’. Before we prove this result, we discuss the behaviour of along a ray. For and , we introduce
Now, for nonzero , either as or . In the first case, there is such that
as . In the second case, for all and
If , then .
The map is proper, if
for all nonzero .
We assume that the ray condition ( ‣ 11) holds for all nonzero .
Let with . In order to apply Lemma 10, we consider sequences in with and and in with .
To begin with, we show that as implies as . Suppose , that is, there is such that as . For close to , we have the partition
with close to and hence . Most importantly, there exists a largest such that . Otherwise,
Additionally, there may exist an even larger with . In any case, there is such that
as and hence with independent of ; that is, as . Hence as ; that is, , as claimed.
In case , assume as . Then, as , by the argument above. In particular, for all and . The vectors with and lie in the lineality space of , and hence
By the ray condition ( ‣ 11), , and hence
Finally, we write
For close to , we have close to for , in particular, for . The limit as implies
and hence . By Lemma 10, is proper. ∎
Let as along the ray given by and as for a sequence (with and ), approaching the ray. In the proof of Lemma 11, we have shown that, if , then , where is the lineality space of . In general, if , then . Note that there are only finitely many index sets and hence finitely many limit points (for fixed ), whereas every arises as a limit point (if is surjective).
The map is bijective for all if and only if is injective for all and the ray condition ( ‣ 11) holds for all nonzero and all .
By Theorems 2 or 3, the injectivity of (for all ) can be characterized in terms of sign vectors of the subspaces and . By Lemma 16 below, the ray condition ( ‣ 11) (for all nonzero and all ) can be characterized in terms of sign vectors of and together with a nondegeneracy condition depending on sign vectors of and on the subspace itself.
Hence, as our main result, we characterize the bijectivity of (for all ) in terms of the subspaces and .
The map is a diffeomorphism for all if and only if
for every nonzero , there is a nonzero such that , and
the pair is nondegenerate.
To complete the statement, we have to define nondegeneracy.
Let be subspaces of . A vector with a positive component is called nondegenerate if
there is (a nonzero) with for some such that or
for , there is a nonzero such that .
The pair is called nondegenerate if every with a positive component is nondegenerate.
The map is a diffeomorphism for all if .
Second, we note that condition (i) in Theorem 13 can also be characterized in terms of maximal minors of the matrices and , cf. Corollary 4. Moreover, condition (ii) can be reformulated using faces of the cones and :
for every proper face of with , there is a proper face of with such that .
Indeed, a face of with corresponds to a supporting hyperplane with normal vector such that for and otherwise (for lying on the positive side of the hyperplane). Hence is characterized by the nonnegative sign vector with . Analogously, a face of with is characterized by a nonnegative sign vector with . Clearly, is equivalent to . (For more details on sign vectors and face lattices, see Appendix E.)
Let be nonzero, and assume that the ray condition ( ‣ 11) holds for all . Then, for all , either there is such that
Note that the sets are disjoint, and the sums involve different coefficients for different . Hence,
there is such that for all or
for all .
To see this, assume (a), that is, there exists such that for all . Then, for all , that is, (b).
Now, let .
If , then defines a proper face of with index set . Indeed, for and otherwise. Condition (b) implies that (the interior of) the lies in a proper face of with index set . That is, , as required by condition (ii).
If , let , having a positive component. Condition (a) implies that there is with such that for all . Equivalently, there is with such that . That is, is nondegenerate, as required by condition (iii). Moreover, let and hence . Condition (b) implies that there is a proper face of , characterized by a nonzero sign vector , such that . Again, is nondegenerate, as required by condition (iii).
To show necessity and sufficiency of (ii) and (iii), we vary over all nonzero .
Let be nonzero and .
If , then defines a proper face of and as . Necessity and sufficiency of (ii): The ray condition ( ‣ 11) for all is equivalent to for all . That is, there is a proper face of characterized by a nonzero such that . Equivalently, , that is, (ii) for .
By varying over all nonzero (with ), all nonzero are covered.
If , then has a positive component. Necessity and sufficiency of (iii): The ray condition ( ‣ 11) for all is equivalent to
either there is such that as
for all . That is,
there is such that for all or
for all ,
thereby using that the sets are disjoint and the sums involve different coefficients for different . Equivalently,
there is with such that for all with , that is, there is with such that , or
for and hence , there is a proper face of , characterized by a nonzero such that ,
that is, (iii) for .
By varying over all nonzero (with ), all with a positive component are covered.
3.1 Special cases: or is pointed
We discuss the conditions for bijectivity in Theorem 13 for two extreme cases, regarding the geometry of the cones and .
If (that is, ), then condition (ii) is equivalent to . Hence, if and is bijective for all , then . However, the converse does not hold.
Let be given by the matrices
Then and is bijective for all . However, .
If (that is, is pointed and no column of is zero), then condition (iii) holds (since ), and conditions (i) and (ii) imply (by Proposition 19 below). Hence, if and is bijective for all , then . However, the converse does not hold.
Let be given by the matrices
Then and is bijective for all . However,