Effective Subdivision Algorithm for Isolating Zeros
of Real Systems of Equations, with Complexity Analysis
We describe a new algorithm Miranda for isolating the simple zeros of a function within a box . The function and its partial derivatives must have interval forms, but need not be polynomial. Our subdivision-based algorithm is “effective” in the sense that our algorithmic description also specifies the numerical precision that is sufficient to certify an implementation with any standard BigFloat number type. The main predicate is the Moore-Kioustelides (MK) test, based on Miranda’s Theorem (1940). Although the MK test is well-known, this paper appears to be the first synthesis of this test into a complete root isolation algorithm.
We provide a complexity analysis of our algorithm based on intrinsic geometric parameters of the system. Our algorithm and complexity analysis are developed using 3 levels of description (Abstract, Interval, Effective). This methodology provides a systematic pathway for achieving effective subdivision algorithms in general.
Solving multivariate zero-dimensional systems of equations is a fundamental task with many applications. We focus on the problem of isolating simple real zeros of a real function
within a given bounded box . We do not require to be polynomial, only each and its partial derivatives have interval forms. We require that has only isolated simple zeros in , which is the box sharing the same center of and of with twice that of . We call the region-of-interest (ROI) of the input instance. This formulation of root isolation is called111 Sometimes, an algorithm is called “local” if it works in small enough neighborhoods (like Newton iteration), and “global” if no such restriction is needed. Clearly, this is a different local/global distinction. a local problem in (imbach-pan-yap:ccluster:18, ), in contrast to the global problem of isolating all roots of . The local problem is very important in higher dimensions because the global problem has complexity that is exponential in . In geometric applications we typically can identify ROI’s and can solve the corresponding local problem much faster than the global problem. Moreover, if is not polynomial, the global problem might not be solvable: E.g., , . But it is solvable as a local problem as in (yap-sagraloff-sharma:cluster:13, ).
In their survey of root finding in polynomial systems, Sherbrooke and Patrikalakis (sherbrooke-patrikalakis:93, ) noted 3 main approaches: (1) algebraic techniques, (2) homotopy, (3) subdivision. They objected to the first two approaches on “philosophical grounds”, meaning that it is not easy in these methods to restrict its computation to some ROI . Of course, one could solve the global problem and discard solutions that do not lie in . But its complexity would not be a function of the roots in . Such local complexity behavior are provable in the univariate case (e.g., (becker+4:cluster:16, )), and we will also show similar local complexity in the algorithm of this paper.
Focusing on the subdivision approach, we distinguish two types of subdivision: algebraic and analytic. In algebraic subdivision, is polynomial and one exploits representations of polynomials such as Bernstein form or B-splines (mourrain-pavone:subdiv-polsys:09, ; elber-kim:solver:01, ; sherbrooke-patrikalakis:93, ; garloff-smith:bernstein-systems:01, ; garloff-smith:subdiv-algo:01, ). Analytic subdivision (neumaier:equations:bk-90, ; kearfott:bk, ; hentenryck+2:solving:97, ) supports a broader class of functions; this is formalized in (yap-sagraloff-sharma:cluster:13, ) and includes all the functions obtained from composition of standard elementary functions or hypergeometric functions. Many algebraic algorithms come with complexity analysis, while the analytic algorithms typically lack such analysis, unless one views convergence analysis as a weak form of complexity analysis. This lack is natural because many analytic algorithms are what theoretical computer science call “heuristics” with no output guarantees. Any guarantees would be highly222 The issue of “unconditional algorithms” is a difficult one in analytic settings. Even the algorithm in this paper is conditional: we require the zeros of to be simple within . But one should certainly specify any conditions upfront and try to avoid conditions which are “algorithm-induced” (see (yap:degeneracies, )). conditional (cf. (hentenryck+2:solving:97, )). To our knowledge, the existing subdivision algorithms, both the algebraic ones and the analytic ones, suffer from a gap: they require an input to serve as termination criterion (mourrain-pavone:subdiv-polsys:09, ; elber-kim:solver:01, ; sherbrooke-patrikalakis:93, ; garloff-smith:bernstein-systems:01, ; garloff-smith:subdiv-algo:01, ). Without this additional , the termination of the algorithms becomes unclear.
1.1. Generic Root Isolation Algorithms
It is useful to formulate a “generic algorithm” for local root isolation (cf. (cxy, )). We postulate 5 abstract modules: three box tests (exclusion , existence , Jacobian ) and two box operators (subdivision and contraction). Our tests (or predicates, which we use interchangeably) are best described using a notation: for any set , denotes the number of roots, counted with multiplicity, of in . These tests are abstractly defined by these implications:
Unlike exact predicates, these tests are “one-sided” (cf. (yap-sagraloff-sharma:cluster:13, )) since their failure may have no implications for the negation of the predicate. For root isolation, we need both and to prove uniqueness. These 3 tests can be instantiated in a variety of ways. The exclusion test is instantiated differently depending on the type of subdivision: exploiting the convex hull property of Bernstein coefficients (in algebraic case) or using interval forms of (in analytic case). For , we can use various tests coming from degree theory or fixed point theory (e.g., (alefeld+3:existence:04, )). This paper is focused on a test based on Miranda’s Theorem. The Jacobian test is related to the determinant of the Jacobian matrix but more geometric forms (e.g., cone test (elber-kim:solver:01, )) can be formulated. Next consider the box operators: An -dimensional box may be subdivided into subboxes in ways (). In practice, and some heuristic will choose one of the binary splits (see (garloff-smith:bernstein-systems:01, ) for 3 heuristics). We contract to where is a box returned by a interval Newton-type operator. Let us say the contraction “succeeds” if the width is less than . But success is not guaranteed, and so this operator always needs to be paired with some subdivision operator that never fails. It is well-known that can also provide exclusion and uniqueness tests:
Given the above 5 modules, we are ready to synthesize them into a root isolation algorithm: In broad outline, the algorithm maintains a queue of candidate boxes. Initially, contains only the ROI , the algorithm loops until is empty:
Simple Isolate() Output: sequence of isolating boxes for roots in While If continue; discard and repeat loop If has a unique root output and continue; If if contraction succeeds .push() else .push()
Simple Isolate gives a synthetic framework of the root isolating algorithms. In practice, an algorithm needs not to consist of all the predicates. Some of them will be sufficient. As mentioned above, the existing algorithms involve an input as a criterion for termination. Besides the fact that some papers lay greater emphasis on root approximation than on root isolation, an important reason for this phenomenon is that the predicates and analysis in the existing papers are not able to support the termination of the algorithms without .
For the existing algebraic subdivision algorithms, most of them have no existence or Jacobian test (sherbrooke-patrikalakis:93, ; mourrain-pavone:subdiv-polsys:09, ; garloff-smith:bernstein-systems:01, ; garloff-smith:subdiv-algo:01, ), others lack detailed discussion on the relationship between the success of these tests and the size of the boxes (elber-kim:solver:01, ). For the analytic subdivision algorithms, the interval Newton type operators are the most favorable ones to serve as exclusion and uniqueness test. Extensive investigations have been performed on them (neumaier:equations:bk-90, ; kearfott:bk, ). For instance, (neumaier:equations:bk-90, , Chapter 5) gives detailed sufficient condition for the strong convergence of the operators. But it is still unproven that when a box is sufficiently small, the operators will give a definite result either to exclude the box or to confirm the uniqueness of a root in it. Therefore, an extra is necessary to ensure the termination of the algorithms. But the dependence on naturally results in two issues: the output boxes may not be isolating, i.e., they may contain no root, or more than one roots. In this paper, we present an algorithm that makes up this gap.
1.2. How to derive effective algorithms
In this paper, we describe Miranda, a subdivision algorithm for root isolation, roughly along the above outline. We forgo the use of the contraction operator as it will not figure in our analysis. For simplicity, assume that all our boxes are hypercubes (equi-dimensional boxes); this means our subdivision splits each box into children. With a little more effort, our analysis can handle boxes with bounded aspect ratios and thus support the bisection-based algorithms. As noted, termination depends on instantiations of our 3 tests: our exclusion and Jacobian tests are standard in the interval literature. Our existence test, called MK test, is from Moore-Kioustelides (MK) (moore-kioustelidis:test:80, ). Our algorithm is similar333 In (lien-sharma-vegter-yap:arrange:14z, , Appendix), only termination was proved (up to the abstract level) with no complexity analysis. We will correct an error there. to one in the Appendix of (lien-sharma-vegter-yap:arrange:14z, ). In the normal manner of theoretical algorithms, one would proceed to “prove that Miranda is correct and analyze its complexity”. This will be done, but the way we proceed is aimed at some broader issues discussed next.
Effectivity: how could we convert a mathematically precise algorithm (like Miranda) into an “effective algorithm”, i.e., certified and implementable. One might be surprised that there is an issue. The non-trivially of this question can be illustrated from the history of isolating univariate roots: for about 30 years, it is known that the “benchmark problem” of isolating all the roots of an integer polynomial with -bit coefficients and degree has bit-complexity , a bound informally described as “near-optimal”. This is achieved by the algorithm of Schönhage and Pan (1981-1992). But this algorithm has never been implemented. What is the barrier? Basically, it is the formidable problem of mapping algorithms in the Real RAM model (bigahu, ) or BSS model (bcss:bk, ) into a bit-based Turing-computable model – see (yap:praise:09, ).
In contrast, recent progress in subdivision algorithms for univariate roots finally succeeded in achieving comparable complexity bounds of , and such algorithms were implemented shortly after! Thus, these subdivision algorithms were “effective”. For two parallel accounts of this development, see (sagraloff-mehlhorn:real-roots:16, ; kobel-rouillier-sagraloff:for-real:16, ) for the case of real roots, and to (becker+3:cisolate:18, ; becker+4:cluster:16, ; imbach-pan-yap:ccluster:18, ) for complex roots. What is the power conferred by subdivision? We suggest this: the subdivision framework provides a natural way to control the numerical precision necessary to ensure correct operations of the algorithm. Moreover, the typical one-sided tests of subdivision avoid the “Zero Problem” and can be effectively implemented using approximations with suitable rounding modes.
In this paper, we capture this pathway to effectivity by introducing 3 Levels of (algorithmic) Abstractions: (A) Abstract Level, (I) Interval Level, and (E) Effective Level. We normally identify Level (A) with the mathematical description of an algorithm or Real RAM algorithms. At level (I), the set extensions of the functions are replaced by the interval forms (see Section 2 for definitions). At the Effective Level, the algorithm approximate real numbers by BigFloat or dyadic numbers, i.e., . As illustration, consider the exclusion test (viewed as abstract) has correspondences in the next three levels:
where is the exact range of on , is the interval form of , and the effective form where the endpoints are dyadic numbers. The 3 range functions here are related as
An abstract algorithm is first mapped into an interval algorithm . But the algorithm still involves real numbers. So we must map to an effective algorithm . Correctness must ultimately be shown at the Effective Level; the standard missing link in numerical (even “certified”) algorithms is that one often stops at Abstract or Interval Levels.
We need to mention that the effectivity of an algorithm has no implications for the efficiency of the algorithm.
Complexity: The complexity of analytic algorithms is often restricted to convergence analysis. But in this paper, we will provide explicit bounds on complexity as a function of the geometry of the roots in . This complexity can be captured at each of our 3 levels, but we always begin by proving our theorems at the Abstract Level, subsequently transferred to the other levels. Although it is the Effective Level that really matters, it would be a mistake to directly attempt such an analysis at the Effective level: that would obscure the underlying mathematical ideas, incomprehensible and error prone. The 3-level description enforces an orderly introduction of new concerns appropriate to each level. Like structured programming, the design of effective algorithms needs some structure. Currently, outside of the subdivision framework, it is hard to see a similar path way to effectivity.
1.3. Literature Survey
There is considerable literature associated with each of our three tests: the exclusion test comes down to bounding range of functions, a central topic in Interval Analysis (ratschek-rokne:range:bk, ). The Jacobian test is connected to the question of local injectivity of functions, the Bieberbach conjecture (or de Branges Theorem), Jacobian Conjecture, and theory of univalent functions. In our limited space, we focus on the “star” of our 3 tests, i.e., the existence test. It is the most sophisticated of the 3 tests in the sense that some nontrivial global/topological principle is always involved in existence proofs. In our case, the underlying principle is the fixed point theorem of Brouwer, in the form of Miranda’s Theorem (1940), and intimately related to degree theory.
We compare two box tests and in terms of their relative efficacy: say is as efficacious as , written , if for all , succeeds implies that succeeds. The relative efficacy of several existence tests have been studied (goldsztejn2007comparison, ; frommer2004comparison, ; frommer-lang:miranda-type-tests:05, ; alefeld+3:existence:04, ). Goldsztejn considers four common existence tests, and argues that “in practice” there is an efficacy hierarchy
where (K) refers to Krawcyzk, (HS) to Hansen-Sengupta, (FLS) to Frommer-Lang-Schnurr, and (IN) to Interval-Newton. Note that (K), (HS) and (IN) are all based on Newton-type operators (see (2)). Our Moore-Kioustelidis (MK) test is essentially (FLS). We say “essentially” because the details of defining the tests may vary to render the comparisons invalid. In our MK tests, we evaluate on each box face using the Mean Value Form expansion at the center of the face. But the above analysis assumes an expansion is at the center of the box, which is less accurate. But we may also compare these tests in terms of their complexity (measured by the worst case number of arithmetic operations, or number of function evaluations); a complexity-efficacy tradeoff may be expected. Finally, evaluating these tests in isolation does not tell us how they might perform in the context of an algorithm. It is therefore premature to decide on the best existence test.
In section 2, we introduce some basic concepts of interval arithmetic and establish notations. Section 3 introduces the key existence test based on Miranda’s theorem. Section 4 proves conditions that ensure the success of these existence test. Section 5 introduces two Jacobian tests. Section 6 describes our main algorithm. Section 7 is the complexity analysis of our algorithm. We conclude in Section 8. All proofs are relegated to the Appendix.
2. Interval Forms
We first establish notations for standard concepts of interval arithmetic. Bold fonts indicate vector variables: e.g., or .
Let denote the set of compact intervals in . Extend this to for the set of compact -boxes. In the remaining paper, we assume that all -boxes are hypercubes (i.e., the width in each dimension is the same). For any box , let denote its center and be the width of any dimension. Besides boxes, we will also use ball geometry: let denote the closed ball centered at of radius . If , is just the point . For any , let denote the box centered at of width , called the -dilation of . The -dilation of is defined likewisely.
Let be two sets. We will quantify their “distance apart” in two ways: their usual Hausdorff distance is denoted and their separation, is denoted as . Note that is a metric on closed subsets of but is no metric.
Consider two kinds of extensions of a function . First, the set extension of refers to the function (still denoted by ) that maps to . The second kind of extension is not unique: an interval form of is any function , satisfying two properties: (i) (inclusion) ; (ii) (convergence) if with then and . For short, we call a box form of . If , we have corresponding set extension and interval forms .
The notation “” is generic; we use subscripts to indicate specific box forms. Thus, the mean value form of is
where is the gradient of (viewed as a column vector) and is the transpose. The box is now centered at the origin, i.e., . The appearance of the generic “” in the definition of means that is still not fully specified. In our complexity analysis, we assume that for any box form, if not fully specified, will have at least linear convergence. Next, we intend to convert the interval form to some effective version .
3. Miranda and MK Tests
In the rest of this paper, we fix
to be a -function (twice continuously differentiable), and and its partial derivatives have interval forms. We further postulate that has only finitely many simple zeros in where is the bounded region of interest. A zero of is simple if the Jacobian matrix is non-singular. For any set , let denote the multiset of zeros of in . We assume that is analytic and its zeros are counted with the proper multiplicity. Then is the size of the multiset . We may write and when is understood. The magnitude of any bounded set is defined as .
We consider a classical test from Miranda (1940) to confirm that a box contains a zero of . If the box is written as with , then it has two -th faces, namely
and , defined similarly. Write to mean either or . Consider the following box predicate called444 We call it “simple” as we ignore some common generalizations that allow an interchange of “” with “”, or replace by for any arbitrary permutation of the indices. the simple Miranda Test:
where is given in (4). The following result is classic:
Proposition 3.1 ().
If holds then .
Next, we introduce the MK Test test that amounts an application of the simple Miranda test to the box , using a preconditioned form of :
Abstract MK Test Input: and box Output: true iff succeeds 1. Take a point with well-defined. 2. Construct a “preconditioned version” : 3. Apply the Simple Miranda Test to over : For : If or , (*) return false 4. Return true.
The notation “” in (*) refers to faces of the box , not the -dilation of the faces of . Here “MK” refers to Moore and Kiousteliades (moore-kioustelidis:test:80, ); the preconditioning idea first appearing in (kioustelidis:miranda:78, ). The MK Test was first introduced in (lien-sharma-vegter-yap:arrange:14z, ). Notice that in MK(), the Miranda test is performed on instead of . It is intended to address the difficult case where the root is close to the boundary of a box.
Note that MK() is mathematically exact and generally not implementable (even if it were possible, we may still prefer approximations). We first define its interval form, denoted : simply by replacing in line (*) by interval forms . Finally, we must define the effective form (Section 8).
4. On Sure Success of MK Test
The success of the MK test implies the existence of roots. In this section, we prove some (quantitative) converses.
We need preliminary facts about mean value forms. Given , the notation denotes a number of the form , where ; thus “” hides the implicit in the definition. This notation is not symmetric: and are generally different. This notation extends to matrices: let and be two matrices, then . Also, let denote the vector where . For , we write to denote the line segment connecting and . We write and for the infinity norms of vector and matrix . For a bounded convex set , define the matrix with entries where
Below, may be a disc or a line . Denote by the Jacobian matrix of at . We write as when is understood. The following is a simple application of the Mean Value Theorem (MVT):
Lemma 4.1 (Mvt).
Given two points , we have:
4.1. Sure Success of abstract MK Test
In this and the next subsection, we consider boxes that contain a root of . We prove conditions that ensures the success of the MK Test. We first prove this for the abstract test . The next section extends this result to the interval test .
The key definition here is a bound which depends on and . We prove that if , then the abstract MK test will succeed on . By a critical point we mean where the determinant of is zero. By definition, a root of is simple if is not a critical point.
Suppose and are two bounded sets in . Define
We see that both and are finite if does not contain a critical point of . Consider the following function
We then define to be the smallest such that , i.e., .
Lemma 4.2 ().
For any simple root of , is well-defined.
From now on, let denote the disc
The following lemma corrects an gap in the appendix of (lien-sharma-vegter-yap:arrange:14z, ).
Lemma 4.3 ().
Let be a box containing a simple root of and with well-defined. If , the preconditioned system satisfies that for all ,
4.2. Sure Success of Interval MK Test
We now extend the previous subsection on the abstract MK Test to the interval version . Again, assume is a box containing exactly one root of . We will give which is analogous to and prove that if , then will succeed.
To prove the existence of such a as mentioned above, we need to make some assumptions on the property of the box functions. As in (moore:bk-95, ), a box function is called Lipschitz in a region if there exists a constant such that
We call any such a Lipschitz constant of on . For our theorem, we need to know the specific box function in order to derive a Lipschitz constant. Consider the mean value form on a region .
Lemma 4.4 ().
Let be a continuously differentiable function defined on a convex region . Then is a Lipschitz constant for on .
Consider the sign tests of :
where is the -th component of the system . We consider the mean value form and assume that the components of are evaluated via the linear combination of for .
We now prove that if is small enough, will succeed. Recalling the Hausdorff distance on intervals, we have this bound from (neumaier:equations:bk-90, ).
Proposition 4.5 ().
Let be a continuously differentiable function. Then
For the next theorem, define
where is a Lipschitz constant for on (for all ).
Theorem 4.6 ().
Let be a box containing a simple root of width and with well-defined.
If for each , then succeeds with .
If with , then succeeds.
5. Two Jacobian Conditions
We define the Jacobian test as follows:
The order of operations in should be clearly understood: first we compute the interval Jacobian matrix , i.e., entries in this matrix are the intervals . Then we compute the determinant of the interval matrix. Also note that we use instead of . The following is well-known in interval computation (see (aberth:precise:bk, , Corollary to Theorem 12.1)):
Proposition 5.1 ().
If holds then .
We next introduce the following strict Jacobian test:
where denotes the expression obtained by computing the determinant of the Jacobian matrix with functional entries . Finally, we evaluate on . Note that and so the strict test is more efficacious. Unfortunately, it is known that does not imply . Nevertheless, we now show that it can serve as a uniqueness test in conjunction with the MK test:
Theorem 5.2 ().
If both and succeed then .
It follows that we could use in our Miranda algorithm in the introduction.
6. The Miranda Algorithm
Our main algorithm for root isolation is given in Figure 1. We use and (respectively) for its existence and Jacobian tests. It remains to specify the exclusion test :
The algorithm in Figure 1 is abstract. To introduce the interval version
: such that ;
In the definition of (Section 3), replace each by .
For the effective version, we use the tests and , which will be discussed in Section 8.
Termination of each version of Miranda follows from the complexity analysis below. We first show the partial correctness:
Theorem 6.1 (Partial Correctness).
If Miranda halts, the output queue is correct.
7. Complexity Upper Bounds
In this section, we derive a lower bound on the size of boxes produced by Miranda. That is, any box with width will either be output or rejected. This implies that the subdivision tree is no deeper than , yielding an upper bound on computational complexity. This bound will be expressed in terms of quantities determined by the zeros in . We first prove this for the abstract Miranda, then extend the results to and . From the algorithm, we see that a box is output if holds in line 10; it is rejected if one of the 2 following cases is true: (1) holds or (2) it is contained in where is an output box, as indicated in line 12. The boxes that contain a root of will be finally verified by the former predicate and the boxes that contain no root of will eventually be rejected in one of the 2 cases.
To prove the existence of such a , we need to look into
the tests , and .
We will give bounds , and
for the tests respectively and show that
for any box produced in the algorithm
(1) if , it will pass when ,
(2) if , it will pass when ;
(3) if , there are 2 cases: (a) if keeps a certain distance from the roots, it passes when ; (b) if is close enough to the roots, it will be rejected by line 12 of the algorithm when .
We have essentially proved item (1) in the Section 4. More precisely, for each root , we had defined a constant . We now set
7.1. Sure Success for and
We study conditions to ensure the success of the tests JC and . We will introduce constants in analogy to (15).
First consider . Let box contain a simple root . By Mean Value Theorem, (see (6) for definition). Since , it holds (). Denoting and , we get and . By applying the rules and where , are intervals, we may verify by induction that for any permutation . Hence, it follows .
Set to be the smallest positive root of the equation
The following lemma implies the existence of :
Lemma 7.1 ().
If box contains a simple root and then succeeds.
Thus we may choose and set
Lemma 7.2 (Lemma A).
If and then and holds.
Corollary 7.3 ().
Each root in will be output in a box of width .
Let be a region that excludes discs around roots:
where is the interior of . Denote the zero set of as for and define . Since all the roots in are removed from the set , we can verify that for all . Combining with the compactness of , we obtain . Finally we set
Lemma 7.4 (Lemma B).
Suppose . If
, succeeds when . If , succeeds when .
The Lemma follows naturally from Lemma A and B:
Lemma 7.5 (Lemma C).
Every box produced by the Miranda has width .
7.2. Sure Success for and
We now consider the interval tests and under the assumption that the underlying interval forms involved are Lipschitz. Let be a global Lipschitz constant for and for all in . We will develop corresponding bounds , . Observe that if we replace the bounds , , in the abstract version by the bounds , all the statements and proofs in the previous section remain valid. So in this section, we do not repeat the statements, except to give the bounds and .
First look at the test . With the same arguments as in abstract level, we obtain
where is the smallest positive root of the
With and , we get an interval analogue of Lemma A:
Lemma 7.6 (Lemma
If and with , then and succeeds.
Next look at the test