Root Refinement for Real Polynomials.

Root Refinement for Real Polynomials.

Michael Kerber Stanford University, Stanford, USA and Max-Planck-Center for Visual Computing and Communication, Saarbrücken, Germany (mkerber@mpi-inf.mpg.de).    Michael Sagraloff Max-Planck Institut für Informatik, Saarbrücken, Germany (msagralo@mpi-inf.mpg.de).
Abstract

We consider the problem of approximating all real roots of a square-free polynomial . Given isolating intervals, our algorithm refines each of them to a width of or less, that is, each of the roots is approximated to bits after the binary point. Our method provides a certified answer for arbitrary real polynomials, only considering finite approximations of the polynomial coefficients and choosing a suitable working precision adaptively. In this way, we get a correct algorithm that is simple to implement and practically efficient. Our algorithm uses the quadratic interval refinement method; we adapt that method to be able to cope with inaccuracies when evaluating , without sacrificing its quadratic convergence behavior. We prove a bound on the bit complexity of our algorithm in terms of the degree of the polynomial, the size and the separation of the roots, that is, parameters exclusively related to the geometric location of the roots. Our bound is near optimal and significantly improves previous work on integer polynomials. Furthermore, it essentially matches the best known theoretical bounds on root approximation which are obtained by very sophisticated algorithms. We also investigate the practical behavior of the algorithm and demonstrate how closely the practical performance matches our asymptotic bounds.

1 Introduction

The problem of computing the real roots of a polynomial in one variable is one of the best studied problems in mathematics. If one asks for a certified method that finds all roots, it is common to write the solutions as a set of disjoint isolating intervals, each containing exactly one root; for that reason, the term real root isolation is common in the literature. Simple, though efficient methods for this problem have been presented, for instance, based on Descartes’ rule of signs [7], or on Sturm’s theorem [8]. Recently, the focus of research shifted to polynomials with real coefficients which are approximated during the algorithm. It is worth remarking that this approach does not just generalize the integer case but has also leads to practical [11, 21] and theoretical [22] improvements of it.

We consider the related real root refinement problem: assuming that isolating intervals of a polynomial are known, refine them to a width of or less, where is an additional input parameter. Clearly, the combination of root isolation and root refinement, also called strong root isolation, yields a certified approximation of all roots of the polynomial to an absolute precision of or, in other words, to bits after the binary point in binary representation.

We introduce an algorithm, called Aqir, to solve the root refinement problem for arbitrary square-free polynomials with real coefficients. Most of the related approaches are formulated in the REAL-RAM model where exact operations on real numbers are assumed to be available at unit cost. In contrast, our approach works only with approximations of the input and exclusively performs approximate but certified arithmetic. Here, we assume the existence of an oracle which, for an arbitrary positive integer , provides approximations of the coefficients of the input polynomial to an error of less than . In the analysis of our algorithm, we also quantify the size of in the worst case. The refinement uses the quadratic interval refinement method [1] (QIR for short) which is a quadratically converging hybrid of the bisection and the secant method. We adapt the method to work with an increasing working precision and use interval arithmetic to validate the correctness of the outcome. In this way, we obtain an algorithm that always returns a correct root approximation, is simple to implement on an actual computer (given that arbitrary approximations of the coefficients are accessible), and is adaptive in the sense that it might succeed with a much lower working precision than predicted by the worst-case bound.

We provide a bound on the bit complexity of our algorithm. To state it properly, we first define several magnitudes depending on the polynomial which remain fixed throughout the paper. Let

(1.1)

be a square-free polynomial of degree with and (throughout the paper, means the logarithm with base ). We denote the (complex) roots of by , and, w.l.o.g., we can assume that the roots are numbered such that the first roots are exactly the real roots of . For each , denotes the separation of , and the logarithmic root bound of . An interval is called isolating for a root if contains and no other root of . We set for the center and for the width of .

Main Result. Given initial isolating intervals for the roots of , our algorithm refines all intervals to the width using

(1.2)

bit operations, where means that we ignore logarithmic factors. To do so, our algorithm requires the coefficients of at a precision of at most

bits after the binary point.

We remark that, if dominates all other input parameters, the bound in (1.2) is optimal up to logarithmic factors because reading the output already takes bit operations in the presence of real roots.

For the analysis, we divide the sequence of QIR steps in the refinement process into a linear sequence where the method behaves like bisection in the worst case, and a quadratic sequence where the interval is converging quadratically towards the root, following the approach in [12]. We do not require any conditions on the initial intervals except that they are disjoint and cover all real roots of ; an initial normalization phase modifies the intervals to guarantee the efficiency of our refinement strategy.

We give two variants of our algorithm; for the first variant which we consider to be more practical, we use approximate polynomial evaluation at single points only, whereas, for the second (more theoretical) variant, we group up to evaluations together and use fast approximate multipoint evaluation [15]. The idea behind the second approach is that we perform polynomial evaluations simultaneously for all intervals at the same cost as for a single classical evaluation.111Very recent work [25] introduces an alternative method for real root refinement which, for the task of refining a single isolating interval, achieves comparable running times as Aqir. In a preliminary version of their conference paper (which has been sent by the authors to M. Sagraloff in April 2013), the authors claim that using approximate multipoint evaluation also yields an improvement by a factor for their method. Given the results from this paper, this seems to be correct, however, their version of the paper did not contain a rigorous argument to bound the precision demand for the fast multipoint evaluation. This has been achieved first in [15]. This yields the complexity bound in (1.2) which is by a factor better (if dominates all other input parameters) than the bound as achieved by the variant without multipoint evaluations.

We remark that, using the root solver from [22], initial isolating intervals can be obtained with bit operations using coefficient approximations of to bits after the binary point. Hence, our complexity result from (1.2) also gives a bound on the strong root isolation problem.

The case of integer coefficients is often of special interest, and, with respect to the QIR method, the problem has been investigated by previous work [12] for this specific case. In that work, the complexity of root refinement was bounded by . We lower this bound and arrive at a complexity of

(1.3)

The improvement stems from a combination of several ideas that we describe separately: In comparison to the purely exact method from [12], we get rid of one factor of because, for Aqir, we consider a different approach for evaluating the sign of at rational points (the main operation in the refinement procedure) than for the classical QIR method: for an interval of size , the evaluation of at the endpoints of the interval has a complexity of when using exact rational arithmetic because the function values can consist of up to bits. However, we show that we can still compute the sign of the function value with certified numerical methods using the substantially smaller working precision of . We remark that the latter result certainly only applies to points whose distance to a root is not much smaller than , thus, for Aqir, we modified the QIR method in way such that the latter requirement is assured; this improvement is described in Sections 3 to 5. The latter modifications yield an algorithm with bit complexity for all real roots. Another factor of in the second term is then shaved off by using approximate multipoint evaluation in the algorithm as already mentioned above.

Finally, we mix the ideas from [22] with our approach. The term in the complexity is due to the fact that, in the worst case, our refinement algorithm performs bisections until the isolating interval has reached a certain threshold. We change the algorithm such that it performs a hybrid of bisections and Newton-like steps initially, and switches to QIR after reaching the threshold. This further reduces the complexity to We remark that this last optimization is restricted to the case of integer polynomials, whereas the first two improvements apply to our general setup and lead to the main result (1.2) stated above.

We have implemented the exact version of the QIR algorithm and the approximate variant Aqir that realizes our first improvement step. We report on experimental results when applying both versions to two families of random input instances. We focus on the comparison of both variants when increasing one of the input parameters. We demonstrate that, for increasing degree of the input polynomial, refining a single root scales quadratically for the exact version and linearly for the approximate version. Hence, by choosing a smaller working precision, we get rid of a factor of both in theory and in practice.

Related work. The problem of accurate root approximation is omnipresent in mathematical applications; certified methods are of particular importance in the context of computations with algebraic objects, for instance, when computing the topology of algebraic curves [6, 10] or when solving systems of multivariate equations [2].

The idea of combining bisection with a faster converging method to find roots of continuous functions has been first introduced in Dekker’s method and elaborated in Brent’s method; see [5] for a summary. However, these approaches assume exact arithmetic for their convergence results.

For polynomial equations, numerous algorithms are available, for instance, the Jenkins-Traub algorithm or Durand-Kerner iteration; although they usually approximate the roots very fast in practice [4], general worst-case bounds on their arithmetic complexity are not available. In fact, for some variants, even termination cannot be guaranteed in theory; we refer to the survey [20] for extensive references on these and further methods.

The theoretical complexity of root approximation has been investigated by Pan [18, 19]. Assuming all roots to be in the unit disc, he achieves a bit complexity of for approximating all roots to an accuracy of , which matches our bound (if is the dominant input parameter) for the first variant of Aqir which does not use fast multipoint evaluation. His approach even works for polynomials with multiple roots. However, as Pan admits in [20], the algorithm is difficult to implement and so is the complexity analysis when taking rounding errors in intermediate steps into account. In addition, the method is global in a sense that all complex roots are approximated in parallel, hence it does not profit from situations where the number of real roots is small. A very recent approach [17] for root isolation and refinement uses Pan’s factorization algorithm from [18] as a key ingredient. For square-free polynomials, the corresponding algorithm achieves a comparable bit complexity bound for the refinement as our asymptotically fast variant of Aqir. However, this does not turn our present results obsolete: First, our approach considerably differs from all existing global root finding algorithms which combine the splitting circle method [24] with techniques from numerical analysis (Newton iteration, GraeffeÕs method, discrete Fourier transforms) and fast algorithms for polynomial and integer multiplication. Second, our algorithm is adaptive in the sense that its computational complexity is directly related to the number of real roots which is often much smaller than the degree of the polynomial. Third, because of its simpleness and the low algorithmic overhead222In particular, this holds for the first variant of Aqir which does not use fast multipoint evaluation. We consider the second variant based on fast approximate multipoint evaluation more to be a theoretical proof of concept that the overall approach may yield almost optimal complexity bounds., it is well suited for an efficient implementation.

We improve upon the conference version of this paper [13] in several ways: in our bit complexity result, we remove the dependence on the coefficient size and, thus, relate the hardness of root approximation to parameters that exclusively depend on the geometric location of the roots. In addition, we redefine the threshold for the interval width that guarantees quadratic convergence (Definition 12); in this way, we get rid of the magnitude , which is a pure artifact of the analysis of [13]. Moreover, the improvements on the complexity result using multipoint evaluations and hybrid Newton steps, as well as the experimental evaluations did not appear in [13].

Outline. We summarize the (exact) QIR method in Section 2. Our Aqir algorithm that only uses approximate coefficients is described in Section 3. Its precision demand is analyzed in Section 4. Based on that analysis of a single refinement step, the complexity bound of root refinement is derived in Section 5. Some experimental comparison between QIR with exact and approximate coefficients is presented in Section 6. Further asymptotic improvements using multipoint evaluation and special techniques for integer polynomials are described in Section 7. We end with concluding remarks in Section 8.

2 Review on exact QIR

Input: square-free, isolating,
Output: with isolating for and

1:procedure eqir()
2:     if , return Bisection(),.
3:     
4:     
5:     
6:     if , return
7:     if and , return
8:     if and , return
9:     Otherwise, return ,.
10:end procedure
Algorithm 1 Eqir: Exact Quadratic Interval Refinement

Abbott’s QIR method [1, 12] is a hybrid of the simple (but inefficient) bisection method with a quadratically converging variant of the secant method. We refer to this method as Eqir, where “E” stands for “exact” in order to distinguish from the variant presented in Section 3.333To avoid confusion, the approximate version presented later is also “exact” in the sense that the refined intervals are isolating, but the intermediate computations are only approximate. Given an isolating interval for a real root of , we consider the secant through and (see also Figure 3.1). This secant intersects the real axis in the interval , say at -coordinate . For small enough, the secant should approximate the graph of the function above quite well and, so, should hold. An Eqir step tries to exploit this fact:

The isolating interval is (conceptually) subdivided into subintervals of same size, using equidistant grid points. Each subinterval has width . Then , the closest grid point to , is computed and the sign of is evaluated. If that sign equals the sign of , the sign of is evaluated. Otherwise, is evaluated. If the sign changes between the two computed values, the interval or the interval , respectively, is set as new isolating interval for . In this case, the Eqir step is called successful. Otherwise, the isolating interval remains unchanged, and the Eqir step is called failing. See Algorithm 1 for a description in pseudo-code.

In [12], the root refinement problem is analyzed using the just described Eqir method for the case of integer coefficients and exact arithmetic with rational numbers. For that, a sequence of Eqir steps is performed with initially. After a successful Eqir step, is squared for the next step; after a failing step, is set to . If drops to , a bisection step is performed, and is set to for the next step. In [12], a bound on the size of an interval is provided to guarantee success of every Eqir and, thus, quadratic convergence of the overall method.

3 Approximate QIR

The most important numerical operation in an Eqir step is the computation of for values . Note that is needed for determining the closest grid point to the secant (Step 4 of Algorithm 1), and its sign is required for checking for sign changes in subintervals (Steps 5-8).

What are the problems if is a bitstream polynomial as in (1.1), so that can only be evaluated up to a certain precision? First of all, can only be computed approximately, too, which might lead to checking the wrong subinterval in the algorithm if is close to the center of a subinterval. Even more seriously, if is zero, then, in general, its sign can never be evaluated using any precision. Even if we exclude this case, the evaluation of can become costly if is too close to a root of . The challenge is to modify the QIR method such that it can cope with the uncertainties in the evaluation of , requires as few precision as possible in a refinement step and still shows a quadratic convergence behavior eventually.

Bisection is a subroutine called in the QIR method if ; before we discuss the general case, we first describe our variant of the bisection in the bitstream context. Note that we face the same problem: might be equal or almost equal to zero at , the center of . We will overcome this problem by evaluating at several -coordinates “in parallel”. For that, we subdivide into 4 equally wide parts using the subdivision points for . We also assume that the sign of at is already known. We choose a starting precision and compute using interval arithmetic in precision (cf. Section 4 for details). If less than out of signs have been determined using precision , we set and repeat the calculation with increased precision. Once the sign at at least subdivision points is determined, we can determine a subinterval of at most half the size of that contains (Algorithm 2). We will refer to this algorithm as “bisection”, although the resulting interval may sometimes be only a quarter of the original size. Note that can only become zero at one of the subdivision points which guarantees termination also in the bitstream context. Moreover, at least of the subdivision points have a distance of at least to . This asserts that the function value at these subdivision points is reasonably large and leads to an upper bound of the required precision (Lemma 5).

Figure 3.1: Illustration of an Aqir step for .

We next describe our bitstream variant of the QIR method that we call approximate quadratic interval refinement, or Aqir for short (see also Figure 3.1 for the illustration of an Aqir step for ). Compared to the exact variant, we replace two substeps. In Step 4, we replace the computation of as follows: For a working precision , we evaluate and via interval arithmetic with precision (blue vertical intervals in the above figure) and evaluate with interval arithmetic accordingly (cf. Section 4). Let denote the resulting interval (in Figure 3.1, is the intersection of the stripe defined by the interval evaluations of and with the real axis). If the width of is more than , we set to and retry. Otherwise, let be the integer closest to and set . For as before and (red dots) for , the following Lemma shows that the computed indeed approximates on the -grid:

Input: square-free, isolating,
Output: isolating with .

1:procedure Approximate_Bisection()
2:     
3:     
4:     
5:     while  contains more than one zero do
6:         for i=2,…,4 do
7:               If , set
8:         end for
9:         
10:     end while
11:     Find , such that
12:     return
13:end procedure
Algorithm 2 Approximate Bisection
Lemma 1.

Let be inside the subinterval . Then, or . Moreover, let be the point that is closer to . If , then .

Proof.

Let and the interval computed by interval arithmetic as above, with width at most . Since , it follows that . By construction, . Therefore, and, thus, it follows that can only be rounded to or . Furthermore, for , implies that . It follows that by triangle inequality, so must be rounded to . The case is analogous. ∎

The second substep to replace in the QIR method is to check for sign changes in subintervals in Steps 5-8. As before, we set . Instead of comparing the signs at and , we choose seven subdivision points (red crosses in Figure 3.1), namely

(3.1)

In case that or , we only choose the points of (3.1) that lie in . For a working precision , we evaluate the sign of at all subdivision points using interval arithmetic. If the sign remains unknown for more than one point, we set to and retry. After the sign is determined for all except one of the points, we look for a sign change in the sequence. If such a sign change occurs, we set the corresponding interval as isolating and call the Aqir step successful. Otherwise, we call the step failing and keep the old isolating interval. As in the exact case, we square up after a successful step, and reduce it to its square root after a failing step. See Algorithm 3 for a complete description.

Note that, in case of a successful step, the new isolating interval satisfies . Also, similar to the bisection method, the function can only be zero at one of the chosen subdivision points, and the function is guaranteed to be reasonably large for all but one of them, which leads to a bound on the necessary precision (Lemma 7). The reader might wonder why we have chosen a non-equidistant grid involving the subdivision points . The reason is that these additional points allow us to give a success guarantee of the method under certain assumptions in the following lemma which is the basis to prove quadratic convergence if the interval is smaller than a certain threshold (Section 5.2).

Lemma 2.

Let be an isolating interval for some root of , and as before. If , then Aqir() succeeds.

Proof.

Let be the subdivision point selected by the Aqir method. We assume that ; otherwise, a similar (simplified) argument applies. By Lemma 1, and, thus, . It follows that the leftmost two points of (3.1) have a different sign than the rightmost two points of (3.1). Since the sign of is evaluated for at least one value on each side, the algorithm detects a sign change and, thus, succeeds. ∎

Input: square-free, isolating, ,
Output: with isolating and

1:procedure Aqir()
2:     if , return Approximate_Bisection(),.
3:     
4:     
5:      while has width , set
6:     
7:     if ,
8:     if ,
9:     if ,
10:     
11:     while  contains more than one zero do
12:         for i=1,…,s do
13:               If , set
14:         end for
15:         
16:     end while
17:     If return
18:     Otherwise, return
19:end procedure
Algorithm 3 Approximate Quadratic interval refinement

4 Analysis of an AQIR step

The running time of an Aqir step depends on the maximal precision needed in the two while loops (Step 5, Steps 11-15) of Algorithm 3. The termination criterion of both loops is controlled by evaluations of the form , where is some polynomial expression and is the current working precision.

We specify recursively what we understand by evaluating in precision with interval arithmetic. For that, we define for and to be the maximal such that for some integer . The same way is the minimal with of the same form. We extend this definition to arithmetic expressions by the following rules (we leave out for brevity):

Finally, we define the interval . By definition, the exact value of is guaranteed to be contained in . We assume that polynomials are evaluated according to the Horner scheme, and when evaluating with precision , the above rules apply in each arithmetic step. The next lemma provides a worst case bound on the size of the resulting interval under certain conditions. We further remark that, in an actual implementation, is usually much smaller than the worst case bound derived here. Nevertheless, our complexity analysis is based on the latter bound. Throughout the following considerations , denotes an integer upper bound on the root bound , that is, , and, in particular for all roots of .

Lemma 3.

Let be a polynomial as in (1.1), with , and . Then,

(4.1)
(4.2)

In particular, has a width of at most .

Proof.

We do induction on . The statement is clearly true for . For , we write with the constant coefficient of and of degree . Note that, for any real value , , same for . Therefore, we can bound as follows (again, leaving out for simplicity):

Note that where or . Moreover, we can write with . Therefore, we can rearrange

By a simple inductive proof on the degree, we can show that both and are bounded by . Using that and the induction hypothesis yields

The bound for follows in the same way. ∎

For the sake of simplicity, we decided to assume fixed-point arithmetic, that means, determines the number of bits after the binary point. We refer the interested reader to [16, Thm. 12], where a corresponding result for floating-point arithmetic is given.

We analyze the required working precision of approximate bisection and of an Aqir step next. We exploit that, whenever we evaluate at subdivision points, of them have a certain minimal distance to the root in the isolating interval. The following lemma gives a lower bound on for such a point , given that it is sufficiently far away from any other root of .

Lemma 4.

Let be as in (1.1), a real root of and be a real value with distance to all real roots . Then,

(recall the notations from Section 1 for the definitions of and )

Proof.

For each non-real root of , there exists a complex conjugate root and, thus, we have for all as well. It follows that

where the last inequality uses that and, thus, . ∎

We next analyze an approximate bisection step.

Lemma 5.

Let be a polynomial as in (1.1), be an isolating interval for a root of and . Then, Algorithm 2 applied on requires a maximal precision of

and its bit complexity is bounded by .

Proof.

Consider the three subdivision points , where , and an arbitrary real root of . Note that because the segment from to spans at least a quarter of . Moreover, , and so

It follows that has a distance to of at least . Hence, we can apply Lemma 4 to each , that is, we have . Since the signs of at the endpoints of are known, it suffices to compute the signs of at two of the three subdivision points. For at least two of these points, the distance of to is at least , thus, we have for at least two points. Then, due to Lemma 3, we can use interval arithmetic with a precision to compute these signs if satisfies

which is equivalent to . Since we double the precision in each step, we will eventually succeed with a precision smaller than . The bit complexity for an arithmetic operation with fixed precision is . Namely, since the absolute value of each subdivision point is bounded by , the results in the intermediate steps have magnitude and we consider bits after the binary point. At each subdivision point, we have to perform arithmetic operations for the computation of , thus, the costs for these evaluations are bounded by bit operations. Since we double the precision in each iteration, the total costs are dominated by the last successful evaluation and, thus, we have to perform bit operations. ∎

We proceed with the analysis of an Aqir step. In order to bound the required precision, we need additional properties of the isolating interval.

Definition 6.

Let be as in (1.1), be an isolating interval of a root of . We call normal444The reader may notice that the definition of ”normal” depends on the upper bound on . Throughout our argument, we assume that such an initial is given. We will finally choose a which approximates up to an (addative) error of . if

  • ,

  • for every and , and


In simple words, a normal isolating interval has a reasonable distance to any other root of , and the function value at the endpoints is reasonably large. We will later see that it is possible to get normal intervals by a sequence of approximate bisection steps.

Lemma 7.

Let be a polynomial as in (1.1), be a normal isolating interval for a root of with , and let . Then, the Aqir step for requires a precision of at most

and, therefore, its bit complexity is bounded by

Moreover, the returned interval is again normal.

Proof.

We have to distinguish two cases. For , we consider the two while-loops in Algorithm 3. In the first loop (Step 5), we evaluate via interval arithmetic, doubling the precision until the width of the resulting interval is less than or equal to . The following considerations show that we can achieve this if satisfies

(4.3)

W.l.o.g., we assume . If satisfies the above condition, then, due to Lemma 3, is contained within the interval

and is contained within the interval

where the latter result uses the fact that and have different signs. It follows that is contained within , and a simple computation shows that has width less than . Hence, since has absolute value less than , has width less than as well. The bound (4.3) on also writes as

and since we double in each iteration, computing via interval arithmetic up to an error of demands for a precision

Since is normal and because of the posed condition on , we can bound this by

We turn to the second while loop of Algorithm 3 (Steps 11-15) where is evaluated at the subdivision points as defined in (3.1). Since the interval is normal, we can apply Lemma 4 to each of the seven subdivision points. Furthermore, at least six of these points have distance to the root and, thus, for these points, is larger than . Then, according to Lemma 4.3, it suffices to use a precision that fulfills

The same argumentation as above then shows that the point evaluation will be performed with a maximal precision of less than