Optimal sampling rates for approximating analytic functions from pointwise samples

Optimal sampling rates for approximating analytic functions from pointwise samples

Ben Adcock
Department of Mathematics
Simon Fraser University
Canada
                                                                        

   Rodrigo B. Platte
School of Mathematical and Statistical Sciences
Arizona State University
USA
   Alexei Shadrin
DAMTP, Centre for Mathematical Sciences
University of Cambridge
United Kingdom
Abstract

We consider the problem of approximating an analytic function on a compact interval from its values at distinct points. When the points are equispaced, a recent result (the so-called impossibility theorem) has shown that the best possible convergence rate of a stable method is root-exponential in , and that any method with faster exponential convergence must also be exponentially ill-conditioned at a certain rate. This result hinges on a classical theorem of Coppersmith & Rivlin concerning the maximal behaviour of polynomials bounded on an equispaced grid. In this paper, we first generalize this theorem to arbitrary point distributions. Next, we present an extension of the impossibility theorem valid for general nonequispaced points, and apply it to the case of points that are equidistributed with respect to modified Jacobi measures. This leads to a necessary sampling rate for stable approximation from such points. We prove that this rate is also sufficient, and therefore exactly quantify (up to constants) the precise sampling rate for approximating analytic functions from such node distributions with stable numerical methods. Numerical results – based on computing the maximal polynomial via a variant of the classical Remez algorithm – confirm our main theorems. Finally, we discuss the implications of our results for polynomial least-squares approximations. In particular, we theoretically confirm the well-known heuristic that stable least-squares approximation using polynomials of degree is possible only once is sufficiently large for there to be a subset of of the nodes that mimic the behaviour of the Chebyshev nodes.

1 Introduction

The concern of this paper is the approximation of an analytic function from its values on an arbitrary set of points in . It is well known that if such points follow a Chebyshev distribution then can be stably (up to a log factor in ) approximated by its polynomial interpolant with a convergence rate that is geometric in the parameter . Conversely, when the points are equispaced polynomial interpolants do not necessarily converge uniformly on as ; an effect known as Runge’s phenomenon. Such an approximation is also exponentially ill-conditioned in , meaning that divergence is witnessed in finite precision arithmetic even when theoretical convergence is expected.

Many different numerical methods have been proposed to overcome Runge’s phenomenon by replacing the polynomial interpolant by an alternative approximation (see [BoydRunge, PlaKle, TrefPlatteIllCond] and references therein). This raises the fundamental question: how successful can such approximations be? For equispaced points, this question was answered recently in [TrefPlatteIllCond]. Therein it was proved that no stable method for approximating analytic functions from equispaced nodes can converge better than root-exponentially fast in the number of points, and moreover any method that converges exponentially fast must also be exponentially ill-conditioned.

A well-known method for approximating analytic functions is polynomial least-squares fitting with a polynomial of degree . Although a classical approach, this technique has become increasingly popular in recent years as a technique for computing so-called polynomial chaos expansions with application to uncertainty quantification (see [ChkifaEtAl, DavenportEtAlLeastSquares, MiglioratiEtAlFoCM, NarayanZhouCCP] and references therein), as well as in data assimilation in reduced-order modelling [BinevEtAlDataAssim, DeVoreEtAlDataAssimBanach]. A consequence of the impossibility theorem of [TrefPlatteIllCond] is that polynomial least-squares is an optimal stable and convergent method for approximating one-dimensional analytic functions from equispaced data, provided the polynomial degree used in the least-squares fit scales like the square-root of the number of grid points . Using similar ideas, it has also recently been shown that polynomial least-squares is also an optimal, stable method for extrapolating analytic functions [DemanetTownsendExtrap].

1.1 Contributions

The purpose of this paper is to investigate the limits of stability and accuracy for approximating analytic functions from arbitrary sets of points. Of particular interest is the case of points whose behaviour lies between the two extremes of Chebyshev and equispaced grids. Specifically, suppose a given set of points exhibits some clustering near the endpoints, but not the characteristic quadratic clustering of Chebyshev grids. Generalizing that of [TrefPlatteIllCond], our main result precisely quantifies both the best achievable error decay rate for a stable approximation and the resulting ill-conditioning if one seeks faster convergence.

This result follows from an extension of a classical theorem of Coppersmith & Rivlin on the maximal behaviour of a polynomial of degree bounded on a grid of equispaced points [copprivlinpolygrowth]. In Theorem 3.1 we extend the lower bound proved in [copprivlinpolygrowth] to arbitrary sets of points. We next present an abstract impossibility result (Lemma 4.1) valid for arbitrary sets of points. To illustrate this result in a concrete setting, we then specialize to the case of nodes which are equidistributed with respect to modified Jacobi measures. Such measures take the form

where and almost everywhere, and include equispaced () and Chebyshev () nodes as special cases. In our main result, Theorem 4.2, we prove an extended impossibility theorem for such nodes. Generalizing [TrefPlatteIllCond], two important consequences of this theorem are as follows:

  • If , any method that converges exponentially fast in with geometric rate, i.e. the error decays like for some , must also be exponentially ill-conditioned in at a geometric rate.

  • The best possible convergence rate for a stable approximation is subgeometric with index . That is, the error is at best for some as .

We also give a full characterization of the trade-off between exponential ill-conditioning and exponential convergence at subgeometric rates lying strictly between and .

Although not a result about polynomials per se, this theorem is closely related to the behaviour of discrete least-squares fitting with polynomials of degree . Indeed, the quantity we estimate in Theorem 3.1 is equivalent (up to a factor of ) to the infinity-norm condition number of such an approximation. By using polynomial least-squares as our method, in Proposition LABEL:p:least_squares_optimal we show that the rate described in (ii) is not only necessary for stable recovery but also sufficient. Specifically, when the polynomial degree is chosen as

(1.1)

the polynomial least-squares approximation is stable and converges like for all functions analytic in an appropriate complex region. The fact that such a method is optimal in view of the generalized impossibility theorem goes some way to justify the recent popularity of discrete least-squares techniques in computing polynomial chaos expansions, for example.

Besides these results, in §LABEL:s:Remez we also introduce an algorithm for computing the maximal polynomial for an arbitrary set of nodes. This algorithm, which is based on a result of Schönhage [schonhagepolyinterp], is a variant of the classical Remez algorithm for computing best uniform approximations (see, for example, [PachonTrefethenRemez, powell]). We use this algorithm to present numerical results in the paper confirming our various theoretical estimates.

Finally, let us note that one particular consequence of our results is a confirmation of a popular heuristic for polynomial least-squares approximation (see, for example, [boyd2009divergence]). Namely, the number of nonequispaced nodes required to stably recover a polynomial approximation of degree is of the same order as the number of nodes required for there to exist a subset nodes of size which mimics the distribution of the Chebyshev nodes . In Proposition LABEL:p:interlace we show that the same sufficient condition for boundedness of the maximal polynomial also implies the existence of a subset of of the original nodes which interlace these nodes. In particular, for nodes that are equidistributed according to a modified Jacobi weight function one has this interlacing property whenever the condition holds, which is identical to the necessary and sufficient condition (1.1) for stability of the least-squares approximation.

2 Preliminaries

Our focus in this paper is on functions defined on compact intervals, which we normalize to the unit interval . Unless otherwise stated, all functions will be complex-valued, and in particular, polynomials may have complex coefficients. Throughout the paper will denote the nodes at which a function is sampled. We include both endpoints in this set for convenience. All results we prove remain valid (with minor alterations) for the case when either or both endpoints are excluded.

We require several further pieces of notation. Where necessary throughout the paper will denote the degree of a polynomial. We write for the uniform norm of a function and for the discrete uniform semi-norm of on the grid of points. We write for the Euclidean inner product on and for the Euclidean norm. Correspondingly, we let and be the discrete semi-inner product and semi-norm respectively.

We will say that a sequence converges to zero exponentially fast if for some and . If then we say the convergence is geometric, and if or then it is subgeometric or supergeometric respectively. When we also refer to this convergence as root-exponential. Given two nonnegative sequences and we write as if there exist constants such that for all large . Finally, we will on occasion use the notation to mean that there exists a constant independent of all relevant parameters such that .

2.1 The impossibility theorem for equispaced points

We first review the impossibility theorem of [TrefPlatteIllCond]. Let be a grid of equispaced points in and suppose that is a family of mappings such that depends only on the values of on this grid. We define the (absolute) condition numbers as

(2.1)

Suppose that is a compact set. We now write be the Banach space of functions that are continuous on and analytic in its interior with norm .

Theorem 2.1 ([TrefPlatteIllCond]).

Let be a compact set containing in its interior and suppose that is an approximation procedure based on equispaced grids of points such that for some and we have

for all . Then the condition numbers (2.1) satisfy

for some and all large .

Specializing to , this theorem states that exponential convergence at a geometric rate implies exponential ill-conditioning at a geometric rate. Conversely, stability of any method is only possible when , corresponding to at best root-exponential convergence in .

2.2 Coppersmith & Rivlin’s bound

The proof of Theorem 2.1, although it does not pertain to polynomials or polynomial approximation specifically, relies on a result of Coppersmith & Rivlin on the maximal behaviour polynomials bounded on an equispaced grid. To state this result, we first introduce the notation

(2.2)

Note that in the special case , this is just the Lebesgue constant

where denotes the polynomial interpolant of degree of a function .

Theorem 2.2 ([copprivlinpolygrowth]).

Let be an equispaced grid of points in and suppose that . Then there exist constants such that

Two key implications of this result are as follows. First, a polynomial of degree bounded on equispaced points can grow at most exponentially large in between those points. Second, one needs quadratically-many equispaced points, i.e. , in order to prohibit growth of the maximal behaviour of a polynomial of degree bounded on an equispaced grid. We remark in passing that when , so that is the Lebesgue constant, one also has the well-known estimate for large (see, for example, [TrefethenATAP, Chpt. 15]).

  • Sufficiency of the scaling is a much older result than Theorem 2.2, dating back to work Schönhage [schonhagepolyinterp] and Ehlich & Zeller [EhlichZeller1, EhlichZeller2] in the 1960s. Ehlich also proved unboundedness of if as [EhlichPoly]. More recently, Rakhmanov [RakhmanovPolyBds] has given a very precise analysis of not just but also the pointwise quantity ≤1 }-1 ≤x ≤1 A simple algorithm that attains the bounds implied in Theorem 2.1 is discrete least-squares with polynomials:

    (2.3)

    Here is a parameter which is chosen to ensure specific rates of convergence. The following result determines the conditioning and convergence of this approximation (note that this result is valid for arbitrary sets of points, not just equispaced):

    Proposition 2.4.

    Let be a set of points in and suppose that . If and is as in (2.3) then the error

    where is the condition number of . Moreover,

    (2.4)

    where is as in (2.2). Furthermore, if , i.e.  is the polynomial interpolant of degree , then

    (2.5)

    is the Lebesgue constant.

    Although this result is well known, we include a short proof for completeness:

    Proof.

    Since the points are distinct and , the least-squares solution exists uniquely. Notice that the mapping is linear and a projection onto . Hence

    and consequently we have

    It remains to estimate the condition number. Since is a polynomial, it follows that

    (2.6)

    Now observe that

    Since is the solution of a discrete least-squares problem it is a projection with respect to the discrete semi-inner product . Hence . Combining this with the previous estimate gives the upper bound . For the lower bound, we use (2.6) and the fact that is a projection to deduce that

    This completes the proof of (2.4). For (2.5) we merely use the definition of . ∎

    2.4 Examples of nonequispaced points

    To illustrate our main results proved later in the paper, we shall consider points that are equidistributed with respect to so-called modified Jacobi weight functions. These are defined as

    (2.7)

    where and satisfies almost everywhere. Throughout, we assume the normalization

    in which case the points are defined implicitly by

    (2.8)

    A special case of the modified Jacobi weight functions are the ultraspherical weight functions, defined as

    (2.9)

    for . Within this subclass, we shall consider a number of specific examples:

    • () The uniform weight function , corresponding to the equispaced points .

    • () The Chebyshev weight function of the first kind , corresponding to the Chebyshev points. Note that these points are roughly equispaced near and quadratically spaced near the endpoints . That is, .

    • () The Chebyshev weight function of the second kind . Note that the corresponding points are roughly equispaced near , but are sparse near the endpoints. In particular, .

    Recall that for (U) one requires a quadratic scaling to ensure stability. Conversely, for (C1) any linear scaling of with suffices (see Remark 3.2). Since the points (C2) are so poorly distributed near the endpoints, we expect, and it will turn out to be the case, that a more severe scaling than quadratic is required for stability in this case.

    We shall also consider two further examples:

    • () The corresponding points cluster at the endpoints, although not quadratically. In particular,

    • () The corresponding points overcluster at the endpoints: .

    We expect (UC) to require a superlinear scaling of with for stability, although not as severe as quadratic scaling as in the case of (U). Conversely, in (OC) it transpires that linear scaling suffices, but unlike the case of (C1), the scaling factor (where ) must be sufficiently large.

    3 Maximal behaviour of polynomials bounded on arbitrary grids

    We now seek to estimate the maximal behaviour of a polynomial of degree that is bounded at arbitrary nodes . As in §2.2, we define

    (3.1)

    Once again we note that is the Lebesgue constant of polynomial interpolation.

    3.1 Lower bound for

    Let be the zeros of the Chebyshev polynomial , given by

    (3.2)

    The following theorem, our first main result of the paper, is a generalization of the lower bound of Coppersmith & Rivlin (Theorem 2.2) to arbitrary nodes:

    Theorem 3.1.

    Let and and suppose that and are the largest integers such that

    (3.3)

    and

    (3.4)

    respectively, where the are as in (3.2). If no such integer exists, set or respectively. If is given by (3.1) and , then

    where

    (3.5)

    If then we have the trivial bound .

    Proof.

    Suppose that without loss of generality. Let be the Chebyshev polynomial and define by

    Figure 1 illustrates the behaviour of . We first claim that for . Clearly, for we have . Suppose now that . Then, since ,

    By definition, we have for . Also, by (3.3),

    and therefore

    For (3.3) gives that . Also, since and we have

    Recall that for . Hence

    and therefore

    This completes the proof of the claim.

    We now wish to estimate from below. Following Figure 1, we choose the point midway between the endpoint and the node leftmost node . Since this gives

    Notice that for and therefore for by (3.3). Hence

    (3.6)

    where in the second step we use (3.3) and the fact that and . Note that

    and that

    Hence we deduce that

    from which the result now follows. ∎

    Figure 2 shows the growth of , and the norm of the polynomial used to prove Theorem 3.1. In these examples, the nodes where generated using the density functions (C2), (U) and (UC). In all cases the polynomial degree was chosen as . Notice that the exponential growth rate of is well estimated by , while both quantities under estimate the rate of growth of .

    , , , ,
    Figure 1: Plots of the polynomial used in the proof of Theorem 3.1 for two sets of points. In both cases the points were generated with – case (C2) in §2.4. For reference, the lower bound is also included.
    Figure 2: Semi-log plot of the growth of , , and for several values of and three node density functions described in §2.4: (C2), (U) and (UC). In all cases . The computation of the quantity is described in §LABEL:s:Remez.

    3.2 Lower bound for modified Jacobi weight functions

    Theorem 3.1 is valid for arbitrary sets of points . In order to derive rates of growth, we now consider points equidistributed according to modified Jacobi weight functions. For this, we first recall the following standard bounds for the Gamma function:

    (3.7)
    Corollary 3.2.

    Let , where and almost everywhere. If then there exist constants and depending on and such that

    for all .

    Proof.

    By definition, the points satisfy

    Without loss of generality, suppose that . Let be such that . Then

    and therefore

    (3.8)

    We now apply Theorem 3.1 and (3.7) to get

    (3.9)

    where and is such that for .

    We next estimate . From (3.8) we find that provided

    or equivalently

    where is some constant. Hence we require . Let be a constant whose value with be chosen later. Note that the result holds trivially if . Hence we now assume that . Set

    and notice that , as required. Notice also that and . Substituting this value for into (3.9) and using these inequalities gives

    Since can be chosen arbitrarily, we now select sufficiently small so that the term in brackets is strictly greater than one. This gives for some . Therefore for suitable and all sufficiently large . To complete the proof we once again recall that the result holds trivially whenever for some constant . ∎

    This result shows that if for some then the maximal polynomial grows at least exponentially fast with rate

    In particular, the scaling , , is necessary for boundedness of the maximal polynomial. In §5.1 we will show that this rate is also sufficient.

    It is informative to relate this result to several of the examples introduced in §2.4. First, if , i.e. case (U), we recover the lower bound of Theorem 2.2. Conversely, for (C2) we have

    Thus, the cubic scaling is necessary for stability. Finally, for the points (UC) we have

    which implies a necessary scaling of . Note that Corollary 3.2 says nothing about the case . We discuss the case further in Remark 3.2.

    The case of linear oversampling () warrants closer inspection:

    Corollary 3.3.

    Let , and be as in Corollary 3.2 and . Then

    In particular, the Lebesgue constants satisfy

    (C2) (UC) (OC)
    Figure 3: The growth of as a function of for the node densities (C2), (UC), and (OC). In each case is plotted for , , and .

    In other words, whenever the points cluster more slowly than quadratically at one of the endpoints (recall that in the above result), linear oversampling (including the case , i.e. interpolation) necessarily leads to exponential growth of the maximal polynomial a geometric rate. Figure 3 illustrates this for the cases (C2) and (UC). Interestingly, the case (OC), although not covered by this result, also exhibits exponential growth, provided the oversampling factor is below a particular threshold.

    • In this paper we are primarily interested in lower bounds for , since this is all that is required for the various impossibility theorems. However, in the case of ultraspherical weight functions an upper bound can be derived from results of Rakhmanov. In [RakhmanovPolyBds, Thm. 1(a)] it is stated that if then any which is bounded on a set of pointsmissing1missing1missing1Rakhmanov’s result excludes the endpoints from this set, whereas in our results we include these points. However as noted earlier, our main theorems would remain valid (with minor changes) if these points were excluded. which are equispaced with respect to the ultraspherical weight function (2.9) satisfies

      —p(x) — ≤C, for some constant independent of and depending only on , where and . Remez’ inequality now gives that

      where is the Chebyshev polynomial. Suppose that . Then

      for some . Using the definition of , we obtain

      for and , where

      The exponent is exactly the same as in the lower bound for (Corollary 3.2). In other words, the two-sided bounds of Coppersmith & Rivlin (see Theorem 2.2) for equispaced nodes extend to nodes equidistributed according to ultraspherical weight functions.

    • Corollaries 3.2 and 3.3 do not apply to the nodes (C1). It is, however, well-known (see, for example, [TrefethenATAP, Thm. 15.2]) that

      in this case. Furthermore, a classical result of Ehlich & Zeller [EhlichZeller1] gives that

      where

      and , .missing2missing2missing2Similar to the previous footnote, excludes the endpoints . In particular, this result implies boundedess of in the case of linear oversampling, i.e.  for any .

    missingmissing4 missingAn extended impossibility theorem

    For , let be the Bernstein ellipse with parameter . That is, the ellipse with foci at and semiminor and semimajor axis lengths summing to . Given a domain , we let be the set of functions that are analytic on . The following lemma – whose proof follows the same ideas to that of Theorem 2.1 – shows how the condition number of an exponentially-convergent method for approximating analytic functions can be bounded in terms of the quantity for suitable .

    Lemma 4.1 (Abstract impossibility lemma).

    Given points , let be an approximation procedure such that depends only on the values for any . Suppose that

    (4.1)

    for all , where be a compact set containing in its interior, and , and are constants that are independent of and . If is such that

    where is such that the , then the condition number defined in (2.1) satisfies

    where is as in (3.1).

    Proof.

    Let . Since is entire (4.1) gives

    Also, due to a classical result of Bernstein [BernsteinPolyBound, Sec. 9] (see also [TrefPlatteIllCond, Lem. 1]), one has . Hence

    It now follows that