Quadrature as a least-squares and minimax problem

Quadrature as a least-squares and minimax problem

Mário M. Graça Departamento de Matemática, Instituto Superior Técnico, Universidade Técnica de Lisboa, Av. Rovisco Pais, 1049–001 Lisboa, Portugal.
Abstract

The vector of weights of an interpolatory quadrature rule with preassigned nodes is shown to be the least-squares solution of an overdetermined linear system here called the fundamental system of the rule. It is established the relation between and the minimax solution of the fundamental system, and shown the constancy of the -norms of the respective residual vectors which are equal to the principal moment of the rule. Associated to and we define several parameters, such as the angle of a rule, in order to assess the main properties of a rule or to compare distinct rules. These parameters are tested for some Newton-Cotes, Fejér, Clenshaw-Curtis and Gauss-Legendre rules.

Key-words: Quadrature rule, least-squares solution, minimax solution, principal moment.

MSC2010: 65D30, 65D32, 65F20, 65D05.

1 Introduction

We establish several connections between the computation of the weights of an interpolatory quadrature rule with given nodes and the least-squares and minimax solutions of a certain inconsistent linear system of equations in unknowns. This system, hereafter called the fundamental system of the rule, is obtained by applying the undetermined coefficients method to a convenient basis of polynomials, which depends on the given abscissas. We prove that the vector of weights has the property of having a constant residual norm, which is independent of the norm used, equal to the so called principal moment of the rule. General references for interpolatory quadrature rules and its applications are, for instance, [8], [17], [14], [9], [18].

In order to approximate the integral (with a nonnegative weight function, continuous and integrable on the open interval ) we assume that an ordered set of nodes is given, say (with belonging or not to the interval of integration). We consider the quadrature rule

(1)

The entries of the vector are the unknown weights of the rule. The rule (1) is assumed to be interpolatory, meaning that it is exact at least for all polynomials of degree less or equal to :

where denotes the vector space of polynomials of degree less or equal to . The rule is said to have degree of exactness (or simply degree ) if it is exact for all polynomials of degree (i.e., for all ) but it is not exact for some polynomial in .

It is well known that any interpolatory rule has at least degree and at most (see for instance [17], Ch. 7). Consequently, the respective weights can be obtained using the so called undetermined coefficients method applied to a basis of . Traditionally one considers the standard basis , with , and the undetermined coefficients method applied to this basis leads to the following overdetermined linear system of equations in unknowns

(2)

where the moments are assumed to be finite, and equal to

Since the degree of the rule is generally unknown, one cannot decide a priori what is the relevant set of equations in the system (2) (see [6], Ch. 2, for the discussion on the choice of relevant equations from an overdetermined system). Furthermore, the matrix of the system is generally dense and so is natural to ask whether there exists another basis in which the system (2) has a simpler form. In our work [15] we show that choosing a basis as in Definition 2.1, we easily replace (2) by a relevant linear system with equations in unknowns. Such system will be called here the fundamental system of the rule, since it is univocally determined by the set of nodes which in turn determines the basis . Thus, the choice of this particular basis may be seen as the right way to extract the relevant set of equations from the system (2). Otherwise (taking the standard basis of polynomials) the computation of a “solution”of (2) relies on heavy tools such as Pólya’s algorithm [19], detailed in [6] (see also [4]).

In Section 2 we study the main properties of the fundamental system associated to the rule . This system has the remarkable property of containing a triangular subsystem , from which the weights can be computed by backward substitution. The last entry of the vector contains a non null quantity which we call the principal moment of the rule. The computation of the parameter gives automatically the degree of exactness of the rule (see the work of the author [15]). Therefore, the principal moment can be seen as a kind of rule’s signature.

The main results are given in Propositions 2.1 and 2.2. The first proposition shows that the least-squares solution of the fundamental system coincides with the vector of weights , and in Proposition 2.2-(a) we prove that the minimax solution of the fundamental system differs from the least-squares solution by a correction vector . This correction (and so the minimax solution) is recursively computed by solving a certain triangular system.

Several parameters related to the least-squares and minimax solutions are defined in order to preview the asymptotic behavior of a rule or to compare distinct rules. For a fixed number of nodes, the first two main quantities to be considered are respectively and . Other interesting parameter is the angle between and , which we call the angle of the rule.

Let us now explain the heuristic behind the choice of these parameters. The most remarkable property is that and have residual vectors and of constant norms (see Proposition 2.2 (b)) in the following sense

This property allows to preview that a convergent rule should have their parameters and aproaching zero as goes to . Moreover, the angle of a rule should also be close to zero for sufficiently large. As observed in Section 3, the behavior of the norm parameters and for the Newton-Cotes rules do not decrease as increases (see Figure 1), and so one concludes that the Newton-Cotes rules cannot be convergent.

In Section 3 we test and compare the behavior of several parameters for the Fejér (F), Clenshaw-Curtis (CC) and Gauss-Legendre rules (GL), for varying from to . We give numerical and graphical evidence that the Fejér and Clenshaw-Curtis rules have an angle close to the angle of Gauss-Legendre rule. This is a plausible explanation for a phenomenon observed by several authors, in particular by Lloyd N. Trefethen in [20]: in spite of its low degree of exactness, for large , the CC rule is almost as accurate as GL.

The least-squares/minimax approach to interpolatory quadrature rules gives a great deal of information to assess the main properties of a rule, with no explicit need to appeal to interpolation theory. Indeed one can deduce an error expression, (see Proposition 2.1 (c)), in the case the integrand function is assumed to be sufficiently smooth. Moreover we do not need to make any assumptions on the smoothness of the integrand function (apart from the existence of ) to compute the principal moment of a rule as well as other relevant parameters such as the residual norms or the angle of the rule.

In conclusion, the minimax/least-squares approach to interpolatory quadrature here discussed gives rise to interesting analytic and geometric aspects opening new research directions. In particular, we think it is worth to further explore the role of the referred parameters on the fundamental questions of the convergence and accuracy of an interpolatory rule. Another direction of research is to relate the least-squares/minimax approach with the available methods for simultaneous computation of nodes and weights using Jacobi matrices (see [13] and [14], Ch. 10).

2 The fundamental system of a rule

Given an ordered set of nodes we construct a polynomial basis , defined in (3) and (4), dependent on the nodes. The undetermined coefficients method is then applied to leading to an overdetermined system , with equations in unknowns.

Definition 2.1.

(Canonical basis)

Given a set of ordered nodes , the polynomials

(3)

(where the polynomial has degree ) is a basis of .This basis is here called the canonical basis associated to .

To form a basis of (where will be specified in accordance with the particular rule under consideration), we complete the canonical basis with the polynomials , , , defined by

(4)
Remark 2.1.

Note that for any , the zeros of the polynomial are exactly . Also the nodes are zeros of each polynomial given in (4), for any .

The canonical basis and the extended basis have been used in a previous work for the simultaneous computation of the degree of a rule and its weights via the undetermined coefficients method. We recall now the main result in [15].

Theorem 2.1.

[15]

Let be the set of given nodes, , and consider the basis ,, of , and the basis , , of , where and are defined in (3) and (4). Denote by the moments , .

  • The undetermined coefficient method applied to the basis determines (uniquely) the weights of the rule (1) for approximating the integral . These weights are

    (5)
  • The undetermined coefficient method applied to the basis , determines the degree of exactness as being the integer for which

  • If is of class , the error expression of the rule is

    (6)

    for some .

For any polynomial of the basis one has . Consequently, the undetermined coefficients method applied to the basis leads always to an overdetermined system of equations and unknowns, of triangular type, which we call the fundamental system of the rule. We denote this system by emphasizing that the system is univocally determined by the set of nodes and its associated basis . In matricial form, the fundamental system is given by

(7)

where and . The entries in are , , . Since and , , the matrix has rank , and so the recursive scheme (5) holds.

2.1 The least-squares solution of the fundamental system

The fundamental system (7) can be written in the following form

(8)

where is the (upper triangular) submatrix of obtained by deleting its last row.

It is easy to show that the solution is a least-squares solution of the fundamental system (7). Indeed, the least-squares solution of (8) is the solution of the normal system , and

and

Then,

(9)

As the system has a unique solution, this solution coincides with its least-squares solution. So, from (9) it follows that the weights’s vector coincides with the least-squares solution of the fundamental system (8).

Recall that the -norm of a vector is defined as

The limiting case is the norm .

The next result gives an essential property of the residual vector at the least-squares solution of the fundamental system .

Proposition 2.1.

Consider the interpolatory rule (1) and its fundamental system (8).

(a) The vector of the weights of the rule is the least-squares solution of the fundamental system.

(b) For any -norm, the residual vector at the least-squares solution, , satisfies the equalities

(10)
Proof.

It remains to show the equalities in (10). Since

the result follows trivially by definition of the -norm. ∎

Remark 2.2.

The -norms are monotone in the sense that (see [6] p. 43), thus the equalities observed in (10) are not verified in general. As referred before, is equivalent to the system (2) which arises when the undetermined coefficient system is applied by brute force to the standard polynomial basis . So, the equations in the fundamental system identify the relevant equations of the system (2), capturing the essential characteristics of the particular quadrature rule, namely its degree. Furthermore its residual vector satisfies the exceptional conditions given in Proposition 2.1.

2.2 The minimax solution of the fundamental system

It is well known that for a general overdetermined linear system a minimax solution is not easy to compute. However, when the system is a full rank system of equations in unknowns its minimax solution can be easily obtained from the least-squares solution. We start by stating a result in [6] which relates the minimax and the least-squares solutions of this type of systems.

In what follows, given a system of equations in unknowns , we denote by the residual vector .

Theorem 2.2.

([6], p. 41) Let be the least-squares solution of a system of linear equations in unknowns: . Assume that the system is of rank . Then, the minimax solution is the exact solution of the system , where and

(11)

We now apply Theorem 2.2 to the fundamental system given in (8), in order to obtain its minimax solution. The components of the residual vector at the least-squares solution are

Assuming the convention , we define the following vector of signs,

By Theorem 2.2 the minimax solution of the fundamental system is the (unique) solution of the system . That is, the solution of the system

(12)

Solving for the last equation of (12), we conclude that the required positive value is simply the magnitude of the moment , thus

as it should be by (11) in Theorem 2.2. Substituting the value of in the first equations of (12) we obtain the following (upper triangular) system:

The minimax solution of the system is then the unique solution of the system . So, satisfies

(13)

On the other hand, the least-squares solution of the fundamental system satisfies . Subtracting this equation from (13), we obtain

We remark that at the minimax solution all the residuals have the same magnitude , since by (13) we have . So,

Furthermore, by Proposition 2.1-(b) it follows that , for any -norm. Thus, .

The next proposition summarizes what we have just proved.

Proposition 2.2.

The fundamental system (8) has the following properties:

(a) The minimax solution is equal to , where is the least-squares solution of the system. The vector is the solution of the following upper triangular system

(14)

and is the principal moment of the rule.

(b) The residual vectors at the minimax and least-squares solutions satisfy

(15)
Example 2.1.

(Simpson’s rule)

Consider the integral . The Simpson’s rule is the most celebrated quadrature rule which uses equally spaced nodes, say . So,

The polynomials defining the canonical basis are: , , , and . The respective moments are , , , , .

Since is the first non null moment with index greater than , it follows from Theorem 2.1-(b) that this rule has degree . The fundamental system for Simpson’s rule is

The weights are easily computed (recursively) by solving the triangular subsystem :

Thus,

Of course, by changing the domain of integration from to one gets the traditional form of this rule: , where .

One can easily confirm the result of Proposition 2.1-(a) by solving the normal system , that is,

Let us now compute the minimax solution from the least-squares solution . By Proposition 2.2-(a), the minimax solution of the fundamental system satisfies , where is the solution of . So, the vector is the solution of the system

giving . Thus, the minimax solution is . One can easily confirm the result of Proposition 2.2-(b) on the residual norm:

The Simpson’s rule angle, that is the angle between its least-squares and minimax solutions, is

In paragraph 3.1.1 we pursue with other closed Newton-Cotes rules with a greater number of nodes.

3 Some relevant parameters

The properties of the least-squares and the minimax solutions of the fundamental system of an interpolatory rule , discussed in the previous section, suggest the study of several features of a quadrature rule by considering the behavior of certain parameters associated to the fundamental system. The first question is know what are the suitable parameters for previewing the accuracy of a given rule, its convergence, or even to compare distinct rules for the same integrand and weight functions.

We define next a few relevant parameters associated to a fundamental system and we justify their relevance. In paragraph 3.1 we compute these parameters for a certain number of nodes using some classical quadrature rules. To which extend these parameters classify a quadrature rule (or distinct rules), and the quest for other useful parameters within this least-squares/minimax setup, will deserve further studies.

To emphasize the dependence on whenever needed we index by the quantities. As shown in Propositions 2.1 and 2.2-(b), for each , we have

(16)

These equalities reveal that the “quality”of the least-squares and minimax solutions is measured by the principal moment of the rule. In fact, the smaller is the principal moment the lesser is the magnitude of the residual vectors and at the least-squares and minimax solutions respectively.

It is well known that an interpolatory rule is convergent if and only if satisfies the following boundness condition ([5], p. 267 or [17], Ch. 12)

(17)

This result suggests that in addition to the magnitude one should also consider the following norm parameters

(18)

Of course, in (18) one can consider any other -norm due to the equivalence of norms in .

In Figure 1 we illustrate the practical relevance of the parameters and for the Newton-Cotes and the Gauss-Legendre rules, with the number of nodes varying from to . In the graphics of Figure 1 the thicker polygonal line connects the values of (represented by the dots) whereas the lighter line connects the values of . It is clear from the respective graphics that for the Newton-Cotes rules the values of and do not decrease for , and so this rule cannot be convergent. We shall also remark that the minimax parameter attains a minimum at . This phenomenon is certainly related to the occurrence of negative weights for in the Newton-Cotes rules ([1], p. 269), however a detailed study of this phenomenon is out of the scope of this work.

Figure 1: Comparison of the norm parameters and (polygonal line) for Newton-Cotes (left) and Gauss-Legendre rules (right), with .

If a rule is convergent, that is , then by the error formula (6) in Theorem 2.1, we have . So, by (16) we get

Then, in the case of a convergent rule, one should also have

One way of measuring the closeness of and is through what we call the angle of a rule , which we define as the angle between the minimax solution and least-squares solution :

(19)

where denotes the standard inner product in . From what we said before, a rule whose angle does not decreases from a certain number of nodes onwards cannot be a convergent rule. This is the case of Newton-Cotes rules (see paragraph 3.1.1).

In Table 1 we compare the values of some parameters for Newton-Cotes (NC), Clenshaw-Curtis (CC), Fejér (F) and Gauss-Legendre (GL) rules, for nodes and the weight function . These rules are discussed in more detail in Section 3.1. The symbol denotes the degree of the respective rule, and the error coefficient , taking into account (6), is defined in (20). The computations have been carried out in a laptop using the Mathematica system and double precision arithmetic.

Table 1: Global comparison for NC, F, CC, GL, with nodes.

In Table 1, there is a remarkable evidence for the angle of a rule to be a good measure for its convergence. In fact, Table 1 shows that the angle of the last three rules is close to zero while the angle for the Newton-Cotes rule is far from zero. This means that the minimax and the least-squares solutions are asymptotically aligned for the F, CC, and GL rules, whereas this does not happen for NC. This probably explains why in spite of its low degree of exactness, the Fejér and Clenshaw-Curtis rules have asymptotically comparable accuracies to the Gaussian rules. This surprising fact has attracted the attention of many authors, in particular after the paper [20] of Lloyd N. Trefethen, where it is shown that the CC and GL rules have close errors for sufficiently large . The angular values in Table 1 indicate that the angle of a rule deserves to be considered as (new) parameter useful to explain the previously referred phenomenon which holds for several rules: in spite of having very different degrees of exactness asymptotically the rules exhibit almost the same accuracy.

For sufficiently smooth integrand functions , Theorem 2.1-(c) gives the formula for the error in terms of the principal moment and of the degree of the rule. This justifies to consider the following parameter which we call here the coefficient of error. We define the coefficient of error of a rule , of degree , as

(20)

where denotes the principal moment of the rule. We say that the rule is better than the rule , if .

Remark 3.1.

Another reason for not using unbounded rules (i.e. those for which (17) does not hold), such as the Newton-Cotes rules, is of computational nature. When the values , have an error, say , it is easy to prove that the error in the rule satisfies

and so, one prefers a rule with a small norm .

For the sake of completeness, in the following proposition we relate the minimax and the least-squares solutions with the condition number of the (triangular) submatrix of the fundamental system.

Proposition 3.1.

Let and be respectively the least-squares and the minimax solutions of the fundamental system (8) of the rule , and its principal moment. Then,

  • (21)
  • The parameter satisfies

    (22)

    where is the condition number of the matrix for the subordinate -norm.

Proof.

From (14) in Proposition 2.2, we have

(23)

where is the vector . Then,

For (b), again by (23), we have , and so

The result follows from these inequalities. ∎

Remark 3.2.

In the inequalities (21) and (22) the influence of the nodes is present through the matrix , while the influence of the functions and is mainly captured by the principal moment of the rule. The parameter in (22) is useful to control numerical instabilities in the computation of the least-squares and minimax solutions of the fundamental system, a system which is in general ill-conditioned due to the fact of the matrix being in general almost singular for large . These issues are however out of scope of the present work.

3.1 Worked examples

In this section we compare some of the parameters defined in the previous section for the Newton-Cotes (NC), Fejér (F), Clenshaw-Curtis (CC) and Gauss-Legendre (GL) rules. All the computations were carried out in a laptop using the system Mathematica and machine precision.

3.1.1 Newton-Cotes rules

Previously (see Figure 1 and Table 1) the parameters and were tested for the Newton-Cotes rules. Here we compare the angle of these rules as well as the norm (recall that ), for a number of nodes .

Figure 2: Newton-Cotes rules: on the left the angle , expressed in degrees, and on the right the distance .

The graphics in Figure 2 show that the angle between the minimax and least-squares solutions, ), has small variation for but suddenly oscillates widely. This remarkable behavior is closely related to the undesirable numerical properties of the Newton-Cotes rules due to the occurrence of negative weights for . Moreover, from onwards the distance , between the minimax solution and the least-squares solution , increases and so the rule cannot be convergent.

3.1.2 Clenshaw-Curtis rules

The Clenshaw-Curtis (CC) [7] rules are defined in , being the points fixed nodes. For , we consider the “practical”abscissas (see [8], p. 86 ) defined by

The total number of nodes is . Let us briefly recall our least-squares/ minimax approach giving the calculations for this rule when . First, we sort the nodes in increasing order to obtain the canonical set , that is, we take , , and . The respective triangular system is

(24)

and its solution. So, the rule is

As the principal moment is this rule has degree , like Simpson’s rule. However, this rule may be considered “better”than Simpson’s rule since its moment is smaller ().

Figure 3: Clenshaw-Curtis rules: the dashed line joins the values and the other line the values .

In Figure 3 we display the values of the -norm of the least-squares solution versus the minimax for the referred CC rules. Here we observe an interesting behavior when compared with the Newton-Cotes case: the magnitude of the solutions and become closer as increases. This suggests that the Clenshaw-Curtis rules are convergent, as in fact they are.

Figure 4: Clenshaw-Curtis rules: on the left the angle and on the right .

The Figure 4 shows values of the angle of the CC rules as well the distance (in the maximum norm) between the least-squares and minimax solutions. It is observable an enormous reduction and smoothing in the referred angle in comparison with the Newton-Cotes case. Here, as the value of increases the angle reduces almost to zero, suggesting that both the solutions and are asymptotically coincident.

3.1.3 Fejér rules

The Fejér first rule [12] with nodes uses as abscissas , which are the zeros of the Chebyshev polynomial , in ([3], p. 539). For , the nodes are , and . The moments are , and

The solution of the rule’s fundamental system is . Its degree is obviously , like in the Simpson’s rule. As the principal moment of this rule, , is smaller in modulus than the principal moment of the Simpson’s rule, we can say that Fejér’s rule is better than Simpson’s rule.

Figure 5: Fejér rules: the dashed line connects the values of