Fast Algorithms for the Multidimensional
Jacobi Polynomial Transform
Abstract
We use the wellknown observation that the solutions of Jacobi’s differential equation can be represented via nonoscillatory phase and amplitude functions to develop a fast algorithm for computing multidimensional Jacobi polynomial transforms. More explicitly, it follows from this observation that the matrix corresponding to the discrete Jacobi transform is the Hadamard product of a numerically lowrank matrix and a multidimensional discrete Fourier transform (DFT) matrix. The application of the Hadamard product can be carried out via fast Fourier transforms (FFTs), resulting in a nearly optimal algorithm to compute the multidimensional Jacobi polynomial transform.
Keywords. Multidimensional Jacobi polynomial transform, nonuniform transforms, nonoscillatory phase function, randomized lowrank approximation, fast Fourier transform.
1 Introduction
The onedimensional forward discrete Jacobi transform consists of evaluating an expansion of the form
(1) 
where denotes the order Jacobi polynomial of the first kind corresponding to parameters and , at a collection of points . The onedimensional inverse discrete Jacobi transform is the process of computing the coefficients in an expansion of the form (1) given its values at a collection of distinct points . For the sake of brevity, we will generally drop the adjective “discrete” and simply use the terms “forward Jacobi transform” and “inverse Jacobi transform” when referring to these operations. Often, the points are the nodes of the point GaussJacobi quadrature rule
(2) 
that is exact when is a polynomial of degree less than or equal to . However, it is useful to consider more general sets of points as well. In the case in which the points are the nodes of the GaussJacobi quadrature rule, we say the corresponding transform and inverse transform are uniform; otherwise, we describe them as nonuniform. To form an orthonormal matrix to represent the transform for numerical purpose, we will rescale the transform in (1) with quadrature weights later.
Many methods for rapidly applying the Jacobi transform and various special cases of the Jacobi transform have been developed. Examples include the algorithms of [1, 2, 3, 4] for applying the Legendre transform, and those of [5] for forming and evaluating expansions for Jacobi polynomials in general. See also, [6] and its references for extensive information on numerical algorithms for forming and manipulating Chebyshev expansions. Almost all such algorithms can be placed into one of two categories. Algorithms in the first category, such as [7, 8, 9, 10, 11], make use of the structure of the connection matrices which take the coefficients in the expansion of a function in terms of one set of Jacobi polynomials to the coefficients in the expansion of the same function with respect to a different set of Jacobi polynomials. They typically operate by computing the Chebyshev coefficients of an expansion and then applying a connection matrix or a series of connection matrices to obtain the coefficients in the desired expansion. The computation of the Chebyshev expansion can be carried out efficiently in operations via the nonuniform FFT (NUFFT) [12, 13] and the application of each connection matrix can be performed in a number of operations which grows linearly or quasilinearly via the fast multipole method (FMM) [3, 14, 15, 5] or the fast eigendecomposition of semiseparable matrices [16, 17, 5].
The second class of algorithms, of which [18] is a major example, make use of butterfly methods and similar techniques for the application of oscillatory matrices. Most algorithms in this category, including that of in [18], require precomputation with running times. However, [19] describes an algorithm based on this approach whose total running time is , when both parameters and are in the interval . It makes use of the observation that the solutions of Jacobi’s differential equation can be accurately represented via nonoscillatory phase and amplitude functions to apply the Jacobi transform rapidly. That certain differential equations admit nonoscillatory phase functions has long been known [20]; a general theorem was established in [21] and a numerical method for the computation of nonoscillatory phase functions was given in [22]. In the case of Jacobi’s differential equation, there exists a smooth amplitude function and a smooth phase function such that
(3) 
and
(4) 
where and are referred as the modified Jacobi functions of the first and second kind, respectively. They are defined in terms of the Jacobi functions of the first and second kinds and (see [20] for definitions) via the formulas
(5) 
and
(6) 
where
(7) 
The normalization constant is chosen to ensure the norm of is when is an integer. Indeed, the set is an orthonormal basis for . The change of variables makes the singularities in phase and amplitude functions for Jacobi’s differential equation more tractable. Obviously, there is no substantial difference in treating expansions of the type (1) and those of the form
(8) 
and we will consider the latter form here, more specially, the latter form with scaling by quadrature weights. Moreover, we will denote by and the nodes and weights of the trigonometric GaussJacobi quadrature rule
(9) 
obtained by applying the change of variables to (2). Again, we refer to transforms which use the set of points as uniform and those which use some other set points as nonuniform.
In [19], it was observed that the second summation in (8) could be interpreted as the application of a structured matrix that is the real part of a Hadamard product of a NUFFT matrix and a numerically lowrank matrix. This leads to a method for its computation which takes quasilinear time [13, 23, 24]. Expansions in terms of the functions of the second kind can be handled similarly. This nonoscillatory representation method will be described in more detail in Section 2.2.
In this paper, we generalize the onedimensional Jacobi polynomial transform of [19] to the multidimensional case. The transformation matrix in the multidimensional case is still the real part of a Hadamard product of a multidimensional NUFFT matrix and a numerically lowrank matrix in the form of tensor products. However, the numerical rank of the lowrank matrix might increase quickly in the dimension following the algorithm in [19] and hence the multidimensional NUFFT in [12, 13] might not be sufficiently efficient. To obtain a viable multidimensional Jacobi polynomial transform, we reformulate the Hadamard product into a new one, which is the Hadamard product of an oversampled Fourier transform matrix and a numerically lowrank matrix, and propose a fast randomized SVD to achieve near optimality in the lowrank approximation. This leads to an efficient implementation of twodimensional and threedimensional Jacobi polynomial transforms. Besides, as will be demonstrated numerically, the new method proposed in this paper is more robust than the method in [19] and works for and in the regime , other than the regime in [19].
The remainder of the paper is organized as follows. In Section 2, we will first briefly introduce the fast SVD via randomized sampling and the theory of nonoscillatory phase functions. In Section 3, a variant of the fast onedimensional Jacobi polynomial transform in [19] is given; it is followed by a description of our fast multidimensional Jacobi polynomial transform. In Section 4, numerical results which demonstrate the efficiency of our algorithm are described. Finally, we will conclude our discussion in Section 5.
2 Preliminaries
In this section, we will revisit the linear scaling randomized SVD introduced in [25] and the nonoscillatory phase and amplitude functions in [19] to make the presentation selfcontained.
2.1 Approximate SVD via Randomized Sampling
For a numerically lowrank matrix with an algorithm to evaluate an arbitrary entry, [25] introduced an algorithm to construct a rank approximate SVD, , from randomly selected rows and columns of , where denotes the conjugate transpose, and , is a diagonal matrix with approximate singular values in the diagonal part in a descent order.
Here, we adopt the standard notation for a submatrix in MATLAB: given a row index set and a column index set , is the submatrix with entries from rows in and columns in ; we also use “” to denote the entire columns or rows of the matrix, i.e., and . With these handy notations, we briefly introduce the randomized SVD as follows.
Algorithm 2.1.
Approximate SVD via randomized sampling

Let and denote the important columns and rows of that are used to form the column and row bases. Initially and .

Randomly sample rows and denote their indices by . Let . Here is a multiplicative oversampling parameter. Perform a pivoted decomposition of to get
(10) where is the resulting permutation matrix and is an upper triangular matrix. Define the important column index set to be the first columns picked within the pivoted decomposition.

Randomly sample columns and denote their indices by . Let . Perform a pivoted decomposition of to get
(11) where is the resulting permutation matrix and is an lower triangular matrix. Define the important row index set to be the first rows picked within the pivoted decomposition.

Repeat steps 2 and 3 a few times to ensure and sufficiently sample the important columns and rows of .

Apply the pivoted factorization to and let be the matrix of the first columns of the matrix. Similarly, apply the pivoted factorization to and let be the matrix of the first columns of the matrix.

We seek a middle matrix such that . To solve this problem efficiently, we approximately reduce it to a leastsquares problem of a smaller size. Let and be the index sets of a few extra randomly sampled columns and rows. Let and . A simple leastsquares solution to the problem
(12) gives , where stands for the pseudoinverse.

Compute an SVD . Then the lowrank approximation of is given by
(13)
In our numerical implementation, iterating Steps and twice is empirically sufficient to achieve accurate lowrank approximations via Algorithm 2.1. Similar arguments as in [26] for a randomized CUR factorization can be applied to quantify the error and success probability rigorously for Algorithm 2.1. But at this point, we only focus on the application of Algorithm 2.1 to the fast Jacobi polynomial transform without theoretical analysis.
Note that our goal in the Jacobi polynomial transform is to construct a lowrank approximation of the lowrank matrix, i.e., with and , up to a fixed relative error , rather than a fixed rank. Algorithm 2.1 can also be embedded into an iterative process that gradually increases the rank parameter to achieve the desired accuracy. We denote the rank parameter as when it is large enough to obtain the accuracy .
When origins from the discretization of a smooth function at the grid points and , i.e., , another standard method for constructing a lowrank factorization of is Lagrange interpolation at Chebyshev grids in or . For example, let denote the set of Chebyshev grids in the domain of , be the matrix consisting of the th Lagrange polynomial in corresponding to as its th column, and be the matrix such that is equal to , where denotes the conjugate operator, then by the Lagrange interpolation. Usually, an oversampling parameter is used via setting . Then a rank truncated SVD of gives a compressed rank approximation of , where and .
The fast nonuniform FFT in [13] and the onedimensional Jacobi polynomial transform in [19] adopted lowrank approximation via Lagrange interpolation to deal with the lowrank term in their Hadamard products of lowrank and (nonuniform) FFT matrices without an extra truncated SVD. In this paper, we propose to use the randomized SVD via random sampling to obtain nearly optimal rank in the lowrank approximation.
2.2 Nonoscillatory phase and amplitude functions
Given a pair of parameters and in , and a maximum degree of interest , we revisit the fast algorithms in [19] for constructing nonoscillatory phase and amplitude functions and such that
(14) 
and
(15) 
for and . The polynomials with and out of these ranges can be evaluated by the wellknown threeterm recurrence relations or various asymptotic expansions; the lower bound of was chosen to obtain optimal numerical performance.
Next, we present a few facts regarding the phase and amplitude functions related to Jacobi’s differential equations. It is well known that the functions and satisfy the second order differential equation
(16) 
where {dmath} q_ν^(α,β)(t) = (ν+ α+β+12 )^2 + 14α24 sin(t2)2 + 14β24 cos(t2)2.
We refer to Equation (16) as Jacobi differential equation. Following the derivation in [19], we can show that the pair of realvalued, linearly independent solutions of (16) satisfies
(17) 
and
(18) 
where is the necessarily positive constant Wronskian of the pair , and is the nonoscillatory phase function of the pair satisfying
(19) 
with an appropriately chosen constant related to . Note that since . Hence, the nonoscillatory amplitude function of can be defined as
(20) 
Through straightforward computation, it can be verified that the square of the amplitude function satisfies the third order linear ordinary differential equation (ODE)
(21) 
Obviously,
(22) 
Hence, we can specify the values of and its first two derivatives in at a point on the interval for any using various asymptotic expansion for and . Afterwards, we uniquely determine via solving the ODE (21) using a variant of the integral equation method of [12] (or any standard method for stiff problems).
To obtain the values of , we first calculate the values of
(23) 
via (20). Next, we obtain the values of the function defined via
(24) 
at any point in via spectral integration. There is an unknown constant connecting with ; that is,
(25) 
To evaluate , we first use a combination of asymptotic and series expansions to calculate at the point . Since , it follows that
(26) 
and can be calculated in the obvious fashion.
The above discussion has introduced an algorithm to evaluate and in the whole domain . To achieve a fast algorithm, we note that it is enough to conduct calculation on selected important grid points of through the above calculation, since we can evaluate and via interpolation with their values on the important grid points.
It is well known that a polynomial (here is either or ) can be evaluated in a numerically stable fashion using the barcyentric Chebyshev interpolation formula [27]
(27) 
where are the nodes of the point Chebyshev grid on a sufficiently small interval (such that is a polynomial of degree at most ) and
(28) 
Hence, it is sufficient to evaluate and , which lead to and , at a tensor product of Chebyshev grid points in and , and calculate them at an arbitrary location via a bivariate piecewise barcyentric Chebyshev interpolation.
More explicitly, let be the least integer such that , and let be equal to twice the least integer such that . Next, we define for , for , and for . Now we let
(29) 
denote the point piecewise Chebyshev gird on the intervals
(30) 
with points in total; let
(31) 
denote the nodes of the point piecewise Chebyshev grid on the intervals
(32) 
with points in total. The piecewise Chebyshev grids (29) and (31) form a tensor product
(33) 
Then and can be obtained via barcyentric Chebyshev interpolations in and via their values at the tensor product restricted to the piece that belongs to for some and .
Here we summarize the operation complexity of the whole algorithm above. A detailed discussion can be found in [19]. ODE solvers are applied for values of , and the solution of the solver is evaluated at values of , making the total running time of the procedure just described .
3 Fast Jacobi polynomial transforms
Without loss of generality, we assume that , , , and can be evaluated in operations for any and . A fast algorithm for rapidly computing GaussJacobi quadrature nodes and weights in the Jacobi polynomial transform in (8) has also been introduced in [19]. We refer the reader to [19] for details and assume that quadrature nodes and weights are available in this section.
3.1 Onedimensional transform and its inverse
Here we propose a new variant of the onedimensional Jacobi polynomial transform for in [19]. The new algorithm simplifies the discussion of the fast algorithm. The transform for is similar. For our purpose, we consider the order uniform forward Jacobi polynomial transform with a scaling, that is, calculate the vector of values
(34) 
given the vector
(35) 
of the coefficients in the expansion
(36) 
here, are the nodes and weights of the point trigonometric GaussJacobi quadrature rule corresponding to the parameters and . The “nonuniform” forward Jacobi transform does not require trigonometric GaussJacobi quadrature nodes and weights.
In the uniform transform, the properties of the trigonometric GaussJacobi quadrature rule and the weighting by square roots in (34) ensure that the matrix taking (35) to (34) is orthogonal, where is the matrix
(37) 
and is the matrix representing the sum in (36). Hence, the inverse transform is a simple matvec by the transpose of . In the nonuniform case, is usually a very illconditioned matrix and the inverse transform requires solving a challenging system of linear equations, which will be discussed in a separate paper in the future.
It follows from (14) that the th entry of is
(38) 
since the part consisting of lowdegree polynomials is computed via threeterm recurrence relations or various asymptotic expansions. Let be the submatrix of corresponding to the part of polynomials with degree less than , and be the rest of , then
Note that is the real part of a discrete Fourier integral transform that can be evaluated with a quasilinear complexity by the algorithm of [23] for applying Fourier integral transforms and a special case of that in [24] for general oscillatory integral transforms. Motivated by this observation, the algorithm in [19] constructs a lowrank matrix whose th entry is
(39) 
Denote the nonuniform FFT matrix, whose th entry is
(40) 
as . Then
(41) 
where “” denotes the Hadamard product. It can be immediately seen from (41) that can be applied through a small number of NUFFTs [12, 13] and a simple direct summation for in a quasilinear scaling of .
More specifically, let a lowrank approximation of be
(42) 
where and are column vectors for each . Then the transform from (35) to (34) can be approximated by the following sum,
(43) 
where is the subvector of with the first entries, is the subvector of with the rest entries, denotes a diagonal matrix with a column vector on its diagonal. The formula (43) can be carried out by NUFFTs in arithmetic operations.
In practice, inspired by the NUFFT in [13], we can replace the Hadamard product of a lowrank matrix and a NUFFT matrix in (41) with a Hadamard product of a lowrank matrix and a DFT matrix, the matvec of which can be carried out more efficiently with a few numbers of FFTs.
Consider another matrix whose th entry is defined via
(44) 
where denotes the integer nearest to . Then we have
(45) 
where is an matrix whose th entry is
(46) 
It’s obvious that in (45) is a row permutation of an inverse DFT matrix. Note that the difference between the phase functions of and is less than . Hence, is also a lowrank matrix and we can apply the randomized SVD in Algorithm 2.1 to construct a lowrank approximation of in operations.
Suppose we have constructed an approximate rank SVD of up to a desired accuracy using Algorithm 2.1; that is, suppose that we have computed the factorization
(47) 
where , , and is a positive definite diagonal matrix. By rearranging the factors above, we have
(48) 
where and denote the column vector of and the row vector of , respectively, and denotes the matrix transpose. Once (48) is ready, we have
(49) 
where denotes a diagonal matrix with on its diagonal. Formula (49) indicates that the Jacobi polynomial transform can be evaluated efficiently via inverse FFTs, which requires arithmetic operations and memory.
Compared to (43) used in the original fast Jacobi transform in [19], the number of inverse FFTs in (49) has been optimized. Thus Formula (49) would take fewer operations to compute a multidimensional transform. Note that, compared to the original method in [19], though our new method using (49) would cost more time to construct a lowrank approximation, the optimized rank of the lowrank approximation would accelerate the application of the multidimensional Jacobi transform. In addition, as we mentioned previously, the new method works in a larger range of and . In Section 4, we will provide numerical comparisons to demonstrate the superior of the new method over the method in [19].
The fast inverse Jacobi transform in the uniform case can be carried out in a similar manner, since is an orthonormal matrix. In fact, the inverse transform can be computed via
and
(50) 
where is a permutation of the DFT matrix. Therefore, the inverse Jacobi polynomial transform can also be computed via FFTs.
It is worth emphasizing that the fast algorithm introduced in this section also works for nonuniform Jacobi polynomial transforms since the lowrankness of is independent of the samples in and the quadrature weights in the uniform Jacobi polynomial transform. There will be numerical examples later to verify this. The inverse transform in the nonuniform case requires solving a highly illconditioned linear system, which will be reported in a separate paper in the future.
3.2 Twodimensional transform and its inverse
In this section, we extend our algorithm to the twodimensional case, using the new method developed in the last section. We only focus on the transform for ; the one for is similar. We will adopt the MATLAB notation as a vector resulting from reshaping the matrix into a vector. We also use similar notations as in Section 3.1, e.g., , , , , , and denote corresponding matrices analogous to their counterparts in Section 3.1, respectively, with “” specifying a variable or in the spatial domain. In the rest of this section, we always assume a lowrank approximation
(51) 
has been obtained by Algorithm 2.1 up to a desired accuracy for “” as or .
Given locations and , with no substantial difference, the forward and inverse twodimensional Jacobi polynomial transforms arise in the following Jacobi expansion
(52) 
where denotes an expansion coefficients matrix. The forward and inverse transform can be defined analogously as in the onedimensional case. When both and are exactly the nodes of the trigonometric GaussJacobi quadrature rule, the corresponding transform is referred as the uniform transform. With additional weight matrices and , the forward transformation matrix taking to is orthogonal, where ”” denotes the Kronecker product. When , , and weights are not given by the trigonometric GaussJacobi quadrature rule, we refer the corresponding transform as a nonuniform transform.
With the lowrank approximations in (51) available, we have
(53) 
where is an zero matrix. In our numerical implement, since does not contribute to the numerical result, there are just entries of needed to be applied in the first three terms above. Formula (53) indicates that 2D uniform Jacobi polynomial transforms can be evaluated via 1D inverse FFTs and 2D inverse FFTs, which results in arithmetic operations and memories. We would like to emphasize that this algorithm also works for 2D nonuniform forward transform since it is actually a tensor form of the onedimensional transform. There will be numerical examples later to verify this.
The fast 2D inverse Jacobi transform in the uniform case can be carried out in a similar manner, since both and are orthonormal matrices. In fact, the 2D inverse transform can be computed via
(54) 
where . In our numerical implement, since does not contribute to the numerical result, we just need to compute entries of for the first three terms above. Therefore, 2D inverse Jacobi polynomial transforms can also be evaluated via 1D FFTs and 2D FFTs.
3.3 Threedimensional transform and its inverse
In this section, we continue to extend our algorithm for 3D Jacobi polynomial transform and its inverse. Analogously, we just discuss the transform for and the transform for is similar. The notations in Section 3.2 will be inherited here.
Given locations , , and in , with no substantial difference, the forward and inverse threedimensional Jacobi polynomial transforms arise in the following Jacobi expansion
(55) 
where denotes a threedimensional tensor containing expansion coefficients. The forward and inverse transforms can be defined analogously to the threedimensional case. When , , and are all exactly the nodes of the trigonometric GaussJacobi quadrature rule, the corresponding transform is referred as the uniform transform. Otherwise, we refer the corresponding transform as a nonuniform transform.
In the uniform transform, in order to take advantage of orthogonality, we consider the the tensor product that takes to . Once the related lowrank approximations,
(56) 
have been obtained, we have
(57) 
where
(58) 
In numerical implement, since does not contribute to numerical result, there are just entries of needed to be applied in the first seven terms above. Formula (58) indicates that 3D uniform Jacobi polynomial transform can be evaluated via