A trivariate interpolation algorithm using a cubepartition searching procedure
Abstract
In this paper we propose a fast algorithm for trivariate interpolation, which is based on the partition of unity method for constructing a global interpolant by blending local radial basis function interpolants and using locally supported weight functions. The partition of unity algorithm is efficiently implemented and optimized by connecting the method with an effective cubepartition searching procedure. More precisely, we construct a cube structure, which partitions the domain and strictly depends on the size of its subdomains, so that the new searching procedure and, accordingly, the resulting algorithm enable us to efficiently deal with a large number of nodes. Complexity analysis and numerical experiments show high efficiency and accuracy of the proposed interpolation algorithm.
Key words. meshless approximation, fast algorithms, partition of unity methods, radial basis functions, scattered data.
AMS subject classifications. 65D05, 65D15, 65D17.
1 Introduction
The problem of constructing fast algorithms for multivariate approximation of scattered data points has recently interested many researchers, who work in various areas of applied mathematics and scientific computing such as interpolation, approximation theory, neural networks, computer aided geometric design (CAGD) and machine learning, to name a few. So we often need to have numerical algorithms, which allow us to efficiently deal with a large number of points, not only in one or two dimensions but also in higher dimensions, as it usually occurs in several applications (see, e.g., [15, 26] and references therein).
Though there exist several numerical algorithms and alternative techniques for bivariate interpolation to scattered data, the problem of efficiently approximating many thousands or millions of three dimensional data does not seem to be much considered in the literature, with the exception of a few cases such as in [3, 13, 19, 23, 24]; a comparison of radial basis function (RBF) methods in the 3D setting can be found in [5].
Since meshbased methods require some sort of an underlying computational mesh, i.e. any triangulation of the domain, their construction is a rather difficult task, already in two dimensions, where the mesh generation turns out usually to be one of the most time consuming part. For this reason, in the following we focus on a meshfree or meshless approximation. More precisely, here we consider the partition of unity method, which involves the use of RBFs as local approximants and of locally supported weight functions (see [25]). Further details on the origin of the partition of unity method can be found in [2, 20]. Moreover, some other examples of local approaches involving modified Shepard’s methods and different searching procedures can be found in [1, 4, 11, 18, 19, 22, 23, 24].
Starting from the previous work [10], where an efficient algorithm with a new cellbased searching procedure is presented for bivariate interpolation of large scattered data sets, in this paper we directly extend it to trivariate case, obtaining in this way a new fast algorithm for interpolation, which can briefly be summarized in three stages as follows:

partition the domain into a suitable number of cubes;

consider an optimized cubepartition searching procedure establishing the minimal number of cubes to be examined, in order to localize the subset of nodes belonging to each subdomain;

apply the partition of unity method combined with local RBFs.
In particular, the algorithm is characterized by the construction of a cubepartition searching procedure, whose origin comes from the repeated use of a quicksort routine with respect to different directions, which enables us to pass from unordered to ordered data structures. Moreover, this technique is strictly related to the construction of a partition of the domain in cubes and depends on the size of its subdomains, thus producing a nearest neighbor searching procedure, which is particularly efficient in local interpolation methods. Numerical experiments show efficiency and accuracy of the cube algorithm.
The paper is organized as follows. In Section LABEL:PUM we recall some theoretical results, giving a general description of the partition of unity method, which makes use of RBFs as local approximants. In Section LABEL:PUM_ALG, we present in detail the cubepartition algorithm for trivariate interpolation, which is efficiently implemented and optimized by using a nearest neighbor searching procedure. Computational complexity and storage requirements of the interpolation algorithm are analyzed as well. In Section LABEL:num_res, we show numerical results concerning efficiency and accuracy of the partition of unity algorithm. Finally, Section LABEL:concl deals with conclusions and future work.
2 Partition of unity interpolation
Let be a set of distinct data points or nodes, arbitrarily distributed in a domain , , with an associated set of data values or function values, which are obtained by sampling some (unknown) function at the nodes, i.e., , .
The basic idea of the partition of unity interpolation is to start with a partition of the open and bounded domain into subdomains such that with some mild overlap among the subdomains. Associated with these subdomains we choose a partition of unity, i.e. a family of compactly supported, nonnegative, continuous functions with such that
\hb@xt@.01(2.1) 
For each subdomain we consider a local approximant and form then the global approximant
\hb@xt@.01(2.2) 
Here defines a RBF interpolant of the form
where represents a radial basis function, denotes the Euclidean norm, and indicates the number of data points in . Furthermore, satisfies the interpolation conditions
\hb@xt@.01(2.3) 
Note that if the local approximants satisfy the interpolation conditions (LABEL:int1), then the global approximant also interpolates at this node, i.e.
Solving the th interpolation problem (LABEL:int1) leads to a system of linear equations of the form
or simply
In particular, the interpolation problem is wellposed, i.e., a solution to the problem exists and is unique, if and only if the matrix is nonsingular. A sufficient condition to have nonsingularity is that the corresponding matrix is positive definite. In fact, if the matrix is positive definite, then all its eigenvalues are positive and therefore is nonsingular (see, e.g., [15]).
Though the theory of RBFs is here considered, for brevity we do not report basic definitions and theorems, referring to [6, 15, 17, 26] for a more detailed analysis. Then, we give the following definition (see [25]).
Definition 2.1
Let be a bounded set. Let be an open and bounded covering of . This means that all are open and bounded and that . Set . We call a family of nonnegative functions with a stable partition of unity with respect to the covering if

;

on ;

for every with there exists a constant such that
for all .
In agreement with the statements in [25], we require additional regularity assumptions on the covering .
Definition 2.2
Suppose that is bounded and are given. An open and bounded covering is called regular for if the following properties are satisfied:

for each , the number of subdomains with is bounded by a global constant ;

each subdomain satisfies an interior cone condition;

the local fill distances , where , are uniformly bounded by the global fill distance , i.e.
Property (a) is required to ensure that the sum in (LABEL:pui) is actually a sum over at most summands. Since is independent of , unlike , which should be proportional to , this is essential to avoid losing convergence orders. It is crucial for an efficient evaluation of the global interpolant that only a constant number of local approximants has to be evaluated. In such way, it should be possible to locate those indices in constant time. Properties (b) and (c) are important for employing the estimates on RBF interpolants (see [26]).
Moreover, we are able to formulate the following theorem, which yields the polynomial precision and controls the growth of error estimates, denoting by the set of polynomials of degree at most (see, e.g., [26]).
Theorem 2.3
Suppose that is compact and satisfies an interior cone condition with angle and radius . Let be fixed and there exist constants depending only on such that . Then, for all and all , there exist functions , , such that

, for all ;

;

provided that .
Therefore, after defining the space of all functions whose derivatives of order satisfy for , we consider the following convergence result (see, e.g., [15, 26]).
Theorem 2.4
Let be open and bounded and suppose that . Let be a strictly conditionally positive definite function of order . Let be a regular covering for and let be stable for . Then the error between , where is the native space of , and its partition of unity interpolant (LABEL:pui) can be bounded by
for all and all .
Comparing this convergence result with the global error estimates (see e.g. [26]), we note that the partition of unity preserves the local approximation order for the global fit. This means that we can efficiently compute large RBF interpolants by solving small RBF interpolation problems (in parallel as well) and then glue them together with the global partition of unity . In other words, the partition of unity approach is a simple and effective technique to decompose a large problem into many small problems while at the same time ensuring that the accuracy obtained for the local fits is carried over to the global one. In particular, the partition of unity method can be thought as a Shepard’s type interpolation with higherorder data, since local approximations instead of data values are used.
Finally, we remark that, among several weight functions in (LABEL:pui), a possible choice is given by Shepard’s weight
\hb@xt@.01(2.4) 
where is the inverse of the Euclidean norm . It constitutes a partition of unity as in (LABEL:pu_f).
3 Cubepartition algorithm
In this section we propose a new algorithm for trivariate interpolation of large scattered data sets lying on the domain . This algorithm, which is based on the partition of unity method for constructing a global interpolant by blending RBFs as local approximants and using locally supported weight functions, is efficiently implemented and optimized by connecting the interpolation method with an effective cubepartition searching procedure. More precisely, the considered approach is characterized by the construction of a cubebased structure, which partitions the domain in cubes and strictly depends on the dimension of its subdomains. This technique is a direct extension in threedimensional case of the squarepartition searching procedure presented in [10] for bivariate interpolation, which we briefly recall in Subsection LABEL:review.
Note that the paper [10] follows preceding works, where efficient searching procedures based on the partition of the domain in strips or spherical zones are considered (see [1, 7, 8, 10]).
3.1 Review of the 2D squarepartition searching procedure
The construction of the 2D searching procedure described in [10] is obtained by making a partition of the bivariate domain in square cells. They are achieved generating two orthogonal families of parallel strips (see Figure LABEL:doublestrips). This approach is combinated with the repeated use of a quicksort routine with respect to different directions. At first, we make a sorting along the axis on all the points, constructing then a first family of strips parallel to the axis. Afterwards, we order the points contained in each strip with respect to the axis direction, and finally we build the second family of strips parallel to the axis. The outcome is a squarebased structure, which allows us to pass from unordered to ordered data structures. Following this idea, we can suitably split up the original data set in ordered and wellorganized data subsets. More precisely, we may act as follows:

organize all the data by means of a quicksort procedure applied along the axis (the subscript denotes the sorting direction);

consider a first family of strips, parallel to the axis and order the points of each strip by using a quicksort procedure;

create a second family of strips, parallel to the axis, which orthogonally intersect the first strip family, thus producing a partition of the bivariate domain in square cells (see Figure LABEL:cellstruct).
Note that a specific square cell is denoted by a double index notation in square brackets, i.e. .
In order to obtain an efficient searching technique in the localization of points, we connect the interpolation method with the squarebased partition structure, exploiting the data structure and the domain partition previously considered. This result is obtained assuming that the square side is equal to the subdomain radius. Though this choice might seem to be trivial, in practice such an imposition means that the search of the nearby points, an essential aspect of local methods as the partition of unity method, is limited at most to nine squares: the square on which the considered point lies, and the eight neighbouring squares (see Figures LABEL:doublestrips–LABEL:cellstruct). The combination between square cell and subdomain sizes constitutes an optimal choice, since it allows us to search the closest points only considering a very small number of them, i.e. taking those points belonging to one of the nine square cells and a priori ignoring all the other ones. Finally, for all those points belonging to the first and last square cells, namely the ones located on or close to the boundary of the domain, we reduce the total number of square cells to be examined.
3.2 The 3D cubepartition searching procedure
As in the 2D case, the basic idea in constructing the 3D searching procedure comes from the repeated use of a quicksort routine with respect to (three) different directions, i.e. along the axis, the axis and the axis, enabling us to pass from unordered to ordered data structures. This process is strictly related to the construction of a partition of the domain, here the unit cube, in smaller cubes. They are obtained generating three orthogonal families of parallelepipeds, while at the same time the original data set is suitably split up in ordered and wellorganized data subsets. More precisely, in order to obtain the cubebased structure and then the resulting searching procedure, we may act as follows:

organize all the data by means of a quicksort procedure applied along the axis;

consider a first family of parallelepipeds, parallel to the plane, and order the points of each parallelepiped by using a quicksort procedure;

create a second family of parallelepipeds, parallel to the plane, which orthogonally intesect the first family, and order the points of each parallelepiped by using a quicksort procedure;

construct a third family of parallelepipeds, parallel to the plane, which orthogonally intesect the two previous families, thus producing a partition of in cubes (see Figure LABEL:cube_1).
Now, exploiting the data structure and the domain partition, we construct an efficient searching technique to be used in the localization of points, effectively connecting the partition of unity scheme with the cubepartition structure. This result is got assuming that the cube side is equal to the subdomain radius , i.e. taking . From this assumption it follows that the search of the nearby points is limited at most to twentyseven () cubes: the cube on which the considered point lies, and the twentysix neighboring cubes (see Figure LABEL:cube_2). From now on, to locate a specific cube , we define a triple index notation using square brackets, i.e. , .
We note that the combination between cube and subdomain sizes provides an optimal choice, since it allows us to search the closest points only considering a very small number of them (that is only those points belonging to one of the twentyseven cubes) and a priori ignoring all the other points of . Obviously, then, for all those points belonging to cubes close to the boundary of , it will be required a reduction of the total number of cubes to be examined. Further details on this searching procedure are contained in Subsection LABEL:cubealgor, where we give a detailed description of the proposed algorithm.
3.3 Cube algorithm
INPUT: , number of data; , set of data points; , set of data values; , number of subdomains; , set of subdomain points (centres); , number of evaluation points; , set of evaluation points. OUTPUT: , set of approximated values. Stage 1. The set of nodes and the set of evaluation points are ordered with respect to a common direction (e.g. the axis), by applying a quicksort procedure. Stage 2. For each subdomain point , , a local spherical subdomain is constructed, whose spherical radius depends on the subdomain number , i.e.
\hb@xt@.01(3.1) 
Although other choices are possible, this value is suitably chosen, supposing to have a nearly uniform node distribution and assuming that the ratio . Stage 3. A triple structure of intersecting parallelepipeds is constructed as follows:

a first family of parallelepipeds, parallel to the plane, is considered taking
\hb@xt@.01(3.2) and a quicksort procedure is applied to order the nodes belonging to each parallelepiped;

a second family of parallelepipeds, parallel to the plane, is constructed and a quicksort procedure is used to order the nodes belonging to each of the resulting parallelepipeds;

a third family of parallelepipeds, parallel to the plane, is considered.
Note that each of the three families of parallelepipeds are ordered and numbered from 1 to ; the choice in (LABEL:q_par) follows directly from the side length of the domain, i.e. the unit cube, and the subdomain radius . Stage 4. The unit cube is partitioned by a cubebased structure consisting of cubes, whose side length is . Then, the sets , and are partitioned by the cube structure into subsets , and , , where , and are the number of points in the th cube.
This stage can be summarized in Algorithm LABEL:alg1.
Stage 5. In order to identify the cubes to be examined in the searching procedure, we adopt the following rule which is composed of three steps:

the cube side is chosen equal to the subdomain radius , i.e. , and the ratio between these quantities is denoted by ;

the value provides the number of cubes to be examined for each point by the rule , which obviously here gives . In practice, this means that the search of the nearby points is limited at most to twentyseven cubes: the cube on which the considered point lies, and the twentysix neighboring cubes;

for each cube , , a cubepartition searching procedure is considered, examining the points from the cube to the cube . For the points of the first and last cubes (those close to the boundary of the unit cube), we reduce the total number of cubes to be examined, setting and/or and/or (when and/or and/or ) and and/or and/or (when and/or and/or ).
Then, after defining which and how many cubes are to be examined, the cubepartition searching procedure (see Algorithm LABEL:alg2) is applied:

for each subdomain point of , , to determine all nodes belonging to a subdomain. The number of nodes of the subdomain centred at is counted and stored in , ;

for each evaluation point of , , in order to find all those belonging to a subdomain of centre and radius . The number of subdomains containing the th evaluation point is counted and stored in , .
Stage 6. A local interpolant , , is found for each subdomain point. Stage 7. A local approximant and a weight function , , is found for each evaluation point. Stage 8. Applying the global interpolant (LABEL:pui), one can find approximated values computed at any evaluation point .
3.4 Complexity analysis
The algorithm is based on the construction of a cubepartition searching procedure. It enables us to efficiently determine all points belonging to each subdomain , , so that we can compute local RBF interpolants to be used in the partition of unity scheme. Assuming that the covering is regular and local and the set of data points is quasiuniform, we analyze the complexity of this code.
The cubepartition algorithm involves the use of the standard quicksort routine, which requires on average a time complexity , where is the number of points to be sorted. Specifically, we have a distribution phase consisting of building the data structure, in which the computational cost has order: for the sorting of all nodes and for the sorting of all evaluation points in Stage 1. Then, in Stage 3 the quicksort routine is repeatedly used with respect to different directions considering a reduced number of points (see Subsections LABEL:cubeproc–LABEL:cubealgor). Since the number of centres in each subdomain is bounded by a constant (see Definition LABEL:defpr), we need space and time for each subdomain to solve the local RBF interpolation problems. In fact, in order to obtain the local RBF interpolants, we have to solve linear systems of (relatively) small sizes, i.e. , with , thus requiring a constant running time , , for each subdomain (see Stage 6). Then, in Stage 5, 7 and 8 we also need a cost of , , , for the th evaluation point of ; in other words, we have a constant time to get the value of the global fit (LABEL:pui). Finally, the algorithm requires , and storage requirements for the data, and , , locations for the coefficients of each local RBF interpolant.
4 Numerical experiments
In this section we present a few numerical tests to show performance of the cubepartition algorithm, numerically analyzing efficiency and accuracy of the local interpolation scheme on some sets of scattered data. The code is implemented in C/C++ language, while numerical results are carried out on a Intel Core i74500U 1.8 GHz processor. In the experiments we consider a node distribution containing , , uniformly random Halton nodes generated by using the MATLAB program haltonseq.m (see [15]). The cubepartition algorithm is run considering , , subdomain points and evaluation (or grid) points, which are contained in the unit cube . Here, for the global interpolant (LABEL:pui) we use Shepard’s weight (LABEL:sh_w).
The performance of the interpolation algorithm is verified taking the data values by the following two trivariate Franke’s test functions (see, e.g., [19, 22])
and using Gaussian (G), Matrn (M4) and Wendland (W4) as local RBF interpolants
where are the shape parameters, is the Euclidean distance, and denotes the truncated power function. Note that Gaussian and Matrn are globally supported basis functions, whereas Wendland is a compactly supported one (see [26]).
Some information about the execution of the interpolation algorithm described in Section LABEL:PUM_ALG are reported in Table LABEL:time, namely the number of partitions in cubes of the domain and the CPU times (in seconds) obtained by running the cubepartition algorithm. Moreover, since we are interested in pointing out the effectiveness of the proposed algorithm, in Table LABEL:time we also show CPU times obtained by using the same interpolation method, but without partitioning the domain in cubes and, accordingly, without considering the corresponding searching procedure. This analysis emphasizes that the use of a cube structure gives a considerable saving of time, mainly when the number of points to be handled becomes quite a lot large.
1.1  
7.9  
62.7 
Analyzing the performance of the algorithm, we observe that the cubepartition searching procedure turns out to be powerful and efficient, because CPU times reported in Table LABEL:time are mainly due to solution of linear systems having matrices with a relatively large number of entries, usually more than a hundred.
Now, in order to investigate accuracy of the method, we compute the root mean square error (RMSE), whose formula is
analyzing its behavior by varying the values of the shape parameters for Gaussian, Matrn and Wendland functions (see Figure LABEL:shape_f1). These graphs allow us to find the optimal values of , and , i.e. those values for which we obtain the smallest RMSEs (see Tables LABEL:tab_errors–LABEL:tab_errors_bis). Note that each evaluation is carried out by choosing equispaced values of the shape parameters, taking and . Analyzing error tables and graphs, we can see that Matrn and Wendland functions have a greater stability than RBF Gaussian, but the latter gives us a greater accuracy although its interpolation matrices might be subject to illconditioning problems for small values of . This behavior is what we expect from theoretical standpoint, but here it is validated by numerical tests. Moreover, we remark that several numerical experiments (not reported here for shortness) have been carried out using other test functions and the results show a uniform behavior.
G  M4  W4  

RMSE  RMSE  RMSE  
G  M4  W4  

RMSE  RMSE  RMSE  
Finally, to show that the CPU times in Table LABEL:time essentially depend on the size of interpolation matrices, we repeat numerical tests fixing a maximum number (i.e., , ) of nodes for each subdomain, namely only considering the nodes closest to the subdomain centres. In fact, for example, taking (and also not fixed) and denoting by the corresponding execution times, we get a significant reduction of times, since and for , and for , while and for (see Table LABEL:time for a comparison). Nevertheless, this reduction expressed in terms of CPU times is paid, in general, only with a slight loss of accuracy, since the behavior of RMSEs is similar to that shown in Figure LABEL:shape_f1.
In conclusion, in Table LABEL:tab_comp_1 we also report the RMSEs obtained by applying the cubepartition algorithm on sets of grid points.
G  

M4  
W4  
5 Conclusions and future work
In this paper we propose a new local interpolation algorithm for trivariate interpolation of scattered data points. It is based on the construction of a partition of the domain in cubes, enabling us to optimally implement a cubepartition searching procedure in order to efficiently detect the nodes belonging to each subdomain of the partition of unity method. This technique works well and quickly also when the amount of data to be interpolated is very large. Moreover, the proposed algorithm is flexible, since different choices of local interpolants are allowable, and completely automatic.
As regards research and future work we are interested in refining the cube algorithm adopting suitable data structures like kdtrees and range trees, connecting these data structures with the special partition of the domain in cubes. Moreover, we are going to extend the proposed algorithm to higher dimensions. Then, even though the choice of loworder basis functions such as Matrn and Wendland functions gives a good tradeoff between stability and accuracy, we are still considering the need of dealing with the illconditioning problem of highorder basis functions. On the one hand, we might consider suitable preconditioning techniques for RBF interpolation matrices as already done in [9] for RBF collocation matrices; on the other hand, one could study alternative stategies to have a stable evaluation of interpolants via HilbertSchmidt SVD as in [12, 16], or new stable bases as in [14, 21].
References
 [1] G. Allasia, R. Besenghi, R. Cavoretto, and A. De Rossi, Scattered and track data interpolation using an efficient strip searching procedure, Appl. Math. Comput., 217 (2011), pp. 5949–5966.
 [2] I. Babuka and J. M. Melenk, The partition of unity method, Internat. J. Numer. Methods. Engrg., 40 (1997), pp. 727–758.
 [3] R. K. Beatson, W. A. Light, and S. Billings, Fast solution of the radial basis function interpolation equations: Domain decomposition methods, SIAM J. Sci. Comput., 22 (2000), pp. 1717–1740.
 [4] M. W. Berry and K. S. Minser, Algorithm 798: Highdimensional interpolation using the modified Shepard method, ACM Trans. Math. Software, 25 (1999), pp. 353–366.
 [5] M. Bozzini and M. Rossini, Testing methods for 3D scattered data interpolation, Monogr. Real Acad. Ci. Exact. Fis.Quim. Nat. Zaragoza, 20 (2002), pp. 111–135.
 [6] M. D. Buhmann, Radial Basis Functions: Theory and Implementation, Cambridge Monogr. Appl. Comput. Math., vol. 12, Cambridge Univ. Press, Cambridge, 2003.
 [7] R. Cavoretto and A. De Rossi, Fast and accurate interpolation of large scattered data sets on the sphere, J. Comput. Appl. Math., 234 (2010), pp. 1505–1521.
 [8] R. Cavoretto and A. De Rossi, Spherical interpolation using the partition of unity method: An efficient and flexible algorithm, Appl. Math. Lett., 25 (2012), pp. 1251–1256.
 [9] R. Cavoretto, A. De Rossi, M. Donatelli, and S. SerraCapizzano, Spectral analysis and preconditioning techniques for radial basis function collocation matrices, Numer. Linear Algebra Appl., 19 (2012), pp. 31–52.
 [10] R. Cavoretto and A. De Rossi, A meshless interpolation algorithm using a cellbased searching procedure, Comput. Math. Appl., 67 (2014), pp. 1024–1038.
 [11] R. Cavoretto, A numerical algorithm for multidimensional modeling of scattered data points, to appear in Comput. Appl. Math., (2014).
 [12] R. Cavoretto, G. E. Fasshauer, and M. McCourt, An introduction to the HilbertSchmidt SVD using iterated Brownian bridge kernels, to appear in Numer. Algorithms, (2014).
 [13] J. Cherrie, R. Beatson, and G. Newsam, Fast evaluation of radial basis functions: Methods for generalized multiquadrics in , SIAM J. Sci. Comput., 23 (2002), pp. 1549–1571.
 [14] S. De Marchi and G. Santin, A new stable basis for radial basis function interpolation, J. Comput. Appl. Math., 253 (2013), pp. 1–13.
 [15] G. E. Fasshauer, Meshfree Approximation Methods with MATLAB, World Scientific Publishers, River Edge, NJ, 2007.
 [16] G. E. Fasshauer and M. J. McCourt, Stable evaluation of Gaussian radial basis function interpolants, SIAM J. Sci. Comput., 34 (2012), pp. A737–A762.
 [17] A. Iske, Scattered data approximation by positive definite kernel functions, Rend. Sem. Mat. Univ. Pol. Torino, 69 (2011), pp. 217–246.
 [18] M. A. Iyer, L. T. Watson, and M.W. Berry, SHEPPACK: A Fortran 95 package for interpolation using the modified shepard algorithm, in Proceedings of the Annual Southeast Conference, R. Menezes et al., eds., ACM, New York, 2006, pp. 476–481.
 [19] D. Lazzaro and L. B. Montefusco, Radial basis functions for the multivariate interpolation of large scattered data sets, J. Comput. Appl. Math., 140 (2002), pp. 521–536.
 [20] J. M. Melenk and I. Babuka, The partition of unity finite element method: basic theory and applications, Comput. Methods. Appl. Mech. Engrg., 139 (1996), pp. 289–314.
 [21] M. Pazouki and R. Schaback, Bases for kernelbased spaces, J. Comput. Appl. Math., 236 (2011), pp. 575–588.
 [22] R. J. Renka, Multivariate interpolation of large sets of scattered data, ACM Trans. Math. Software, 14 (1988), pp. 139–148.
 [23] R. J. Renka, Algorithm 661: QSHEP3D: Quadratic Shepard method for trivariate interpolation of scattered data, ACM Trans. Math. Software, 14 (1988), pp. 151–152.
 [24] W. I. Thacker, J. Zhang, L. T. Watson, J. B. Birch, M. A. Iyer, and M. W. Berry, Algorithm 905: SHEPPACK: Modified Shepard algorithm for interpolation of scattered multivariate data, ACM Trans. Math. Software, 37 (2010), Art. 34, pp. 1–20.
 [25] H. Wendland, Fast evaluation of radial basis functions: Methods based on partition of unity, in Approximation Theory X: Wavelets, Splines, and Applications, C. K. Chui, L. L. Schumaker, J. Stckler, eds., Vanderbilt Univ. Press, Nashville, TN, 2002, pp. 473–483.
 [26] H. Wendland, Scattered Data Approximation, Cambridge Monogr. Appl. Comput. Math., vol. 17, Cambridge Univ. Press, Cambridge, 2005.