A Nearly Optimal Multigrid Method for General Unstructured Grids^{†}^{†}thanks: This work was supported by NSF DOE DESC0006903, DMS1217142 and Center for Computational Mathematics and Applications at Penn State.
Abstract
In this paper, we develop a multigrid method on unstructured shaperegular grids. For a general shaperegular unstructured grid of elements, we present a construction of an auxiliary coarse grid hierarchy on which a geometric multigrid method can be applied together with a smoothing on the original grid by using the auxiliary space preconditioning technique. Such a construction is realized by a cluster tree which can be obtained in operations for a grid of elements. This tree structure in turn is used for the definition of the grid hierarchy from coarse to fine. For the constructed grid hierarchy we prove that the convergence rate of the multigrid preconditioned CG for an elliptic PDE is . Numerical experiments confirm the theoretical bounds and show that the total complexity is in .
Key words: clustering, multigrid, auxiliary space, finite elements
1 Introduction
We consider a sparse linear system
(1) 
that arises from the discretization of an elliptic partial differential equation. In recent decades, multigrid (MG) methods have been well established as one of the most efficient iterative solvers for (1). Moreover, intensive research has been done to analyze the convergence of MG. In particular, it can be proven that the geometric multigrid (GMG) method has linear complexity in terms of computational and memory complexity for a large class of elliptic boundary value problems.
Roughly speaking, there are two different types of theories that have been developed for the convergence of GMG. For the first kind theory that makes critical use of elliptic regularity of the underlying partial differential equations as well as approximation and inverse properties of the discrete hierarchy of grids, we refer to Bank and Dupont [1], Braess and Hackbusch [4], Hackbusch[26], and Bramble and Pasciak[5]. The second kind of theory makes minor or no elliptic regularity assumption, we refer to Yserentant [51], Bramble, Pasciak and Xu [7], Bramble, Pasciak, Wang and Xu [6], Xu [45, 46] and Yserentant [52], and Chen, Nochetto and Xu [15, 49].
The GMG method, however, relies on a given hierarchy of geometric grids. Such a hierarchy of grids is sometimes naturally available, for example, due to an adaptive grid refinement or can be obtained in some special cases by a coarsening algorithm [16]. But in most cases in practice, only a single (fine) unstructured grid is given. This makes it difficult to generate a sequence of nested meshes. To circumvent this difficulty, nonnested geometric multigrid and relevant convergence theories have been developed. One example of such kind of theory is by Bramble, Pasciak and Xu [8]. In this work, optimal convergence theories are established under the assumption that a nonnested sequence of quasiuniform meshes can be obtained. Another example is the work by Bank and Xu [2] that gives a nearly optimal convergence estimate for a hierarchical basis type method for a general shaperegular grid in two dimensions. This theory is based on nonnested geometric grids that have nested sets of nodal points from different levels.
One feature in the aforementioned MG algorithms and their theories is that the underlying multilevel finite element subspaces are not nested, which is not always desirable from both theoretical and practical points of view. To avoid the nonnestedness, many different MG techniques and theories have been explored in the literature. One such a theory was developed by Xu [47] for a seminested MG method with an unstructured but quasiuniform grid based on an auxiliary grid approach. Instead of generating a sequence of nonnested grids from the initial grid, this method is based on a single auxiliary structured grid whose size is comparable to the original quasiuniform grid. While the auxiliary grid is not nested with the original grid, it contains a natural nested hierarchy of coarse grids. Under the assumption that the original grid is quasiuniform, an optimal convergence theory was developed in [47] for second order elliptic boundary problems with Dirichlet boundary conditions.
Totally nested multigrid methods can also be obtained for general unstructured grids, for example, algebraic multigrid (AMG) methods. Most AMG methods, although their derivations are purely algebraic in nature, can be interpreted as nested MG when they are applied to finite element systems based on a geometric grid. AMG methods are usually very robust and converge quickly for Poissonlike problems [9, 34]. There are many different types of AMG methods: the classical AMG [36, 10], smoothed aggregation AMG [40, 43, 12], AMGe [30, 31], unsmoothed aggregation AMG [14, 3] and many others. Highly efficient sequential and parallel implementations are also available for both CPU and GPU systems [27, 11, 44]. AMG methods have been demonstrated to be one of the most efficient solvers for many practical problems [39]. Despite of the great success in practical applications, AMG still lacks solid theoretical justifications for these algorithms except for twolevel theories [41, 43, 37, 38, 19, 18, 13]. For a truly multilevel theory, using the theoretical framework developed in [6, 46], Vaněk, Mandel, and Brezina [42] provide a theoretical bound for the smoothed aggregation AMG under some assumption about the aggregations. Such an assumption has been recently investigated in [13] for aggregations that are controlled by auxiliary grids that are similar to those used in [47].
The aim of this paper is to extend the algorithm and theory in Xu [47] to shape regular grids that are not necessarily quasiuniform. The lack of quasiuniformity of the original grid makes the extension nontrivial for both the algorithm and the theory. First, it is difficult to construct auxiliary hierarchical grids without increasing the grid complexity, especially for grids on complicated domains. The way we construct the hierarchical structure is to generate a cluster tree, based on the geometric information of the original grid [22, 24, 23, 21]. This auxiliary cluster tree has also been used as a coarsening process of the UAAMG [44]. Secondly, it is also not straightforward to establish optimal convergence for the geometric multigrid applied to hierarchy of auxiliary grids that can be highly locally refined.
The rest of the paper is organized as follows. In §2, we discuss some basic assumptions on the given triangulation and review multigrid theories and the auxiliary space method. An abstract analysis is provided based on four assumptions. In §3 we introduce the detailed construction of the structured auxiliary space by an auxiliary cluster tree and an improved treatment of the boundary region for Neumann boundary conditions. In §4, we describe the auxiliary space multgrid preconditioner (ASMG) and estimate the condition number by verifying the assumptions of the abstract theory. Finally, in §5, we provide some numerical examples to verify our theory.
For simplicity of exposition, the algorithm and theory are presented in this paper for the two dimensional case by using a quadtree. They can be generalized for the three dimensional case without intrinsic difficulties by using an octree.
2 Preliminaries
We present a multigrid method for second order elliptic problems on a complicated domain (). Let be the initial Hilbert space with inner product and energy norm . The variational problem we want to solve is
(2) 
In order to simplify the presentation, we restrict ourselves to the Poisson problem discretized by piecewise linear finite elements. Assume that the nodal basis functions of are and the local elements supporting patch of is where is the node index set. For any , there exists such that . In this case, the system matrix has entries of the form
with basis functions that are continuous and piecewise affine on triangles of the triangulation of the polygonal and connected domain ,
2.1 Properties of the triangulation
The triangulation is assumed to be conforming and shaperegular in the sense that the ratio of the circumcircle and inscribed circle is bounded uniformly [17], and it is a Kmesh in the sense that the ratio of diameters between neighboring elements is bounded uniformly. All elements are assumed to be shaperegular but not necessarily quasiuniform, so the diameters can vary globally and allow a strong local refinement.
The vertices of the triangulation are denoted by . Some of the vertices are Dirichlet nodes, , where we impose essential Dirichlet boundary conditions, and some are Neumann vertices, , where we impose natural Neumann boundary conditions.
The following construction will be given for the case , but a generalisation to is straightforward .
For each of the triangles , we use the barycenter
where are the three vertices of the triangle , as in Figure 1.
Notation: We denote the minimal distance between the triangle barycenters of the grid by
The diameter of is denoted by
In order to prove the desired nearly linear complexity estimate, we have to assume that the refinement level of the grid is algebraically bounded in the following sense.
Assumption 1.
We assume that for a small number , e.g. .
The above assumption allows an algebraic grading towards a point but it forbids a geometric grading. The assumption is sufficient but not necessary: the construction of the auxiliary grids might still be of complexity or less if Assumption 1 is not valid, but it would require more technical assumptions in order to prove this.
2.2 Auxiliary space preconditioning theory
The auxiliary space method, developed in [33, 47], is for designing preconditioners by using the auxiliary spaces which are not necessarily subspaces of the original space. Here, the original space is the finite element space for the given grid and the preconditioner is the multigrid method on the sequence of FE spaces for the auxiliary grids .
The idea of the method is to generate an auxiliary space with inner product and energy norm . Between the spaces there is a suitable linear transfer operator , which is continuous and surjective. is the dual operator of in the default inner products
In order to solve the linear system , we require a preconditioner B defined by
(3) 
where is the smoother and is the preconditioner of .
The estimate of the condition number is given below.
Theorem 1.
Assume that there are nonnegative constants , and , such that

the smoother is bounded in the sense that
(4) 
the transfer operator is bounded,
(5) 
the transfer is stable, i.e. for all there exists and such that
(6) 
the preconditioner on the auxiliary space is optimal, i.e. for any , there exists , such that
(7)
Then, the condition number of the preconditioned system defined by (3) can be bounded by
(8) 
We also can combine the smoother S and the auxiliary grid correction multiplicatively with a preconditioner in the form [29, 28]
(9) 
which leads to Algorithm 1. The combined preconditioner, under suitable scaling assumptions performs no worse than its components.
Theorem 2.
Suppose there exists such that for all , we have
then the multiplicative preconditioner yields the bound
(10) 
for the condition number, and
(11) 
According to Theorem 1 and 2, our goal is to construct an auxiliary space in which we are able to define an efficient preconditioner. The preconditioner will be the geometric multigrid method on a suitably chosen hierarchy of auxiliary grids. Additionally, the space has to be close enough to so that the transfer from to fulfils (5) and (6). This goal is achieved by a density fitting of the finest auxiliary grid to . In order to prove (7), we use the multigrid theory for the auxiliary grids from the viewpoint of the method of subspace corrections.
2.3 An abstract multigrid theory
In the spirit of divide and conquer, we can decompose any space as the summation of subspaces . Since may not be a direct sum, the decomposition is not necessarily unique for .
We will use the following operators, for :

the projection in the inner product ;

the natural inclusion to ;

the projection in the inner product ;

the restriction of A to the subspace ;

an approximation of which means the smoother;

.
For any and , these operators fulfil the trivial equalities
Assume we know the value of , if we perform the subspace correction in a successive way, it reads in operator form as
The corresponding error equation is
Suppose that we have nested finite element spaces: and as the basis functions of . Define . Then, is the decomposition of . Then given , we have the decomposition . Similarly define as the projection from to . In this case, the successive subspace correction method for this decomposition is nothing but the simple GaussSeidel iteration. This algorithm is sometimes called the backslash () cycle. A Vcycle algorithm is obtained from the backslash cycle by performing more smoothings after the coarse grid corrections. Such an algorithm, roughly speaking, is like a backslash () cycle plus a slash () (a reversed backslash) cycle. It is simple to show that the convergence of the Vcycle is a consequence of the convergence of the backslash cycle. The detailed algorithm is given in Algorithm 2.
Now we present a convergence analysis based on three assumptions.
 (T) Contraction of Subspace Error Operator:

There exists such that
 (A1) Stable Decomposition:

For any , there exists a decomposition
 (A2) Strengthened CauchySchwarz (SCS) Inequality:

For any
The convergence theory of the method is as follows.
Theorem 3.
Let be a decomposition satisfying assumptions (A1) and (A2), and let the subspace smoothers satisfy (T). Then
3 Construction of the auxiliary gridhierarchy
In this section, we explain how to generate a hierarchy of auxiliary grids based on the given (unstructured) grid . The idea is to analyse and split the element barycenters by their geometric position regardless of the initial grid structure. Our aim is to obtain a structured hierarchy of grids that preserves some properties of the initial grid, e.g. the local mesh size. A similar idea has already been applied in [20, 32, 44].
3.1 Clustering and auxiliary boxtrees
We build an auxiliary tree structure by a geometrically regular subdivision of boxes. For the initial step we choose a (minimal) square bounding box of the domain :
Define the level of to be . Then we subdivide regularly, thus obtaining four children :
where and . The level of is , where . Finally, we apply the same subdivision process recursively, starting with and define the level of the boxes recursively (cf. Figure 2). This yields an infinite tree with root . Letting denote a box in this tree, we can define the cluster , which is a subset of , by
This yields an infinite cluster tree with root . We construct a finite cluster tree by not subdividing nodes which are below a minimal cardinality , e.g. . Define the nodes which have no child nodes as the leaf nodes. The cardinality is the number of the barycenters in . Leaves of the cluster tree contain at most indices. For any leaf node, its parent node contains at least 4 barycenters, then the total number of leaf nodes is bounded by the number of barycenters .
Remark 1.
The size of a leaf box can be much larger than the size of triangles that intersect with since a large box may only intersect with one very small element and will not be further subdivided.
Lemma 1.
Suppose Assumption 1 holds, the complexity for the construction of is .
Proof.
First, we estimate the depth of the cluster tree. Let be a node of the cluster tree and . By definition the distance between two nodes is at least
Therefore, the box has a diameter of at least . After each subdivision step the diameter of the boxes is exactly halved. Let denote the number of subdivisions after which was created. Then
Consequently, we obtain
so that by Assumption 1,
Therefore the depth of is in .
Next, we estimate the complexity for the construction of . The subdivision of a single node and corresponding box is of complexity . On each level of the tree , the nodes are disjoint, so that the subdivision of all nodes on one level is of complexity at most . For all levels this sums up to at most . ∎
Remark 2.
The boxes used in the clustering can be replaced by arbitrary shaped elements, e.g. triangles/tetrahedra or anisotropic elements — depending on the application or operator at hand. For ease of presentation we restrict ourselves to the case of boxes.
Remark 3.
The complexity of the construction can also be bounded from below by , as is the case for a uniform (structured) grid. However, this complexity arises only in the construction step and this step will typically be of negligible complexity.
Notice that the tree of boxes is not the regular grid that we need for the multigrid method. A further refinement as well as deletion of elements is necessary.
3.2 Closure of the auxiliary boxtree
The hierarchy of boxmeshes from Figure 2 is exactly what we want to construct: each box has at most one hanging node per edge, namely, the fineness of two neighbouring boxes differs by at most one level. In general this is not fulfilled.
We construct the grid hierarchy of nested uniform meshes starting from a coarse mesh consisting of only a single box , the root of the box tree. All boxes in the meshes to be constructed will either correspond to a cluster in the cluster tree or will be created by refinement of a box that corresponds to a leaf of the cluster tree.
Let be a level that is already constructed (the trivial start of the induction is given above).
We mark all elements of the mesh which are then refined regularly. Let be an arbitrary box in . The box corresponds to a cluster . The following two situations can occur:

If then is marked for refinement.

If , e.g. , then is not marked in this step.
After processing all boxes on level , it may occur that there are boxes on level that would have more than one hanging node on an edge after refinement of the marked boxes, cf. Figure 3. Since we want to avoid this, we have to perform a closure operation for all such elements and for all coarser levels .

Let be the set of all boxes on level having too many hanging nodes. All of these are marked for refinement. By construction a single refinement of each box is sufficient. However, a refinement on level might then produce too many hanging nodes in a box on level . Therefore, we have to form the lists , of boxes with too many hanging nodes successively on all levels and mark the elements.

At last we refine all boxes (on all levels) that are marked for refinement.
The result of the closure operation is depicted in Figure 4.
Each of the boxes in the closed grids lives on a unique level . It is important that a box is either refined regularly (split into four successors on the next level) or it is not refined at all. For each box that is marked in step 1, there are at most boxes marked during the closure step 3.
Lemma 2.
The complexity for the construction and storage of the (finite) box tree with boxes and corresponding cluster tree with clusters is of complexity , where is the number of barycenters, i.e., the number of triangles in the triangulation .
Proof.
For the level of the tree, let be the number of leaf boxes and be the boxes which have child boxes. Accordingly, the total number of the boxes on level is . By definition,
where and . Since we have
As a result,
The total work for generating the tree is .
Given , let denote the number of boxes in (the set of boxes that have more than 1 hanging node). Since every box in has to be a leaf box, we have . As the process of closing each hanging node will go through at most two boxes in any given level, the total number of the marked boxes in this closure process is bounded by
∎
3.3 Construction of a conforming auxiliary grid hierarchy
At last, we create a hierarchy of nested conforming triangulations by subdivision of the boxes and by discarding those boxes that lie outside the domain . For any box there can be at most one hanging node per edge. The possible situations and corresponding local closure operations are presented in Figure 5.
The closure operation introduces new elements on the next finer level.
The final hierarchy of triangular grids is nested and conforming without hanging nodes. All triangles have a minimum angle of degrees, i.e., they are shaperegular, cf. Figure 6.
The triangles in the quasiregular meshes have the following properties:

All triangles in that have children which are themselves further subdivided, are refined regularly (four congruent successors) as depicted here.

Each triangle that is subdivided but not regularly refined, has successors that will not be further subdivided.
The hierarchy of grids constructed so far covers on each level the whole box . This hierarchy has now to be adapted to the boundary of the given domain . In order to explain the construction we will consider the domain and triangulation ( triangles) from Figure 7.
The triangulation consists of shaperegular elements, it is locally refined, it contains many small inclusions, and the boundary of the domain is rather complicated.
3.4 Adaptation of the auxiliary grids to the boundary
The Dirichlet boundary: On the Dirichlet boundary we want to satisfy homogeneous boundary conditions (b.c.), i.e., (nonhomogeneous b.c. can trivially be transformed to homogeneous ones). On the given fine triangulation this is achieved by use of basis functions that fulfil the b.c. Since the auxiliary triangulations do not necessarily resolve the boundary, we have to use a slight modification.
Definition 1 (Dirichlet auxiliary grids).
We define the auxiliary triangulations by
In Figure 8 the Dirichlet auxiliary grids are formed by the blue boxes. All other elements (light green and dark green) are not used for the Dirichlet problem. On an auxiliary grid we impose homogeneous Dirichlet b.c. on the boundary
The auxiliary grids are still nested, but the area covered by the triangles grows with increasing the level number:
The Neumann boundary: On the Neumann boundary we want to satisfy natural (Neumann) b.c., i.e., . For the auxiliary triangulations , we will approximate the true b.c. by the natural b.c. on an auxiliary boundary.
Definition 2 (Neumann auxiliary grids).
Define the auxiliary triangulations by
In Figure 9 the Neumann auxiliary grids are formed by the blue boxes. All other elements (light green) are not used for the Neumann problem. On an auxiliary grid we impose natural Neumann b.c., the auxiliary grids are nonnested. The area covered by the triangles grows with decreasing level number:
Remark 4 (Mixed Dirichlet/Neumann b.c.).
The defintion of the grids for mixed boundary conditions of Dirichlet (on ) and Neumann type we use the grids
The b.c. on the auxiliary grid are of Neumann type except for neighbours of boxes where essential Dirichlet b.c. are imposed.
3.5 Near boundary correction
Since the boundaries of different levels do not coincide, the near boundary error cannot be reduced very well by the standard multigrid method for the Neumann boundary condition. So we introduce a nearboundary region where a correction for the boundary approximation will be done. The nearboundary region is defined in layers around the boundary :
Definition 3 (Nearboundary region).
We define the th nearboundary region on level of the auxiliary grids by
The idea for solving the linear system on level is to perform a nearboundary correction after the coarse grid correction. The errors introduced by the coarse grid correction is eliminated by solving the subsystem for the degrees of freedom in the nearboundary region . The extra computational complexity is because only the elements which are close to the boundary are considered.
Definition 4 (Partition of degrees of freedom).
Let denote the index set for the degrees of freedom on the auxiliary grid . We define the nearboundary degrees of freedom by
Let be a coefficient vector on level of the auxiliary grids. Then we extend the standard coarse grid correction by the solve step
The small system of nearboundary elements is solved by an matrix solver, cf. [25].
4 Convergence of the auxiliary grid method
In this section, we investigate and analyze the new algorithm by verifying the assumptions of the theorem of the auxiliary grid method.
4.1 Overall algorithm
Based on the auxiliary hierarchy we constructed in Section 3, we can define the auxiliary space preconditioner (3) and (9) as follows.
Let the auxiliary space and be generated from (2). Since we already have the hierarchy of grids , we can apply MG on the auxiliary space as the preconditioner . On the space , we can apply a traditional smoother , e.g. Richardson, Jacobi, or GaußSeidel. For the stiffness matrix (diagonal, lower and upper triangular part), the matrix representation of the Jacobi iteration is and for the GaußSeidel iteration it is . (More generally, one could use any smoother that features the spectral equivalence .)
The auxiliary grid may be overrefined, it can happen that an element intersects much smaller auxiliary elements :
(12) 
In this case, we do not have the local approximation and stability properties for the standard nodal interpolation operator. Therefore, we need a stronger interpolation between the original space and the auxiliary space. This is accomplished by the ScottZhang quasiinterpolation operator
for a triangulation [35]. Let be an dual basis to the nodal basis . We define the interpolation operator as
By definition, preserves piecewise linear functions and satisfies (cf. [35]) for all
(13) 
We define the new interpolation from the auxiliary space to by the ScottZhang interpolation and the reverse interpolation .
Then we can apply Theorem 1 for . In order to estimate the condition number, we need to verify that the multigrid preconditioner on the auxiliary space is bounded and the finest auxiliary grid and corresponding FE space we constructed yields a stable and bounded transfer operator and smoothing operator.
4.2 Convergence of the MG on the auxiliary grids
Firstly, we prove the convergence of the multigrid method on the auxiliary space. For the Dirichlet boundary, we have the nestedness
which induces the nestedness of the finite element spaces defined on the auxiliary grids :
In order to avoid overloading the notation, we will skip the superscript in the following.
In order to prove the convergence of the local multilevel methods by Theorem 3, we only need to verify the assumptions for the decompsition of
(14) 
where
Since is the local refinement of , the size of the triangles in may be different. We denote as a refinement of the grid where all elements are regularly refined such that all elements from are congruent to the smallest element of . The finite element spaces corresponding to are denoted by . In the triangulations we have
For an element we denote by the level number of the triangulation to which belongs, i.e. . For any vertex , if but , we define . The following properties about the generation of elements or vertices are [49, 15]
where is the set of vertices of .
4.2.1 Stable decomposition: Proof of (A1)
The purpose of this subsection is to prove the decomposition is stable.
Theorem 4.
For any , there exist function , , such that
(15) 
Proof.
Following the argument of [49, 15], we define the ScottZhang interpolation between different levels , , and .
By the definition, we can define the decomposition as
Assume , where . Then,
where is support of and the center vertex is .
By the inverse inequality, we can conclude
Invoking the approximability and stability and following the same argument of Lemma 6, we have
So,