A Multiscale Pyramid Transform for Graph Signals
Abstract
Multiscale transforms designed to process analog and discretetime signals and images cannot be directly applied to analyze highdimensional data residing on the vertices of a weighted graph, as they do not capture the intrinsic topology of the graph data domain. In this paper, we adapt the Laplacian pyramid transform for signals on Euclidean domains so that it can be used to analyze highdimensional data residing on the vertices of a weighted graph. Our approach is to study existing methods and develop new methods for the four fundamental operations of graph downsampling, graph reduction, and filtering and interpolation of signals on graphs. Equipped with appropriate notions of these operations, we leverage the basic multiscale constructs and intuitions from classical signal processing to generate a transform that yields both a multiresolution of graphs and an associated multiresolution of a graph signal on the underlying sequence of graphs.
Signal processing on graphs, multiresolution, spectral graph theory, graph downsampling, Kron reduction, spectral sparsification, Laplacian pyramid, interpolation
I Introduction
Multiscale transform methods can reveal structural information about signals, such as singularities or irregular structural patterns, at different resolution levels. At the same time, via coarsetofine analysis, they provide a way to reduce the complexity and dimensionality of many signal processing tasks, often resulting in fast algorithms. However, multiscale transforms such as wavelets and filter banks designed to process analog and discretetime signals and images cannot be directly applied to analyze highdimensional data residing on the vertices of a weighted graph, as they do not capture the intrinsic topology of the underlying graph data domain (see [shuman_SPM] for an overview of the main challenges of this emerging field of signal processing on graphs).
To address this issue, classical wavelets have recently been generalized to the graph setting in a number of different ways. Reference [shuman_SPM] contains a more thorough review of these graph wavelet constructions, which include, e.g., spatiallydesigned graph wavelets [Crovella2003], diffusion wavelets [diffusion_wavelets], spectral graph wavelets [sgwt], lifting based wavelets [jansen, narang_lifting_graphs] critically sampled twochannel wavelet filter banks [narang_icip], criticallysampled spline wavelet filter banks [ekambaram_globalsip], and multiscale wavelets on balanced trees [gavish]. Multiresolutions of graphs also have a long history in computational science problems including graph clustering, numerical solvers for linear systems of equations (often arising from discretized differential equations), combinatorial optimization problems, and computational geometry (see, e.g., [teng][vishnoi] and references therein). We discuss some of the related work from these fields in Sections LABEL:Se:alt_down and LABEL:Se:alternatives.
In this paper, we present a modular framework for adapting Burt and Adelson’s Laplacian pyramid transform [burt_adelson] to the graph setting. Our main contributions are to (1) survey different methods for and desirable properties of the four fundamental graph signal processing operations of graph downsampling, graph reduction, generalized filtering, and interpolation of graph signals (Sections IIILABEL:Se:filtering); (2) present new graph downsampling and reduction methods, including downsampling based on the polarity of the largest Laplacian eigenvector and Kron reduction followed by spectral sparsification; and (3) leverage these fundamental operations to construct a new multiscale pyramid transform that yields both a multiresolution of a graph and a multiscale analysis of a signal residing on that graph (Section LABEL:Se:pyramid). We also discuss some implementation approximations and open issues in Section LABEL:Se:approximations, as it is important that the computational complexity of the resulting multiscale transform scales well with the number of vertices and edges in the underlying graph.
Ii Spectral Graph Theory Notation
We consider connected, loopless (no edge connecting a vertex to itself), undirected, weighted graphs. We represent such a graph by the triplet , where is a set of vertices, is a set of edges, and is a weight function that assigns a nonnegative weight to each edge. An equivalent representation is , where is a weighted adjacency matrix with nonnegative entries
In unweighted graphs, the entries of the adjacency matrix are ones and zeros, with a one corresponding to an edge between two vertices and a zero corresponding to no edge. The degree matrix is a diagonal matrix with an diagonal element , where is the set of vertex ’s neighbors in . Its maximum element is . We denote the combinatorial graph Laplacian by , the normalized graph Laplacian by , and their respective eigenvalue and eigenvector pairs by and . Then and are the matrices whose columns are equal to the eigenvectors of and , respectively. We assume without loss of generality that the eigenvalues are monotonically ordered so that , and we denote the maximum eigenvalues and associated eigenvectors by and . The maximum eigenvalue is said to be simple if .
Iii Graph Downsampling
Two key components of multiscale transforms for discretetime signals are downsampling and upsampling.^{1}^{1}1We focus here on downsampling, as we are only interested in upsampling previously downsampled graphs. As long as we track the positions of the removed components of the signal, it is straightforward to upsample by inserting zeros back into those components of the signal. To downsample a discretetime sample by a factor of two, we remove every other component of the signal. To extend many ideas from classical signal processing to the graph setting, we need to define a notion of downsampling for signals on graphs. Yet, it is not at all obvious what it means to remove every other component of a signal defined on the vertices of a graph. In this section, we outline desired properties of a downsampling operator for graphs, and then go on to suggest one particular downsampling method.
Let be a graph downsampling operator that maps a weighted, undirected graph to a subset of vertices to keep. The complement is the set of vertices that removes from . Ideally, we would like the graph downsampling operator to have the following properties:

It removes approximately half of the vertices of the graph (or, equivalently, approximately half of the components of a signal on the vertices of the graph); i.e., .

The set of removed vertices are not connected with edges of high weight, and the set of kept vertices are not connected with edges of high weight; i.e., if , then is low, and if , then is low.

It has a computationally efficient implementation.
Iiia Vertex Selection Using the Largest Eigenvector of the Graph Laplacian
The method we suggest to use for graph downsampling is to select the vertices to keep based on the polarity of the components of the largest eigenvector; namely, let
(1) 
We refer to this method as the largest eigenvector vertex selection method. A few remarks are in order regarding this choice of downsampling operator. First, the polarity of the largest eigenvector splits the graph into two components. In this paper, we choose to keep the vertices in , and eliminate the vertices in , but we could just as easily do the reverse, or keep the vertices in , for example. Second, for some graphs such as the complete graph, is a repeated eigenvalue, so the polarity of is not uniquely defined. In this case, we arbitrarily choose an eigenvector from the eigenspace. Third, we could just as easily base the vertex selection on the polarity of the normalized graph Laplacian eigenvector, associated with the largest eigenvalue, . In some cases, such as the bipartite graphs discussed next, doing so yields exactly the same selection of vertices as downsampling based on the largest nonnormalized graph Laplacian eigenvector; however, this is not true in general.
In the following sections, we motivate the use of the largest eigenvector of the graph Laplacian from two different perspectives  first from a more intuitive view as a generalization of downsampling techniques for special types of graphs, and then from a more theoretical point of view by connecting the vertex selection problem to graph coloring, spectral clustering, and nodal domain theory.
IiiB Special Case: Bipartite Graphs
There is one situation in which there exists a fairly clear notion of removing every other component of a graph signal – when the underlying graph is bipartite. A graph is bipartite if the set of vertices can be partitioned into two subsets and so that every edge links one vertex in with one vertex in . In this case, it is natural to downsample by keeping all of the vertices in one of the subsets, and eliminating all of the vertices in the other subset. In fact, as stated in the following theorem, the largest eigenvector downsampling method does precisely this in the case of bipartite graphs.
Theorem 1 (Roth, 1989)
For a connected, bipartite graph , the largest eigenvalues, and , of and , respectively, are simple, and . Moreover, the polarity of the components of the eigenvectors and associated with and both split into the bipartition and . That is, for or ,
(2) 
If, in addition, is regular (), then , and 1N,if i∈V11N,if i∈V1c