A short-graph Fourier transform via personalized PageRank vectors

A short-graph Fourier transform via personalized PageRank vectors

Abstract

The short-time Fourier transform (STFT) is widely used to analyze the spectra of temporal signals that vary through time. Signals defined over graphs, due to their intrinsic complexity, exhibit large variations in their patterns. In this work we propose a new formulation for an STFT for signals defined over graphs. This formulation draws on recent ideas from spectral graph theory, using personalized PageRank vectors as its fundamental building block. Furthermore, this work establishes and explores the connection between local spectral graph theory and localized spectral analysis of graph signals. We accompany the presentation with synthetic and real-world examples, showing the suitability of the proposed approach.

\captionsetup

[subfigure]labelformat=simple \DeclareCaptionLabelSeparatorperiodspace.  \captionsetupfont=footnotesize, labelsep=periodspace, singlelinecheck=false \captionsetup[sub]font=footnotesize,singlelinecheck=true \crefnamesectionSectionsections \crefnamefigureFig.figs. \crefnameequationEquationequations \crefrangelabelformatequation(#3#1#4–#5#2#6) \crefnameproblemProblemproblems \creflabelformatproblem(#2#1#3) \crefnamealgorithmAlgorithmalgorithms \creflabelformatalgorithm(#2#1#3) \crefnameassumptionAssumptionassumptions \creflabelformatassumption(#2#1#3) \crefnamestepStepsteps \creflabelformatstep(#2#1#3) \crefnamealgolineLinelines \nameMariano Tepper and Guillermo Sapiro1 \addressDepartment of Electrical and Computer Engineering, Duke University

{keywords}

Graph, localized Fourier transform, personalized PageRank, local spectral graph theory.

1 Introdution

The Fourier transform globally decomposes a temporal signal into its constituting frequencies, identifying their contribution to the signal formation. Often, temporal signals vary their behavior through time; in these cases, the Fourier transform, being global, falls short as a tool to analyze the characteristics of these signals. The short-time Fourier transform (STFT) [[1]] is used to analyze the Fourier spectrum of temporally localized sections of the signal. It is well studied that there is a trade-off between resolution (sharpness) in time and its counterpart in frequency. There is no way to get arbitrarily sharp analysis in both domains simultaneously [[2], Sec. 2.6.2].

Formally, in the STFT a window function which is nonzero for only a short period of time is slid along the time axis and multiplied by the input signal ; then the Fourier transform of the resulting signal is taken. Formally, for one dimensional signals,

(1)

The changing spectra is usually analyzed as a function of the time-shift and is well suited to analyze time-varying signals. The above formula can be interpreted as the following three-sets algorithm: (1) translate the window by , (2) modulate the result by frequency , and (3) take the convolution of the result with the signal . This can be written as

(2)

where and are the translation and modulation operators, respectively.

Weighted graphs are a natural representational structure in most modern network applications (including, for example, social, energy, transportation, and sensor networks). These graphs are loaded with information, usually in the form of high-dimensional data (i.e., signals) that reside on the vertices (nodes) of graphs. Graph signal processing lies at the intersection of graph theory and computational harmonic analysis and seeks to process such signals on graphs. See [[3], [4]] for further details and references on this emerging field.

Graph signal processing has been successful at characterizing the equivalent of the Fourier transform in graph domains. Many different types of localized spectral transforms have been proposed in recent years, see [[3], Sec. IV] for a thorough discussion. This list includes a windowed Fourier transform [[5]], later described in this work.

One of the main examples are diffusion wavelets [[6]], which are based on compressed representations of powers of a graph diffusion operator. In parallel, local spectral techniques, in which personalized PageRank vectors play a prominent role, have become increasingly popular in the field of community detection in graphs [e.g., [7]]. As we will see in \crefsec:sgft, the PageRank equation is defined recursively, and we can consider a single PageRank vector in place of a sequence of random walk vectors, or of powers of a diffusion operator [[8]].

In this work, we establish and explore for the first time the connection between local spectral graph methods and localized spectral analysis of graph signals. This work is a first step in this exploration, and introduces a short-graph Fourier transform inspired on the ideas of local graph analysis, using personalized PageRank vectors as fundamental building blocks of the method.

The remainder of the paper is organized as follows. In \crefsec:sgft we introduce our short-graph Fourier transform. Experimental results on synthetic and real graphs are presented in \crefsec:experiments, showing the interesting characteristics of the proposed formulation. Finally, we provide some concluding remarks in \crefsec:conclusions.

2 From local spectral graph theory to a short-graph Fourier transform

We begin by introducing the notation and fundamental formulas used throughout the paper.

Let be a matrix. In the following, , , denote the th entry of , the th column of , and the th row of , respectively.

We consider a graph , where and is the weighted adjacency matrix. The weighted entry represents in most applications a measure of similarity between vertices and . We assume that is connected and undirected, i.e., . The degree of a node is . Let be a diagonal matrix with entries . The Laplacian of is defined as , and the normalized Laplacian of is defined as . We denote the eigendecompositions of and by and , respectively. We assume that the eigenvalues, the diagonal entries of and , are sorted in increasing order. Finally, the volume of a set of vertices is .

Let be a signal over the graph vertices, i.e., is the signal value at vertex . The classical Fourier transform can be defined as the transform that diagonalizes the Laplace operator. Similarly, the graph Fourier transform [[9]] is defined as , where diagonalizes the graph Laplacian. The inverse graph Fourier transform is then simply defined as .

2.1 Local spectral graph theory

The second eigenvalue of the graph Laplacian can be viewed as the solution to

(3)

The optimal solution is a generalized eigenvector of with respect to and provides a map from the graph to the real line. This map encodes a measure of similarity (geodesic distance) between graph vertices. This property is exploited for clustering [[10]] and hashing [[11]], for example.

In [[12]] the above problem is modified to incorporate a bias towards a target region (defined by one or more vertices) in the graph. This region is represented as an indicator vector , normalized such that and . More precisely, given a set of nodes we define the unit vector as

(4)

where . The modified problem is given by [[12]]

(5)

The only modification is the addition of the constraint on . This can be interpreted as imposing to the solution a correlation with larger than .

Intuitively, as the solution to \crefeq:spectral provides a notion of geodesic distance between graph nodes, the solution to \crefeq:local_spectral provides a notion of geodesic distance from the seed set to the rest of the vertices. This link will become clear in the following.

Theorem ([[12]]).

Let be a seed vector such that and , and , where is the second generalized eigenvector of with respect to . In addition, let be an optimal solution to \crefeq:local_spectral with correlation parameter . Then, there exists some and some such that

(6)

theorem]theo:local_spectral_equivalence

PageRank [[13]] assigns a numerical weight to each vertex of a graph, assessing its relative importance within the graph; its personalized variant is frequently used to localize the PageRank vector within a subset of the network [[8]]. The following proposition can be proven using simple algebraic manipulations and the definition of .

Proposition.

Let in \creftheo:local_spectral_equivalence. The vector , defined as , is the solution to the (degree normalized) personalized PageRank (PPR) equation

(7)

\cref

theo:local_spectral_equivalence,theo:ppr_link connect \crefeq:local_spectral with the personalized PageRank equation [[14]]. In the field of community detection, PPR vectors are used to find local communities around seed vertices [e.g., [8], [7], [15]], where a small but cohesive “seed set” of vertices is expanded to generate its enclosing community (vertices having a stronger relationship to the seed set than to the rest of the graph). In this context, PPR vectors arise as natural units of observation for localized analysis of graphs.

Furthermore, powers of a graph diffusion operator were identified in [[6]] as natural building blocks to define wavelets on graphs. Notice that the PPR vector is exactly equivalent to the recursive application of the diffusion operator  [[8]], which leads naturally to the notion of geodesic distance.

Given this evidence, we posit that the PPR vector is a fundamental tool to perform a localized spectral analysis of graph signals. This connection is the key observation of this work and drives our definition of a short-graph Fourier transform.

2.2 A short-graph Fourier transform

As described in the introduction, we need two elements to define a short-graph Fourier transform: a localization (e.g., classically a translation) and a modulation operators.

Definition (Localization).

We define the local window at node as

(8)

where the solution to \crefeq:local_spectral with and the maximum is taken entrywise.

The window is defined in terms of its correlation with , yielding to a simple conceptual interpretation. In this work, we solve \crefeq:local_spectral using \creftheo:local_spectral_equivalence. Given the eigendecomposition of the normalized Laplacian, we have . Then,

(9)

Once the eigendecomposition is computed as a preprocessing step, this formula delivers an efficient method for obtaining , without any iterations nor matrix inversions (albeit the inversion of the diagonal matrix ). Interestingly, the spectral localization of is determined by the product , i.e., by the correlation between and each element of the graph Fourier basis.

Since the eigendecomposition of is used in \crefeq:sgft_localization, we also use it in our graph modulation operator.

Definition (Graph modulation).

For , we define the graph modulation operator by

(10)

where denotes the entrywise multiplication.

is the identity operator, such as is in the classical modulation for temporal signals.

Definition.

Given the localization and modulation operators, we define the short-graph Fourier transform of a signal at vertex and frequency as

(11)

The spectrogram of is defined as

(12)

It is not hard to see that \crefeq:sgft reduces to the standard one when the unweighted graph is a Cartesian grid.

Note.

Shuman et al. provide a different definition for a short-graph Fourier transform [[5]]. They define the graph modulation operator as , where denotes the entrywise multiplication. They also define a convolution operator as , and a translation operator as , where is the impulse function at vertex . This leads to the (more classical) definition . Defining an appropriate window (kernel) is not trivial in the graph space. It is possible to define it in the graph spectral space as and then invert the graph Fourier transform. In this work we do not aim at producing a better method than the one in [[5]] (although we will exemplify potential localization advantages of our definition). The method here proposed is based on radically different principles, which are of interest by themselves for the spectral study of graph signals.

3 Experimental Results

We implemented the proposed short-graph Fourier transform in Python, using the graph-tool library [[16]]. We make the code publicly available at https://github.com/marianotepper/sgft. In all examples we set , where takes a particular value in each example.

We begin by examining the localization operator (\crefdef:sgft_localization). For this, we use in \creffig:localization a linear graph, where localization can be easily interpreted and visualized. The main observation is that the window is properly localized when using the proposed approach, while this is not always the case with the convolutional approach.

{subfigure}
Figure 1: The linear graph has vertices, where each vertex is connected to its two neighbors (with periodicity in the edges). In the weighted case, the graph weights for all edges are set to one, excepting edges and (marked in red) which have a weight of .
{subfigure}
Unweighted Weighted
Conv. () PPR () Conv. () PPR ()
Figure 2: We compute localized windows around vertices , , and (red, blue and green curves, respectively). Note how the convolutional approach [[5]] fails to properly locate the window in the weighted case, as the window peak does not coincide with the desired vertex.
Figure 3: Window localization comparison. In the unweighted case (in which the graph Fourier and the standard Fourier transforms are equal), convolutional localization [[5]] works well; however, when the graph weights present a sharp discontinuity, it fails to provide an accurate result. Contrarily, the proposed PPR approach works well in both cases.

In the second example, we present results on a 2D grid graph, see \creffig:grid. When this graph is unweighted, the graph Fourier transform amounts to the classical 2D Fourier. In the unweighted and weighted cases, the proposed short-graph Fourier transform is able to clearly identify the two different signal regions. Naturally, since the weight discontinuity matches the boundary between both signal regions, the spectrograms of the weighted graph have better spatial and frequency localizations. The proposed PPR-based spectrogram exhibits better spatial and frequency localizations than the convolutional approach.

\subcaption

The graph is a regular grid, where each vertex is connected to its four neighbors (with periodicity in the edges). The input signal is formed by two sinusoidal waveforms, as shown on the side. In the weighted case, the graph weights for the edges connecting both waveforms are set to while for the rest of the edges, they are set to one.

{subfigure}
Unweighted Weighted
Conv. () PPR () Conv. () PPR ()
Figure 4: Top row: Localized windows at the central vertex of the grid. Center row: Spectrograms. Bottom row: For each vertex, color represents the index of the frequency with maximum magnitude, from light green (low frequencies) to dark green (high frequencies).
Figure 5: All spectrograms coarsely identify the two sections in the input signal with different patterns (in all cases, we use the first eigenvectors only). In the unweighted case, the proposed method works significantly better than the convolutional approach [[5]]. In the unweighted and weighted cases, the proposed PPR-based method has better spectral localization, i.e., for each vertex, fewer frequencies are selected.

For our last example, we use a real graph comprised of weather stations distributed throughout the US. The input signal is the average temperature in each station in 2014. The localized PPR windows follow nicely the graph topology, being more isotropic or anisotropic, depending on the local graph topology. The spectrogram obtained with the proposed PPR-based method presents clear patterns, which are coherent with the spatial arrangement of the graph vertices.

Figure 6: Graph of weather stations in the US, with color representing the average annual temperature in 2014. The graph was built by connecting each station to its 6 spatial nearest neighbors (the corresponding edge weight is the spatial distance between both stations).
Figure 7: Different personalized PageRank windows.
Figure 8: Spectrogram obtained with the proposed technique (for better visualization, we only show the first 30 components).
Figure 9: For the vertex marked with a red square (and a red arrow), we show the correlation between its spectral signature (its column in the spectrogram) and the signatures of the other vertices.
Figure 10: Clear patterns appear in the spectrogram, where nodes with similar spectral signature are localized in spatially coherent areas (e.g., Florida). We use and restrict the computations to the first eigenvectors.

4 Conclusions

In this work we presented an extension of the classical short-time Fourier transform (STFT) to signals defined over graphs. We have shown with different examples that this new short-graph Fourier transform can be a valuable tool for extracting information from signals on graphs.

More broadly, we established the connection between local spectral graph theory and localized spectral analysis of graph signals. This is the first work that studies the use of personalized PageRank vectors as fundamental building blocks for local spectral analysis of graph signals.

The STFT becomes ineffective when the signal includes structures having different time-frequency resolution, some very localized in time and others in frequency. Wavelets address this issue by changing the time and frequency resolution. Such as diffusion wavelets extended the definition to the graph domain using powers of a graph diffusion operator, we plan on extending the use of personalized PageRank vectors to produce an alternate definition of wavelets on graphs.

Footnotes

  1. thanks: Work partially supported by NSF, ONR, NGA, ARO, and NSSEFF.

References

  1. J. Allen, “Short term spectral analysis, synthesis, and modification by discrete Fourier transform,” EEE Trans. Acoust., vol. 25, no. 3, 1977.
  2. M. Vetterli and J. Kovačević, Wavelets and Subband Coding, Prentice Hall, 1995.
  3. D. Shuman, S. Narang, P. Frossard, A. Ortega, and P. Vandergheynst, “The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains,” IEEE Signal Process. Mag., vol. 30, no. 3, pp. 83–98, 2013.
  4. A. Sandryhaila and J. Moura, “Discrete signal processing on graphs: Frequency analysis,” IEEE Trans. Signal Process., vol. 62, no. 12, pp. 3042–3054, 2014.
  5. D. Shuman, B. Ricaud, and P. Vandergheynst, “A windowed graph Fourier transform,” in SSP, 2012.
  6. R. Coifman and M. Maggioni, “Diffusion wavelets,” Appl. Comput. Harmon. Anal., vol. 21, no. 1, pp. 53–94, 2006.
  7. R. Andersen and K. Lang, “Communities from seed sets,” in WWW, 2006.
  8. R. Andersen, F. Chung, and K. Lang, “Local graph partitioning using PageRank vectors,” in FOCS, 2006.
  9. F. Chung, Spectral Graph Theory, vol. 92, American Mathematical Soc., 1997.
  10. J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 888–905, 2000.
  11. Y. Weiss, A. Torralba, and R. Fergus, “Spectral hashing,” NIPS, 2008.
  12. M. Mahoney, L. Orecchia, and N. Vishnoi, “A local spectral method for graphs: With applications to improving graph partitions and exploring data graphs locally,” J. Mach. Learn. Res., vol. 13, no. 1, pp. 2339–2365, 2012.
  13. L. Page, S. Brin, R. Motwani, and T. Winograd, “The PageRank citation ranking: Bringing order to the web.,” Tech. Rep. 1999-66, Stanford InfoLab, 1999.
  14. G. Jeh and J. Widom, “Scaling personalized web search,” in WWW, 2003.
  15. K. Kloster and D. Gleich, “Personalized PageRank solution paths,” arXiv:1503.00322, 2015.
  16. T. Peixoto, “The graph-tool Python library,” figshare, 2014.
137253
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
Edit
-  
Unpublish
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel
Comments 0
Request comment
""
The feedback must be of minumum 40 characters
Add comment
Cancel
Loading ...

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description