Distributed Time-Varying Graph Filtering

# Distributed Time-Varying Graph Filtering

Elvin Isufi,  Andreas Loukas,  Andrea Simonetto,  and Geert Leus,  Authors contributed equally in the preparation of this manuscript.E. Isufi, A. Simonetto and G. Leus are with the faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2826 CD Delft, The Netherlands. A. Loukas is with Department of Telecommunication Systems, TU Berlin, Germany. E-mails: e.isufi-1, a.simonetto, g.j.t.leus@tudelft.nl, a.loukas@tu-berlin.de. This manuscript presents a generalisation of [1]. This research was supported in part by STW under the D2S2 project from the ASSYS program (project 10561).

# Autoregressive Moving Average Graph Filtering

Elvin Isufi,  Andreas Loukas,  Andrea Simonetto,  and Geert Leus,  Authors contributed equally in the preparation of this manuscript.E. Isufi, A. Simonetto and G. Leus are with the faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2826 CD Delft, The Netherlands. A. Loukas is with Department of Telecommunication Systems, TU Berlin, Germany. E-mails: e.isufi-1, a.simonetto, g.j.t.leus@tudelft.nl, a.loukas@tu-berlin.de. This manuscript presents a generalisation of [1]. This research was supported in part by STW under the D2S2 project from the ASSYS program (project 10561).
###### Abstract

One of the cornerstones of the field of signal processing on graphs are graph filters, direct analogues of classical filters, but intended for signals defined on graphs. This work brings forth new insights on the distributed graph filtering problem. We design a family of autoregressive moving average (ARMA) recursions, which (i) are able to approximate any desired graph frequency response, and (ii) give exact solutions for specific graph signal denoising and interpolation problems. The philosophy, to design the ARMA coefficients independently from the underlying graph, renders the ARMA graph filters suitable in static and, particularly, time-varying settings. The latter occur when the graph signal and/or graph topology are changing over time. We show that in case of a time-varying graph signal our approach extends naturally to a two-dimensional filter, operating concurrently in the graph and regular time domain. We also derive the graph filter behavior, as well as sufficient conditions for filter stability when the graph and signal are time-varying. The analytical and numerical results presented in this paper illustrate that ARMA graph filters are practically appealing for static and time-varying settings, as predicted by theoretical derivations.

Keywords— distributed graph filtering, signal processing on graphs, infinite impulse response graph filters, autoregressive moving average graph filters, time-varying graph signals, time-varying graphs.

## I Introduction

Due to their ability to capture the complex relationships present in many high-dimensional datasets, graphs have emerged as a favorite tool for data analysis. Indeed, in recent years we have seen significant efforts to extend classical signal processing methods to the graph setting, where, instead of regular low-dimensional signals (e.g., a temporal or spatial signals), one is interested in graph signals, i.e., signals defined over the nodes of a graph [2]. The introduction of a Fourier-like transform for graph signals brought the tool to analyze these signals not only in the node domain, but also in the graph frequency domain [2, 3, 4]. One of the key tools of graph signal analysis are graph filters. In a direct analogy to classical filters, graph filters process a graph signal by selectively amplifying its graph Fourier coefficients. This renders them ideal for a wide range of tasks, ranging from graph signal smoothing and denoising [5, 6], classification [7, 8, 9] and interpolation [10], segmentation [11], wavelet construction [12], and dictionary learning [13]—among others.

Distributed implementations of filters on graphs emerged as a way of increasing the scalability of computation [6, 14, 15, 16]. In this way, a desired graph filtering operation is performed by only local information exchange between neighbors and there is no need for a node to have access to all the data. Nevertheless, being inspired by finite impulse response (FIR) graph filters, these methods are sensitive to time variations, such as time-varying signals and/or graphs. An alternative approach, namely distributed infinite impulse response (IIR) graph filtering, was recently proposed [17, 18]. Compared to FIR graph filters, IIR filters allow for the computation of a larger family of responses, and give exact rather than approximate solutions to specific denoising [5] and interpolation [10] problems. Yet the issue of time variations has so far been unresolved.

In a different context, we introduced IIR filter design (in fact, prior to [18]) using an autoregressive process called the potential kernel [19, 11]. These graph filters were shown to facilitate information processing tasks in sensor networks, such as smoothing and event region detection, but, due to their ad-hoc design, they only accomplished a limited subset of filtering objectives. In this paper, we build upon our prior work to develop more general autoregressive moving average (ARMA) graph filters of any order, using parallel or periodic concatenations of the potential kernel. The design philosophy of these graph filters allows for the approximation of any desired graph frequency response without knowing the structure of the underlying graph. In this way, we design the filter coefficients independently of the particular graph. This allows the ARMA filters to be universally applicable for any graph structure, and in particular when the graph varies over time, or when the graph structure is unknown to the designer.

Though ARMA graph filters belong to the class of IIR graph filters, they have a distinct design philosophy which bestows them the ability to filter graph signals not only in the graph frequency domain, but also in the regular temporal frequency domain (in case the graph signal is time-varying). Specifically, our design extends naturally to time-varying signals leading to two-dimensional ARMA filters: a filter in the graph domain as well as a filter in the time domain.

Our contributions are twofold:

(i) Distributed graph filters (Sections III and IV). We propose two types of autoregressive moving average (ARMA) recursions, namely the parallel and periodic implementation, which attain a rational graph frequency response. Both methods are implemented distributedly, attain fast convergence, and have message and memory requirements that are linear in the number of graph edges and the approximation order. Using a variant of Shanks’ method, we are able to design graph filters that approximate any desired graph frequency response. In addition, we give exact closed-form solutions for tasks such as Tikhonov and Wiener-based graph signal denoising and graph signal interpolation under smoothness assumptions.

(ii) Time-varying graphs and signals (Section V). We begin by providing a complete temporal characterization of ARMA graph filters w.r.t. time-varying graph signals. Our results show that the proposed recursions naturally extend to two-dimensional filters operating simultaneously in the graph-frequency domain and in the time-frequency domain. We also discuss the ARMA recursion behavior when both the graph topology and graph signal are time-varying. Specifically, we provide sufficient conditions for filter stability, and show that a decomposition basis exists (uniquely determined by the sequence of graph realizations), over which the filters achieve the same frequency response as in the static case.

Our results are validated by simulations in Section VI, and conclusions are drawn in Section VII.

#### Notation and terminology.

We indicate a scalar valued variable by normal letters (i.e.,  or ); a bold lowercase letter will indicate a vector variable and a bold upper case letter a matrix variable. With and we will indicate the eneries of and , respectively. For clarity, if needed we will refer to these entries also as and and to the -th column of as . We indicate by the absolute value of and by and the 2-norm and the spectral norm of the vector and matrix , respectively. To characterize convergence, we adopt the term linear convergence, which asserts that a recursion converges to its stationary value exponentially with time (i.e., linearly in a logarithmic scale) [20].

## Ii Preliminaries

Consider an undirected graph of nodes and edges, where indicates the set of nodes and the set of edges. Let be the graph signal defined on the graph nodes, whose -th component represents the value of the signal at the -th node, denoted as .

#### Graph Fourier transform (GFT).

The GFT transforms a graph signal into the graph frequency domain by projecting it into the basis spanned by the eigenvectors of the graph Laplacian , typically defined as the discrete Laplacian or the normalized Laplacian . Since the Laplacian matrix of an undirected graph is symmetric, its eigenvectors form an orthonormal basis, and the forward and inverse GFTs of and are and , respectively, where the -th column of is indicated as .The corresponding eigenvalues are denoted as and will indicate the graph frequencies. For an extensive review of the properties of the GFT, we refer to [2, 3]. To avoid any restrictions on the generality of our approach, in the following, we present our results for a general representation matrix . We only require that is symmetric and local: for all , whenever and are not neighbors and otherwise. We derive our results for a class of graphs with general Laplacian matrices in some restricted set . We assume that for every the minimum eigenvalue is bounded below by and the maximum eigenvalue is bounded above by . Hence, all considered graphs have a bounded spectral norm, i.e., . For instance, when , we can take and , with related to the maximum degree of any of the graphs. When , we can consider and .

#### Graph filters.

A graph filter is an operator that acts upon a graph signal by amplifying or attenuating its graph Fourier coefficients as

 \mathboldH\mathboldx=N∑n=1H(λn)⟨\mathboldx,\mathboldϕn⟩\mathboldϕn, (1)

where denotes the usual inner product. Let and be the minimum and maximum eigenvalues of over all possible graphs. The graph frequency response controls how much amplifies the signal component of each graph frequency

 H(λn)=⟨\mathboldH\mathboldx,\mathboldϕn⟩/⟨\mathboldx,\mathboldϕn⟩. (2)

We are interested in how we can filter a signal with a graph filter having a user-provided frequency response . Note that this prescribed is a continuous function in the graph frequency and describes the desired response for any graph. This approach brings benefits in those cases when the underlying graph structure is not known to the designer, or in cases the graph changes in time. The corresponding filter coefficients are thus independent of the graph and universally applicable. Using universal filters, we can design a single set of coefficients that instantiate the same graph frequency response over different bases. To illustrate the universality property, consider the application of a universal graph filter to two different graphs and of and nodes with graph frequency sets , and eigenvectors , . The filter will attain the same response over both graphs, but, in each case, supported over a different set of graph frequencies: For , filtering results in , whereas for the filtering operator will be Thus, the universality lies in the correctness to implement on all graphs, which renders it applicable for time-varying graph topologies.

#### Distributed implementation.

It is well known that we can approximate a universal graph filter in a distributed way using a -th order polynomial of , for instance using Chebychev polynomials [6]. Define FIR as the -th order approximation given by

 \mathboldH=h0\mathboldI+K∑k=1hk\mathboldLk, (3)

where the coefficients can be both found by Chebyshev polynomial fitting [6] or in a least-squares sense, after a (fine) gridding of the frequency range, by minimizing

 ∫λ|K∑k=0hkλk−H∗(λ)|2dλ.\vspace−1mm (4)

Observe that, in contrast to traditional graph filters, the order of the considered universal graph filters is not necessarily limited to . By increasing , we can approximate any filter with square integrable frequency response arbitrarily well. On the other hand, a larger FIR order implies a longer convergence time in a distributed setting, since each node requires information from all its -hop neighbors to attain the desired filter response.

To perform the filter distributedly in the network , we assume that each node is imbued with memory, computation, and communication capabilities and is in charge of calculating the local filter output . To do so, the node has to its disposal direct access to , as well as indirect access to the memory of its neighbors. For simplicity of presentation, we pose an additional restriction to the computation model: we will assume that nodes operate in synchronous rounds, each one consisting of a message exchange phase and a local computation phase. In other words, in each round may compute any (polynomial-time computable) function which has as input, variables from its local memory as well as those from the memory of its neighbors in . Since the algorithms examined in this paper are, in effect, dynamical systems, in the following we will adopt the term iteration as being synonymous to rounds. Furthermore, we assume that each iteration lasts exactly one time instant. In this context, the convergence time of an algorithm is measured in terms of the number of iterations the network needs to perform until the output closely reaches the steady-state, i.e., the asymptotic output of the dynamical system.

The computation of FIR is easily performed distributedly. Since , each node can compute the -th term from the values of the -th term in its neighborhood, in an iterative manner. The algorithm performing the FIR graph filter terminates after iterations, and if a more efficient recursive implementation is used [6], in total, each node exchanges values with its neighbors, meaning that, overall, the network exchanges variables, amounting to a communication cost of . Further, since for this computation each node keeps track of the values of its neighbors at every iteration, the network has a memory complexity of .

However, FIR filters exhibit poor performance when the graph signal or/and graph topology are time-varying, since the intermediate steps of the recursion cannot be computed exactly. This is for two reasons: i) First, the distributed averaging is paused after iterations, and thus the filter output is not a steady state of the iteration , which for gives as above. Accumulated errors in the computation alter the trajectory of the dynamical system, rendering intermediate states and the filter output erroneous. ii) Second, the input signal is only considered during the first iteration. To track a time-varying signal , a new FIR filter should be started at each time step having as input, significantly increasing the computation, message and memory complexities.

To overcome these issues and provide a more solid foundation for graph signal processing, we study ARMA graph filters.

## Iii ARMA Graph Filters

This section contains our main algorithmic contributions. First, Sections III-A and III-B present distributed algorithms for implementing filters with a complex rational graph frequency response

 H(λ)=pn(λ)pd(λ)=∑Kk=0bkλk1+∑Kk=1akλk, (5)

where and are the complex numerator and denominator polynomials of order . Note that this structure resembles the frequency response of temporal ARMA filters [21], in which case , with being the temporal frequency. Though both polynomials are presented here to be of the same order, this is not a limitation: different orders for and are achieved trivially by setting specific constants or to zero.

### Iii-a Arma1 graph filters

Before describing the full fledged ARMA filters, it helps first to consider a 1-st order graph filter. Besides being simpler in its construction, an ARMA lends itself as the basic building block for creating filters with a rational frequency response of any order (cf. Section III-B). We obtain ARMA filters as an extension of the potential kernel [19]. Consider the following 1-st order recursion

 \mathboldyt+1=ψ\mathboldL\mathboldyt+φ\mathboldx (6a) \mathboldzt+1=\mathboldyt+1+c\mathboldx, (6b)

for arbitrary and where the coefficients , and are complex numbers (to be specified later on). For this recursion, we can prove our first result.

###### Theorem 1.

The frequency response of the ARMA is

 H(λ)=c+rλ−p,subject to|p|>ϱ (7)

with residue and pole given by and , respectively, and with being the spectral radius bound of . Recursion (6) converges to it linearly, irrespective of the initial condition and graph Laplacian .

###### Proof.

We can write the recursion (6a) at time in the expanded form as

 \mathboldyt=(ψ\mathboldL)t\mathboldy0+φt−1∑τ=0(ψ\mathboldL)τ\mathboldx. (8)

When and as this recursion approaches the steady state

 \mathboldy=limt→∞\mathboldyt=φ∞∑τ=0(ψ\mathboldL)τ\mathboldx=φ(\mathboldI−ψ\mathboldL)−1\mathboldx, (9)

irrespective of . Based on Sylvester’s matrix theorem, the matrix has the same eigenvectors as and its eigenvalues are equal to . It is also well known that invertible matrices have the same eigenvectors as their inverse and eigenvalues that are the inverse of the eigenvalues of their inverse. Thus,

 \mathboldz=limt→∞\mathboldzt=\mathboldy+c\mathboldx=N∑n=1(c+φ1−ψλn)^xn\mathboldϕn, (10)

and the desired frequency response (7) follows by simple algebra. We arrive at (10) by considering a specific realization of , thus the set of eigenvalues is discrete. However, the same result is achieved for every other graph realization matrix with a potentially different set of eigenvalues, still in . Thus, we can write (7) for all . ∎

The stability constraint in (7) can be also understood from a dynamical system perspective. Comparing recursion (6) to a discrete-time state-space equation, it becomes apparent that, when the condition holds, recursion (6) achieves frequency response (7). It is useful to observe that, since , an increment of the stability region can be attained, if we work with a shifted version of the Laplacian and thereby decrease the spectral radius bound . For instance, the following shifted Laplacians can be considered: with and or with and . Due to its benefits, we will use the shifted versions of the Laplacians in the filter design phase and numerical results.111Note that from Sylvester’s matrix theorem, the shifted version of the Laplacians share the same eigenvectors as the original ones.

Recursion (6) leads to a simple distributed implementation of a graph filter with 1-st order rational frequency response: at each iteration , each node updates its value based on its local signal and a weighted combination of the values of its neighbors . Since each node must exchange its value with each of its neighbors, the message/memory complexity at each iteration is of order , which leads to an efficient implementation.

###### Remark 1.

Note that there is an equivalence between the ARMA filter and FIR filters in approximating rational frequency responses. Indeed, the result obtained in (9) from the ARMA in can also be achieved with an FIR filter of order . Further, from (8) we can see that in finite time, i.e.,  and the ARMA output is equivalent to an FIR filter with coefficients .

This suggests that: the same output of an FIR filter can be obtained form (6) and (ii) the ARMA graph filter can be used to design the FIR coefficients to approximate frequency responses of the form (7). As we will see later on, due to their implementation form (6), the ARMA filters are more robust than FIRs in a time-varying scenario (time-varying graph and/or time-varying graph signal).

### Iii-B ArmaK graph filters

Next, we use ARMA as a building block to derive distributed graph filters with a more complex frequency response. We present two constructions: The first uses a parallel bank of ARMA filters, attaining linear convergence with a communication and memory cost per iteration of . The second uses periodic coefficients in order to reduce the communication costs to , while preserving the linear convergence as the parallel ARMA filters.

#### Parallel ARMAK filters.

A larger variety of filter responses can be obtained by adding the outputs of a parallel ARMA filter bank. Let’s denote with superscript the terms that correspond to the -th ARMA filter for . With this notation in place, the output of the ARMA filter at time instant is

 \mathboldy(k)t+1=ψ(k)\mathboldL\mathboldy(k)t+φ(k)\mathboldx (11a) \mathboldzt+1=K∑k=1\mathboldy(k)t+1+c\mathboldx, (11b)

where is arbitrary.

###### Theorem 2.

The frequency response of a parallel ARMA is

 H(λ)=c+K∑k=1rkλ−pk% subject to|pk|>ϱ, (12)

with the residues and poles , and the spectral radius of . Recursion (11) converges to it linearly, irrespective of the initial conditions and graph Laplacian .

###### Proof.

Follows straightforwardly from Theorem 1. ∎

The frequency response of a parallel ARMA is therefore a rational function with numerator and denominator polynomials of order (presented here in a partial fraction form). In addition, since we are simply running ARMA filters in parallel, the communication and memory complexities are times that of the ARMA graph filter. Note also that the same considerations of Remark 1 can be extended to the parallel ARMA filter.

#### Periodic ARMAK filters.

We can decrease the memory requirements of the parallel implementation by letting the filter coefficients periodically vary in time. Our periodic filters take the following form

 \mathboldyt+1=(θtI+ψt\mathboldL)\mathboldyt+φt\mathboldx (13a) \mathboldzt+1=\mathboldyt+1+c\mathboldx, (13b)

where is the arbitrary, the output is valid every iterations, and coefficients are periodic with period : , with an integer in .

###### Theorem 3.

The frequency response of a periodic ARMA filter is

 H(λ) =c+K−1∑k=0(K−1∏τ=k+1(θτ+ψτλ))φk1−K−1∏k=0(θk+ψkλ), (14)

subject to the stability constraint

 ∣∣ ∣∣K−1∏k=0(θk+ψkϱ)∣∣ ∣∣<1 (15)

with being the spectral radius bound of . Recursion (13) converges to it linearly, irrespective of the initial condition and graph Laplacian .

(The proof is deferred to the appendix.)

By some algebraic manipulation, we can see that the frequency response of periodic ARMA filters is also a rational function of order . We can also observe that the stability criterion of parallel ARMA is more involved than that of the parallel implementation. As now we are dealing with ARMA graph filters interleaved in time, to guarantee their joint stability one does not necessarily have to examine them independently (requiring for instance that, for each , . Instead, it is sufficient that the product is smaller than one. To illustrate this, notice that if , the periodic ARMA can be stable even if some of the ARMA graph filters it is composed of are unstable.

When computing a periodic ARMA distributedly, in each iteration each node stores and exchanges values with its neighbors, yielding a memory complexity of , rather than the of the parallel one (after each iteration, the values are overwritten). On the other hand, since the output of the periodic ARMA is only valid after iterations, the communication complexity is again . The low memory requirements of the periodic ARMA render it suitable for resource constrained devices.

## Iv ARMA Filter Design

In this section we focus on selecting the coefficients of our filters. We begin by showing how to approximate any given frequency response with an ARMA filter, using a variant of Shanks’ method [22]. This approach gives us stable filters, ensuring the same selectivity as the universal FIR graph filters. Section IV-B then provides explicit (and exact) filter constructions for two graph signal processing problems which were up-to-now only approximated: Tiknohov and Wiener-based signal denoising and interpolation under smoothness assumptions [6, 23] and [10].

### Iv-a The Filter Design Problem

Given a graph frequency response and filter order , our objective is to find the complex polynomials and of order that minimize

 ∫λ∣∣∣pn(λ)pd(λ)−H∗(λ)∣∣∣2dλ = (16)

while ensuring that the chosen coefficients result in a stable system (see constraints in Theorems 2 and 3). From one computes the filter coefficients or by algebraic manipulation.

###### Remark 2.

Even if we constrain ourselves to pass-band filters and we consider only the set of for which , it is impossible to design our coefficients based on classical design methods developed for IIR filters (e.g., Butterworth, Chebyshev). The same issue is present also using a rational fitting approach, e.g., Padé and Prony’s method. This is due to the fact that, now, the filters are rational in the variable and the notion of frequency does not stem from nor from . This differentiates the problem from the design of continuous and discrete time filters. Further, the stability constraint of ARMA is different from classical filter design, where the poles of the transfer function must lie within (not outside) the unit circle.

To illustrate this remark, consider the Butterworth-like graph frequency response , where is the desired cut-off frequency. For , one finds that it has two complex conjugate poles at . Thus, the behavior of these filters depends on the cut-off frequency and stability is not always guaranteed. For this particular case, and for a parallel implementation, whenever the filters are not stable.

#### Design method.

Similar to Shanks’ method, we approximate the filter coefficients as follows:

Denominator. Determine for by finding a order polynomial approximation of using polynomial regression, and solving the coefficient-wise system of equations .

Numerator. Determine for by solving the least-squares problem of minimizing , w.r.t. .

Once the numerator () and denominator () coefficients of the target rational response are found:

(i) Parallel design. Perform the partial fraction expansion to find the residuals () and poles (). Then, the filter coefficients and can be found by exploiting their relation with and in Theorem 2.

(ii) Periodic design. Identify by computing the roots of the (source) denominator polynomial in (14) and equating them to the roots of the (target) denominator . It is suggested to set and for , which has the effect of putting the two polynomials in similar form. Once coefficients (and ) have been set, we obtain by equating the numerator target and source polynomials.

The method is also suitable for numerator and denominator polynomials of different orders. We advocate however the use of equal orders, because it yields the highest approximation accuracy for a given communication/memory complexity.

The most crucial step is the approximation of the denominator coefficients. By fitting to instead of , we are able to compute coefficients independently of . Increasing often leads to a (slight) increase in accuracy, but at the price of slower convergence and higher sensitivity to numerical errors (such as those caused by packet loss). Especially for sharp functions, such as the ones shown in Fig. 1, a high order polynomial approximation results in very large coefficients, which affect the numerically stability of the filters and push the poles closer to the unit circle. For this reason, in the remainder of this paper we set .

Though the proposed design method does not come with theoretical stability guarantees, it has been found to consistently produce stable filters222This has been observed while working with shifted Laplacians, and especially with the shifted normalized Laplacian .. Fig. 1 illustrates in solid lines the frequency responses of three ARMA filters (), designed to approximate a step function (top) and a window function (bottom). For reproducibility, Table I, which is featured in the Appendix, summarizes the filter coefficients of the parallel implementation for different .

### Iv-B Exact and Universal Graph Filter Constructions

We proceed to present exact (and in certain cases explicit) graph filter constructions for particular graph signal denoising and interpolation problems. In contrast to previous work, the proposed filters are universal, that is they are designed without knowledge of the graph structure. Indeed, the filter coefficients are found independently from the eigenvalues of the graph Laplacian. This makes the ARMA filters suitable for any graph, and ideal for cases when the graph structure is unknown or when the complexity of the eigenvalue decomposition becomes prohibitive.

#### Tikhonov-based denoising.

Given a noisy signal , where is the true signal and is noise, the objective is to recover  [6, 2, 23]. When is smooth w.r.t. the graph, denoising can be formulated as the regularized problem

 ~\mathboldx=argmin\mathboldx∈CN∥\mathboldx−\mathboldt∥22+w\mathboldx⊤\mathboldLK\mathboldx, (17)

where the first term asks for a denoised signal that is close to , and the second uses the quadratic form of to penalize signals that are not smooth. In (17), admitted choices of are limited to the discrete Laplacian or the normalized Laplacian (without shifting). The positive regularization weight allows us to trade-off between the two terms of the objective function. Being a convex problem, the global minimum is found by setting the gradient to zero, resulting in

 ~\mathboldx =(\mathboldI+w\mathboldLK)−1\mathboldt=N∑n=111+wλKn⟨\mathboldt,\mathboldϕn⟩\mathboldϕn. (18)

It follows that (18) can be approximated with an ARMA with frequency response

 H(λ) =11+wλK=1∏Kk=1(λ−pk) (19)

with and . From the stability condition of the parallel ARMA, we have stable filters as long as that , which for the particular expression of becomes .

The solution of (18) can also computed by an ARMA implemented on the shifted Laplacians, with a notable improvement on the stability of the filters. For we can reformulate (19) as

 H(λ) =11+w(λ+l2)K=1∏Kk=1(λ−pk) (20)

where now for . Again, from the stability of ARMA , or equivalently , we now obtain stable filters as long as

 (−l2+cos(γk)K√w)2+sin2(γk)K√w2>ϱ2, (21)

or equivalently

 (l24−ϱ2)K√w2−l cos(γk)K√w+1>0, (22)

are satisfied. For the shifted normalized Laplacian, and , the stability condition simplifies to

 2cos(γk)K√w<1, (23)

which is always met for the standard choices of (quadratic regularization) and (total variation)333Even though is a free parameter, for K = 1, 2 the value in (23) will be either 0 or -1, due to the expression of .. For these values of , and for different values of the regularized weight , we show in Fig. 2 the normalized error between the output of the ARMA recursion and the solution of the optimization problem (17), as a function of time.

For both (19) and (20), the denominator coefficients of the corresponding parallel ARMA filter can be found as . Meanwhile, the numerator coefficients are found in two steps: (i) express (19), (20) in the partial form as in (12) to find the residuals and (ii) take .

#### Wiener-based denoising.

When the statistical properties of the graph signal and noise are available, it is common to opt for a Wiener filter, i.e., the linear filter that minimizes the mean-squared error (MSE)

 ~\mathboldH =argmin\mathboldH∈RN×NE[∥\mathboldH\mathboldt−\mathboldx∥22]and ~\mathboldx=~\mathboldH\mathboldt, (24)

where as before is the graph signal which has been corrupted with additive noise. It is well known that, when and are zero-mean with covariance and , respectively, the solution of (24) is

 ~\mathboldx=\mathboldΣx(\mathboldΣx+\mathboldΣn)−1\mathboldt. (25)

given that matrix is non-singular. The above linear system can be solved by a graph filter when matrices and share the eigenvectors of the Laplacian matrix . Denote by the eigenvalue of matrix the which corresponds to the -th eigenvector of , and correspondingly . We then have that

 ~\mathboldx=N∑n=1σx(λn)σx(λn)+σ\mathboldn(λn)⟨\mathboldt,\mathboldϕn⟩\mathboldϕn. (26)

It follows that, when and are rational functions (of ) of order , the Wiener filter corresponds to an ARMA graph filter. Notice that the corresponding filters are still universal, as the ARMA coefficients depend on the rational functions and , but not on the specific eigenvalues of the graph Laplacian. In a different context, similar results have been also observed in semi-supervised learning [24].

Let us illustrate the above with an example. Suppose that is normally distributed with covariance equal to the pseudoinverse of the Laplacian . This is a popular and well understood model for smooth signals on graphs with strong connections to Gaussian Markov random fields [25]. In addition, let the noise be white with variance . Substituting this into (26), we find

 ~\mathboldx =N∑n=111+wλn⟨\mathboldt,\mathboldϕn⟩\mathboldϕn, (27)

which is identical to the Tikhonov-based denoising for and corresponds to an ARMA with and , which as previously shown has always stable implementation. Note that even though the stability is ensured for the considered case, it does not necessarily hold for every covariance matrix. Indeed, the stability of the filter must be examined in a problem-specific manner.

#### Graph signal interpolation.

Suppose that only out of the values of a signal are known, and let be the vector which contains the known values and zeros otherwise. Under the assumption of being smooth w.r.t. or , we can estimate the unknowns by the regularized problem

 ~\mathboldx=argmin\mathboldx∈RN∥\mathboldS(\mathboldx−\mathboldt)∥22+w\mathboldx⊤\mathboldLK\mathboldx, (28)

where is the diagonal matrix with if is known and otherwise. Such formulations have been used widely, both in the context of graph signal processing [10, 26] and earlier by the semi-supervised learning community [9, 27]. Similar to (17), this optimization problem is convex and its global minimum is found as

 ~\mathboldx=(\mathboldS+w\mathboldLK)−1\mathboldt. (29)

Most commonly, , and can be re-written as

 ~\mathboldx=(\mathboldI−^\mathboldL)−1\mathboldt=N∑n=111+^λn⟨\mathboldt,^\mathboldϕn⟩^\mathboldϕn. (30)

which is an ARMA filter designed for the Laplacian matrix with the -th eigenpair of . For larger values of , the interpolation cannot be computed distributedly using ARMA filters. That is because the corresponding basis matrix cannot be appropriately factorized into a series of local matrices.

## V Time-Variations

At this point we have characterized the filtering and convergence properties of ARMA graph filters for static inputs. But do these properties hold when the graph and signal are a function of time? In the following, we characterize ARMA graph filters with respect to time-variations in the graph signal (cf. Section V-A), as well as in the graph topology (cf. Section V-B).

### V-a Joint Graph and Temporal Filters

To understand the impact of graph signal dynamics we broaden the analysis to a two-dimensional domain: the first dimension, as before, captures the graph (based on the graph Fourier transform), whereas the second dimension captures time (based on the Z-transform [21]). This technique allows us to provide a complete characterization of the ARMA filter subject to time-variations. First, we show that the ARMA filter output remains close to the correct time-varying solution (under sufficient conditions on the input), which implies that our algorithms exhibit a certain robustness to dynamics. Further, we realize that ARMA graph filters operate along both the graph and temporal dimensions. We find that a graph naturally dampens temporal frequencies in a manner that depends on its spectrum. Exploiting this finding, we extend the ARMA designs presented in Section III so as to also allow a measure of control over the temporal frequency response of the filters.

As previously, we start our exposition with the ARMA recursion (6), but now the input graph signal is time dependent (thus, indexed with the subscript )

 \mathboldyt+1=ψ\mathboldL\mathboldyt+φ\mathboldxt (31a) \mathboldzt+1=\mathboldyt+1+c\mathboldxt. (31b)

The dimension of the above recursion can be reduced by restricting the input graph signal to lie in the subspace of an eigenvector with associated eigenvalue , i.e., , where now is a scalar and similarly, we take .444This is a standard way to derive the frequency response of the system. By orthogonality of the basis, the filter only alters the magnitude relative to the eigenvector and not the direction of . Therefore, (31) is equivalent to

 yt+1=ψλyt+φxt (32a) zt+1=yt+1+cxt, (32b)

where are simply the magnitudes of the vectors lying in the eigenspace of , and we can write and . Taking the Z-transform on both sides, we obtain the joint graph and temporal frequency transfer function

 H(z,λ)=φz−11−ψλz−1+cz−1. (33)

It can be shown that the temporal-impulse response for each graph frequency is

 \mathboldht+1(λ)=(φ(ψλ)t+cδ[t])\mathboldϕ, (34)

with the impulse function. From (34) we deduce that the region of convergence (ROC) of the filter is , for all and that the filter is causal.

The joint transfer function characterizes completely the behavior of ARMA graph filters for an arbitrary yet time-invariant graph: when , we return to the constant result and stability condition of Theorem 1, while for all other we obtain the standard frequency response as well as the graph frequency one. As one can see, recursion (31) is an ARMA filter in the graph domain as well as in the time domain. Observe also that the poles of obey the fixed relationship . This yields an interesting insight: the temporal frequency response of the filter differs along each graph frequency , meaning that temporal dynamics affecting signals lying in low graph frequency eigenspaces are dampened to a smaller extent.

As Theorems 4 and 5 below show, the results are readily generalized to higher order filters.

###### Theorem 4.

The joint graph and temporal frequency transfer function of a parallel ARMA is

 H(z,λ)=K∑k=1φ(k)z−11−ψ(k)λz−1+cz−1, (35)

subject to the stability conditions of Theorem 2.

###### Theorem 5.

The joint graph and temporal frequency transfer function of a periodic ARMA is

 H(z,λ)=K−1∑k=0(K−1∏τ=k+1θτ+ψτλ)φkzk−K1−(K−1∏τ=0θτ+ψτλ)z−K+cz−1, (36)

subject to the stability conditions of Theorem 3.

(The proofs are deferred to the appendix.)

As in the first order case, Theorems 4 and 5 describe completely the behavior of the parallel and periodic implementations. We can see that both filters are ARMA filters in the graph and temporal domain. In particular, the parallel and periodic filters have up to distinct poles abiding respectively to

 z=ψ(k)λandz=K ⎷K−1∏τ=0θτ+ψτλ. (37)

To provide further insight, Fig. 3 plots the joint graph and temporal frequency response of a parallel and a periodic graph filter of third order, both designed (only in the graph domain) to approximate an ideal low pass response with cut-off frequency . In the figure, the horizontal axis measures the graph frequency with smaller corresponding to lower variation terms. The temporal axis on the other hand measures the normalized temporal frequency such that, for , one obtains the standard graph frequency response.

We make two observations: First, both graph filters ensure almost the same frequency response as for the static case () for low temporal variations . This suggests that these filters are more appropriate for slow temporal variations. Whereas for graph signals lying in eigenspaces with close to all temporal frequencies are damped. This is a phenomenon that transcends the filter implementation (parallel or periodic) and the particular filter coefficients. It is attributed to the shifting of the Laplacian in the design phase and to the multiplicative relation of the response poles. Second, the parallel concatenation of ARMA filters results in a more stable implementation that is more fit to tolerate temporal dynamics than the periodic implementation. As shown in Figure 3, for and , temporal fluctuations with frequencies exceeding cause the periodic filter output to blow up by an order of magnitude, effectively rendering the periodic implementation unusable. The poor stability of the periodic implementation is also seen from (36), where the terms tend to push the poles closer to the unit circle, and it is the price to pay for its small memory requirements. Due to its superior stability properties and convenient form (less coefficients and simpler design), we suggest the parallel implementation for dealing with time-varying graph signals.

### V-B Time-Varying Graphs and Signals

Time variations on the graph topology bring new challenges to the graph filtering problem. First, they render approaches that rely on knowledge of the graph spectrum ineffective. Approaches which ensure stability by designing the poles to lie outside the set of the Laplacian eigenvalues of a given graph, may lead to unstable filters in a different graph where some eigenvalues may over-shoot one of the poles. Due to their different design philosophy, the presented ARMA graph filters handle naturally the aforementioned issues. We can, for instance, think that the different graph realizations among time enjoy an upper bound on their maximum eigenvalue . In case this is not possible, or difficult to determine, we can always work with the normalized Laplacian and thus take . In this way, by designing the filter coefficients in order to ensure stability w.r.t. , we automatically impose stability for all different graph realizations. Furthermore, by designing once the filter coefficients for a continuous range of frequencies, the ARMA recursions also preserve the desired frequency response for different graph realizations.

The second major challenge is characterizing the graph filter behavior. Time-varying affine systems are notoriously difficult to analyze when they possess no special structure [28]. To this end, we devise a new methodology for time-varying graph filter analysis. We show that a decomposition basis always exists, over which ARMA graph filters (and as a consequence parallel ARMA filters) have the same frequency response as in the static case. Furthermore, this decomposition basis depends only on the sequence of graph realizations.

In case of a time-varying graph topology, yet with a fixed number of nodes , as well as a time-varying graph signal, the ARMA recursion (6) can be written as

 \mathboldyt+1=ψ\mathboldLt\mathboldyt+φ\mathboldxt (38a) \mathboldzt+1=\mathboldyt+1+c\mathboldxt, (38b)

where the time-varying graph is shown by indexing with the subscript . Expanding the recursion we find that, for any sequence of graph realizations with corresponding Laplacians , the output signal is given by

 \mathboldzt+1 =ψt+1\mathboldΦL(t,0)\mathboldy0+φt∑τ=0ψτ\mathboldΦL(t,t−τ+1)\mathboldxt−τ+c\mathboldxt, (39)

where for , and otherwise.

Since the output depends on the entire sequence of graph realizations, the spectrum of any individual Laplacian is insufficient to derive the graph frequency of the filter. To extend the spectral analysis to the time-varying setting, we define a joint Laplacian matrix that encompasses all the individual shifted graph Laplacians. The intuition behind our approach is to think of a time-varying graph as one large graph that contains all nodes of the graphs , as well as directional edges connecting the nodes at different timesteps. We then interpret the spectrum of the associated Laplacian matrix as the basis for our time-varying graph Fourier transform. This idea is a generalization of the joint graph construction [29], used to define a Fourier transform for graph signals which change with time (though in the previous work the graph was considered time-invariant). Similar to [29], we will construct by replicating each node once for each timestep . Denote the -th replication of the -th node as . For each and , will then contain directed edges between and with being a neighbor of in . Therefore, in contrast to previous work, here the edges between nodes and are a function of time. By its construction, captures not only the topology of the different graphs, but also the temporal relation between them: since the exchange of information between two neighbors incurs a delay of one unit of time, at each timestep , a node has access to the values of its neighbors at .

To proceed, define to be the cyclic shift matrix with ones below the diagonal and construct as the permuted block-diagonal matrix

 \mathboldLtv=blkdiag[\mathboldL0,\mathboldL1,…,\mathboldLt](\mathboldP⊗\mathboldI), (40)

For consistency with the established theory on GFT, when and the graph is time-invariant, we define . Let be the -dimensional canonical unit vector with if and , otherwise. Defining as the vector of dimension which encompasses all input graph signals, we can then write

 \mathboldΦL(t,t−τ+1)\mathboldxt−τ=(\mathbolde⊤t+1⊗\mathboldI)\mathboldLτtv\mathbolds. (41)

In those cases when the non-symmetric matrix enjoys an eigendecomposition, we have with (, ) the -th eigenpair. Specifically, is the -th diagonal element of and is the -th column of . The total number of eigenpairs of is . To ease the notation, we will denote as the respective -th column of .

Substituting (41) into the second term of (39) and rearranging the sums, we get

 φt∑τ=0ψτ\mathboldΦL(t,t−τ+1)\mathboldxt−τ =φ(\mathbolde⊤t+1⊗\mathboldI)t∑τ=0(ψ\mathboldLtv)τ\mathbolds =φ(\mathbolde⊤t+