Random walks and diffusion on networks

Random walks and diffusion on networks

Naoki Masuda naoki.masuda@bristol.ac.uk Department of Engineering Mathematics, University of Bristol, Bristol, UK Mason A. Porter Department of Mathematics, University of California Los Angeles, Los Angeles, USA Mathematical Institute, University of Oxford, Oxford, UK CABDyN Complexity Centre, University of Oxford, Oxford, UK Renaud Lambiotte Department of Mathematics/Naxys, University of Namur, Namur, Belgium
Abstract

Random walks are ubiquitous in the sciences, and they are interesting from both theoretical and practical perspectives. They are one of the most fundamental types of stochastic processes; can be used to model numerous phenomena, including diffusion, interactions, and opinions among humans and animals; and can be used to extract information about important entities or dense groups of entities in a network. Random walks have been studied for many decades on both regular lattices and (especially in the last couple of decades) on networks with a variety of structures. In the present article, we survey the theory and applications of random walks on networks, restricting ourselves to simple cases of single and non-adaptive random walkers. We distinguish three main types of random walks: discrete-time random walks, node-centric continuous-time random walks, and edge-centric continuous-time random walks. We first briefly survey random walks on a line, and then we consider random walks on various types of networks. We extensively discuss applications of random walks, including ranking of nodes (e.g., PageRank), community detection, respondent-driven sampling, and opinion models such as voter models.

keywords:
random walk, network, diffusion, Markov chain, point process

1 Introduction

Random walks (RWs) are popular models of stochastic processes with a very rich history Aldous2002book; Feller-book1; Feller1971book2; Hughes1995book; Kutner2016arxiv.111See https://www.youtube.com/watch?v=stgYW6M5o4k for an introduction to random walks for a public audience from the U.S. Public Broadcasting Service (PBS). The term “random walk” was coined by Karl Pearson Pearson1905Nature, and the study of RWs dates back to the “Gambler’s Ruin” problem analyzed by Pascal, Fermat, Huygens, Bernoulli, and others Ore1960AmMathMonthly. Additionally, Albert Einstein formulated stochastic motion (in the form of “Brownian motion”) of particles in continuous time due to their collisions with atoms and molecules Einstein1905AnnPhys-brownian. Theoretical developments have involved mathematics (especially probability theory), computer science, statistical physics, operations research, and more. RW models have also been applied in various domains, ranging from locomotion and foraging of animals Viswanathan1999Nature; Codling2008JRSocInterface; Humphries2010Nature; Okubo2001book, the dynamics of neuronal firing Tuckwell1988book2; Gabbiani2010book and decision-making in the brain Usher2001PsycholRev; Gold2007AnnuRevNeurosci to population genetics Ewens2010book, polymer chains fisher1966; isic1992, descriptions of financial markets CampbellLoMackinlay1996book; Mantegna1999book, evolution of research interests (through random walks on problem space) jia2017, ranking systems Gleich2015SiamRev, dimension reduction and feature extraction from high-dimensional data (e.g., in the form of “diffusion maps”) Coifman2005PNAS-1; Coifman2006ApplComputHarmonAnal, and even sports statistics clauset2015; Godreche2017arxiv. RW theory can also help predict arrival times of diseases spreading on networks Iannelli2017PhysRevE. There exist several monographs and review papers on RWs. Many of them treat RWs on classical network topologies, such as regular lattices (e.g., ) and Cayley trees (i.e., trees in which each node has the same number of neighboring nodes, which we henceforth call the node “degree”) Spitzer1976book; Weiss1994book; Hughes1995book; Redner2001book; Burioni2005JPhysA; Krapivsky2010book; Klafter2011book; Ibe2013book. Other monographs and surveys focus on RWs on fractal structures, revealing diffusion properties that are “anomalous” compared to RWs on regular lattices or Euclidean spaces (i.e., ) Rammal1984JStatPhys; Havlin1987AdvPhys; Bouchaud1990PhysRep; Benavraham2000book; Burioni2005JPhysA; Benichou2014PhysRep. Other literature treats RWs on finite networks, which are equivalent to a finite Markov chain (in the discrete-time case) Doyle1984book; Lovasz1993Boyal; Aldous2002book; Burioni2005JPhysA and are at the core of several stochastic algorithms.

In parallel, “network science” has emerged in recent years as a central approach to the study of complex systems ejam-special-2016; Newman2010book; Barabasi2016book; Boccaletti2006PhysRep. Networks are a natural representation of systems composed of interacting elements and allow one to examine the impact of structure on the dynamics and function of a system (as well as the impact of dynamics and function on network structure). Examples include friendship networks, international relationships, gene-regulatory networks, food webs, airport networks, the internet, and myriad more. In each case, one can represent the system’s connectivity structure as a set of nodes (representing the entities in the system) and edges (representing interactions among those entities). The study of networks is highly interdisciplinary, and it integrates theoretical and computational tools from subjects such as applied mathematics, statistical physics, computer science, engineering, sociology, economics, biology, and other domains. Many networks exhibit complex yet regular patterns that are explainable (sometimes arguably) by simple mechanisms. Network science has also had a strong impact on the understanding of dynamical processes because of the critical role of structure on spreading processes, synchronization, and others strogatz01; Barrat2008book; Porter2016book. As with RWs, numerous books and review papers have been written on networks, including textbooks Dorogovtsev2003book; Newman2010book; CohenHavlin2010book; Estrada2012book; Barabasi2016book, general review articles Newman2003SiamRev; Boccaletti2006PhysRep, and more specialized reviews on topics such as dynamical processes on networks Arenas2008PhysRep; Barrat2008book; Porter2016book, connections to statistical physics Albert2002RevModPhys; Dorogovtsev2008RevModPhys, temporal networks HolmeSaramaki2012PhysRep; Holme2015EurPhysJB; MasudaLambiotte2016book, multilayer networks Boccaletti2014PhysRep; Kivela2014JCompNetw; naturephysicsspreading, and community structure porter2009; Fortunato2010PhysRep; santo2016.

The main purpose of the present review is to bring together two broad subjects — RWs and networks — by discussing their many interconnections and their ensuing applications. RWs are often used as a model for diffusion, and there has been intense research on the impact of network architecture on the dynamics of RWs. Moreover, nontrivial network structure paves the way for different definitions of RWs, and different definitions can be “natural” from some perspective, while leading to different diffusive processes on the same network. Finally, RWs are at the core of several algorithms to uncover structural properties in networks. We will discuss these points further in the next three paragraphs.

First, RWs are often used as a model for diffusion, and there has been intense research on the impact of network architecture on the dynamics of RWs. The finiteness of a network — along with properties such as degree heterogeneity, community structure, and others — can make diffusion on networks both quantitatively and even qualitatively different from diffusion on regular or infinite lattices. RWs on networks are an example of a Markov chain in which the network is the state space of the random walker and the transition probabilities depend on the existence and weights of the edges between nodes. In this review, we will include a summary of results on the dependence of dynamical properties — including stationary distribution and mean first-passage time — on structural properties of an underlying network.

Second, the irregularity of underlying network structure opens the door for different definitions of RWs. Each is “natural” from some perspective, but they lead to different diffusive processes even when considering the same network. For example, it is useful to distinguish between discrete-time and continuous-time RWs. On networks in which degree (i.e., the number of neighbors) is heterogeneous (i.e., it depends on the node), one needs to subdivide continuous-time RWs further into two major types, depending on whether the random events that induce walker movement are generated on nodes or edges and corresponding to different types of propagators (normalized versus unnormalized Laplacian matrices). Different literatures use different variants of RWs, often implicitly. We distinguish different types of RWs and clarify the relationship between them, and we discuss formulations and results that are informed by empirical networks (such as networks with heavy-tailed degree distributions, multilayer networks, and temporal networks).

Finally, RWs lie at the core of many algorithms to uncover various types of structural properties of networks. Consider the notion of identifying “central” nodes, edges, or other substructures in networks Newman2010book. A powerful set of diagnostics (e.g., PageRank Brin1998conf; Gleich2015SiamRev and eigenvector centrality bonacich1972) are derived based on recursive arguments of the type “a node is important if it is connected to many important nodes”, and such derivations often rely on the trajectories of random walkers. Similarly, flow-based algorithms, based on trajectories of dynamical processes (e.g., random walks) being trapped within certain sets of node for a long time, are helpful for discovering mesoscale patterns in networks santo2016; Jeub2015PhysRevE. These techniques and algorithms open a wealth of applications that go well beyond classical applications of RWs. Their design benefits both explicitly and implicitly from developing an understanding of how RW dynamics are influenced by network structure and how different types of RWs behave on the same network.

There has been a vast amount of research on RWs on networks, and it is scattered across disparate corners of the scientific literature. It is impossible to cover everything, and we choose specific subsets of it to make our review cohesive, although we will occasionally include pointers to other parts of the landscape. First, we focus on the most standard types of RWs, in which a random walker moves to a neighbor with a probability proportional to edge weight, and their very close relatives. We only very rarely mention some of the numerous other types of random walks, which include correlated RWs Gillis1955ProcCambPhilSoc, self-avoiding RWs Domb1983JStatPhys; Madras1993book; Hughes1995book, zero-range processes EvansHanney2005JPhysA, multiplicative random processes Schenzle1979PhysRevA; Havlin1988PhysRevLett, adaptive RWs (including reinforced RWs Pemantle2007ProbSurv), branching RWs Schinazi1999book, Lévy flights Klafter2011book; Ibe2013book, elephant RWs SchutzTrimper2004PhysRevE, quantum walks Kempe2003ComtempPhys; Mulken2011PhysRep, mortal RWs mortal, and so on. These processes are of course fascinating, and many of the different flavors of RWs are often developed with specific motivation from an application (e.g., a Pac-Man-like “hungry RW” hungry2016 has been used as a model for chemotaxis in a porous medium), are often inspired by applications, such as animal movement Codling2008JRSocInterface; Okubo2001book or financial markets Mantegna1999book, and one can find discussions of different flavors of RWs in Refs. Hughes1995book; Klafter2011book; Ibe2013book. Second, we will not cover many results for RWs on particular generative models of networks, except that we do give extensive attention to first-passage times for fractal and pseudo-fractal network models (see Section 3.2.5). Third, we will not discuss various important, rigorous results from mathematics and theoretical computer science. For such results, see Doyle1984book; Lovasz1993Boyal; Weiss1994book; Hughes1995book; Aldous2002book. We focus instead on results that we believe give physical insight on RW processes and their applications.

As a final warning, we focus exclusively on diffusive processes in which the total number of walkers (or, equivalently, the total probability of observing a walker) is a conserved quantity 222We thus consider “conservative” processes, though non-conservative processes are also interesting lerman; Yan2016PeerJComputSci.. The only exception is in Section LABEL:sub:voter_model, where we use “coalescing RWs” as an analytical tool. As we will see, this conservation rule translates into certain properties of the operator that drives the RW process. When transposed, the operator leads naturally to linear models for consensus dynamics (see Sections LABEL:sub:voter_model and LABEL:sub:DeGroot_model). Among notable non-conservative processes, which we do not cover in this review, are classical epidemic processes Anderson1991book; Barrat2008book; Pastorsatorras2015RevModPhys; Porter2016book, in which the number of entities (e.g., viruses or infected individuals) varies over time. In the linear regime, corresponding to a small number of infected nodes, the propagator of infection events in simple epidemic processes such as susceptible–infected (SI) and susceptible–infected–recovered (SIR) models are the adjacency matrix Wang2003SRDS; Klemm2012SciRep. In contrast, a propagator of an RW is a type of Laplacian matrix, as we will discuss in detail in Section 3. If all nodes have the same degree, these Laplacian and adjacency matrices are related linearly, and their dynamics are essentially the same Godsil2001book; MasudaLambiotte2016book. However, they are generically different for heterogeneous networks, such as when degree depends on node identity. Therefore, the difference between conservative dynamics (described by a Laplacian matrix) and non-conservative dynamics (described by the adjacency matrix) tends to be more striking for heterogeneous than for homogeneous networks. Other spreading models that are also beyond the scope of this work include threshold models of social contagions valente-book; Porter2016book (e.g., for modeling adoption of behaviors) and reaction–diffusion dynamics ReactionDiffusion.

The rest of our review proceeds as follows. In Section 2, we discuss RWs on the line. In Section 3, we give a lengthy presentation of RWs on networks. We then discuss RWs on multilayer networks in Section 4.1 and RWs on temporal networks in Section 4.2. We discuss applications in Section LABEL:sec:applications, and we conclude in Section LABEL:sec:outlook.

2 Random walks on the line

In this section, we review some basic properties of RW processes on one-dimensional space (i.e., the infinite line). This section serves as a primer to later sections, in which we examine RWs on general networks. In this and later sections, we carefully distinguish between discrete-time and continuous-time models.

2.1 Discrete time

Consider a discrete-time RW (DTRW) process on the infinite line, which we identify with . There is a single walker. At each discrete time step, it moves from some point to some other point, including the case of moving from a point to itself. The length and direction of the move are both random variables. We assume that the probability that a walker located at moves to the interval in one step is equal to . The normalization is , and we assume that moves at different times are independent.

Let’s derive the probability density that a random walker is located at a point after steps. (For emphasis, we sometimes use the term “discrete time” or “event time” for .) The master equation is given by

(1)

It is convenient to solve Eq. (1) for general and in the Fourier domain. We define the Fourier transform by

(2)

and the inverse Fourier transform by

(3)

Note that is the “characteristic function” of a random variable with probability density . The Fourier transform of is sometimes called the “structure function” of the RW. The Taylor expansion of around yields

(4)

where is the expectation unless we state otherwise. One can thereby obtain moments of from the derivatives of at .

The Fourier transform maps a convolution, such as Eq. (1), to a product; and Eq. (1) thus yields

(5)

If a random walker is located initially at , we obtain , where is the Dirac delta function, which has Fourier transform . We thereby obtain

(6)

Using the inverse Fourier transform in Eq. (3), we obtain a formal solution for in the time domain:

(7)

The qualitative behavior of the solution in Eq. (7) depends on the details of the structure function . However, the asymptotic behavior of the RW as depends only on some of the properties of . When the first two moments of are finite, the solution converges to the Gaussian profile

(8)

where and . Equation (8) implies that the variance of grows linearly with time. This result is the “central limit theorem” for the sum of the sizes of the moves, which are independent random variables. This asymptotic regime is well-defined because the underlying space (i.e., the line) is infinitely large. One can derive these results in a similar manner when the underlying space is discrete (e.g., a one-dimensional lattice) Feller-book1; Weiss1994book; Hughes1995book; Redner2001book. In situations in which the second moment of the structure function diverges, the process exhibits superdiffusion and the probability profile converges to so-called “Lévy distributions” Klafter2011book; Ibe2013book.

2.2 Continuous time

In this section, we consider continuous-time RWs (CTRWs), which incorporate the timing of moves Montroll1965JMathPhys; Weiss1994book; Hughes1995book; Klafter2011book; Ibe2013book; Kutner2016arxiv. We assume that a walker waits betweegn two moves for a duration that independently obeys the probability density function . In other words, the move events are generated by a renewal process Feller1971book2. If with probability , the CTRW reduces to the DTRW described in Section 2.1. In a standard CTRW, one assumes that the time of a move event and the selection of a destination in a given move are independent. Therefore, a combination of and , where is the displacement in a single move, completely determines the dynamical properties of a random walker.

Let denote the time of the th move. By definition, , where each is independent and identically distributed (i.i.d.) and drawn from some distribution . Additionally, we can write

(9)

where is the probability that the walker is located at at time , the quantity is the probability that the walker is located at after steps, and is the probability density that the walker has moved times at time . Note that it is crucial to distinguish and , and we illustrate the difference between these probabilities with a schematic in Fig. 1. Equation (9) reflects the fact that a walker can visit at time after some number of steps.

Figure 1: Schematic of the standard continuous-time random walk (CTRW) on a one-dimensional lattice. (a) The position of the walker in physical time is described by . Note that represents the time of the th move. (b) The position of the walker after moves is described by .

The probability is given by the same solution, Eq. (7), as for the DTRW. To obtain from Eq. (9), we need to examine , and we thus need to consider a renewal process generated by . According to the elementary renewal theorem Cox1962book, the mean of at time is

(10)

Equation (10) indicates that grows linearly with time on average, irrespective of the details of the distribution . However, realized values of are random, inducing heterogeneity in the length of the RW “trajectory” (i.e., the walk measured in terms of the number of moves) observed at a given time .

When the CTRW is driven by a Poisson process, is the exponential distribution (i.e., ). In this case, obeys the Poisson distribution with mean . That is,

(11)

It requires some effort to derive when is a general distribution. To calculate the time of the th event or the number of events in a given time interval, we need to sum i.i.d. variables that obey . The duration is nonnegative, so we take a Laplace transform

(12)

The Taylor expansion of Eq. (12) is given by

(13)

and implies that generates the moments of if they exist. One computes the inverse Laplace transform by integrating in the complex plane:

(14)

where is a real constant that is larger than the real part of all singularities of .

The probability that no event has occurred up to time is

(15)

whose Laplace transform is

(16)

The probability that one event occurs in is

(17)

By Laplace-transforming Eq. (17) and applying Eq. (16), we obtain

(18)

By the same arguments, the probability density that events occur at times , , , but at no other times in is given by . This yields Cox1962book; Grigolini2001Fractals

(19)

In the analysis of RWs, Eq. (19) relates two ways to count time: one is in terms of the number of moves (), and the other is in terms of the physical time ().

For a CTRW driven by a Poisson process, we obtain

(20)

Substituting Eq. (20) into Eq. (19) yields

(21)

By taking the Fourier transform of Eq. (9) with respect to and the Laplace transform of Eq. (9) with respect to and then using Eqs. (6) and (19), we obtain

(22)
(23)

This result is central to the theory of CTRWs Montroll1965JMathPhys, and we will extend it to the case of general networks in Section 3.3. Taking the inverse transform of Eq. (23) with respect to both time and space yields , and we can examine the behavior of the RW for large by expanding or for small .

3 Random walks on networks

3.1 Notation

For our discussions, we assume that our networks are finite. However, to estimate how certain quantities scale with the number of nodes, we sometimes examine the limit. We allow our networks to have self-edges and multi-edges. We assume that the edge weights are nonnegative, so our networks are unsigned. For now, we assume that our networks are ordinary graphs (i.e., the best-studied types of networks), but we will consider multilayer networks in Section 4.1 and temporal networks in Section 4.2. Because introducing edge weights does not usually complicate RW problems, we assume that our networks are weighted unless we state otherwise, and we consider unweighted networks to be a special case of weighted networks. We also assume that our networks are directed unless we state otherwise. We summarize our main notation in Table 1.

number of nodes
number of edges
the th node (where )
The weighted adjacency matrix of the network; the matrix component represents the weight of the edge from node to node . In an undirected network, (where ). In an unweighted network, (again with ).
combinatorial Laplacian matrix
random-walk normalized Laplacian matrix
The strength of node in an undirected network; it is defined by . In an undirected and unweighted network, is equal to the degree of , which we denote by .
In-strength of ; it is defined by . In an unweighted network, is equal to the in-degree of , which we denote by .
Out-strength of ; it is defined by . In an unweighted network, is equal to the out-degree of , which we denote by .
mean degree, which is given by and indicates the sample mean of the degree for a network
The diagonal matrix whose th element is equal to (where ). In an undirected network, the th element of is equal to .
discrete time
continuous time
probability that a random walker visits
stationary density of a random walker at
approximately equal to
proportional to
Table 1: Main notation.

An undirected network is called “regular” if all nodes have the same degree. Notably, many mathematical results for RWs on networks are restricted to regular graphs Lovasz1993Boyal; Aldous2002book; Hoory2006BullAmMathSoc. In this review, we are interested in networks with heterogeneous degree distributions, which tend to be the norm rather than the exception in empirical networks in numerous domains Clauset2009SiamRev.

In our discussions, we assume that undirected networks are connected networks and that directed networks are “weakly connected” (i.e., that they are connected when one ignores the directions of the edges). It is clear (in the absence of jumps such as “teleportation” Gleich2015SiamRev to augment the RW) that a random walker is confined in the component in which it starts, and the analysis of RWs is then reduced to analysis within each component. See Newman2010book for extensive discussions of components and weakly connected components.

3.2 Discrete time

3.2.1 Definition and temporal evolution

Consider a DTRW on a directed network. We suppose that there is a single walker, which moves during each time step. When the walker is located at , it moves to the out-neighbor with a probability proportional to . The transition-probability matrix has elements , which give the probability that the walker moves from to , of

(24)

where we assume that . Other choices of , informed by the adjacency matrix , are also possible. One example is a “degree-biased RW” in unweighted (and usually undirected) networks Eisler2005PhysRevE; WangWangYin2006PhysRevE; Fronczak2009PhysRevE; LeeYook2009EurPhysJB; Baronchelli2010PhysRevE; Bonaventura2014PhysRevE; in this case, , where is a constant. If , then given by Eq. (24) gives this degree-biased RW. Another example of a biased transition-probability matrix is a “maximum entropy RW” Demetrius2005PhysicaA; Gomezgardenes2008PhysRevE; Burda2009PhysRevLett; Delvenne2011PhysRevE; Sinatra2011PhysRevE.

Because a random walker must go somewhere — including perhaps the current node — in a given move, the following conservation condition holds:

(25)

A DTRW on a finite network is a Markov chain on states. There is a huge literature (both pedagogical and more advanced) on Markov chains in general and for RWs in particular. This is especially true for finite state spaces (corresponding to finite networks) and for stationary Markov chains in which the transition probability does not depend on discrete time Kemeny1960-1976book; Papoulis1965-2002book; Iosifescu1980book; Stewart1994book; Norris1997book; TaylorKarlin1984-1998book; Aldous2002book; LevinPeresWilmer2009book; Blanchard2011book; Privault2013book. We draw from this literature to explain several properties of DTRWs in the rest of this section.

Let denote the probability that node is visited at discrete time . This probability evolves according to

(26)

Additionally,

(27)

for any if Eq. (27) holds for . Equation (26) is equivalent to

(28)

where . From Eq. (28), we see that

(29)

3.2.2 Stationary density

Consider the stationary density (i.e., the so-called “occupation probability”) , where (with ). Substituting into Eq. (28) yields

(30)

Therefore, the stationary density is the left eigenvector of with eigenvalue . The corresponding right eigenvector is , where represents transposition.

For a directed network that is “strongly connected” (i.e., a walker can travel from any node to any other node along directed edges Newman2010book), is unique. In undirected networks, one just needs a network to be connected, which we have assumed.

In undirected networks, we obtain the central result

(31)

which one can verify by substituting Eq. (31) into Eq. (30). For unweighted networks, Eq. (31) reduces to . Regardless of other structural properties of a network, the stationary density is determined solely by strength (and thus by degree for unweighted networks). Equation (31) also holds for directed networks that satisfy (where ). Such directed networks are sometimes called “balanced” Aldous2002book.

In undirected networks,

(32)

In other words, for each edge, the flow of probability in each direction must equal each other at equilibrium. This property, called “detailed balance” in statistical physics sethnabook and “time reversibility” in mathematics Lovasz1993Boyal; Aldous2002book, does not generally hold for directed networks.

Let’s consider a generalization of the degree-biased RW to weighted networks (i.e., a strength-biased RW) in which the probability that a random walker located at node or traverses the edge (, ) is proportional to . It follows that

(33)

where is the neighborhood of . A strength-biased RW is equivalent to an RW on a modified undirected network whose weighted adjacency matrix is given by (see Fig. 2 for an example). The strength of node in this modified network is given by . By substituting into Eq. (31) in place of , we obtain the stationary density

(34)

For an unweighted network constructed using a “configuration model” Fosdick2016, a standard model of random networks, we obtain Colizza2008JTheorBiol; LinZhang2013PhysRevE; ZhangShanChen2013PhysRevE. In particular, we obtain for all nodes when . Therefore, in general, we expect that a node with a large strength tends to have a large when (including for the unweighted case ) and that the same node tends to have a small when . For nodes with a large strength, we expect to increase as increases.

Figure 2: Strength-biased RW. (a) An original undirected network, whose weighted adjacency matrix is given by . (b) The modified undirected network, whose weighted adjacency matrix is given by . The numbers attached to the edges represent the edge weight. We set .

For directed networks in general, one can write a first-order approximation to the stationary density from Eq. (30). We assume that we do not possess any information about the neighbors of , so we replace and by their mean values:

(35)

On both synthetic and empirical networks, Eq. (35) is reasonably accurate in some cases but not in others Donato2004EurPhysJB; Fortunato2006PNAS; Restrepo2006PhysRevLett; Davis2008JAmerSocInfoSciTech; Fortunato2008LNCS; Fersht2009PNAS; MasudaOhtsuki2009NewJPhys; Ghoshal2011NatComm.

3.2.3 Relaxation time

To determine the relaxation time to the stationary state, it is instructive to project the solution, Eq. (29), onto an appropriate basis of vectors and to represent it in terms of its modes. The procedure, which is analogous to taking a Fourier transform [see Eq. (2)], is sometimes called a “graph Fourier transform” Sandryhaila2013IEEETransSignalProc; Tremblay2014IEEESigProc and will be explained in this section [see Eqs. (43)–(45)].

For simplicity, we consider undirected networks. In general, the transition probability matrix is asymmetric even for undirected networks, except for regular graphs. However, one can derive its eigenvalues and eigenvectors from those of the symmetric matrix

(36)

which we can decompose as follows:

(37)

where is the th eigenvalue of and is the corresponding normalized eigenvector (so that , where is the inner product), and is the Kronecker delta. Because is symmetric, each eigenvalue is real.

Because , we have the following similarity relationship between and Aldous2002book; Samukhin2008PhysRevE:

(38)

where we defined (a matrix whose nonzero entries lie only on the diagonal) in Section 3.1. Equation (38) implies that and have the same eigenvalues. In particular, all eigenvalues of are real-valued, because that is the case for . The left and right eigenvectors of corresponding to the eigenvalue are, respectively,

(39)

and

(40)

One can verify Eqs. (39) and (40) using Eq. (38) and the relation .

Using

(41)

we obtain the following mode expansion of the solution of the RW:

(42)

That is,

(43)

where

(44)
(45)

and is the projection onto the th eigenmode. Equations (43)–(45) map the state vector , which is defined on the nodes, to a vector of eigenvector amplitudes (i.e., their coefficients). This transform, called the “graph Fourier transform”, generalizes the standard Fourier transform of an RW [see Eqs. (3) and (7)], and the eigenvectors of the transition-probability matrix play the role of the Fourier modes .

For the matrix and , the eigenvalues each satisfy Lovasz1993Boyal; Aldous2002book. Except in the special cases of multipartite graphs, the strict inequality also holds. In this case, the mode with corresponds to the stationary density, and we thus write . The right eigenvector that corresponds to this mode is . All modes for which decay to . The eigenvalue is the largest-magnitude eigenvalue, and the Perron–Frobenius theorem guarantees that all elements of and are positive. Similar results hold for directed networks, although we cannot take advantage of the symmetric structure of the matrix in general. In directed networks, the eigenvalues satisfy . When holds for all but one eigenvalue, which is the case except for directed variants of multipartite graphs with an even number of components, the mode with corresponds to the stationary density. In this case, we obtain and . Again, the Perron–Frobenius theorem guarantees that all elements of are positive.

By letting in Eq. (42), we obtain , where the subscript “” indicates the mode corresponding to the dominant eigenvalue (which is equal to ). Because , it follows that regardless of the initial condition . This is consistent with the fact that gives the stationary density. By letting be large but finite, we obtain

(46)

where is the second-largest (in magnitude) eigenvalue of . In deriving Eq. (46), we only kept two terms, because for all eigenvalues with , assuming that (where ). Equation (46) indicates that the second-largest eigenvalue of governs the relaxation time. More generally, the relaxation speed is determined by the ratio between and . The difference is often called the “spectral gap”. A large spectral gap (i.e., a small-magnitude for ) entails fast relaxation.

The “Cheeger inequality” gives useful bounds on Chung1997book-spectral. The “Cheeger constant”, which is also called “conductance”, is defined by

(47)

where is a set of nodes in a network, is the complementary set of the nodes (i.e., and is the complete set of the nodes), and . In the minimization in Eq. (47), we seek a bipartition of a network such that the two parts are the most sparsely connected. (In other words, we want a minimum cut.) The denominator in the right-hand side of Eq. (47) prevents the selection of a very uneven bipartition, which would easily yield a small value for the numerator. The Cheeger inequality is

(48)

so a small Cheeger constant implies a small spectral gap and hence slower relaxation. This result is intuitive, because one can partition a network with a small value of into two well-separated communities such that it is difficult for random walkers to cross from one community to the other. Note that there are various versions of Cheeger constants and inequalities. They give qualitatively similar — but quantitatively different — results Lovasz1993Boyal; Aldous2002book; Donetti2006JStatMech; Arenas2008PhysRep; Cvetkovic2010book; piet-book. As discussed in Ref. Jeub2015PhysRevE and references therein, such results are important considerations for community detection.

A fact related to the relaxation time is that the power method is a practical method to calculate the stationary density of an RW in a directed network Golub1996book. Suppose that we start with an arbitrary initial vector , excluding one that is orthogonal to , and repeatedly left-multiply it by . After many iterations, we obtain an accurate estimate of . Because any that is orthogonal to includes a negative entry, one can start iterations with any probability vector . In practice, one may have to normalize after each iteration (or after some number of iterations) to avoid the elements of becoming too large or small.

3.2.4 Exit probability

One is often interested in the probability that a random walker terminates at a particular node, which is then called an “absorbing state”. Upon reaching an absorbing state, a stochastic process cannot escape from it. A node is “absorbing” if and only if , which implies that (for ). A set of nodes is an “ergodic” set if (1) it is possible to go from to for any nodes in the set and (2) the process does not leave the set once it has been reached. An absorbing node is an ergodic set that consists of a single node. A state in a Markov chain is said to be a “transient state” if it does not belong to an ergodic set.

When an RW is composed of transient-state nodes and absorbing-state nodes, there are nodes in total. Without loss of generality, we relabel the nodes such that , , are transient and , , are absorbing. The transition-probability matrix then has the following form:

(49)

where is an matrix that describes transitions between transient-state nodes, is an matrix that describes transitions from transient-state nodes to absorbing-state nodes, and is the identity matrix that corresponds to individual absorbing-state nodes. Taking powers of Eq. (49) yields

(50)

Suppose that we start from transient-state node and want to calculate the mean number of visits to transient-state node before reaching an absorbing-state node. This number of visits is equal to the (, )th element of the matrix

(51)

because the th element of is equal to the probability that a random walker starting from visits at discrete time . The matrix is called the “fundamental matrix” associated with . The matrix on the right-hand side of Eq. (51) is called the “resolvent” of . Similar considerations arise in the study of “central” (i.e., important) nodes in networks Est12comm.

The “exit probability” (i.e., the “first-passage-time probability”) is defined as the probability that the walker terminates at an absorbing state when it starts from a transient state . When there are multiple absorbing-state nodes, it is nontrivial to determine the exit probability. The probability that the walker reaches after exactly steps is given by the th element of . Therefore, we obtain the exit probability in matrix form as follows:

(52)

3.2.5 Mean first-passage and recurrence times

When does a random walker starting from a certain source node arrive at a target node for the first time? The answer to this question is known as the “first-passage time” (or “first-hitting time”) if the source and target nodes are different and is known as the “recurrence time” (or the “first-return time”) when the source and target nodes are identical. Let (with ) denote the mean first-passage time (MFPT) from node to node . The mean recurrence time is . For directed networks, we assume strongly connected networks throughout this section to guarantee that (for ). For reviews on first-passage problems on networks and other media, see Redner2001book; Benichou2014PhysRep.

General networks: Let’s first consider some general results. The following identity holds Kemeny1960-1976book; Papoulis1965-2002book; Stewart1994book; Aldous2002book:

(53)

In its first step, a random walker moves from node to node , which produces the on the right-hand side of Eq. (53). If , then the walk terminates at , resulting in a first-passage time of . Otherwise, we seek the first-passage from node (with ) to node . This produces the second term on the right-hand side. Note that Eq. (53) is also valid when .

In matrix notation, we write Eq. (53) as

(54)

where , all of the elements of the matrix are equal to , and is the diagonal matrix whose diagonal elements are equal to . By left-multiplying Eq. (54) by and using and , we obtain the mean recurrence time

(55)

Equation (55) is called “Kac’s formula” Aldous2002book; LevinPeresWilmer2009book; Blanchard2011book.

There are several different ways to evaluate the MFPT (with ), and it is insightful to discuss different approaches.

One method is simply to iterate Eq. (53) Stewart1994book.

A second method to calculate the MFPT, for a given , is to rewrite Eq. (53) as

(56)

where and are -dimensional column vectors and is the submatrix of that excludes the th row and th column LinZhang2013PhysRevE. The formal solution of Eq. (56) is

(57)

where is the submatrix of that excludes the th row and th column and , where is the submatrix of that excludes the th row and th column. The matrix is sometimes called a “grounded Laplacian matrix” Miekkala1993Bit (although it is not a Laplacian matrix), and it is invertible because we assumed strongly connected networks. One can derive and solve Eq. (57) separately for each .

A third method to calculate the MFPT is to take advantage of relaxation properties of RWs Noh2004PhysRevLett. Let denote the probability that a walker starting at node visits node after moves. The master equation is

(58)

Let denote the probability that the walker starting from arrives at for the first time after moves. We obtain

(59)

Using a discrete-time Laplace transform (see, e.g., wilf2005 for an extensive discussion of such generating functions), defined by

(60)

and

(61)

we transform Eq. (59) to

(62)

and thereby obtain

(63)

Using Eq. (63) then yields

(64)

To evaluate Eq. (64), we define

(65)

Equation (65) quantifies the relaxation speed at which approaches the stationary density. To write the Laplace transform, we multiply both sides of Eq. (65) by and sum over . We thereby obtain

(66)

Substituting Eq. (66) into Eq. (63) then yields

(67)

where represents a quantity that is much smaller than in the relevant asymptotic limit ( in the present case). Consequently,

(68)

which is consistent with Kac’s formula [see Eq. (55)]. For undirected networks, substituting into Eq. (68) yields

(69)

A fourth method to examine the MFPT is to estimate using a mean-field approximation Kittas2008EPL; Perra2012PhysRevLett; Starnini2012PhysRevE. Regardless of the source node , the target node is reached with an approximate probability of in each time step. Therefore,

(70)

Equation (70) is a rather coarse approximation, and can deviate considerably from . More sophisticated mean-field approaches can likely do better, especially for networks with structures that are well-suited to the employed approximation.

There have been many studies of MFPTs for various network models using both analytical and numerical approaches Redner2001book; Almaas2003PhysRevE; Masuda2004PhysRevE-rw; Hwang2012PhysRevE; Hwang2012PhysRevE; Hwang2013PhysRevE; PengAgliariZhang2015Chaos. We will discuss some examples of undirected and unweighted networks. We focus mainly on the MFPT between difference nodes, although it is of course also interesting to calculate recurrence times.

Regular networks: For a complete graph, (with ) is independent of and because of the symmetry of the network. Therefore, Eq. (53) reduces to