Distributed Local Linear Parameter Estimation using Gaussian SPAWN
We consider the problem of estimating local sensor parameters, where the local parameters and sensor observations are related through linear stochastic models. Sensors exchange messages and cooperate with each other to estimate their own local parameters iteratively. We study the Gaussian Sum-Product Algorithm over a Wireless Network (gSPAWN) procedure, which is based on belief propagation, but uses fixed size broadcast messages at each sensor instead. Compared with the popular diffusion strategies for performing network parameter estimation, whose communication cost at each sensor increases with increasing network density, the gSPAWN algorithm allows sensors to broadcast a message whose size does not depend on the network size or density, making it more suitable for applications in wireless sensor networks. We show that the gSPAWN algorithm converges in mean and has mean-square stability under some technical sufficient conditions, and we describe an application of the gSPAWN algorithm to a network localization problem in non-line-of-sight environments. Numerical results suggest that gSPAWN converges much faster in general than the diffusion method, and has lower communication costs, with comparable root mean square errors.
A wireless sensor network (WSN) consists of many sensors or nodes capable of on-board sensing, computing, and communications. WSNs are used in numerous applications like environmental monitoring, pollution detection, control of industrial machines and home appliances, event detection, and object tracking [Akyildiz2007, Bulusu2005, Tay2009, Tay2008]. In distributed estimation [Hong2007, Zhu2010, Speranzon2006], sensors cooperate with each other by passing information between neighboring nodes, which removes the necessity of transmitting local data to a central fusion center. Distributed estimation schemes hence have the advantages of being scalable, robust to node failures, and are more energy efficient due to the shorter sensor-to-sensor communication ranges, compared with centralized schemes which require transmission to a fusion center. It also improves local estimation accuracy [Gholami2012, Dardari2008]. For example, by cooperating with each other to perform localization, nodes that have information from an inadequate number of anchors can still successfully self-localize [Wymeersch2008, Wymeersch2009].
In this paper, we consider distributed local linear parameter estimation in a WSN in which each sensor cooperates with its neighbors to estimate its own local parameter, which is related to its own observations as well as its neighbors’ observations via a linear stochastic model. This is a special case of the more general problem in which sensors cooperate to estimate a global vector parameter of interest, as the local parameters can be collected into a single vector parameter. Many distributed estimation algorithms for this problem have been investigated in the literature, including consensus strategies [Boyd2006, Aysal2009, Zhu2011, Kriegleder2013], the incremental least mean square (LMS) algorithm [Lopes2007, Cattivelli2011], the distributed LMS consensus-based algorithm [Schizas2009] and diffusion LMS strategies (see the excellent tutorial [Sayed2013a] and the references therein). The consensus strategies are relatively less computationally intensive for each sensor. However, their performance in terms of the convergence rate and mean-square deviation (MSD) are often not as good as the diffusion LMS strategies [Tu2012]. The incremental LMS algorithm requires that sensors set up a Hamiltonian path through the network, which is impractical for a large WSN.
To ensure mean and mean-square convergence, the diffusion LMS strategy requires that a pre-defined step size in each diffusion update is smaller than twice the reciprocal of the maximum eigenvalue of the system matrix covariance [Sayed2013, Sayed2013a]. Since this eigenvalue is not known a priori, a very small step size is thus typically chosen in the diffusion LMS strategy. However, this leads to slow convergence rates, resulting in higher communication costs. Furthermore, the diffusion strategy is not specifically designed to perform local sensor parameter estimation in a WSN. For example, when sensors need to estimate their own clock skews and offsets in network synchronization [Leng2011], the diffusion LMS strategy either requires that every node estimates the same global parameter, which is a collection of all the sensor local clock skews and offsets, or at least transmits estimates of the clock skews and offsets of all its neighbors (cf. Section LABEL:subsect:compare_ATC for a detailed discussion). In both cases, communication cost for each sensor per iteration increases with the density of the network, and does not make full use of the broadcast nature of wireless communications in a WSN. The distributed LMS consensus-based algorithm of [Schizas2009] requires the selection of a set of bridge sensors, and the passing of several messages between neighboring sensors, which again does not utilize the broadcast nature of a WSN. It is also more computationally complex than the diffusion algorithm but with better MSD performance.
We therefore ask if there exists a distributed estimation algorithm with similar MSD performance as the diffusion algorithms, and that allows sensors to broadcast a fixed size message, regardless of the network size or density, to all its neighbors? Our work suggests that the answer is affirmative for a somewhat more restrictive data model than that used in the LMS literature [Sayed2013a], and can be found in the Sum-Product Algorithm over a Wireless Network (SPAWN) method, first proposed by [Wymeersch2009]. The SPAWN algorithm is based on belief propagation [Kschischang2001, Bishop2006]; the main differing characteristic is that the same message is broadcast to all neighboring nodes by each sensor, in contrast to traditional belief propagation where a different message is transmitted to each neighbor. The reference [Wymeersch2009] however does not address the issue of error convergence in the SPAWN algorithm, and it is well known that loopy belief propagation may not converge [Pearl1988, Johnson2006, Weiss2001].
In this paper, we consider an adaptive version of the Gaussian SPAWN (gSPAWN) algorithm for distributed local linear parameter estimation. We assume that sensors have only local observations with respect to its neighbors, such as the pairwise distance measurement, the relative temperature with respect to one another, and the relative clock offset between two nodes. Similar to [Cattivelli2010, Takahashi2010, Sayed2013], we assume that the observations follow a linear model. Our main contribution in this paper is the derivation of sufficient conditions for the mean and MSD convergence of the gSPAWN algorithm. Note that although the gSPAWN algorithm is based on Gaussian belief propagation, the methods of [Weiss2001, Johnson2006] for analyzing the convergence of Gaussian belief propagation in a loopy graph do not apply due to the difference in messages transmitted by each node in gSPAWN versus that in Gaussian belief propagation. In fact, due to the broadcast nature of the messages in gSPAWN, our analysis is simpler than that in [Weiss2001, Johnson2006].
As an example, we apply the gSPAWN algorithm to cooperative self-localization in non-line-of-sight (NLOS) multipath environments. To the best of our knowledge, most distributed localization methods [Srirangarajan2008, Ihler2005, Zhu2011] consider only line-of-sight (LOS) signals, because NLOS environments introduce non-linearities in the system models, and measurement noises can no longer be modeled as Gaussian random variables. In this application, we assume that individual propagation paths can be resolved, and we adopt a ray tracing model to characterize the relationship between sensor locations, range and angle measurements [Miao2007, Seow2008, Xie2009]. When all scatterers in the environment are either parallel or orthogonal to each other, we show that the sensor location estimates given by the gSPAWN algorithm converges in the mean to the true sensor positions. We compare the performance of our algorithm with that of a peer-to-peer localization method and the diffusion LMS strategy, with numerical results suggesting that the gSPAWN algorithm has better average accuracy and convergence rate.
The rest of this paper is organized as follows. In Section II, we define the system model. In Section III, we describe the gSPAWN algorithm and compare it to the diffusion algorithm. We provide sufficient conditions for mean convergence and mean-square stability of the gSPAWN algorithm, and numerical comparison results in Section LABEL:Section:convergence. We then show an application of the gSPAWN algorithm to network localization in NLOS environments in Section LABEL:Section:Case. Finally, we summarize and conclude in Section LABEL:Section:Conclude.
Notations: We use upper-case letters to represent matrices and lower-case letters for vectors and scalars. Bold faced symbols are used to denote random variables. The conjugate of a matrix is . The transpose and conjugate transpose of are denoted as and , respectively. The minimum and maximum non-zero singular values of are denoted as and , respectively. The maximum absolute eigenvalue or spectral radius of is denoted as . When is Hermitian, we have . We write and if is positive semi-definite and positive definite, respectively. If all entries of are non-negative, we write . The operation is the Kronecker product between and . The vector is formed by stacking the columns of together into a column vector. The matrix is a block diagonal matrix consisting of the sub-matrices on the main diagonal. The symbol represents a identity matrix. We use to denote a vector of all zeroes. The operator denotes mathematical expectation. The density function of a multivariate Gaussian distribution with mean and covariance is given by .
Ii System model
In this section, we describe our system model, and present some assumptions that we make throughout this paper. We consider a network of sensors , where each sensor wants to estimate a parameter . Sensor is set to be the reference node in the network, and is a global reference level for the other sensor parameters. For example, in a localization problem, we are often interested to find the relative locations of nodes in the network with respect to (w.r.t.) a reference node. In the sensor network synchronization problem, we want to estimate the relative clock offset and skew of each sensor w.r.t. to a reference sensor. In this paper, we use the terms “sensors” and “nodes” interchangeably. We say that sensors and are neighbors if they are able to communicate with each other. A network consisting of all sensors therefore corresponds to a graph , where the set of vertices , and the set of edges consists of communication links in the network. We let the set of neighbors of sensor be (which may include sensor itself or not, depending on the application data model).
Sensors interact with neighbors and estimate their local parameters through in-network processing. In most applications, each sensor can obtain only local measurements w.r.t. its neighbors via communication between each other. Examples include the pseudo distance measurement between two sensors when performing wireless ranging, and the pairwise clock offset when performing clock synchronization. For each sensor , and neighbor , we consider the data model given by
where is the vector measurement obtained at sensor w.r.t. sensor at time , is a known system coefficient matrix, is an observed regression matrix at time , and is a measurement noise of dimensions . We note that this data model is a special case of the widely studied LMS data model (see e.g., [Sayed2013a]), since we can describe (1) by using a global parameter with appropriate stacking of the measurements . On the other hand, if every sensor is interested in the same global parameter , we have from (1) that , where is the regression matrix in [Sayed2013a].111The data model in [Sayed2013a] assumes that , but can be easily extended to . Although we have assumed for convenience that all quantities have the same dimensions across sensors, our work can be easily generalized to the case where measurements and parameters have different dimensions at each sensor. We have the following assumptions similar to [Sayed2013a] on our data model.
The regression matrices are independent over sensor indices and , and over time . They are also independent of the measurement noises for all , and . For all , is stationary over .
The measurement noises are stationary with zero mean, and , where is a positive semi-definite matrix.
For every , the matrix is positive definite.
In the LMS framework, each sensor seeks to estimate so that the overall network mean square error,
is minimized. Our goal is to design a distributed algorithm to perform local parameter estimation, which makes full use of the broadcast nature of the wireless medium over which sensors communicate. In particular, the messages broadcast by each sensor do not depend on the number of neighbors of the sensor or the network size. This is achieved by the gSPAWN algorithm. However, in order to ensure convergence of the gSPAWN algorithm, we need the following additional technical assumptions as well as Assumption LABEL:assumpt:model3, which will be discussed later in Section LABEL:subsect:Belief_means_converge. This makes our system model more restrictive than that considered in [Sayed2013a].
The network graph is connected, and consists of strongly connected components.222The notation denotes a graph with node and its incident edges removed.
The reference sensor has an a priori estimate of its parameter , where is a zero mean Gaussian random variable with covariance matrix .
Let for all , , and . For all , we have if , and if .
Assumption 2(i) does not result in much loss of generality. If the graph is not connected, the same analysis can be repeated on each connected component of . Furthermore, in most practical applications, sensor measurements are symmetric, i.e., if sensor can obtain a measurement w.r.t. sensor , then sensor can also obtain a measurement w.r.t. sensor . Assumption 2(ii) also holds in our applications of interest, where typically, is taken to be a known reference value with . For example, when localizing sensors w.r.t. to the reference node 0, there is no loss in generality in assuming that the reference node is at , the origin of the frame of reference. Assumption 2(iii) on the other hand restricts our data model to those in which the system coefficient matrices and expected regression matrices do not differ by too much.
Iii The Gaussain SPAWN Algorithm
In this section, we briefly review the SPAWN algorithm, and derive the adaptive gSPAWN algorithm. Since belief propagation is a Bayesian estimation approach, we will temporarily view the parameters as random variables with uniform priors in this section. Suppose that we make the observations at a time . In the Bayesian framework, we are interested to find values that maximize the a posteriori probability
where the notation denotes the conditional probability of given .333Note the abuse of notation here: is used to denote both the random variable and its realization. Since the distributions of and depend only on and , we have from (1) that
Let and . We construct a factor graph using as the variable nodes, and for each pair of neighboring sensors and with , we connect their corresponding variable nodes using the two factor nodes and . See Figure 1 for an example, where each random variable is represented by a circle and each factor node is represented by a square. We note that our factor graph construction is somewhat untypical compared to those used in traditional loopy belief propagation, where usually a single factor is used between the two variable nodes and . The reason we separate it into two factors is to allow us to design an algorithm that allows sensors to broadcast the same message to all neighboring nodes, as described below.
To apply the sum-product algorithm on a loopy factor graph, we need to define a message passing schedule [Bishop2006]. We adopt a fully parallel schedule here. In each iteration, all factor nodes send messages to their neighboring variable nodes in parallel, followed by messages from variable nodes to neighboring factor nodes in parallel. We constrain the updates by allowing messages to flow in only one direction. Specifically, for every and for all , sends messages only to and not to , and sends messages only to and not to . The two types of messages sent between variable and factor nodes at the th iteration are the following:
Each factor node passes a message to the variable node representing ’s belief of ’s state. This message is given by
Each variable node broadcasts a belief to all neighboring factor nodes , and its belief is given by
The above scheme is called the SPAWN algorithm by [Wymeersch2009]. Note that the same belief message from the variable node is broadcast to all factor nodes , since node receives only from factor node and the belief does not include any messages from in the previous iterations. The communication cost therefore does not depend on the size or density of the network, making this scheme more suitable for a WSN. In the following, we derive the exact messages passed when , for all , are assumed to be Gaussian distributions. In addition, we allow the observations to be updated at each iteration so that the algorithm becomes adaptive in nature. We call this the gSPAWN algorithm. Although Gaussian distributions are assumed in the derivation of the gSPAWN algorithm, we show in Section LABEL:Section:convergence that this assumption is not required for mean and MSD convergence of the algorithm.444This is analogous to the philosophy of using the B.L.U.E. for parameter estimation with non-Gaussian system models.