Distributed Quantization Networks
Abstract
Several key results in distributed source coding offer the intuition that little improvement in compression can be gained from intersensor communication when the information is coded in long blocks. However, when sensors are restricted to code their observations in small blocks (e.g., 1), intelligent collaboration between sensors can greatly reduce distortion. For networks where sensors are allowed to “chat” using a side channel that is unobservable at the fusion center, we provide asymptoticallyexact characterization of distortion performance and optimal quantizer design in the highresolution (lowdistortion) regime using a framework called distributed functional scalar quantization (DFSQ). The key result is that chatting can dramatically improve performance even when intersensor communication is at very low rate, especially if the fusion center desires fidelity of a nonlinear computation applied to source realizations rather than fidelity in representing the sources themselves. We also solve the rate allocation problem when communication links have heterogeneous costs and provide a detailed example to demonstrate the theoretical and practical gains from chatting. This example for maximum computation gives insight on the gap between chatting and distributed networks, and how to optimize the intersensor communication.
I Introduction
A longstanding consideration in distributed compression systems is whether sensors wishing to convey information to a fusion center should communicate with each other to improve efficiency. Architectures that only allow communication between individual sensors and the fusion center simplify the network’s communication protocol and decrease sensor responsibilities. Moreover, information theoretic results such as the Slepian–Wolf theorem show that distributed compression can perform as well as joint compression for lossless communication of correlated information sources [1]. Although this surprising and beautiful result does not extend fully, comparable results for lossy coding show that the rate loss from separate encoding can be small using Berger–Tung coding (see, e.g., [2]), again suggesting that communication between sensors has little or no utility.
Although it is tempting to use results from information theory to justify simple communication topologies, it is important to note the Slepian–Wolf result is dependent on large blocklength; in the finiteblocklength regime, the optimality of distributed encoding does not hold [3]. This paper examines the use of communication among sensors when the compression blocklength is 1, a regime where collaboration, called chatting in this work, can greatly decrease the aggregate communication from sensors to the fusion center to meet a distortion criterion as compared to a distributed network. We analyze chatting networks using the distributed functional scalar quantization (DFSQ) framework, which constrains sensors to using scalar quantizers to compress their observations and generalizes the fusion center’s objective to desire fidelity in computing a function of the sources rather than determining the sources themselves [4, 5]. Our problem model is shown in Fig. 1, where correlated but memoryless continuousvalued, discretetime stochastic processes produce scalar realizations for . For each , realizations of these sources are scalar quantized by sensors and transmitted to a fusion center at rates . To aid this communication, sensors can collaborate with each other via a side channel that is unobservable to the fusion center. Since the quantization is scalar and the sources are memoryless, we remove the time index and model the sources as being drawn from a joint distribution at each .
The side channel facilitating intersensor communication has practical implications. In typical communication systems, the transmission power needed for reliable communication increases superlinearly with distance and bandwidth [6]. Hence, it is much cheaper to design short and lowrate links between sensors than reliable and highrate links to a fusion center. Moreover, milder transmission requirements provide more flexibility in determining the transmission media or communication modalities employed, which can allow intersensor communication to be orthogonal to the main network. One such example is cognitive radio, a paradigm where the wireless spectrum can have secondary users that communicate only when the primary users are silent [7]. This means secondary users have less priority and hence lower reliability and rate, which is adequate for intersensor communication.
The main contributions of the paper are to precisely characterize the distortion performance of a distributed network when chatting is allowed and to identify the optimal quantizer design for each sensor. We show that collaboration can have significant impact on performance; in some cases, it can dramatically reduce distortion even when the chatting has extremely low rate. We also give necessary conditions on the chatting topology and protocol for successful decodability in the DFSQ framework, thus providing insight into the architecture design for chatting networks. Finally, we recognize that intersensor communication can occur on lowcost channels and solve the rate allocation problem in networks with heterogeneous links and different costs of transmission. The basic concepts of this work were introduced in [8]; this paper provides more complete and definitive coverage, including more results on rate allocation, a discussion on generalizing chatting messages, and details on the impact of various optimizations.
We begin by introducing related work, notation and prerequisite results in Section II. In Section III, we analyze the performance of chatting networks and discuss how to optimize the communication that occurs. We then determine the proper rate allocation for chatting networks in Section IV. Finally, we develop intuition for the behavior of chatting by considering a maximum computation network in Section V; this specific example demonstrates the incremental gains achieved by incorporating the different optimizations discussed in the paper.
Ii Preliminaries
Iia Previous Work
There is a large body of literature studying asymptotic performance of the distributed network in Fig. 1 without the chatting channel; a comprehensive review of these works and their connections to DFSQ appears in [4]. Similarly, connections to coding for computing (e.g. [9, 10]) are discussed there as well. Recent work on the finiteblocklength regime [11] has led to extensions in source coding [12, 13, 3]. In general, this analysis technique is meaningful for blocklengths as low as 100, but is unsuitable for regimes traditionally considered in highresolution theory.
We review results that relate to the chatting channel, focusing on Shannontheoretic results. Kaspi and Berger provided inner bounds for the rate region of a twoencoder problem where one encoder can send information for the other using compressandforward techniques [14]. Recently, this bound has been generalized in [15], but the exact rate region is still unknown except in special cases. Chatting is related to source coding problems such as interaction [16, 17], omniscience [18] and data exchange [19]. However, these settings are more naturally suited for discretealphabet sources and existing results rely on largeblocklength analysis.
IiB Quantization
The focus of this work is on compression of continuousvalued, finitesupport sources using small blocks of data. Here, performance results from Shannon theory are overly optimistic since tools such as jointtypicality encoding and decoding are not reliable without operating far from the distortion–rate bound. Instead, we consider the complementary asymptotic of high resolution, where the blocklength is small and the compression rate is large [22, 23, 24]. Before introducing the highresolution asymptotic, we summarize the quantization model for the case of blocklength 1 and set up the notation used for the rest of the paper.
A scalar quantizer is a mapping from the real line to a set of points called the codebook, where if and the cells form a partition of . The quantizer is called regular if the partition cells are intervals containing the corresponding codewords. For simplicity, the codebook and partition are indexed from smallest to largest, implying if , with and . Define the granular region as and its complement as the overload region.
Uniform quantization, where partition cells in the granular region have equal length, is most common in practice, but nonuniform quantization can be better for compression if the source can be modeled properly. One way of constructing a nonuniform quantizer is using the compander model, where the scalar source is transformed using a nondecreasing and smooth compressor function , then quantized using a uniform quantizer comprising levels on the granular region , and finally passed through the expander function (Fig. 2). Compressor functions are defined such that and . It is convenient to define a point density function as . Because of the boundary conditions on , there is a onetoone correspondence between and ; hence, a companding quantizer can be uniquely specified using a point density function and codebook size, and is denoted in this work. The conversion of point density functions to finitecodeword quantizers is described in more detail in [5, Section IIB].
IiC Highresolution Theory
It is generally difficult to determine the distortion of a scalar quantizer for any codebook size . However, the performance of can be precisely analyzed as the number of codewords becomes large, which is the basis of highresolution theory. Assume a source is a continuous random variable, and define the mean squared error (MSE) distortion as
(1) 
where the expectation is with respect to the source density . Under the additional assumption that the tails of decay sufficiently fast,
(2) 
where indicates that the ratio of the two expressions approaches 1 as increases [25, 26]. Hence, the MSE performance of a scalar quantizer can be approximated by a simple relationship between the source distribution, point density, and codebook size, and this relation becomes more precise with increasing . In fact, companding quantizers are asymptotically optimal, meaning that the quantizer optimized over has distortion that approaches the performance of the best found by any means [27, 28, 29]. Experimentally, the highresolution approximation is accurate even for moderate [23, 30].
When the quantized values are to be communicated or stored, it is natural to map each codeword to a string of bits and consider the tradeoff between performance and communication rate , defined to be the expected number of bits per sample. In the simplest case, the codewords are indexed with equallength labels and the communication rate is ; this is called fixedrate or codebookconstrained quantization. Since the distortion’s dependence on the shape of the quantizer is explicit in the asymptote, calculus techniques can be used to optimize companders. For fixed rate, Hölder’s inequality can show the optimal point density satisfies
(3) 
and the resulting distortion is
(4) 
with the notation [31]. The limit conditions on imply the integral of is unity. Thus, (3) specifies the point density uniquely; for clarity, we omit the normalization when presenting point density results.
In general, the codeword indices can be coded to produce bit strings of different lengths based on probabilities of occurrence; this is referred to as variablerate quantization. If the decoding latency is allowed to be large, one can employ block entropy coding and the communication rate approaches . This particular scenario, called entropyconstrained quantization, can be analyzed using Jensen’s inequality to show the optimal point density is constant on the support of the input distribution [31]. The optimal quantizer is thus uniform, and the resulting distortion is
(5) 
Note that block entropy coding suggests that the sources are transmitted in blocks even though the quantization is scalar. As such, (5) is an asymptotic result and serves as a lower bound on practical entropy coders with finite blocklengths that match the latency restrictions of a network.
IiD Distributed Functional Scalar Quantization
When the goal of acquisition is to approximate some computation applied to the sources, optimizing the compression to the source distribution can be suboptimal and potentially worse than uniform quantization. This is most evident in distributed networks since each sensor cannot determine the overall computation at the encoder. The distributed functional scalar quantization (DFSQ) framework accounts for the computational task at the fusion center, and the resulting quantizers can be substantially better than naive designs [4, 5]. In this setting, the distortion criterion is functional MSE (fMSE):
(6) 
where is a scalar function of interest, is the decoding function and is scalar quantization performed on a vector such that
Before understanding how a quantizer changes fMSE, it is convenient to define how a computation locally affects distortion.
Definition 1 ([4]).
The th functional sensitivity profile of a multivariate function is defined as
(7) 
where is the partial derivative of with respect to its th argument evaluated at the point .
Given the sensitivity profile, the main result of DFSQ [4] says the distortion of a set of companding quantizers has the asymptotic form
(8) 
with conditional expectation decoder
(9) 
provided the following conditions are satisfied:

The function is Lipschitz continuous and twice differentiable in every argument except possibly on a set of Jordan measure 0.

The source pdf is continuous, bounded, and supported on .

The function and set of point densities allow to be defined and finite for all .
Similar conditions are given in [5] for infinitesupport distributions and a simpler decoder.
Following the same recipes to optimize over as in the MSE setting, the relationship between distortion and communication rate is found. In both cases, the sensitivity acts to shift quantization points to where they can reduce the distortion in the computation. For fixedrate quantization, the asymptotic minimum distortion is
(10) 
where is the marginal distribution of and each optimal point density satisfies
(11) 
Meanwhile, for entropyconstrained quantization, the asymptotic minimum distortion is
(12) 
which results from point densities satisfying
(13) 
IiE Don’tcare intervals
When the computation induces the sensitivity to be 0 on some subintervals of the support, the highresolution assumptions are violated and the asymptotic distortion performance may not be described by (8). This issue is addressed by carefully coding when the source is in such a “don’tcare” interval [4, Section VII] and then applying traditional highresolution theory to the remaining support. This consideration is particularly relevant because chatting among sensors can often induce the conditional sensitivity to be 0, and proper coding can lead to greatly improved performance.
Consider don’tcare intervals in and let be the event that the source realization is not in the unions of them. In the fixedrate setting, one codeword is allocated to each don’tcare interval, and the remaining codewords are used to form reconstruction points in the nonzero intervals. There is a small degradation in performance from the loss corresponding to , but this quickly becomes negligible as increases. In the entropyconstrained case, the additional flexibility in coding allows for the encoder to split its message and reduce cost. The first part is an indicator variable revealing whether the source is in a don’tcare interval and can be coded at rate , where is the binary entropy function. The actual reconstruction message is only sent if event occurs, and its rate is amplified to to meet the average rate constraint. The multiplicative factor is called the rate amplification.
IiF Chatting
In [4, Section VIII], chatting is introduced in the setting where one sensor sends exactly one bit to another sensor. Under fixedrate quantization, this collaboration can at most decrease the distortion by a factor of 4 using a property of quasinorms. Because utilizing that bit to send additional information to the fusion center would decrease distortion by exactly a factor of 4, this is considered a negative result. Here, there is an implicit assumption that links have equal cost per bit and the network wishes to optimize a total cost budget. In the entropyconstrained setting, chatting may be useful even when links have equal costs. One example was given to demonstrate a single bit of chatting can decrease the distortion by an unbounded amount; more generally, the benefit of chatting varies depending on the source joint distribution and decoder computation.
In previous work, there is no systematic theory on performance and quantizer design of chatting. Moreover, collaboration in larger networks was still an open problem. In this paper, we extend previous results and provide a more complete discussion on how a chatting channel affects a distributed quantization network. A sample result is that chatting can be beneficial in the fixedrate setting if the cost of communicating a bit to another sensor is lower than the cost of communicating a bit to the fusion center.
Iii Performance and Design of Chatting Networks
We model the chatting channel in Fig. 1 as a directed graph , where the set of nodes is the set of all sensors and is the set of noiseless, directed chatting links. If , then for each source realization, Sensor sends to Sensor a chatting message with codebook size . The parent and children sets of a sensor are denoted and respectively; when , is a parent of and is a child of . The set of all chatting messages is and the set of corresponding codebook sizes is . The chatting messages are communicated according to a schedule that the sensors and the fusion center know in advance; the set of chatting messages can therefore also be thought of as a sequence. We assume chatting occurs quickly in that all communication is completed before the next discrete time instant (at which point new realizations of are measured). After chatting is complete, Sensor compresses its observation into a message using a codebook dependent on the information gathered from chatting messages, which is noiselessly communicated to the fusion center with a message with codebook size .
We now present fMSE performance of in the fixedrate and entropyconstrained settings, and we show how to optimize given and . We first analyze the network assuming the fusion center can successfully infer the codebook used by each sensor and hence recover the quantized values from messages . Later in Section IIID, we provide conditions on the chatting graph and set of chatting messages such that the fusion center is successful with zero error, having benefited from already understanding the quantizer design.
Before studying fMSE, we need to extend the definition of functional sensitivity.
Definition 2.
Let be the set of parents of Sensor in the graph induced by chatting. The th conditional sensitivity profile of computation given all chatting messages is
(14) 
Notice only messages from parent sensors are relevant to . Intuitively, chatting messages reveal information about the parent sensors’ quantized values and reshape the sensitivity appropriately. Depending on the encoding of chatting messages, this may induce don’tcare intervals in the conditional sensitivity (where ).
The distortion dependence on the number of codeword points and the conditional sensitivity profiles is given in the following theorem:
Theorem 1.
Given the source distribution , computation , and point densities satisfying conditions MF1–3 for every possible realization of , the asymptotic distortion of the conditional expectation decoder (9) given codeword allocation and is
(15) 
Proof:
Extend the proof of [4, Theorem 17] using the Law of Total Expectation. ∎
Compared to the DFSQ result, the performance of a chatting network can be substantially more difficult to compute since the conditional sensitivity may be different with each realization of and affects the choice of the point density and codebook size. However, Sensor ’s dependence on is through a subset of messages from its parent nodes. In Section V, we will see how structured architectures lead to tractable computations of fMSE. Following the techniques in [5], the theorem can be expanded to account for infinitesupport distributions and a simpler decoder. Some effort is necessary to justify the use of normalized point densities in the infinitesupport case, especially in the entropyconstrained setting, but highresolution theory applies in this case as well.
Iiia Don’tCare Intervals
We have already alluded to the fact that chatting can induce don’tcare intervals in the conditional sensitivity profiles of certain sensors. In this case, we must properly code for these intervals to ensure the highresolution assumptions hold, as discussed in Section IID.
For fixedrate coding where , this means shifting one codeword to the interior of each don’tcare interval and applying standard highresolution analysis over the union of all intervals where . The resulting distortion of a chatting network is then given as:
Corollary 1.
Assume the source distribution , computation , and point densities satisfying conditions MF1–3 for every possible realization of , with the additional requirement that whenever . Let be the number of don’tcare intervals in the conditional sensitivity of Sensor when . The asymptotic distortion of such a chatting network where communication links utilize fixedrate coding is
(16) 
In the entropyconstrained setting where , we must code first the event that the source is not in a don’tcare interval given the chatting messages, and then coding the source realization only if occurs. The resulting distortion of a chatting network is:
Corollary 2.
Assume the source distribution , computation , and point densities satisfying conditions MF1–3 for every possible realization of , with the additional requirement that whenever . Let be the event that is not in a don’tcare interval given . The asymptotic distortion of such a chatting network where communication links utilize entropy coding is
We will use both corollaries in optimizing the design of in the remainder of the paper.
IiiB Fixedrate Quantization Design
We mirror the method used to determine (11) in the DFSQ setup but now allow the sensor to choose from a set of codebooks depending on the incoming messages from parent sensors. The mapping between chatting messages and codebooks is known to the decoder of the fusion center, and each codebook corresponds to the optimal quantizer for a given conditional sensitivity induced by the incoming message. Let be the union of the don’tcare intervals of a particular conditional sensitivity. Then using Corollary 1, the optimal point density for fixedrate quantization satisfies
(17) 
Recall that the point density is the derivative of the compressor function in the compander model. Hence, codewords are placed at the solutions to for . In addition, one codeword must be placed in each of the don’tcare interval.
IiiC Entropyconstrained Quantization Design
Using Corollary 2, the optimal point density when entropy coding is combined with scalar quantization has the form
(18) 
Note that rate amplification can arise through chatting, and this can allow distortion terms to decay at rates faster than . However, there is also a penalty from proper coding of don’tcare intervals, corresponding to . This loss is negligible in the highresolution regime but may become important for moderate rates.
IiiD Conditions on Chatting Graph
We have observed that chatting can influence optimal design of scalar quantizers through the conditional sensitivity, and that sensors will vary their quantization codebooks depending on the incoming messages from parent sensors. Under the assumption that the fusion center does not have access to , success of compression is contingent on the fusion center identifying the codebook employed by every sensor from the messages .
Definition 3.
A chatting network is codebook identifiable if the fusion center can determine the codebooks of using the messages it receives from each sensor. That is, it can determine from for each time instant.
We have argued that a chatting network can successfully communicate its compressed observations if it is codebook identifiable. The following are sufficient conditions on the chatting graph and messages such that the network is codebook identifiable:

The chatting graph is a directed acyclic graph.

The causality in the chatting schedule matches , meaning for every , Sensor sends its chatting message after it receives messages from from all parent sensors.

The quantizer at Sensor is a function of the source joint distribution and all incoming messages from parent sensors in .

At any discrete time, the message transmitted by Sensor is a function of and incoming messages from parent sensors in .
When each sensor’s quantizer is regular and encoder only operates on the quantized values , matching the DFSQ setup, the chatting message can only influence the choice of codebook. In this setting, the above conditions become necessary as well. Alternatively, if sensors can locally fuse messages from parents with their own observation, there may exist other conditions for a network to be codebook identifiable.
Iv Rate Allocation in Chatting Networks
A consequence of chatting is that certain sensors can exploit their neighbors’ acquisitions to refine their own. Moreover, a sensor can potentially utilize this side information to adjust its communication rate in addition to changing its quantization if the network is codebook identifiable. These features of chatting networks suggest intelligent rate allocation across sensors can yield significant performance gains. In addition, a strong motivation for intersensor interaction is that sensors may be geographically closer to each other than a fusion center and hence require less transmit power, or can utilize lowbandwidth orthogonal channels that do not interfere with the main communication network. As a result, the cost of communicating a bit may vary in a network.
This section explores proper rate allocation to minimize the total cost of transmission in a chatting network, allowing asymmetry of the information content at each sensor and heterogeneity of the communication links. Consider the distributed network in Fig. 1. The cost per bit of the communication link and the resource allocation between Sensor and the fusion center are denoted by and respectively, leading to a communication rate of from Sensor to the fusion center. Similarly, for a chatting link between Sensors and , the cost per bit and resource allocation are denoted by and respectively, corresponding to a chatting rate of . Consistent with previous notation, we denote the set of costs per chatting bit, resource allocations on chatting links, and chatting rates by , , and .
Given a total resource budget , how should the rates be allocated among these links? For simplicity, assume all chatting links employ fixedrate quantization; this implies that for all and for all . The distortion–cost tradeoff is then expressed as
(19) 
In general, this optimization is extremely difficult to describe analytically since the distortion contribution of each sensor is dependent in a nontrivial way on the conditional sensitivity, which in turn is dependent on the design of the chatting messages. However, the relationship between and the overall system distortion is much simpler, as described in Theorem 1. Hence, once the chatting allocations is fixed, the optimal is easily determined using extensions of traditional rate allocation techniques described in Appendix A. In particular, the optimal can be found by applying Lemmas 3 and 4 with a total cost constraint
(20) 
A bruteforce search over then provides the best allocation, but this procedure is computationally expensive. More realistically, network constraints may limit the maximum chatting rate, which greatly reduces the search space.
In Fig. 3, we show optimal communication rates for the network described in Section V. We delay description of the specific network properties and aim only to illustrate how the cost allocations may change depending with sensors or chatting messages. Under fixedrate coding, varies depending on the chatting graph. In the entropyconstrained setting, the allocation can also vary with the chatting messages, except for Sensor 1. This increased flexibility allows for a wider range of rates, as well as improved performance in many situations.
(a)  (b) 
V Maximum Computation
The results in the previous sections hold generally, and we now build some intuition about chatting using a specific distributed network performing a maximum computation. The choice of this computation is not arbitrary; we will show that it allows for a particular chatting architecture that makes it convenient to study large networks. Moreover, this network reveals some surprising insights into the behavior of chatting. While this paper restricts its attention solely to the maximum computation, more examples are discussed in [8].
Va Problem Model
We consider a network where the fusion center aims to reproduce the maximum of sources, where each is independent and uniformly distributed on . The sensors measuring these sources are allowed to chat in a serial chain, meaning each sensor has at most one parent and one child (see Fig. 4). Initially, we will consider the simplest such network with the following assumptions:

The chatting is serial, meaning the sequence of chatting messages is .

Each chatting link is identical and has rate , codebook size and cost .

The communication links between sensors and the fusion center are allowed to have different rates. For simplicity, we assume them to be homogeneous and normalize the cost to be .

The outgoing chatting message at Sensor is the index of a uniformly quantized version of its observation with levels.

For , the chatting message from Sensor is the maximum of the index of Sensor ’s own uniformly quantized observation and the chatting message from its parent.
Under this architecture, the chatting messages effectively correspond to a uniformly quantized observation of the maximum of all ancestor nodes:
(21) 
where is the index of the quantization codeword and can takes values . The simplicity of the chatting message here arises from the permutationinvariance of the maximum function. We will exploit this structure to provide precise characterizations of system performance.
VB Quantizer Design
Using (7), we find the max function has sensitivity for all . Without chatting, each sensor’s quantizer would be the same with a point density that is a function of the source distribution and sensitivity. Moreover, since the cost per bit of transmitting to the fusion center is the same, the solution of the resource allocation problem assigns equal weight to each link. Hence, minimizing (10) yields the optimal fixedrate distortion–cost tradeoff:
(22) 
Similarly, the minimum of (12) leads to the optimal entropyconstrained distortion–cost tradeoff
(23) 
These highresolution expressions provide scaling laws on how the distortion relates to the number of sensors. They require the total cost increase linearly with to hold.
With chatting, we first need to determine the conditional sensitivity, which is given below for uniform sources:
Lemma 1.
Given , the sensitivity profile corresponding to a received chatting message is
(24) 
Proof:
See Appendix B. ∎
We have already noted the incident chatting message of Sensor is a uniformly quantized observation of , where . Hence,
(25) 
Below, we give distortion asymptotics for the serial chatting network under both fixedrate and entropyconstrained quantization.
VB1 Fixedrate case
From Theorem 1, the asymptotic total fMSE distortion is
(26) 
where . Because Sensor 1 has no incoming chatting messages, its sensitivity is and the resulting distortion constant is
For other sensors, the distortion contribution is
For Sensor with , all incoming messages besides induce a don’tcare interval, so one of the codewords is placed exactly at .
We study the tradeoff between chatting rate and fMSE for several choices of and using optimal cost allocation as determined by Lemma 3. In Fig. 5a, we observe that increasing the chatting rate yields improvements in fMSE. As the number of sensors increases, this improvement becomes more pronounced. However, this is contingent on the chatting cost being low. As discussed in Section IID, chatting can lead to worse system performance if the cost of chatting is on the same order as the cost of communication given a total resource budget, as demonstrated by Fig. 5c. Although the main results of this work are asymptotic, we have asserted the distortion equations are reasonable at finite rates. To demonstrate this, we design real quantizers under the same cost constraint and demonstrate that the resulting performance is comparable to highresolution approximations of Theorem 1. This is observed in Figs. 5a and c, which shows the asymptotic prediction of the distortion–rate tradeoff is accurate even at 4 bits/sample.
VB2 Entropyconstrained case
Generally, the total distortion in the entropyconstrained case is
(27) 
noting each sensor is allowed to vary its communication rate with the chatting messages it receives. Like in the fixedrate setting, an incoming message will induce a don’tcare interval of in the conditional sensitivity. If is the event that is not in a don’tcare interval when receiving message , then
(28) 
and .
Like in the fixedrate setting, we study the relationship between the chatting rate and fMSE, this time using the probabilistic allocation optimization of Lemma 4 in Appendix A. Due to the extra flexibility of allowing a sensor to vary its communication to the fusion center with the chatting messages it receives, we observe that increasing the chatting rate can improve performance more dramatically than in the fixedrate case (see Fig. 5b). Surprisingly, chatting can also lead to inferior performance for some combinations of and , even when is small. This phenomenon will be discussed in greater detail below. In Fig. 5d, we compare different choices of to see how performance changes with the chatting rate. Unlike for fixed rate, in the entropyconstrained setting, chatting can be useful even when its cost is close to the cost of communication to the fusion center.
(a)  (b) 
(c)  (d) 
VC Generalizing the Chatting Messages
We have considered the case where a chatting message is the uniform quantization of the maximum of all ancestor nodes, as shown in (21). Although simple, this coding of chatting messages is not optimal. Here, we generalize chatting messages to understand how the performance can change with this design choice.
We begin by considering the same network under the restriction that the chatting rate is , but allow the single partition boundary to vary rather than setting it to . Currently, we keep the coding consistent for every sensor such that a chatting message implies and means . Distortions for a range of and are shown in Fig. 6.
From these performance results, we see that the choice of should increase with the size of the network, but precise characterization of the best is difficult because of the complicated effect the conditional sensitivity has on both the distortion constants and rate allocation. We can recover some of the results of Fig. 5 by considering . It is now evident that this choice of can be very suboptimal, especially as becomes large. In fact, we observe that for certain choices of the partition with entropy coding, the distortion with chatting can be larger than from a traditional distributed network even though the chatting cost is 0. This unintuitive fact arises because the system’s reliance on the conditional sensitivity is fixed, and the benefits of a don’tcare interval are mitigated by creating a more unfavorable conditional sensitivity. We emphasize that this phenomenon disappears as the rate becomes very large.
Since the flexibility in the choice of the chatting encoder’s partitions can lead to improved performance when , we can expect even more gains when the chatting rate is increased. However, the only method for optimizing the choice of partition boundaries developed currently involve bruteforce search using the conditional sensitivity derived in Appendix B. Another extension that leads to improved performance is to allow chatting encoders to employ different partitions. This more general framework yields strictly improved results, but some of the special structure of the serial chatting network is lost as the chatting message is no longer necessarily the maximum of all ancestor sensors. The added complexity of either of these extensions make their performances difficult to quantify.
(a)  (b) 
VD Optimizing a Chatting Network
In this paper, we have formulated a framework allowing lowrate collaboration between sensors in a distributed network. We have introduced several methods to optimize such a network, including nonuniform quantization, rate allocation, and design of chatting messages. Here, we combine these ingredients and see how each one impacts fMSE.
We will continue working with the maximum computation network from Fig. 4 assuming , , and . We further assume the coding of chatting messages is the same for every sensor on the serial chain. We will then consider the following scenarios:
We analyze the fMSE of each scenario compared to a distributed network without chatting (). From Fig. 7, we can see that incorporating rate allocation and chatting optimization yields substantial gains in the entropyconstrained setting. For fixed rate, the most meaningful improvement comes from allowing chatting, while additional optimization provides little additional benefit. Up to this point, we have limited chatting to have fixed codebook size and did not allow entropy coding. Lifting these restrictions increase system complexity and can provide even greater compression gain.
Vi Conclusions
In this work, we explored how intersensor communication—termed chatting—can improve approximation of a function of sensed data in a distributed network constrained to scalar quantization. We have motivated chatting from two directions: providing an analysis technique for distortion performance when lowblocklength limitations make Shannon theory too optimistic, and illustrating the potential gains over simplistic practical designs. There are many opportunities to leverage heterogeneous network design to aid information acquisition using the tools of highresolution theory, and we provide precise characterizations of distortion performance, quantizer design, and cost allocation to optimize distributed networks. Many challenges remain in analyzing chatting networks. Some future directions that are meaningful include a more systematic understanding of how to design chatting messages and applications where chatting may be feasible and beneficial.
One can consider “sensors” being distributed in time rather than space, with the decoder computing a function of samples from a random process. Connections of this formulation to structured vector quantizers are of independent interest.
Appendix A Rate Allocation for Distributed Networks
Consider the distributed network in Fig. 1 without the chatting channel. The cost per bit of the communication link and the cost allocation between Sensor and the fusion center is denoted by and respectively, leading to a communication rate of . Below, we solve the cost allocation problem under the assumption that companding quantizers are used and noninteger rates are allowed.
Lemma 2.
The optimal solution to
(29) 
has cost allocation
(30) 
where is chosen such that .
Proof:
This lemma extends the result from [32] or can be derived directly from the KKT conditions. ∎
Each is calculated using only the functional sensitivity and marginal source pdf . Although Lemma 2 is always true, we emphasize that its effectiveness in predicting the proper cost allocation in a distributed network is only rigorously shown for high cost (i.e. high rate) due to its dependence on (8). However, it can be experimentally verified that costs corresponding to moderate communication rates still yield nearoptimal allocations.
When the solution of Lemma 2 is strictly positive, a closedform expression exists:
Lemma 3.
Assuming each in (30) is strictly positive, it can be expressed as
(31) 
Proof:
The proof uses Lagrangian optimization. ∎
If Sensor is allowed to vary the communication rate depending on the side information it receives, further gains can be enjoyed. This situation is natural in chatting networks, where the side information is the lowrate messages passed by neighboring sensors. Here, we introduce probabilistic cost allocation, yielding a distortion–cost tradeoff
(32) 
where the expectation is taken with respect to . Each link will have a cost allocation for every possible message while satisfying an average cost constraint. An analogous result to Lemma 2 can be derived; for the situation where the optimal allocation is strictly positive, it can again be expressed in closed form:
Lemma 4.
Assume the side information received at Sensor is and the cost per bit of the communication link may vary with . Assuming each allocation in the solution to (32) is strictly positive, it can be expressed as
(33) 
where .
Appendix B Sensitivity of Maximum Computation Network
Assuming iid uniform sources on the support , the sensitivity of each sensor in the maximum computation network in Fig. 4 without chatting is
When the chatting graph is a serial chain, Sensor has some lossy version of the information collected by its ancestor sensors. For the max function, chatting reduces the support of the estimate of by Sensor . Hence, the message reveals the max of the ancestor sensors is in the range . This side information forms three distinct intervals in the conditional sensitivity. First, in the interval , is assuredly less than and hence sensitivity is 0 since the information at Sensor is irrelevant at the fusion center. Second, if , is greater than and the sensitivity should only depend on the number of descendant sensors, leading to a sensitivity of . Finally, when , Sensor must take into consideration both ancestors and descendants, yielding sensitivity
More specific to the case when messages correspond to uniform quantization, we define and denote each received message as . Setting and gives Lemma 1.
References
 [1] D. Slepian and J. K. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Inform. Theory, vol. IT19, pp. 471–480, July 1973.
 [2] R. Zamir, “The rate loss in the Wyner–Ziv problem,” IEEE Trans. Inform. Theory, vol. 42, pp. 2073–2084, Nov. 1996.
 [3] V. Y. F. Tan and O. Kosut, “On the dispersions of three network information theory problems.” arXiv:1201.3901v2 [cs.IT]., Feb. 2012.
 [4] V. Misra, V. K. Goyal, and L. R. Varshney, “Distributed scalar quantization for computing: Highresolution analysis and extensions,” IEEE Trans. Inform. Theory, vol. 57, pp. 5298–5325, Aug. 2011.
 [5] J. Z. Sun, V. Misra, and V. K. Goyal, “Distributed functional scalar quantization simplified.” arXiv:1206.1299v1 [cs.IT]., June 2012.
 [6] D. Tse and P. Viswanath, Fundamentals of Wireless Communication. Cambridge, UK: Cambridge University Press, 2005.
 [7] T. Yucek and H. Arslan, “A survey of spectrum sensing algorithms for cognitive radio applications,” IEEE Comm. Surveys Tutorials, vol. 11, no. 1, pp. 116–130, 2009.
 [8] J. Z. Sun and V. K. Goyal, “Chatting in distributed quantization networks,” in Proc. 50th Ann. Allerton Conf. on Commun., Control and Comp., (Monticello, IL), Oct. 2012.
 [9] A. Orlitsky and J. R. Roche, “Coding for computing,” IEEE Trans. Inform. Theory, vol. 47, pp. 903–917, Mar. 2001.
 [10] H. Feng, M. Effros, and S. A. Savari, “Functional source coding for networks with receiver side information,” in Proc. 42nd Annu. Allerton Conf. Commun. Control Comput., pp. 1419–1427, Sept. 2004.
 [11] Y. Polyanskiy, H. V. Poor, and S. Verdu, “Channel coding rate in the finite blocklength regime,” IEEE Trans. Inform. Theory, vol. 56, pp. 2307–2359, May 2010.
 [12] A. Ingber and Y. Kochman, “The dispersion of lossy source coding,” in Proc. IEEE Data Compression Conf., (Snowbird, Utah), pp. 53–62, Mar. 2011.
 [13] V. Kostina and S. Verdu, “Fixedlength lossy compression in the finite blocklength regime,” IEEE Trans. Inform. Theory, vol. 58, pp. 3309–3338, June 2012.
 [14] A. H. Kaspi and T. Berger, “Rate–distortion for correlated sources with partially separated encoders,” IEEE Trans. Inform. Theory, vol. IT28, pp. 828–840, Nov. 1982.
 [15] M. Sefidgaran and A. Tchamkerten, “On cooperation in multiterminal computation and rate distortion,” in Proc. IEEE Int. Symp. Inform. Theory, (Cambridge, MA), pp. 771–775, July 2012.
 [16] N. Ma and P. Ishwar, “Some results on distributed source coding for interactive function computation,” IEEE Trans. Inform. Theory, vol. 57, pp. 6180–6195, Sept. 2011.
 [17] N. Ma, P. Ishwar, and P. Gupta, “Interactive source coding for function computation in collocated networks,” IEEE Trans. Inform. Theory, vol. 58, pp. 4289–4305, July 2012.
 [18] S. Nitinawarat and P. Narayan, “Perfect omniscience, perfect secrecy, and Steiner tree packing,” IEEE Trans. Inform. Theory, vol. 56, pp. 6490–6500, Dec. 2010.
 [19] T. Courtade and R. Wesel, “Efficient universal recovery in broadcast networks,” in Proc. 48th Ann. Allerton Conf. on Commun., Control and Comp., (Monticello, IL), pp. 1542–1549, Oct. 2010.
 [20] E. Martinian, G. W. Wornell, and R. Zamir, “Source coding with distortion side information,” IEEE Trans. Inform. Theory, vol. 54, pp. 4638–4665, Oct. 2008.
 [21] T. Linder, R. Zamir, and K. Zeger, “Highresolution source coding for nondifference distortion measures: Multidimensional companding,” IEEE Trans. Inform. Theory, vol. 45, pp. 548–561, Mar. 1999.
 [22] A. Gersho and R. M. Gray, Vector Quantization and Signal Compression. Boston, MA: Kluwer Acad. Pub., 1992.
 [23] D. L. Neuhoff, “The other asymptotic theory of lossy source coding,” in Coding and Quantization (R. Calderbank, G. D. Forney, Jr., and N. Moayeri, eds.), vol. 14 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pp. 55–65, American Mathematical Society, 1993.
 [24] R. M. Gray and D. L. Neuhoff, “Quantization,” IEEE Trans. Inform. Theory, vol. 44, pp. 2325–2383, Oct. 1998.
 [25] W. R. Bennett, “Spectra of quantized signals,” Bell Syst. Tech. J., vol. 27, pp. 446–472, July 1948.
 [26] P. F. Panter and W. Dite, “Quantizing distortion in pulsecount modulation with nonuniform spacing of levels,” Proc. IRE, vol. 39, pp. 44–48, Jan. 1951.
 [27] J. A. Bucklew and G. L. Wise, “Multidimensional asymptotic quantization theory with th power distortion measures,” IEEE Trans. Inform. Theory, vol. IT28, pp. 239–247, Mar. 1982.
 [28] S. Cambanis and N. L. Gerr, “A simple class of asymptotically optimal quantizers,” IEEE Trans. Inform. Theory, vol. IT29, pp. 664–676, Sept. 1983.
 [29] T. Linder, “On asymptotically optimal companding quantization,” Prob. Contr. Inform. Theory, vol. 20, no. 6, pp. 475–484, 1991.
 [30] V. K. Goyal, “Highrate transform coding: How high is high, and does it matter?,” in Proc. IEEE Int. Symp. Inform. Theory, (Sorrento, Italy), p. 207, June 2000.
 [31] R. M. Gray and A. H. Gray, Jr., “Asymptotically optimal quantizers,” IEEE Trans. Inform. Theory, vol. IT23, pp. 143–144, Feb. 1977.
 [32] A. Segall, “Bit allocation and encoding for vector sources,” IEEE Trans. Inform. Theory, vol. IT22, pp. 162–169, Mar. 1976.