Maximum Likelihood Signal Amplitude Estimation Based on Permuted Blocks of Differently Binary Quantized Observations of a Signal in Noise

# Maximum Likelihood Signal Amplitude Estimation Based on Permuted Blocks of Differently Binary Quantized Observations of a Signal in Noise

Guanyu Wang, Jiang Zhu, Rick S. Blum, Paolo Braca and Zhiwei Xu
###### Abstract

Parameter estimation based on binary quantized observations is considered given the estimation system does not know which of a set of quantizers was used, without replacement, for each block of observations. Thus the estimation system receives permutated blocks of quantized samples of a signal in noise with unknown signal amplitude. Maximum likelihood (ML) estimators are utilized to estimate both the permutation matrix and unknown signal amplitude under arbitrary, but known, signal shape and quantizer thresholds. Sufficient conditions are provided under which an ML estimator can be found in polynomial time. In addition, model identifiability is also studied, and an alternating maximization algorithm is proposed to solve the general problem via good initial estimates. Finally numerical simulations are performed to evaluate the performances of the ML estimators.

Keywords: Distributed estimation, quantization, unlabeled sensing, alternating maximization.

## I Introduction

Recently, parameter estimation problems under permutated observations have been studied extensively [1, 2, 3, 5, 4, 6, 8, 9, 10, 7]. In [1], it is shown that the convex relaxation based on a Birkhoff polytope approach does not recover the permutation matrix, and a global branch and bound algorithm is proposed to estimate the permutation matrix. In the noiseless case with a random linear sensing matrix, it is shown that the permutation matrix can be recovered correctly with probability , given that the number of measurements is twice the number of unknowns [2]. In [3], the noise is taken into account and a condition under which the permutation matrix can be recovered with high probability is provided. In addition, a polynomial time algorithm is proposed for a scalar parameter case [3]. For unlabeled ordered sampling problems where the relative order of observations is known, an alternating maximization algorithm combined with dynamic programming is proposed [4].

In large WSNs, the number of bits of information that each sensor sends to the fusion center may be larger than that of the identity information [8]. As a consequence, sensors may only send their measurements but not their identities. In this setting, estimation and detection algorithms should be redesigned to jointly estimate the underlying parameter and permutation matrix, as shown in [9, 10]. In [9], a signal detection problem where the known signal is permutated in an unknown way is studied. The unlabeled scalar parameter estimation problem is studied in both analog and digital communication scenarios, and an alternating maximization algorithm is proposed in [10].

In this letter, we focus on estimation problems from unlabeled quantized samples. The main contribution of our work can be summarized as follows. First, a sufficient condition for the existence of a polynomial algorithm is provided for a unlabeled estimation problem. The model is shown to be unidentifiable in some special cases. Good initial points are provided to improve the performance of an alternating maximization algorithm.

## Ii Problem Setup

Consider a parameter estimation problem where different binary quantizers are each used times to generate binary quantized samples which will be utilized to estimate an unknown deterministic scalar signal amplitude parameter . The binary samples are generated via

 bij=Qi(hiθ+wij),i=1,⋯,N,j=1,⋯,K, (1)

where and respectively identify one of the quantizers and one of the repeated applications of that quantizer, is the known coefficient characterizing the signal shape, is a zero-mean and variance Gaussian noise sample assumed independent across and , is the binary quantized sample, and implies a binary quantization of its argument that produces if the argument is larger than a scalar threshold and otherwise. The quantized data are transmitted over a binary channel with flipping probabilities and which are defined as and , where is the sample received at the output of the channel, which we call the fusion center (FC) [11]. We assume that satisfies , which effectively imposes dynamic range limitations on the quantizers [12].

### Ii-a Estimation with Labeled Data

Classically, one can estimate the scalar parameter based on binary quantized samples collected in a matrix , where . The probability mass function (PMF) of is described by

 Pr(uij=1) =q0+(1−q0−q1)Φ(hiθ−τiσw)≜pi, (2)

where denotes the standard Gaussian cumulative distribution function. Thus, the log-likelihood function is

 l(η;θ)=KN∑i=1(ηilogpi+(1−ηi)log(1−pi)), (3)

where denote the fraction of elements of such that , i.e., , and is given in (2). Now the MLE problem based on the ordered data is , which is a convex optimization problem and can be solved efficiently via numerical algorithms [13]. In addition, the Fisher Information (FI) given in [14] is

 I(θ)=K(1−q0−q1)2σ2wN∑i=1h2ip2w((hiθ−τi)/σw)pi(1−pi), (4)

where denotes the probability density function of a standard normal random variable. Consequently, the Cramér Rao lower bound (CRLB) is

 CRLB(θ)=1/I(θ), (5)

which is later used as a benchmark performance for ML estimation from labeled data in Section IV.

## Iii Estimation from unlabeled quantized data

In this section, estimation from unlabeled quantized samples is studied where it is assumed that each quantizer transmits its data without including its own identity or somehow this identity is lost. The FC does not know which quantizer produced the data , but knows that was produced by one of the quantizers. Note that is the labeled (ordered) data such that the row actually identifies the quantizer that produced these data, thus we obtain the unlabeled samples which are a permutation of the labeled samples given by

 ˜U=ΠU, (6)

where is an unknown permutation matrix. Introduce the function such that if the permutation matrix maps the th row of to the th row of . The log-likelihood function is

 l(~η;θ,Π)=KN∑i=1(~ηπ(i)logpi+(1−~ηπ(i))log(1−pi)), (7)

where . The ML estimation problem can be formulated as

 maximizeθ,Π∈PN l(~η;θ,Π), (8)

where denotes an permutation matrix set.

### Iii-a θ known case

In this subsection, the permutation matrix recovery problem is studied in the case of known . It is shown that the permutation matrix under ML estimation criterion can be recovered efficiently, which is based on the following results.

###### Proposition

Given the ML estimation problem in (8) with known , the ML estimate of the permutation matrix will reorder the rows of , and equivalently the elements of , to have the same relative order as the elements of .

###### Proof:

Note that the objective function (7) can be decomposed as

 l(~η;θ,Π)=KN∑i=1~ηπ(i)si+KN∑i=1log(1−pi), (9)

where . From (9), the ML estimate of the permutation matrix will reorder the rows of , and equivalently the elements of to have the same relative order as the elements of [9, 10]. Because is monotonically increasing with respect to , the elements of should be reordered by the permutation matrix to have the same relative order as the elements of to maximize the likelihood.

If , then changing in would reverse the ordering. This might help to explain why two solutions appear in the subsequent Proposition III-B1 when is unknown.

### Iii-B θ unknown case

In general, may be unknown. Consequently, we should jointly estimate and permutation matrix . However, finding the best permutation matrix is very challenging in most problems due to non-convexity. One could try all the possible permutation matrices which costs . Given a permutation matrix, we can solve for the ML estimate of . Given , the computation complexity of finding the optimal permutation matrix is just reordering (sorting), which costs , as we show in subsection III-A.

#### Iii-B1 Special cases for efficient estimation of the Π under unknown θ

###### Proposition

Given the ML estimation problem in (8) with unknown , if there exist constants such that , the elements of should be reordered according to the order of the elements of or if , otherwise reordered according to or .

###### Proof:

We separately address the cases and . In the case of , must be a constant vector. Reordering according to is equivalent to reordering according to or . Given , we have . Consequently, and is reordered according to or .

The above proposition deals with three cases, i.e., is a constant vector, is a constant vector, and is a multiple of . In [10] it is shown that reordering yields the optimal MLE given . Proposition III-B1 extends the special case in [10] to more general cases. Consequently, we propose Algorithm 1, an efficient algorithm for parameter estimation.

Note that Algorithm 1 will generate two solutions and . Given system parameters and , it is important to determine whether the two solutions and will yield the same log-likelihood , i.e., whether the model is identifiable [15]. The following proposition is presented to justify that there exist cases in which the model is unidentifiable.

###### Proposition

Let and denote the ascending and descending ordered versions of , and and , where and are permutation matrices. Given and , the model is unidentifiable, i.e., , where and .

###### Proof:

Let be a permutation matrix such that has the same relative order as . Now we prove that has the same relative order as . Utilizing and , we obtain . Note that . Because has the same relative order as , has the same relative order as , and has the same relative order as .

Next we prove that holds. Because , we have

 hi^θs1−τi=hi(^θs1−c0),hi^θs2−τi=−hi(^θs1−c0). (10)

By examining (9) and utilizing , the second term of is equal to that of . For the first term, note that given and , the corresponding and in (9) can be viewed as evaluating at and according to (10), respectively. Because , we can conclude that is a permutated version of . The first term of (9) can be expressed as either or . Because and have the same relative order as , and and have the same relative order as , one has . Thus .

It can be shown that given , only one of lies in the interval , thus the model is identifiable. In the following, an alternating maximization algorithm is proposed and a method to select good initial points is also provided for the general case.

#### Iii-B2 Alternating maximization algorithm for general case

The problem structure induces us to optimize the two unknowns alternately as shown in Algorithm 2.

The alternating maximization in Algorithm 2 can be viewed as the alternating projection with respect to and . The objective function is . In step , given , we update the permutation matrix as , and the objective value is . Given , we obtain ML estimation of as , and the objective value is satisfying . Given , we update the permutation matrix as , and the objective value is satisfying . Consequently, we have

 l(~η;^θt,^Πt)≥l(~η;^θt−1,^Πt−1). (11)

Given that the maximum with respect to each and is unique, any accumulation point of the sequence generated by Algorithm 2 is a stationary point [16].

#### Iii-B3 Good initial points

For alternating maximization algorithms dealing with nonconvex optimization problems, an initial point is important for the algorithm to converge to the global optimum. In the following text, we provide good initial points for the alternating maximization algorithm.

Suppose that the number of measurements is large. Consequently, as the number of measurements tends to infinity, the law of large numbers (LLN) implies

 ηip⟶q0+(1−q0−q1)Φ((hiθ−τi)/σw), (12)

where denotes convergence in probability. Given , . In the following text, we only deal with case. The case that is very similar and is omitted here. Define and . Then should satisfy . Let denotes the projection of onto the interval . From (12) one obtains

 m≜σwΦ−1((Il,u(~η)−q01N)/(1−q0−q1))p⟶Π(hθ−τ).

Utilizing yields

 mTmp⟶hThθ2−2τThθ+τTτ, (13)

which is a quadratic equation in . The solutions are

 (14)

The above two solutions can be used for the alternating maximization algorithm as initial points. Finally, the optimum with larger likelihood is chosen as ML estimator. In Section IV, to provide fair comparison of alternating maximization algorithm with good initial points, and are used as two initial points, and we choose the solution whose likelihood is larger as ML estimator.

## Iv Numerical Results

In this section, numerical experiments are conducted to verify the theoretical results. For the first experiment, we evaluate the performance of ML estimator in the case of . Then, numerical results are implemented for general and . The number of Monte Carlo trials is . Other parameters are set as follows: , , , , and . The tolerance parameter in Algorithm 2 is set to be .

For the first numerical experiment, the special case mentioned in Proposition III-B1 is verified. In Fig. 1, is equispaced with , which corresponds to a ramp signal, and . Note that does not satisfy the condition in Proposition III-B1, thus the model may be identifiable. In addition, the MLE can be obtained via the reordering algorithm in Proposition III-B1. It can be seen that when the number of measurements is small, the ML estimator with labeled data works well, and there is an obvious gap between the MSE of the reordering estimator and the CRLB with labeled data. As increases, the MLE performances of both estimators become very close to the CRLB with labeled data.

For the second numerical experiment, the MSE performance of Algorithm 2 (for the general case) is evaluated. In Fig.2, the elements of the vectors and are independently and randomly generated between . The results are plotted in Fig. 2. It can be seen that when is small, good initial points significantly improve the MSE performance of the alternating maximization algorithm with unlabeled data. As increases to , the MSE performances of both unlabeled ML estimators approach a common level which is larger than the CRLB achieved by the labeled data. Finally, both the unlabeled ML estimators achieve the CRLB around .

Note that the performance gap between labeled and unlabeled ML estimators exists due to incorrect permutation matrix recovery. As a result, we define the permutation matrix recovery probability as , where denotes the true permutation matrix. In Fig. 3, we plot the permutation matrix recovery probability via MC trials. It can be seen that as the number of measurements increases, the recovery probability of permutation matrix increases, i.e., the performance gap between labeled and unlabeled ML estimators decreases with increasing and finally approach zero. In addition, it can be seen that the probability of correct permutation matrix recovery in the first experiment is higher than that of the second experiment. This result implies that the performance of unlabeled ML estimator more rapidly approaches that of labeled ML estimator in the first experiment, which is demonstrated in Fig 1 and 2. Note that the permutation matrix recovery probability for the two unlabeled estimators is almost the same for randomly generated , while there exists an MSE gap between the two estimators when is less than , demonstrating the importance of good initialization for this nonconvex ML estimation problem.

## V Conclusion

We study the joint parameter and permutation estimation problem from quantized data for a canonical (known signal shape) sensing model. A sufficient condition under which the ML estimation problem can be solved efficiently is provided. It is shown that in some settings the model can even be unidentifiable. Finally, good initial points are provided to improve the performance of an alternating maximization algorithm for general estimation problems, whose effectiveness is shown by numerical experiments.

## References

• [1] V. Emiya, A. Bonnefoy, L. Daudet and R. Gribonval, “Compressed sensing with unknown sensor permutation,” ICASSP, pp. 1040-1044, 2014.
• [2] J. Unnikrishnan, S. Haghighatshoar, M. Vetterli, “Unlabeled sensing with random linear measurements,” avaliable at https://arxiv.org/pdf/1512.00115.pdf, 2015.
• [3] A. Pananjady, M. J. Wainwright and T. A. Courtade, “Linear regression with an unknown permutation: statistical and computational limits,” available at https://arxiv.org/abs/1608.02902, 2016.
• [4] S. Haghighatshoar and G. Caire, “Signal recovery from unlabeled samples,” available at https://arxiv.org/pdf/1701.08701.pdf, 2017.
• [5] A. Pananjady, M. J. Wainwright and T. A. Courtade, “Denoising linear models with permutated data,” http://arxiv.org/abs/1704.07461, 2017.
• [6] A. Abid, A. Poon and J. Zou, “Linear regression with shuffled labels,” http://arxiv.org/abs/1705.01342, 2017.
• [7] P. Braca, S. Marano, V. Matta, P. Willett, “Asymptotic efficiency of the PHD in multitarget/multisensor estimation,” IEEE Journal of Selected Topics in Signal Processing, vol. 7, no. 3, pp. 553-564, 2013.
• [8] L. Keller, M. J. Siavoshani, C. Fragouli and K. Argyraki, “Identity aware sensor networks,” Proceedings - IEEE INFOCOM, pp. 2177-2185, 2009.
• [9] S. Marano, V. Matta, P. Willett, P. Braca and R. S. Blum, “Hypothesis testing in the presence of Maxwell’s daemon: Signal detection by unlabeled observations,” ICASSP, pp. 3286-3290, 2017.
• [10] J. Zhu, H. Cao, C. Song and Z. Xu, “Parameter estimation via unlabeled sensing using distributed sensors,” to appear in IEEE commun. lett., 2017.
• [11] O. Ozdemir and P. K. Varshney, “Channel aware target location with quantized data in wireless sensor networks,” IEEE Trans. Signal Process., vol. 57, pp. 1190-1202, 2009.
• [12] H. C. Papadopoulos, G. W. Wornell and A. V. Oppenheim, “Sequential signal encoding from noisy measurements using quantizers with dynamic bias control,” IEEE Trans. Inf. Theory, vol. 47, no. 3, pp. 978-1002, Mar. 2001.
• [13] A. Ribeiro and G. B. Giannakis, “Bandwidth-constrained distributed estimation for wireless sensor networks-part I: Gaussian case,” IEEE Trans. Signal Process., vol. 54, no.3, pp. 1131-1143, 2006.
• [14] R. Niu and P. K. Varshney, “Target location estimation in sensor networks with quantized data,” IEEE Trans. Signal Process., vol. 54, no. 12, Dec. 2006.
• [15] C. D. M. Paulino, and C. A. B. Pereira, “On identifiability of parametric statistical models,” Journal of the Italian Statistical Society, pp. 125-151, 1994.
• [16] D. P. Bertsekas, Nonlinear Programming, 2nd ed., Athena Scientific, Belmont, MA, 1999.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters