SEQUENTIAL JOINT SIGNAL DETECTION AND SIGNAL-TO-NOISE RATIO ESTIMATION

# Sequential joint signal detection and signal-to-noise ratio estimation

## Abstract

The sequential analysis of the problem of joint signal detection and signal-to-noise ratio (SNR) estimation for a linear Gaussian observation model is considered. The problem is posed as an optimization setup where the goal is to minimize the number of samples required to achieve the desired (i) type I and type II error probabilities and (ii) mean squared error performance. This optimization problem is reduced to a more tractable formulation by transforming the observed signal and noise sequences to a single sequence of Bernoulli random variables; joint detection and estimation is then performed on the Bernoulli sequence. This transformation renders the problem easily solvable, and results in a computationally simpler sufficient statistic compared to the one based on the (untransformed) observation sequences. Experimental results demonstrate the advantages of the proposed method, making it feasible for applications having strict constraints on data storage and computation.

\name

M. Fauß, K. G. Nagananda, A. M. Zoubir, and H. V. Poor1 \address Signal Processing Group, Darmstadt University of Technology, D-64283 Darmstadt, Germany
Dept. of Electronics and Communications Engineering, PES University, Bangalore 560085, India
Dept. of Electrical Engineering, Princeton University, Princeton, NJ 08544, USA \ninept

{keywords}

Sequential analysis, Bernoulli transformation, joint detection and estimation.

## 1 Introduction

The joint problem of distinguishing between different hypotheses and estimating the unknown parameters based on the outcome of the hypotheses test has received considerable attention in the literature [1] - [6]. Such a problem arises in a wide range of applications, including (i) radiographic inspection for detecting anomalies in manufactured objects and estimating their position and size [7], (ii) retrospective changepoint hypotheses testing to detect change in the statistics and simultaneously estimate the time of change [8], [9], (iii) jointly detecting the presence of multiple objects and estimating their states using image observations [10], and (iv) distinguishing between two hypotheses and at the same time estimating the unknown parameters in the accepted hypothesis in a distributed framework [11]. Some popular techniques to address this problem include reformulating the composite detection problem as a pure estimation problem [12], while the maximum a posteriori estimate was shown to provide a solution to the joint detection and estimation problem in a Bayesian context [13]. The problem has also been addressed in a sequential setting, where the objective is to minimize the number of samples subject to a constraint on the combined detection and estimation cost [14], [15]. The generalized sequential probability ratio test was presented in [16], where a decision was obtained using the maximum likelihood estimate of the unknown parameter.

There is another class of problems where it is desirable to distinguish between the hypotheses and simultaneously estimate the signal-to-noise ratio (SNR), specifically signal and noise powers, under the “signal present” hypothesis. For example, in speech processing, it was shown in [17], [18] that the performance of voice detection systems can be drastically improved by jointly estimating the noise power and a priori SNR. In [19], it was shown that a scheduling scheme performed detection in an energy-efficient manner by jointly estimating the SNR. However, [20] reported that the techniques developed in some of the papers mentioned above were not readily applicable to the problem of joint detection and signal and noise power estimation. For a Bayesian formulation, it was shown [20, Sec. III] that knowledge of the priors and or of the distribution of the unknown parameters was not amenable for problems addressed in [17] - [19]. Instead, an optimal solution for Gaussian observation models was presented using conjugate priors on the signal and noise powers [20, Sec. IV].

In this paper, we extend the problem of joint signal detection and SNR estimation, without a priori knowledge of the signal or noise powers, to a sequential setting and propose a novel method to address this problem. To the best of our knowledge, the sequential analysis of this problem has not been reported in the literature. The problem of distinguishing between two hypotheses (signal absent and signal present) and at the same time estimating the SNR in a Gaussian observation model is posed as an optimization setup, where we seek to minimize the number of samples required to achieve the desired (i) type I and type II error probabilities and (ii) mean squared error (MSE) performance. Our approach comprises transforming the observed signal and noise sequences to a single sequence of Bernoulli random variables, and then performing the detection-and-estimation task on the resulting Bernoulli sequence.

One of the main advantages of this transformation is that it significantly reduces the complexity of the optimization problem so that it can be solved more efficiently. Secondly, we obtain a computationally simpler sufficient statistic compared to the one that emerges when solving the problem directly. Moreover, we show that the proposed method allows for more degrees of freedom than an equivalent Bayesian solution. Experimental results show that (i) the expected number of measurements required to achieve the desired performance almost remains constant for increasing values of SNR, and (ii) many of the constraints in the transformed optimization problem are inactive which renders the problem easily solvable. As such, the method developed in this paper is feasible especially for applications with strict constraints on data storage and computation.

In Section 2, we present the problem statement. In Section 3, we detail the transformation of the observations to a Bernoulli sequence, and show how the original optimization problem can be reformulated into a setup which can be solved efficiently. Results of computer simulations are presented in Section 4. Concluding remarks are provided in Section 5.

## 2 Problem Formulation

The following linear Gaussian signal model is considered:

 x[n]=s[n]+w[n],n=1,2,…, (1)

where denotes the set of observations, while and denote sets of i.i.d. zero mean Gaussian random variables corresponding to the signal and noise, respectively. The variances and are unknown, and the SNR is given by . The problem is to distinguish between the two hypotheses

 {H0:θ=0(signal absent),H1:θ≥θmin(signal present), (2)

and at the same time estimate the SNR under the hypothesis using as few samples as possible, while satisfying predefined constraints on the type I and type II error probabilities. In (2), denotes the minimum SNR for which reliable detection is to be guaranteed. We attempt to jointly solve the problem of signal detection and SNR estimation in a sequential setting. While the latter enables one to adapt the number of samples to the quality of realizations, the former ensures that the dual objective of detection and estimation is achieved with a desired performance. Essentially, the problem can be formulated as the following optimization setup:

 minψ,δ,^θ Eθ∗[N] (3) subject to (6) P0(δN=1)≤α, Pθ(δN=0)≤β(θ),∀θ≥θmin, Eθ[(^θN−θ)2]≤γ(θ),∀θ≥θmin,

where denotes the sample number at which the sequential test is terminated, denote the stopping and decision rule after the sample has been observed, and is an estimator for . The constant is a nominal SNR value under which the average sample number (ASN) is to be minimum. and denote the probabilities of an event under hypothesis and , respectively. The type I and type II error probabilities are bounded by and , respectively, with being allowed to depend on the true SNR. The mean square error (MSE) of the estimator is bounded by a function .

We assume knowledge of a sequence of noise-only realizations , that can either be recorded before performing the test, or can be generated on the fly, for example, via an identical sensor that is shielded from the external signal, but otherwise exposed to the same environmental conditions. Without , the testing problem cannot be solved for the setup considered in this paper.

## 3 Solution Methodology

Our approach comprises the following two steps: (i) the two sequences and are transformed to a single sequence of Bernoulli random variables, whose success probability is determined by the true SNR, and (ii) a sequential joint detection and estimation procedure is applied to this Bernoulli sequence.

### 3.1 Transformation to a Bernoulli sequence

The two Gaussian sequences and are transformed into a single Bernoulli sequence using Birnbaum’s sequential procedure [21] as follows: At every time step, we calculate the sum of the squares of the samples from both sequences and take an additional sample from the one whose sum is smaller. Whenever the additional sample changes the order of the two sums, the procedure outputs a 0 or 1 depending on which sequence the sample was drawn from. Essentially, given and , we define and , and , where . In the sequence , let the transition from to , or vice versa, occur at the sample. Then the Bernoulli sequence is , where

 b[m]={0, if ~ws[km]

In [21], it was proved that irrespective of the actual values of and , the output is a sequence of i.i.d. Bernoulli random variables with success probability , thereby establishing a one-to-one correspondence between and . Therefore, (2) can be re-hypothesized in terms of , i.e.,

 {H0:ρ=0.5(signal absent),H1:ρ≤ρmax(signal present), (8)

where . The optimization setup (3) - (6) can be reformulated as

 minψ,δ,~θ Eρ∗[M] (9) subject to (12) P0(δM=1)≤α, Pρ(δM=0)≤β(ρ),∀ρ≤ρmax, Eρ[(~θM−1ρ)2]≤γ(ρ),∀ρ≤ρmax,

where is an estimator for that is biased in order to simplify the expression under the expected value. and denote the probabilities of an event under hypothesis and , respectively, corresponding to (8). denotes the sample number of at which the procedure is terminated. Since it takes multiple observations of and to generate one observation of , is in general larger than . Moreover, includes observations of as well as . However, by formulating the problem in terms of and , we aim at minimizing the required samples of , irrespective of the number of observations of and are used to generate these samples. Due to this difference in the objective, both problem formulations are not strictly equivalent so that the proposed procedure cannot be guaranteed to be strictly optimal. A more detailed analysis of the loss incurred by applying Birnbaum’s transformation is a subject of future research. Also note that, depending on the cost involved in sampling from and , one can modify the transformation to require more or fewer samples from a certain sequence. This additional potential for optimization is not taken into account in this work. For more details on the transformation and its near-optimality properties see [21].

### 3.2 Joint detection and estimation

In order to solve (9), we first calculate its Lagrangian dual. For fixed Lagrange multipliers, it results in an unconstrained optimal stopping problem that can be solved by means of dynamic programming. We then choose the Lagrange multipliers such that the procedure satisfies the constraints on the error probabilities and the desired estimation accuracy. For analytical tractability, we relax the constraints under to hold for all , where is a discrete subset of . That is, we bound and only at a finite number of grid points; and will, therefore, be approximated for points in-between. For the problem considered in this work, these approximations are shown to be reasonably accurate (see examples in Section 4).

The Lagrangian dual problem of (9), with replaced by and Lagrange multipliers and , is given by

 maxλ,μ≥0{L(λ,μ)−λ0α−K∑k=1(λkβ(ρk)+μkγ(ρk))}, (13)
 Missing or unrecognized delimiter for \right (14)

where . Following the techniques developed in [22], [23], (14) can be straightforwardly solved as follows, where we omit the details in the interest of space: Let and denote the number of 0’s and 1’s observed. The likelihood-ratios of the corresponding observations under and , with respect to are given by

 Zm0,m10 = (0.51−ρ∗)m0(0.5ρ∗)m1, (15) Zm0,m1k = (1−ρk1−ρ∗)m0(ρkρ∗)m1. (16)

We define, , and . The optimal decision rule is given by

 δ∗m0,m1={1,λ0Zm0,m10≤Em0,m1λ,0,λ0Zm0,m10>Em0,m1λ, (17)

and the optimal estimator by

 ~θ∗m0,m1=Em0,m1μ,1Em0,m1μ,0. (18)

The optimal stopping rule is obtained as

 ψ∗m0,m1={1,Gm0,m1=Rm0,m1,0,Gm0,m1>Rm0,m1, (19)

where

 Gm0,m1=min[λ0Zm0,m10,Em0,m1λ]+Em0,m1μ,2−(Em0,m1μ,1)2Em0,m1μ,0 (20)

and is defined recursively as follows

 Rm0,m1=min[Gm0,m1,1+ρ∗Rm0,m1+1+(1−ρ∗)Rm0+1,m1]. (21)

The quantities and correspond to the cost incurred for stopping immediately, or stopping at the optimal time instant, given that 0’s and 1’s have been observed. The procedure is stopped for the first time when . The term in (20) signifies the detection cost if the decision rule (17) is employed, while the last two terms correspond to the estimation cost, i.e., the deviation of the estimator from the true SNR. At first glance, the optimal decision rule as well as the optimal estimator seem equivalent to Bayesian solutions because of the following reason: The term can be interpreted as the posterior probability of given the observations and the prior that has been scaled by a cost coefficient. Similarly, , the terms can be interpreted as the conditional moments of the posterior distribution of (or, ) with prior , and the optimal estimator as the posterior expected value of .

However, the proposed scheme is not equivalent to the Bayesian procedure. Note that, and both behave as “priors” for , but can be chosen independently. For the proposed approach to be Bayesian, one would require . In the case of a single constraint under either hypothesis, the corresponding Lagrange multiplier can always be interpreted as a prior density scaled by a cost coefficient, so that the optimal method necessarily has a Bayesian equivalent; compare the classic likelihood-ratio test [24]. In our approach, however, the two Lagrange multipliers and correspond to two different constraints under the same distribution, so that there is no equivalence to the Bayesian setup.

Returning to the solution of (14), the main advantage of transforming Gaussian sequences into Bernoulli sequences is that the pair becomes a sufficient statistic of . In comparison, directly solving the problem in the SNR-domain requires a more complicated sufficient statistic, which will include observations from and as well as their empirical variances. Given a maximum sample number , the matrices can be calculated via backward recursion with starting point . Since the state variables are integers, this recursion is numerically stable for moderately large values of and . The final element of the recursion is , which is the cost at the beginning of the test (i.e., ) when the optimal decision and stopping rules, given by (17) and (19), respectively, are used and and are given, i.e., . Once we are able to evaluate , the Lagrange multiplies can be determined by solving (13) for and . Since by construction is jointly concave in and , this is a convex optimization problem that can be solved using standard algorithms, as shown in the next section.

## 4 Experimental Results

In this section, we present results of two numerical experiments. which provide interesting insights into the structure of the joint detection and estimation problem. The SNR is chosen in the range with grid points at integer values, i.e., , . The nominal SNR value is set at dB. The target error probabilities are identical for both experiments, with . As a measure for the estimation accuracy, the relative (or normalized) MSE is used and bounded by a constant. In order to match the problem formulation in Section 2, the constraint can be expressed in terms of the absolute MSE and an SNR dependent bound or, in terms of , as . For the first experiment, , while for the second .

For the numerical solution of (13), we employed the Subplex algorithm [25] as implemented in [26]. It applies the Nelder–Meat simplex algorithm [27] in a repeated fashion on suitably chosen low-dimensional subspaces and does not require the calculation of gradients, which is computationally intensive for the recursively-defined cost function (13). By limiting the search to subspaces, the algorithm can exploit the sparsity in the optimal Lagrange multipliers. Since there is no formal proof of convergence, the Subplex solution was subsequently verified by evaluating its first-order optimality conditions. The maximum number of Bernoulli samples was set to , which proved to be sufficient.

In Fig. 1, the optimal Lagrange multipliers are depicted for both experiments. It is interesting to note that in both cases the solution is sparse, which implies that most of the performance constraints are inactive. Considering the very coarse SNR grid, this outcome is rather unexpected and suggests that in practice very few constrains can be sufficient to bound the performance over large SNR intervals. This can also be seen in Fig. 2, where the type II error probabilities and the relative MSE are plotted over the range of SNRs. The results were obtained by averaging over Monte Carlo simulations and the SNR interval was sampled at intervals of size dB. Within the numerical accuracy, the performance requirements are met or exceeded for all SNR values in the feasible interval. Especially the type II error probabilities are well below the required for all SNR values. For , the estimation constraint is so much tighter than the detection constraint that the latter is virtually deactivated, i.e., the corresponding Lagrange multipliers are close to zero (see Fig. 1). The constraints on the relative MSE are stricter so that the bound is reached over large regions of the SNR interval. Considering that the performance is restricted to integer values of the SNR, it is remarkable that the requirements are satisfied over the entire interval.

The average number of samples drawn from the observation sequence and the reference sequence is shown in Fig. 3. As expected, ASNs for both cases are high for low SNRs and decrease for higher SNRs. Both ASNs reach a minimum at around dB, which corresponds to the nominal SNR that was targeted in the minimization procedure. For large SNR values, the number of samples drawn from the reference sequence increases again, while the number of samples generated by the signal itself stays almost constant. This is a desirable property, considering that generating training samples is usually easier than taking physical measurements. The modes at low and high SNR values are due the detection and estimation constraints, respectively.

## 5 Concluding remarks

We have considered the problem of joint signal detection and SNR estimation for a linear Gaussian model in a sequential framework. The central idea of our approach is to transform the observed sequences to a sequence of Bernoulli random variables. This transformation leads to a simpler reformulation of the main optimization problem, which can be efficiently solved. The expected minimum number of samples required to achieve the desired performance remains almost constant for increasing values of SNR. We also obtain a sufficient statistic for the test which is very easy to compute. Experimental results indicate that many constraints on the optimization setup are inactive, which renders the problem easily solvable. These results indicate the feasibility of the proposed method to practical applications. Understanding the implications of non-stationary noise processes on the performance of the proposed approach especially in the high-SNR regime is one of the main avenues for future research.

### Footnotes

1. thanks: This research was supported in part by the U. S. National Science Foundation under Grants CNS-1456793 and ECCS-1343210.

### References

1. D. Middleton and R. Esposito, “Simultaneous optimum detection and estimation of signals in noise,” IEEE Transactions on Information Theory, vol. 14, no. 3, pp. 434–444, May 1968.
2. A. Fredriksen, D. Middleton, and V. VandeLinde, “Simultaneous signal detection and estimation under multiple hypotheses,” IEEE Transactions on Information Theory, vol. 18, no. 5, pp. 607–614, Sept. 1972.
3. G. V. Moustakides, G. H. Jajamovich, A. Tajer, and X. Wang, “Joint detection and estimation: Optimum tests and applications,” IEEE Transactions on Information Theory, vol. 58, no. 7, pp. 4215–4229, July 2012.
4. J. Chen, Y. Zhao, A. Goldsmith, and H. V. Poor, “Optimal joint detection and estimation in linear models,” in IEEE Conference on Decision and Control, Dec. 2013, pp. 4416–4421.
5. S. Li and X. Wang, “Joint composite detection and Bayesian estimation: A Neyman - Pearson approach,” in IEEE Global Conference on Signal and Information Processing, Dec. 2015, pp. 453–457.
6. S. Li and X. Wang, “Optimal joint detection and estimation based on decision-dependent Bayesian cost,” IEEE Transactions on Signal Processing, vol. 64, no. 10, pp. 2573–2586, May 2016.
7. L. Fillatre, I. Igor Nikiforov, and F. Florent Retraint, “A simple algorithm for defect detection from a few radiographies,” Journal of Computers, vol. 2, no. 6, pp. 26–34, Aug. 2007.
8. A. Vexler and C. Wu, “An optimal retrospective change point detection policy,” Scandinavian Journal of Statistics, vol. 36, no. 3, pp. 542–558, Sept. 2009.
9. S. Boutoille, S. Reboul, and M. Benjelloun, “A hybrid fusion system applied to off-line detection and change-points estimation,” Information Fusion, vol. 11, no. 13, pp. 325–337, Oct. 2010.
10. B. N. Vo, B. T. Vo, N. T. Pham, and D. Suter, “Joint detection and estimation of multiple objects from image observations,” IEEE Transactions on Signal Processing, vol. 58, no. 10, pp. 5129–5141, Oct. 2010.
11. H. Zhu, P. Zhang, Y. Lin, and J. Liu, “Joint detection and estimation fusion in distributed multiple sensor systems,” in International Conference on Information Fusion, July 2016, pp. 805–810.
12. A. Ghobadzadeh, S. Gazor, M. R. Taban, A. A. Tadaion, and M. Gazor, “Separating function estimation tests: A new perspective on binary composite hypothesis testing,” IEEE Transactions on Signal Processing, vol. 60, no. 11, pp. 5626–5639, Nov 2012.
13. A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin, Bayesian Data Analysis, CRC Press, 3 edition, 2013.
14. Y. Yilmaz, G. V. Moustakides, and X. Wang, “Sequential joint detection and estimation,” Theory of Probability & Its Applications, vol. 59, no. 3, pp. 452–465, 2015.
15. Y. Yılmaz, S. Li, and X. Wang, “Sequential joint detection and estimation: Optimum tests and applications,” IEEE Transactions on Signal Processing, vol. 64, no. 20, pp. 5311–5326, Oct. 2016.
16. X. Li, J. Liu, and Z. Ying, “Generalized sequential probability ratio test for separate families of hypotheses,” Sequential Analysis, vol. 33, no. 4, pp. 539–563, 2014.
17. J. Sohn and W. Sung, “A voice activity detector employing soft decision based noise spectrum adaptation,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, May 1998, vol. 1, pp. 365–368.
18. J. Sohn, N. S. Kim, and W. Sung, “A statistical model-based voice activity detection,” IEEE Signal Processing Letters, vol. 6, no. 1, pp. 1–3, Jan. 1999.
19. L. Le, D. M. Jun, and D. L. Jones, “Energy-efficient detection system in time-varying signal and noise power,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, May 2013, pp. 2736–2740.
20. L. Le and D. L. Jones, “Optimal simultaneous detection and signal and noise power estimation,” in IEEE International Symposium on Information Theory, June 2014, pp. 571–575.
21. A. Birnbaum, “Sequential tests for variance ratios and components of variance,” Annals of Mathematical Statistics, vol. 29, no. 2, pp. 504–514, June 1958.
22. A. Novikov, “Optimal sequential tests for two simple hypotheses,” Sequential Analysis, vol. 28, no. 2, pp. 188–217, 2009.
23. M. Fauß and A. M. Zoubir, “A linear programming approach to sequential hypothesis testing,” Sequential Analysis, vol. 34, no. 2, pp. 235–263, 2015.
24. E. L. Lehmann and J. P. Romano, Testing Statistical Hypotheses, Springer, New York City, New York, USA, 3 edition, 2005.
25. T. H. Rowan, Functional Stability Analysis of Numerical Algorithms, Ph.D. thesis, The University of Texas at Austin, Austin, TX, USA, 1990, UMI Order No. GAX90-31702.
26. S. G. Johnson, “The NLopt nonlinear-optimization package,” http://ab-initio.mit.edu/nlopt.
27. J. A. Nelder and R. Mead, “A simplex method for function minimization,” The Computer Journal, vol. 7, no. 4, pp. 308–313, 1965.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters