A Convex Relaxation Approach to Higher-Order Statistical Approaches to Signal Recovery
In this work, we investigate an efficient numerical approach for solving higher order statistical methods for blind and semi-blind signal recovery from non-ideal channels. We develop numerical algorithms based on convex optimization relaxation for minimization of higher order statistical cost functions. The new formulation through convex relaxation overcomes the local convergence problem of existing gradient descent based algorithms and applies to several well-known cost functions for effective blind signal recovery including blind equalization and blind source separation in both single-input-single-output (SISO) and multi-input-multi-output (MIMO) systems. We also propose a fourth order pilot based cost function that benefits from this approach. The simulation results demonstrate that our approach is suitable for short-length packet data transmission using only a few pilot symbols.
Keywords: Blind signal recovery, semi/blind channel equalization, convex optimization, semidefinite programming, rank 1 approximation.
Blind signal recovery is a well-known problem in signal processing and communications. With a special goal of recovering unknown input signals to unknown linear systems based on the system output signals, this problem typically manifests itself either as blind equalization or blind source separation. In SISO systems or single-input-multiple-output (SIMO) systems, the goal of channel equalization is to undo the inter-symbol interference (ISI). In MIMO systems or in source separation, the objective is to mitigate both the ISI and the inter-channel interference (ICI) in order to successfully separate different sources. The advantage lies in the fact that blind algorithms do not require allocation of extra bandwidth to training signals whereas semiblind algorithms can substantially reduce the length of training signals. System implementations based on blind algorithms have appeared in downstream cable modem , HDTV [2, 3, 4].
The specially designed cost functions for blind signal recovery typically pose a challenge to the issue of global convergence and convergence speed. In literature, global convergent algorithms for blind equalization and source separation do exist for linear programming equalization  and space-time signal detection . However, without limiting system input signals to the special QAM class, most of the blind algorithms are based on non-quadratic and non-convex costs that utilize high-order statistics of the received signals. Well-known algorithms of this type include the constant modulus algorithm (CMA) [7, 8], the Shalvi-Weinstein algorithm (SWA) , and the minimum entropy deconvolution (MED) algorithm [10, 11]. In fact, without modifications, these stochastic gradient descent (SGD) algorithms typically admit multiple local minima [12, 13] and require large numbers of data and iterations to converge.
To improve convergence speed of blind channel equalization techniques, batch algorithms can effectively utilize the statistical information of the channel output signals. They can significantly shorten the convergence time [14, 15, 16, 17, 18, 19]. Unfortunately, local convergence remains a major obstacle, that requires good initialization strategies. Another approach to mitigate the local convergence problem is through relaxation by “lifting” the receiver parameters to a new parameterization that covers a larger parameter space over which the cost function is convex [20, 21]. Once the convex problem is uniquely solved, the solution is mapped back to the original restrictive parameter space. For example, the authors of  relaxed the outer-product matrix of the equalizer parameter space from a rank-1 matrix to an unrestricted square matrix over which the CMA cost becomes quadratic and can be solved globally via least squares (LS). From the relaxed solution, a series of linear operations map the solution back into the right equalizer coefficient space. In , the authors proposed a similar relaxation with a much more elegant algorithm that modifies the CMA cost function into a special sum. The modified CMA cost is first solved over a more restrictive semi-definite positive matrix space. The resulting convex optimization problem is then solved efficiently via semidefinite programing (SDP). Iterative mappings must also follow the SDP solution in order to determine the equalizer parameters .
In this work, we study means to improve the convergence of general blind and semiblind equalization algorithms by generalizing the cost modification principle presented in . We can therefore develop a more general batch implementation of well-known blind equalization and source separation algorithms that minimize cost functions involving the fourth order statistics of system output signals. More specifically, the new implementation leverages a convex optimization relaxation that can be applied to CMA, SWA, and MED algorithms. We show that our convex formulation requires less resource and improves the efficiency of the convex formulation in . We further generalize the formulation to accommodate the semi-blind algorithms when a small number of training symbols are available to assist the receiver signal recovery and separation. Our proposed method overcomes the local convergence problem which is a drawback of traditional gradient descent implementations.
The rest of the paper is organized as follows. In Section II, we introduce MIMO system model. Section III discusses batch MIMO blind source recovery using CMA cost as an example of real-valued fourth order functions. Section IV presents a convex formulation of the fourth order function and algorithm to find its global minima. In Section V, we extend formulation to other blind algorithms based on fourth order cost. A semiblind algorithm using fourth order function is proposed in Section VI. In Section VII, we present our simulation results and Section VIII contains some concluding remarks. In the Appendices, we include the formulation of converting cross-correlation cost and the training based cost into real-valued fourth order functions.
Ii Problem Statement and System Model
We consider a baseband MIMO system model with transmit antennas and receive antennas. This MIMO model covers SIMO and SISO channels. Let , denote random independent data symbols for transmit antenna at time . typically belong to finite set . Consistent with practical QAM systems, we assume that are mutually independent and identically distributed (i.i.d.) with variance and fourth order kurtosis .
The data streams are transmitted over multipath MIMO channels denoted by impulse responses , , , , with delay spread of samples. Assuming the MIMO channel is corrupted by i.i.d. additive white Gaussian noise (AWGN) with zero-mean and variance , the output of -th receiver can be expressed as
The goal of the MIMO receiver is to recover data stream , from the channel output . Following most of the works in this area, we shall focus only on linear FIR equalizers. Given sub-channel outputs, we apply a MIMO equalizer with parameter vector
where is considered as the order of individual equalizer and , . If the channel is flat fading (or non-frequency-selective), then we have a degenerated problem of blind source separation for which . The FIR blind MIMO equalizer then degenerates into a linear source separation matrix .
The linear receiver output has parallel output streams denoted by
where denote the convolution. Our objective is to optimize the linear receiver parameters such that the source data symbols are recovered by without interference as follows
Note that is a phase ambiguity that is inherent to the blind signal recovery which cannot be resolved without additional information; and is the output delay of the th signal recovery that does not affect the receiver performance. Upon optimum convergence, a simple memoryless decision device can be applied, and at high signal to noise ratio (SNR), we have
For convenience of notation, we further define
With these notations, we can write
In MIMO blind equalization, we would like to find equalizer parameter vectors such that the combined (channel-equalizer) response is free of inter-symbol and inter-channel interferences
Iii Constant Modulus Algorithm for MIMO Equalization
In this section, we provide a real-valued representation of the conventional batch CMA cost. This representation is one of the keys to reduce parameter space compared to the work of  in which the formulation is based on complex values. This representation is also applied to other blind channel equalization and blind source separation costs and will be shown in the latter sections.
Iii-a CMA for single source recovery
For the sake of clarity, we discuss the CMA cost as an example of selecting blind cost. To recover a particular source, the CMA cost function for the -th equalizer output sequence is defined as
where . The CMA cost can be represented as a function of the equalizer coefficients and the channel output statistics . Since the formulation here deals with single source, the source index is omitted.
Let and denote the real and imaginary parts of . Now define , and . We have the following relationship
As a result, we have
By denoting the rank-1 matrix and , we have
Note that both and are symmetric of dimension with . In other words, and can be mapped by the svec operator to lower dimensional subspace after ignoring redundant entries. Furthermore, we sort the entries in the usual lexicographic order. We define the following operators and vectors:
For a symmetric matrix , we define svec operator and its reverse operator as
For a vector , we define as vector whose entries are all second order (quadratic) terms and sorted in the usual lexicographic order.
There is one to one mapping between elements of and as consists of all upper triangle elements of . Nevertheless, the reverse operator needs further consideration. Given an arbitrary , we can form the corresponding , which is not necessarily a rank 1 matrix. Therefore, we define the reverse operator with approximation as follows: First, using to form the corresponding ; Next, find the rank 1 approximation of by its maximum eigenvalue and the corresponding eigenvector . The resulting matrix is .
With the above definitions, the output power can be rewritten as
Similarly, the fourth order moment of equalizer output is
Therefore, the CMA cost can be written into a fourth order function of or a quadratic function of as
The equalizer can be found by minimizing the polynomial function .
Iii-B Multi-source recovery
To recover multiple input source signals of a MIMO channel, we must form several single stream receivers to generate multiple receiver output streams that will represent distinct source signals . Typically, a blind algorithm is adept at recovering one of the possible source signals. However, because of the inherent properties of blind equalization, the receiver is unable to ascertain which signal may be recovered a priori. As a result, multiple single stream receivers may recover duplicate signals and miss some critical source signals. To avoid duplicate convergence, proper initialization of each may help lead to a diversified convergence as needed. However, there is unfortunately no guarantee that different initial values of will lead to . In order to ensure that different receiver vectors extract distinct signals, common criteria rely on the prior information that the source signals are i.i.d. and mutually independent.
where is a positive scalar and accounts for the correlation of the receiver outputs and . The minimization of ensures that the equalizers converge to different solutions corresponding to different sources. One can recover all sources either simultaneously or sequentially to reduce the parameter size.
In sequential source recovery, we start by extracting the first source by minimizing one single CMA cost. Assume the sources up to are equalized and separated, we can minimize the following cost for separating the source .
In this work, we consider the sequential approach. can be chosen as the sum of cross-cumulants  or as the sum of cross-correlations . Since the use of cross-cumulant is for coarse step separation  and may lead to poor convergence, here, we use the cross-correlation
where is an integer parameter to be chosen during system design. can also be written as a fourth order function of or a second order function of . Noticing that , for , are already known, we can write
where is a vector formed by the previously calculated equalizers and received statistics. The detailed calculation of is given in the Appendix A.
Eventually, the cost in (17) when using CMA and cross-correlation can be written as a fourth order (cost) function of equalizer parameter with zero odd-order coefficients similar to :
In the following section, we discuss the method to find global minima of the functions of this type.
Iv Semidefinite programming approach to minimize a fourth order polynomial
Iv-a General formulation
General formula to find the global minimum of a non-negative polynomial is discussed in [25, 26]. Here, we restrict the formulation for the fourth order polynomial with . The minimization of is equivalent to
This optimization is equivalent to lifting the horizontal hyperplane created by until it lies immediately beneath the hypersurface . The intersection points that the hyperplane and the hypersurface are the global minima of . This problem is convex with a convex cost and linear constraints in . The problem, however, still a hard problem to solve since the number of constraints is infinite. Although it is hard to find the optimal solution to this problem, we can modify and simplify the problem by narrowing down the search space. Since is non-negative, we are looking for a representation of as a sum of square. Following the work of , we define two convex cones of fourth order polynomials of , and :
We can rewrite the optimization problem in (21) in term of as
This problem can be cast into a convex semi-definite programming to be solved efficiently .
We recall the following lemma, which is a simplified version of the results in .
Given any forth-order polynomial , the following relation holds:
where and , , .
Therefore, we obtain an equivalent formulation of (23) as
In general, the solutions for (22) and (23) are not the same. Define and the solutions of (22), (23), respectively. Since, (23) is a restrictive version of (22), we have . Nevertheless, simulation tests in  showed that the solutions of (22) and (23) are nearly identical for arbitrary polynomial cost functions . In the next subsection, we will show that they are actually identical in the case of CMA cost.
Iv-B Minimization of fourth order cost functions without odd-order coefficients
Now, we would like to specialize the algorithm for solving CMA cost. Therefore, we focus on the functions that are fourth order without odd-order entries.
The CMA cost and other mentioned costs in this work belong to the function set . The minimization problem is simply
As the problem in (26) remains hard to solve, we reduce our feasible solution from to as
We can arrive at the following proposition:
Given any forth-order polynomial without odd-order coefficients, the following relationship holds:
where , , and .
From Lemma 1, we find such that and . We partition as
where corresponds to the coefficients for the product terms between entries of and . Consequently, the function can be rewritten as
Since does not have odd-order entries, we have
We can rewrite , where
This is equivalent to
where since by definition. ∎
Iv-C Semidefinite programing solution
Similar to , we let and define as the set of all distinct 4-tuples that are permutation of . Now define a subset of as
can be written into two ways as follows
where and represents the entries of and , respectively, for the product term between entries and in . Similarly, and denote the entries of and , respectively, for product term in .
The optimization in (34) can be rewritten as
This problem is convex with linear and semi-definite constraints. Therefore, it can be solved efficiently using available optimization tools such as Sedumi or SDPT-3 .
The proposed convex optimization is a real-valued and a simplified version of the complex-valued problem in . By modifying the approach given in , our method has two numerical benefits that lead to lower complexity. First, it exploits the special characteristic of fourth order statistics blind costs that have zero odd-order coefficients. As a result, the unknown parameter space is reduced. Second, the real-valued formulation captures all necessary information for equalization. Because in our real-valued formulation, the matrix does not increase in size, the real-valued formulation therefore further reduces the number of parameters to half.
After reaching the global solution of (25) via SDP, the result must be translated and mapped back to the original blind receiver parameter space. Here, we describe such post-processing procedure.
If the global optimum is achieved, there exists such that
where is the symmetric matrix formed by the elements of inside . Since is positive semidefinite, the solution must lie in the null space of . If the null space of has dimension 1, then we already have uniquely. Otherwise, we must look for inside the eigenspace corresponding to the smallest eigenvalues of . Furthermore, the last element of must equal to 1. We now describe an iterative algorithm for real-valued data that was modified from the one proposed in :
Step 1. Initialization: pick a random . Find the null space of by finding that consists of eigenvectors of whose eigenvalues are below a set threshold .
Step 2. For a , find the linear projection of onto :
Step 3. Normalize the last element to 1. Rescale as
where is the last element of . The resulting vector now is in null space of and its last element is 1. Let be the vector consisting of the first elements of as .
Step 4. Calculate , note that this is an approximation operation related to rank 1 approximation. Form .
Step 5. Repeat step 2 until converge. The equalizer is contained in as
Note that in step 3 of this existing post-processing, the last element of must be nonzero. In practice, this condition may not always be true. To overcome this weakness, we can require instead that the equalizer output have the same power as the transmitted symbols. We introduce an additional gain on , (or on ) such that produces the output with the same power as the transmitted symbol,
As a result, we have a more robust and new post-processing algorithm as follows:
Step 1. Initialization: pick a random . Find the null space of by finding matrix which consists of eigenvectors of corresponding to the eigenvalues less than threshold .
Step 2. For a , find which is the linear projection of onto as (40). Let be the vector consisting of the first elements of as .
Step 3. Calculate .
Step 4. Find using (43). The new equalizer is , and the new .
Step 5. Repeat step 2 until converge.
Iv-E Complexity discussion
The complexity of the system depends mostly on the size of . It affects the size of kurtosis and covariance matrices, the number of variables input to the SDP solver, the size of vectors and matrices in post-processing step. Similar to many other batch algorithms the complexity of forming the statistical matrices is . The SDP solver using interior point method requires the worst case complexity of , where is the number of variables. As the size of matrix is , the SDP-solver requires arithmetic operations. For the post-processing techniques, the dominant operation is the rank one approximation which is realized by singular value decomposition and its complexity is . We further discuss about the SDP solver complexity as it is dominant. Compared to the work in , we can see that, in big O notation, the worst case complexity do not change. However, in practice, convex optimization in  estimates complex equalizer taps whereas the proposed CO-CMA method estimates many real equalization taps resulting into reduction of worse case complexity. That is equivalent to reduce the complexity of the SDP solver by for the best case. This, in practice, is very significant.
In , the authors mentioned about sparsity of for the formulation of CMA cost where the first and the third order parts are zeros and a specially tailored SDP solver is needed. With our formulation, we show that it is not necessary to have such a solver. By omitting the odd order parts, the new formulation further reduces the number of variables from to . For , it is equivalent to reducing complexity.
Another advantage of our formulation over the formulation in  is that the vector comprises the real and imaginary parts of the equalizer whereas the corresponding component in  is made of the equalizer vector and its Hermitian. In , this dependency is not imposed in the post-processing step and may requires more iterations to converge. In fact, our simulations show that at most 3 iterations are requires to converge where as the work in  requires 5 iterations.
As compared to the traditional gradient decent algorithms, the convex optimization approach, which uses SDP solver for the fourth order cost, provides much better performance at the expanse of complexity. The gradient decent algorithms such as BGD-CMA or OS-CMA, have low complexity and suffer from local minima and, in many cases, provide poor performance. In contrast, the convex optimization approach at least finds a local minimum that achieves good performance. Due to large computational complexity, the convex optimization approach is not viable for real time applications. Nevertheless, our work achieves incremental complexity reduction for CO-CMA method.
V Generalization to Other Algorithms
The principle of the convex optimization relaxation and the accompanied iterative mapping procedure can be generalized beyond CMA. In fact, several well-known blind algorithms based on fourth order statistics can also be recast into convex optimization algorithms.
V-a Shalvi-Weinstein algorithm
Closely related to CMA is SWA . This algorithm is also based on the fourth order statistics by minimizing the following cost
Here, the source index is omitted. For , the SWA cost becomes the CMA cost with a constant difference.
Using real-valued vector notation, we can rewrite the SWA cost as
Clearly, the cost function is a fourth order function of the equalizer parameter vector , similar to the CMA. Hence, we can apply the proposed convex optimization method on this cost function for convergence enhancement.
V-B MED cost and the modified cost for convex optimization
Another related algorithm is the MED algorithm originally developed by Donoho in  as
Taking into account the negative kurtosis of the communication signals, i.e., , we form an equivalent cost
where is the Lagrangian multiplier for the power constraints. The cost can be further summarized as
Applying the real-valued formulation, we arrive at another fourth order blind cost function
Therefore, we can apply the proposed convex optimization method on MED for convergence enhancement.
Vi Application to Semiblind Equalization and Signal Recovery
In practice, there are often pilot symbols for channel estimation and equalization. Semiblind equalization and signal recovery are desirable when the linear system is too complex for the available pilot symbols to fully estimate or equalize. In such cases, integrating pilot symbols with blind recovery criterion into a semiblind equalization makes better sense. In this section, we present a semiblind algorithm by utilizing the available pilot samples to enhance the equalization performance using the aforementioned convex optimization formulation.
Our basic principle is to construct a special semiblind cost for minimization. Typically, such cost functions are usually formed as a linear combination of a blind criterion and a pilot based cost in the form of
Here the scalar depends on a number of factors such as the pilot sequence length, the number of sources, as well as the source signal constellations.
In order to apply the same convex optimization principle, we aim to develop a semiblind cost function that is also a fourth order function of the receiver parameters. This cost function should also only have even order coefficients. Here, for simplicity, we discuss the use of pilot in single source recovery mode (SISO or SIMO channel model), although our method can be easily generalized for multiple sources.
For a single signal source, we can omit the source index. Without loss of generality, let the transmitted symbol , be the training sequence. In the absence of noise and under perfect channel equalization, we have the following linear relationship between training symbols and received signals
where is the decision delay.
We convert the complex data representation to real-valued data representation by defining
The relationship between training vector and the received signal vector can be equivalently written as real-valued signal representation as follows
This relationship can be also written as a fourth order function of through the following steps.
First, under noise free condition, if the channel can be completely equalized, then the following criteria can be used for channel equalization
Therefore, the equalizer should minimize the following pilot-based fourth order cost function
where , , and are appropriately defined matrix, vector, and scalar that correspond to the fourth, the second, and the zero-th order coefficients of , respectively. Their details are presented in Appendix B.
This cost can be combined with one of the blind costs to form semi-blind costs which can be solved by the convex optimization
with , , and are the matrix, vector, scalar that correspond to fourth, second, zero order coefficients of the aforementioned blind costs, respectively.
Vii Simulation results
We now present numerical results to illustrate the performance of our new batch algorithms using convex optimization (CO) for CMA, SWA, MED and the fourth order cost semiblind algorithm. In batch algorithms, the expectations are replaced by the average of equalizer input data samples for each antenna. To show the advantages of our formulation, we compare our algorithms with other blind/semiblind algorithms using gradient descent method. For the best performance, batch gradient descent algorithms are used and marked as BGD.
The semiblind algorithms are marked as SB. The semiblind cost for comparison is selected with CMA cost for the blind part and LS cost for pilots in (49):
Here, the decision delays is chosen optimally.
We compare the algorithm performance in terms of the final receiver ISI. Quantitatively, the equalization effectiveness is measured by the normalized ISI for the th equalizer output defined as
For multiple source signals, we also define the sum normalized ISI as
There are several systems in our studies: SISO system with multipath, MIMO system with multipath and a mixing matrix. The convex optimization is proceeded using available optimization tool . The results are averaged over 500 Monte-Carlo realizations of inputs and channels, unless otherwise specified.
Vii-a SISO channel
We first test our convex optimization using CMA, SWA, MED costs on SISO channel and compare with BGD-CMA. The QPSK and 16-QAM inputs are passed through 3-tap Rayleigh channels with unit power profile and the receive signals are equalized by -tap equalizers. Tables I and II compare the ISI performance of different algorithms for QPSK and 16-QAM channel inputs, respectively. For convex optimization, SDP-3 with interior point method is used. For post-processing, the threshold is set at . The CO-CMA uses both post-processing techniques in  (marked as "pp1") and the newly proposed one (marked as "pp2"). Other CO algorithms apply only the proposed post-processing technique. The data length is to ensure good convergence.
|ISI (dB)||No noise (SNR=)||SNR=dB|
|ISI (dB)||No noise (SNR=)||SNR=dB|
The simulations show no significant performance difference among the algorithms using convex optimization. Under good condition, i.e., high SNR and long enough data size, the convex optimization could find the global minima for each case. In fact, the equalization objective does not depend on the costs. The difference among the costs with the choice of parameters are perhaps the convergence of the gradient descent implementations as confirmed by many works [17, 22, 32, 33]. Similar results can be observed in MIMO channel. Therefore, in the rest of the simulation results, we only consider CMA as an example of fourth order statistic based blind channel equalization cost. We also compare the CO algorithms with BGD-CMA in Tables I and