A Convex Relaxation Approach to HigherOrder Statistical Approaches to Signal Recovery
Abstract
In this work, we investigate an efficient numerical approach for solving higher order statistical methods for blind and semiblind signal recovery from nonideal channels. We develop numerical algorithms based on convex optimization relaxation for minimization of higher order statistical cost functions. The new formulation through convex relaxation overcomes the local convergence problem of existing gradient descent based algorithms and applies to several wellknown cost functions for effective blind signal recovery including blind equalization and blind source separation in both singleinputsingleoutput (SISO) and multiinputmultioutput (MIMO) systems. We also propose a fourth order pilot based cost function that benefits from this approach. The simulation results demonstrate that our approach is suitable for shortlength packet data transmission using only a few pilot symbols.
Keywords: Blind signal recovery, semi/blind channel equalization, convex optimization, semidefinite programming, rank 1 approximation.
I Introduction
Blind signal recovery is a wellknown problem in signal processing and communications. With a special goal of recovering unknown input signals to unknown linear systems based on the system output signals, this problem typically manifests itself either as blind equalization or blind source separation. In SISO systems or singleinputmultipleoutput (SIMO) systems, the goal of channel equalization is to undo the intersymbol interference (ISI). In MIMO systems or in source separation, the objective is to mitigate both the ISI and the interchannel interference (ICI) in order to successfully separate different sources. The advantage lies in the fact that blind algorithms do not require allocation of extra bandwidth to training signals whereas semiblind algorithms can substantially reduce the length of training signals. System implementations based on blind algorithms have appeared in downstream cable modem [1], HDTV [2, 3, 4].
The specially designed cost functions for blind signal recovery typically pose a challenge to the issue of global convergence and convergence speed. In literature, global convergent algorithms for blind equalization and source separation do exist for linear programming equalization [5] and spacetime signal detection [6]. However, without limiting system input signals to the special QAM class, most of the blind algorithms are based on nonquadratic and nonconvex costs that utilize highorder statistics of the received signals. Wellknown algorithms of this type include the constant modulus algorithm (CMA) [7, 8], the ShalviWeinstein algorithm (SWA) [9], and the minimum entropy deconvolution (MED) algorithm [10, 11]. In fact, without modifications, these stochastic gradient descent (SGD) algorithms typically admit multiple local minima [12, 13] and require large numbers of data and iterations to converge.
To improve convergence speed of blind channel equalization techniques, batch algorithms can effectively utilize the statistical information of the channel output signals. They can significantly shorten the convergence time [14, 15, 16, 17, 18, 19]. Unfortunately, local convergence remains a major obstacle, that requires good initialization strategies. Another approach to mitigate the local convergence problem is through relaxation by “lifting” the receiver parameters to a new parameterization that covers a larger parameter space over which the cost function is convex [20, 21]. Once the convex problem is uniquely solved, the solution is mapped back to the original restrictive parameter space. For example, the authors of [20] relaxed the outerproduct matrix of the equalizer parameter space from a rank1 matrix to an unrestricted square matrix over which the CMA cost becomes quadratic and can be solved globally via least squares (LS). From the relaxed solution, a series of linear operations map the solution back into the right equalizer coefficient space. In [21], the authors proposed a similar relaxation with a much more elegant algorithm that modifies the CMA cost function into a special sum. The modified CMA cost is first solved over a more restrictive semidefinite positive matrix space. The resulting convex optimization problem is then solved efficiently via semidefinite programing (SDP). Iterative mappings must also follow the SDP solution in order to determine the equalizer parameters [21].
In this work, we study means to improve the convergence of general blind and semiblind equalization algorithms by generalizing the cost modification principle presented in [21]. We can therefore develop a more general batch implementation of wellknown blind equalization and source separation algorithms that minimize cost functions involving the fourth order statistics of system output signals. More specifically, the new implementation leverages a convex optimization relaxation that can be applied to CMA, SWA, and MED algorithms. We show that our convex formulation requires less resource and improves the efficiency of the convex formulation in [21]. We further generalize the formulation to accommodate the semiblind algorithms when a small number of training symbols are available to assist the receiver signal recovery and separation. Our proposed method overcomes the local convergence problem which is a drawback of traditional gradient descent implementations.
The rest of the paper is organized as follows. In Section II, we introduce MIMO system model. Section III discusses batch MIMO blind source recovery using CMA cost as an example of realvalued fourth order functions. Section IV presents a convex formulation of the fourth order function and algorithm to find its global minima. In Section V, we extend formulation to other blind algorithms based on fourth order cost. A semiblind algorithm using fourth order function is proposed in Section VI. In Section VII, we present our simulation results and Section VIII contains some concluding remarks. In the Appendices, we include the formulation of converting crosscorrelation cost and the training based cost into realvalued fourth order functions.
Ii Problem Statement and System Model
We consider a baseband MIMO system model with transmit antennas and receive antennas. This MIMO model covers SIMO and SISO channels. Let , denote random independent data symbols for transmit antenna at time . typically belong to finite set . Consistent with practical QAM systems, we assume that are mutually independent and identically distributed (i.i.d.) with variance and fourth order kurtosis .
The data streams are transmitted over multipath MIMO channels denoted by impulse responses , , , , with delay spread of samples. Assuming the MIMO channel is corrupted by i.i.d. additive white Gaussian noise (AWGN) with zeromean and variance , the output of th receiver can be expressed as
(1) 
The goal of the MIMO receiver is to recover data stream , from the channel output . Following most of the works in this area, we shall focus only on linear FIR equalizers. Given subchannel outputs, we apply a MIMO equalizer with parameter vector
where is considered as the order of individual equalizer and , . If the channel is flat fading (or nonfrequencyselective), then we have a degenerated problem of blind source separation for which . The FIR blind MIMO equalizer then degenerates into a linear source separation matrix .
The linear receiver output has parallel output streams denoted by
(2) 
where denote the convolution. Our objective is to optimize the linear receiver parameters such that the source data symbols are recovered by without interference as follows
Note that is a phase ambiguity that is inherent to the blind signal recovery which cannot be resolved without additional information; and is the output delay of the th signal recovery that does not affect the receiver performance. Upon optimum convergence, a simple memoryless decision device can be applied, and at high signal to noise ratio (SNR), we have
For convenience of notation, we further define
With these notations, we can write
(3)  
(4) 
In MIMO blind equalization, we would like to find equalizer parameter vectors such that the combined (channelequalizer) response is free of intersymbol and interchannel interferences
(5) 
Iii Constant Modulus Algorithm for MIMO Equalization
In this section, we provide a realvalued representation of the conventional batch CMA cost. This representation is one of the keys to reduce parameter space compared to the work of [21] in which the formulation is based on complex values. This representation is also applied to other blind channel equalization and blind source separation costs and will be shown in the latter sections.
Iiia CMA for single source recovery
For the sake of clarity, we discuss the CMA cost as an example of selecting blind cost. To recover a particular source, the CMA cost function for the th equalizer output sequence is defined as
(6) 
where . The CMA cost can be represented as a function of the equalizer coefficients and the channel output statistics [22]. Since the formulation here deals with single source, the source index is omitted.
Let and denote the real and imaginary parts of . Now define , and . We have the following relationship
(7)  
(8) 
As a result, we have
(9) 
By denoting the rank1 matrix and , we have
(10) 
Note that both and are symmetric of dimension with . In other words, and can be mapped by the svec operator to lower dimensional subspace after ignoring redundant entries. Furthermore, we sort the entries in the usual lexicographic order. We define the following operators and vectors:

For a symmetric matrix , we define svec operator and its reverse operator as
(11) 
For a vector , we define as vector whose entries are all second order (quadratic) terms and sorted in the usual lexicographic order.
(12) There is one to one mapping between elements of and as consists of all upper triangle elements of . Nevertheless, the reverse operator needs further consideration. Given an arbitrary , we can form the corresponding , which is not necessarily a rank 1 matrix. Therefore, we define the reverse operator with approximation as follows: First, using to form the corresponding ; Next, find the rank 1 approximation of by its maximum eigenvalue and the corresponding eigenvector . The resulting matrix is .
With the above definitions, the output power can be rewritten as
(13) 
Similarly, the fourth order moment of equalizer output is
(14)  
Therefore, the CMA cost can be written into a fourth order function of or a quadratic function of as
(15) 
The equalizer can be found by minimizing the polynomial function .
IiiB Multisource recovery
To recover multiple input source signals of a MIMO channel, we must form several single stream receivers to generate multiple receiver output streams that will represent distinct source signals . Typically, a blind algorithm is adept at recovering one of the possible source signals. However, because of the inherent properties of blind equalization, the receiver is unable to ascertain which signal may be recovered a priori. As a result, multiple single stream receivers may recover duplicate signals and miss some critical source signals. To avoid duplicate convergence, proper initialization of each may help lead to a diversified convergence as needed. However, there is unfortunately no guarantee that different initial values of will lead to . In order to ensure that different receiver vectors extract distinct signals, common criteria rely on the prior information that the source signals are i.i.d. and mutually independent.
Let denote a blind cost function, for examples, CMA, SWA, or MED, to recover the source , the cost function for multiple blind source recovery (BSR) is [23, 24]
(16) 
where is a positive scalar and accounts for the correlation of the receiver outputs and . The minimization of ensures that the equalizers converge to different solutions corresponding to different sources. One can recover all sources either simultaneously or sequentially to reduce the parameter size.
In sequential source recovery, we start by extracting the first source by minimizing one single CMA cost. Assume the sources up to are equalized and separated, we can minimize the following cost for separating the source .
(17) 
In this work, we consider the sequential approach. can be chosen as the sum of crosscumulants [24] or as the sum of crosscorrelations [23]. Since the use of crosscumulant is for coarse step separation [24] and may lead to poor convergence, here, we use the crosscorrelation
(18) 
where is an integer parameter to be chosen during system design. can also be written as a fourth order function of or a second order function of . Noticing that , for , are already known, we can write
(19) 
where is a vector formed by the previously calculated equalizers and received statistics. The detailed calculation of is given in the Appendix A.
Eventually, the cost in (17) when using CMA and crosscorrelation can be written as a fourth order (cost) function of equalizer parameter with zero oddorder coefficients similar to :
(20) 
In the following section, we discuss the method to find global minima of the functions of this type.
Iv Semidefinite programming approach to minimize a fourth order polynomial
Iva General formulation
General formula to find the global minimum of a nonnegative polynomial is discussed in [25, 26]. Here, we restrict the formulation for the fourth order polynomial with . The minimization of is equivalent to
max  (21)  
s.t. 
This optimization is equivalent to lifting the horizontal hyperplane created by until it lies immediately beneath the hypersurface . The intersection points that the hyperplane and the hypersurface are the global minima of . This problem is convex with a convex cost and linear constraints in . The problem, however, still a hard problem to solve since the number of constraints is infinite. Although it is hard to find the optimal solution to this problem, we can modify and simplify the problem by narrowing down the search space. Since is nonnegative, we are looking for a representation of as a sum of square. Following the work of [21], we define two convex cones of fourth order polynomials of , and :
We can rewrite the optimization problem in (21) in term of as
max  (22)  
s.t. 
Since the problem in (22) is hard to solve, we can follow [21] and narrow down our feasible solution from to a more restrictive set
max  (23)  
s.t. 
This problem can be cast into a convex semidefinite programming to be solved efficiently [27].
We recall the following lemma, which is a simplified version of the results in [27].
Lemma 1.
Given any forthorder polynomial , the following relation holds:
(24) 
where and , , .
Therefore, we obtain an equivalent formulation of (23) as
max  
s.t.  
and  (25) 
The problem can be solved efficiently using semidefinite programming [28]. Solving (25) for , , we obtain the global solution for this problem.
In general, the solutions for (22) and (23) are not the same. Define and the solutions of (22), (23), respectively. Since, (23) is a restrictive version of (22), we have . Nevertheless, simulation tests in [29] showed that the solutions of (22) and (23) are nearly identical for arbitrary polynomial cost functions . In the next subsection, we will show that they are actually identical in the case of CMA cost.
IvB Minimization of fourth order cost functions without oddorder coefficients
Now, we would like to specialize the algorithm for solving CMA cost. Therefore, we focus on the functions that are fourth order without oddorder entries.
Define
The CMA cost and other mentioned costs in this work belong to the function set . The minimization problem is simply
max  (26)  
s.t. 
As the problem in (26) remains hard to solve, we reduce our feasible solution from to as
max  (27)  
s.t. 
We can arrive at the following proposition:
Proof.
Thus, unlike the general formulation, the restrictive problem in (27) does not introduce any gap. We can now refine Lemma 1 into the following proposition:
Proposition 3.
Given any forthorder polynomial without oddorder coefficients, the following relationship holds:
(28) 
where , , and .
Proof.
From Lemma 1, we find such that and . We partition as
(29) 
where corresponds to the coefficients for the product terms between entries of and . Consequently, the function can be rewritten as
(30) 
Since does not have oddorder entries, we have
(31)  
(32) 
We can rewrite , where
and
This is equivalent to
where since by definition. ∎
IvC Semidefinite programing solution
Following Proposition 3, the optimization problem in (27) is equivalent to solving
max  
s.t.  
and  (33) 
where , i.e., , and is defined as in (28). Here, corresponds to the coefficients for the product terms between entries of and and . The optimization in (33) can be recast into
(34) 
Similar to [21], we let and define as the set of all distinct 4tuples that are permutation of . Now define a subset of as
can be written into two ways as follows
(35) 
and
(36) 
where and represents the entries of and , respectively, for the product term between entries and in . Similarly, and denote the entries of and , respectively, for product term in .
The optimization in (34) can be rewritten as
s.t.  (37)  
(38) 
This problem is convex with linear and semidefinite constraints. Therefore, it can be solved efficiently using available optimization tools such as Sedumi or SDPT3 [31].
The proposed convex optimization is a realvalued and a simplified version of the complexvalued problem in [21]. By modifying the approach given in [21], our method has two numerical benefits that lead to lower complexity. First, it exploits the special characteristic of fourth order statistics blind costs that have zero oddorder coefficients. As a result, the unknown parameter space is reduced. Second, the realvalued formulation captures all necessary information for equalization. Because in our realvalued formulation, the matrix does not increase in size, the realvalued formulation therefore further reduces the number of parameters to half.
IvD Postprocessing
After reaching the global solution of (25) via SDP, the result must be translated and mapped back to the original blind receiver parameter space. Here, we describe such postprocessing procedure.
If the global optimum is achieved, there exists such that
(39) 
where is the symmetric matrix formed by the elements of inside . Since is positive semidefinite, the solution must lie in the null space of . If the null space of has dimension 1, then we already have uniquely. Otherwise, we must look for inside the eigenspace corresponding to the smallest eigenvalues of . Furthermore, the last element of must equal to 1. We now describe an iterative algorithm for realvalued data that was modified from the one proposed in [21]:

Step 1. Initialization: pick a random . Find the null space of by finding that consists of eigenvectors of whose eigenvalues are below a set threshold .

Step 2. For a , find the linear projection of onto :
(40) 
Step 3. Normalize the last element to 1. Rescale as
(41) where is the last element of . The resulting vector now is in null space of and its last element is 1. Let be the vector consisting of the first elements of as .

Step 4. Calculate , note that this is an approximation operation related to rank 1 approximation. Form .

Step 5. Repeat step 2 until converge. The equalizer is contained in as
Note that in step 3 of this existing postprocessing, the last element of must be nonzero. In practice, this condition may not always be true. To overcome this weakness, we can require instead that the equalizer output have the same power as the transmitted symbols. We introduce an additional gain on , (or on ) such that produces the output with the same power as the transmitted symbol,
(42) 
Hence,
(43) 
As a result, we have a more robust and new postprocessing algorithm as follows:

Step 1. Initialization: pick a random . Find the null space of by finding matrix which consists of eigenvectors of corresponding to the eigenvalues less than threshold .

Step 2. For a , find which is the linear projection of onto as (40). Let be the vector consisting of the first elements of as .

Step 3. Calculate .

Step 4. Find using (43). The new equalizer is , and the new .

Step 5. Repeat step 2 until converge.
IvE Complexity discussion
The complexity of the system depends mostly on the size of . It affects the size of kurtosis and covariance matrices, the number of variables input to the SDP solver, the size of vectors and matrices in postprocessing step. Similar to many other batch algorithms the complexity of forming the statistical matrices is . The SDP solver using interior point method requires the worst case complexity of , where is the number of variables. As the size of matrix is , the SDPsolver requires arithmetic operations. For the postprocessing techniques, the dominant operation is the rank one approximation which is realized by singular value decomposition and its complexity is . We further discuss about the SDP solver complexity as it is dominant. Compared to the work in [21], we can see that, in big O notation, the worst case complexity do not change. However, in practice, convex optimization in [21] estimates complex equalizer taps whereas the proposed COCMA method estimates many real equalization taps resulting into reduction of worse case complexity. That is equivalent to reduce the complexity of the SDP solver by for the best case. This, in practice, is very significant.
In [21], the authors mentioned about sparsity of for the formulation of CMA cost where the first and the third order parts are zeros and a specially tailored SDP solver is needed. With our formulation, we show that it is not necessary to have such a solver. By omitting the odd order parts, the new formulation further reduces the number of variables from to . For , it is equivalent to reducing complexity.
Another advantage of our formulation over the formulation in [21] is that the vector comprises the real and imaginary parts of the equalizer whereas the corresponding component in [21] is made of the equalizer vector and its Hermitian. In [21], this dependency is not imposed in the postprocessing step and may requires more iterations to converge. In fact, our simulations show that at most 3 iterations are requires to converge where as the work in [21] requires 5 iterations.
As compared to the traditional gradient decent algorithms, the convex optimization approach, which uses SDP solver for the fourth order cost, provides much better performance at the expanse of complexity. The gradient decent algorithms such as BGDCMA or OSCMA, have low complexity and suffer from local minima and, in many cases, provide poor performance. In contrast, the convex optimization approach at least finds a local minimum that achieves good performance. Due to large computational complexity, the convex optimization approach is not viable for real time applications. Nevertheless, our work achieves incremental complexity reduction for COCMA method.
V Generalization to Other Algorithms
The principle of the convex optimization relaxation and the accompanied iterative mapping procedure can be generalized beyond CMA. In fact, several wellknown blind algorithms based on fourth order statistics can also be recast into convex optimization algorithms.
Va ShalviWeinstein algorithm
Closely related to CMA is SWA [9]. This algorithm is also based on the fourth order statistics by minimizing the following cost
(44) 
Here, the source index is omitted. For , the SWA cost becomes the CMA cost with a constant difference.
Using realvalued vector notation, we can rewrite the SWA cost as
(45)  
Clearly, the cost function is a fourth order function of the equalizer parameter vector , similar to the CMA. Hence, we can apply the proposed convex optimization method on this cost function for convergence enhancement.
VB MED cost and the modified cost for convex optimization
Another related algorithm is the MED algorithm originally developed by Donoho in [11] as
Maximize  
subject to 
Taking into account the negative kurtosis of the communication signals, i.e., , we form an equivalent cost
(47) 
where is the Lagrangian multiplier for the power constraints. The cost can be further summarized as
(48) 
Applying the realvalued formulation, we arrive at another fourth order blind cost function
Therefore, we can apply the proposed convex optimization method on MED for convergence enhancement.
Vi Application to Semiblind Equalization and Signal Recovery
In practice, there are often pilot symbols for channel estimation and equalization. Semiblind equalization and signal recovery are desirable when the linear system is too complex for the available pilot symbols to fully estimate or equalize. In such cases, integrating pilot symbols with blind recovery criterion into a semiblind equalization makes better sense. In this section, we present a semiblind algorithm by utilizing the available pilot samples to enhance the equalization performance using the aforementioned convex optimization formulation.
Our basic principle is to construct a special semiblind cost for minimization. Typically, such cost functions are usually formed as a linear combination of a blind criterion and a pilot based cost in the form of
(49) 
Here the scalar depends on a number of factors such as the pilot sequence length, the number of sources, as well as the source signal constellations.
In order to apply the same convex optimization principle, we aim to develop a semiblind cost function that is also a fourth order function of the receiver parameters. This cost function should also only have even order coefficients. Here, for simplicity, we discuss the use of pilot in single source recovery mode (SISO or SIMO channel model), although our method can be easily generalized for multiple sources.
For a single signal source, we can omit the source index. Without loss of generality, let the transmitted symbol , be the training sequence. In the absence of noise and under perfect channel equalization, we have the following linear relationship between training symbols and received signals
(50) 
where is the decision delay.
We convert the complex data representation to realvalued data representation by defining
(51) 
and
(52) 
The relationship between training vector and the received signal vector can be equivalently written as realvalued signal representation as follows
(53) 
This relationship can be also written as a fourth order function of through the following steps.
First, under noise free condition, if the channel can be completely equalized, then the following criteria can be used for channel equalization
(54) 
Therefore, the equalizer should minimize the following pilotbased fourth order cost function
(55) 
where , , and are appropriately defined matrix, vector, and scalar that correspond to the fourth, the second, and the zeroth order coefficients of , respectively. Their details are presented in Appendix B.
This cost can be combined with one of the blind costs to form semiblind costs which can be solved by the convex optimization
(56) 
where
(57)  
(58)  
(59) 
with , , and are the matrix, vector, scalar that correspond to fourth, second, zero order coefficients of the aforementioned blind costs, respectively.
Vii Simulation results
We now present numerical results to illustrate the performance of our new batch algorithms using convex optimization (CO) for CMA, SWA, MED and the fourth order cost semiblind algorithm. In batch algorithms, the expectations are replaced by the average of equalizer input data samples for each antenna. To show the advantages of our formulation, we compare our algorithms with other blind/semiblind algorithms using gradient descent method. For the best performance, batch gradient descent algorithms are used and marked as BGD.
The semiblind algorithms are marked as SB. The semiblind cost for comparison is selected with CMA cost for the blind part and LS cost for pilots in (49):
(60) 
Here, the decision delays is chosen optimally.
We compare the algorithm performance in terms of the final receiver ISI. Quantitatively, the equalization effectiveness is measured by the normalized ISI for the th equalizer output defined as
(61) 
For multiple source signals, we also define the sum normalized ISI as
(62) 
There are several systems in our studies: SISO system with multipath, MIMO system with multipath and a mixing matrix. The convex optimization is proceeded using available optimization tool [31]. The results are averaged over 500 MonteCarlo realizations of inputs and channels, unless otherwise specified.
Viia SISO channel
We first test our convex optimization using CMA, SWA, MED costs on SISO channel and compare with BGDCMA. The QPSK and 16QAM inputs are passed through 3tap Rayleigh channels with unit power profile and the receive signals are equalized by tap equalizers. Tables I and II compare the ISI performance of different algorithms for QPSK and 16QAM channel inputs, respectively. For convex optimization, SDP3 with interior point method is used. For postprocessing, the threshold is set at . The COCMA uses both postprocessing techniques in [21] (marked as "pp1") and the newly proposed one (marked as "pp2"). Other CO algorithms apply only the proposed postprocessing technique. The data length is to ensure good convergence.
ISI (dB)  No noise (SNR=)  SNR=dB 

Optimum    
COCMA (pp1)  
COCMA (pp2)  
COSWA ()  
COMED ()  
BGDCMA(1)  
BGDCMA(3) 
ISI (dB)  No noise (SNR=)  SNR=dB 

Optimum    
COCMA (pp1)  
COCMA (pp2)  
COSWA ()  
COMED ()  
BGDCMA(1)  
BGDCMA(3) 
The simulations show no significant performance difference among the algorithms using convex optimization. Under good condition, i.e., high SNR and long enough data size, the convex optimization could find the global minima for each case. In fact, the equalization objective does not depend on the costs. The difference among the costs with the choice of parameters are perhaps the convergence of the gradient descent implementations as confirmed by many works [17, 22, 32, 33]. Similar results can be observed in MIMO channel. Therefore, in the rest of the simulation results, we only consider CMA as an example of fourth order statistic based blind channel equalization cost. We also compare the CO algorithms with BGDCMA in Tables I and