Lower Bounds and Approximations for the Information Rate of the ISI Channel

# Lower Bounds and Approximations for the Information Rate of the ISI Channel

Yair Carmon and Shlomo Shamai1 1 Technion, Israel Institute of Technology. Emails: yairc@tx.technion.ac.il, sshlomo@ee.technion.ac.il
###### Abstract

We consider the discrete-time intersymbol interference (ISI) channel model, with additive Gaussian noise and fixed i.i.d. inputs. In this setting, we investigate the expression put forth by Shamai and Laroia as a conjectured lower bound for the input-output mutual information after application of a MMSE-DFE receiver. A low-SNR expansion is used to prove that the conjectured bound does not hold under general conditions, and to characterize inputs for which it is particularly ill-suited. One such input is used to construct a counterexample, indicating that the Shamai-Laroia expression does not always bound even the achievable rate of the channel, thus excluding a natural relaxation of the original conjectured bound. However, this relaxed bound is then shown to hold for any finite entropy input and ISI channel, when the SNR is sufficiently high. Finally, new simple bounds for the achievable rate are proven, and compared to other known bounds. Information-Estimation relations and estimation-theoretic bounds play a key role in establishing our results.

## I Introduction and preliminaries

The discrete-time inter-symbol interference (ISI) communication channel model is given by,

 yk=L−1∑i=0hixk−i+nk (1)

where 111We use the standard notation for the sequence with the natural interpretation when and/or ., is an independent identically distributed (i.i.d.) channel input sequence with average power and is the channel output sequence. The noise sequence is assumed to be an i.i.d. zero-mean Gaussian sequence independent of the inputs, with average power , and are the ISI channel coefficients. We let denote the channel transfer function. For simplicity we assume that the input, ISI coefficients and noise are real, but all the results reported in this paper extend straightforwardly to a complex setting.

ISI is common in a wide variety of digital communication applications, and thus holds much interest from both practical and theoretical perspectives. In particular, evaluation of the maximum achievable rate of reliable communication sheds light on the fundamental loss caused by ISI, and aids in the design of coded communication systems. Since this model is ergodic, the rate of reliable communication is given by [1],

 I=limN→∞12N+1I(xN−N;yN−N) (2)

When the input distribution is Gaussian, a closed form expression for is readily derived by transforming the problem into parallel channels (cf. [2]), and is given by

 Ig=12π∫π−πlog(1+PxN0|H(θ)|2)dθ (3)

This rate is also the maximum information rate attainable by any i.i.d. input process — i.e. the i.i.d. channel capacity. However, in practical communication systems the channel inputs must take values from a finite alphabet, commonly referred to as a signal constellation. In this case no closed form expression for is known. In lieu of such expression, can be approximated or bounded numerically, mainly by using simulation-based techniques [3, 4, 5, 6, 7, 8].

Simple closed form bounds on present an alternative to numerical approximation. It is straightforward to show that (see [9]),

 I≥I(x0;∑kaky−k|x−1−∞) (4)

where is an arbitrary set of coefficients. Substituting for according to the channel model (1), this bound can be simplified to

 I≥I(x0;x0+∑k≥1αkxk+m) (5)

with the coefficients , and a Gaussian RV, independent of with zero mean and variance . Different choices of coefficients provide different bounds for . One appealing choice is the taps of the sample whitened matched filter (SWMF), for which for every [10]. This choice yields the Shamai-Ozarow-Wyner bound [11]:

 I≥ISOW≜Ix(SNRZF-% DFE) (6)

where

 Ix(γ)≜I(x0;√γN0/Pxx0+n0) (7)

is the input-output mutual information in a scalar additive Gaussian noise channel at SNR and input distributed as a single ISI channel input. stands for the output SNR of the unbiased zero-forcing decision feedback equalizer (ZF-DFE), which uses the SWMF as its front-end filter [12], and is given by

 SNRZF-DFE=PxN0exp{12π∫π−πlog(|H(θ)|2)dθ} (8)

Since evaluation of and amounts to simple one-dimensional integration, the Shamai-Ozarow-Wyner bound can be easily computed and analyzed. However, it is known to be quite loose in medium and low SNR’s.

Another choice of coefficients are the taps of the mean-squared whitened matched filter (MS-WMF), for which the variance of the noise term is minimized. The MS-WMF is used as the front-end filter of the MMSE-DFE [12]. Denoting the minimizing coefficients by and their corresponding Gaussian noise term by , the SNR at the output of the unbiased MMSE-DFE is given by,

 SNRDFE-U=Ex20E(∑k≥1^αkxk+^m)2=exp{12π∫π−πlog(1+PxN0|H(θ)|2)dθ}−1 (9)

and we denote the resulting bound by

 I≥IMMSE≜I(x0;x0+∑k≥1^αkxk+^m) (10)

The bound is still difficult to handle numerically or analytically because of the high complexity of the variable . Several techniques for further bounding were proposed, such as those in [9] and more recently in [13]. However, none of those methods provide bounds that are both simple and tight.

In [9] Shamai and Laroia conjectured that can be lower bounded by replacing the interfering inputs with i.i.d. Gaussian variables of the same variance, i.e

 IMMSE≥I(x0;x0+∑k≥1^αkgk+^m)=Ix(SNRDFE-U)≜ISL (11)

where are i.i.d. Gaussian variables with variance and independent of and . The inequality (11) is known as the Shamai-Laroia conjecture (SLC). The expression was empirically shown to be a very tight approximation for in a large variety of SNR’s and ISI coefficients. Since it is also elegant and easy to compute, the conjectured bound has seen much use despite remaining unproven — cf. [13, 8, 7, 14].

In a recent paper [15], Abbe and Zheng disproved a stronger version of the SLC, by applying a geometrical tool using Hermite polynomials. This so-called “strong SLC” claims that (11) holds true for any choice of coefficients , and not just the MMSE coefficients . The disproof in [15] is achieved by constructing a counterexample in which the interference is composed of a single tap (i.e. ) and the input distribution is a carefully designed small perturbation of a Gaussian law. In this setting, it is shown that there exist SNR’s and values of in which the strong SLC fails. In order to apply this counterexample to the original SLC, one has to construct appropriate ISI coefficients and their matching MMSE-DFE, which is not trivial. Moreover, such a counterexample would use a continuous input distribution, leaving room to hypothesize that the SLC holds for practical finite-alphabet inputs.

The aim of this paper is to provide new insights into the validity of the SLC, as well as to provide new simple lower bounds for . Information-Estimation relations [16] and related results [17] are instrumental in all of our analytic results, as they enable the derivation of novel bounds and asymptotic expressions for mutual information.

We begin by disproving the original (“weak”) SLC, showing analytically that is does not hold when the SNR is sufficiently low, under very general settings. Our proof relies on the power series expansion of the input-output mutual information in the additive Gaussian channel [17]. This result allows us to construct specific counterexamples in which computations clearly demonstrate that the SLC does not hold. Furthermore, it provides insight on what makes the Shamai Laroia expression such a good approximation, to the point where it was never before observed not to hold in low SNR’s.

With the SLC disproven, we are led to consider the weakened but still highly meaningful conjecture, that lower bounds the achievable rate itself, i.e. . We provide numerical results indicating that for sufficiently skewed binary inputs for some SNR, disproving the weakened bound in its most general form. Nonetheless, we prove that for any finite entropy input distribution and any ISI channel, the bound holds for sufficiently high SNR. This proof is carried out by showing that converges to the input entropy at a higher exponential rate than .

Finally, new bounds for are proven using Information-Estimation techniques and bounds on MMSE estimation of scaled sums of i.i.d. variables contaminated by additive Gaussian noise. A simple parametric bound is developed, which parameters can either be straightforwardly optimized numerically or set to constant values in order to produce an even simpler, if sometimes less tight, expression. Numerical results are reported, showing the bounds to be useful in low to medium SNRs, and of comparable tightness to that of the bounds reported in [13].

The rest of this paper is organized as follows. Section II contains the disproof of the original SLC via low-SNR asymptotic analysis. Section III presents counterexamples for the original SLC as well as the weakened bound . Section IV details the proof of the bound in the high-SNR regime, and section V established novel Infromation-Estimation based bounds on . Section VI concludes this paper.

## Ii Low SNR analysis of the Shamai-Laroia approximation

In this section we prove that the conjectured bound (11) does not hold in the low SNR limit in essentially every scenario. Given a zero-mean RV , let and stand for its skewness and excess kurtosis, respectively. Note that for a Gaussian RV. Our result is formally stated as,

###### Theorem 1.

For every real ISI channel and any input with and , when is sufficiently small. When there exist real ISI channels for which when is sufficiently small.

###### Proof:

The proof comprises of rewriting as a combination of mutual informations in additive Gaussian channels, applying a fourth order Taylor-series expansion to each element, and showing that the resulting combination is always negative in the leading order.

First, let us state the Taylor expansion of the mutual information in a useful form. Suppose is a zero-mean random variable and let be independent of . It follows from equation (61) of [17] that,

 I(ξ;ξ+ν)=ρ2−ρ24+ρ36[1−s2ξ2]−ρ448[κ2ξ−12s2ξ+6]+O(ρ5) (12)

Where . Let and be the ISI coefficients and Gaussian noise term resulting from the application of the unbiased MMSE-DFE filter on the channel output, as defined in (10). In our proof we will make use of the following definitions for ,

 μi≜∑k≥i^αkxk (13) ~μ1∼N(0,Eμ21), ~μ1⊥x0,^m (14) ~μ0≜x0+~μ1 (15) β2i≜∑k≥i^αk2,γ3i≜∑k≥i^αk3,δ4i≜∑k≥i^αk4 (16) ϵi=Eμ2iE^m2=E~μ2iE^m2=β2iPxE^m2 (17) IMMSE(i)≜I(μi;μi+^m) (18) ISL(i)≜I(~μi;~μi+^m) (19)

where . It is seen that

 IMMSE(0) =I(x0+μ1;x0+μ1+^m)=I(x0,x0+μ1;x0+μ1+^m) =I(x0;x0+μ1+^m)+I(x0+μ1;x0+μ1+^m|x0) =IMMSE+IMMSE(1)

and so . Similarly, . Let , it follows that,

 IMMSE−ISL=Δ0−Δ1 (20)

Notice that are each the mutual information between the input and output of an additive Gaussian channel, and can therefore readily be expanded according to (12), yielding

 Δ0=−(ϵ3012−ϵ404)[s2μ0−s2~μ0]−ϵ4048[κ2μ0−κ2~μ0]+O(ϵ50) (21) Δ1=−(ϵ3112−ϵ414)s2μ1−ϵ4148κ2μ1+O(ϵ51) (22)

Where since is Gaussian, and

 sμi =E(∑k≥i^αkxk)3β3iP3/2x=γ3iβ3isx (23) s~μ0 =E(x0+~μ1)3β30P3/2x=sxβ30 (24) κμi =E(∑k≥i^αkxk)4β4iP2x−3=δ4iβ4iκx (25) κ~μ0 =E(x0+~μ1)4(Px+β21Px)2−3=κxβ40 (26)

Putting everything together, we get:

 IMMSE−ISL=−γ31s2x6β60ϵ30−(δ41κ2x24β80−(2β20+γ31)γ31s2x4β80)ϵ40+O(ϵ50) (27)

For the case ,(27) simplifies to,

 IMMSE−ISL=−δ41κ2x24β80ϵ40+O(ϵ50) (28)

=and clearly when we must have from some point.

We now show that when . In Appendix A we find that,

 ϵ0=SNRLESNRDFE−1SNRLE−1−1 (29)

where , stand for the output SNR’s of the MMSE (biased) linear and decision-feedback equalizers, respectively (see (91) and (92)). When is small, we have

 SNRDFE=SNRLE=1+⎡⎢⎣12ππ∫−π|H(θ)|2dθ⎤⎥⎦PxN0+O((PxN0)2) (30)

and therefore,

 ϵ0=⎡⎢⎣12ππ∫−π|H(θ)|2dθ⎤⎥⎦PxN0+O((PxN0)2) (31)

and goes to zero when . This proves our statement in the case , since by (28) and (31), the leading term in the expansion of with respect to is guaranteed to be negative.

When , we will demonstrate that there exist ISI channels for which at low SNRs. Let us consider the two tap channel for some . Carrying out the calculation according to [12] reveals that the residual ISI satisfies,

 α∗i=(−1)i+1[a−√a2−1]i12[1+√1−1/a2](1+PxN0)−1 (32)

where

 a=1+N0/Px2q√1−q2≥1 (33)

Thus, for small one finds that

 γ31=q3(1−q2)3/2+O(PxN0) (34)

Plugging (34) into (27) and (31), we conclude that for channels of the form with ,

 IMMSE−ISL=−16q3(1−q2)3/2s2x(PxN0)3+O((PxN0)4) (35)

proving our statement for the case of non-zero skewness. ∎

## Iii Counterexamples

In this section we use insights from Section II in order to construct specific counterexamples for the SLC in both its original= form () and its weakened version (). The section is composed of two parts. In the first part we compare and in the low-SNR regime for specific input distributions and ISI channel, demonstrating Theorem 1 and verifying the series expansion derived in its proof. In the second part we compare and , with the former estimated by means of Monte-Carlo simulation, in the medium-SNR regime and with the ISI channel and input distributions that were used in the first part of this section. Our results indicate that for highly skewed binary inputs, for some SNRs.

### Iii-a Low-SNR regime — IMMSE<ISL

Figure 1 demonstrates Theorem 1 and its inner workings, for a particular choice of ISI coefficients and two input distributions. The first distribution represents a symmetric source with input alphabet and , that has zero skewness and excess kurtosis . The second distribution represents a zero-mean skewed binary source with , that has and . The ISI is formed by a three taps impulse response with and (“Channel B” from [18] chapter 10). Examining Figure 1 it is seen that in the low SNR regimem is indeed negative and well approximated by the expansion (27) — in agreement with Theorem 1 and in contradiction to the Shamai-Laroia conjecture.

In order to estimate as defined in (10), the infinite sequence of residual ISI taps is truncated to with the minimal for which . For the ISI channel used in our counterexample, moves from 8 at SNR -26 dB to 36 at SNR 10 dB. Experimentation indicates that the accuracy of the computation of and is of the order of bit.

To clearly observe the behavior predicted by Theorem 1, it is crucial to use an input distribution with high skewness or high kurtosis. Using the notation of (27), we observe that the difference is of the order of . Computations reveal that for channels with moderate to high ISI, and are both of the order of at low SNRs, and that the series approximation is valid up to values of around . Hence, the difference term is roughly . Therefore, we must have of the order of and/or of the order of for the predicted low-SNR behavior to be distinguishable from numeric errors.

We emphasize that Theorem 1 guarantees that the SLC does not hold for any input distribution with nonzero skewness or excess kurtosis, including for example BPSK input that has and . However, the above analysis shows that the universal low SNR behavior (27) is masked by numerical errors when common input distributions are used, due to the facts that by symmetry they have zero skewness, and that their excess kurtosis values are of order unity. This serves to explain why similar low SNR counterexamples to the SLC were not previously reported.

### Iii-B Medium-SNR regime — I<ISL

Figure 2 displays , and computed for the input distributions and ISI channel described above. The value of is computed by Monte-Carlo simulations as described in [7]. For each SNR, 20 simulations with input length were preformed. The dots on the red curve indicate the averaged result of these simulations (which is equivalent to a single simulation with input length ), and the error bars indicate the minimum and maximum results among the 20 simulations.

For both input distributions, clearly exceeds . In fact, further simulations indicate that in both cases for the entire SNR range, leaving little room to hope that the Shamai-Laroia conjecture is valid in the high-SNR regime. For the symmetric trinary source, it is seen that for all SNRs tested. However, for the skewed binary sources, it is fairly certain that at some SNRs. This leads to the conclusion that even the modified conjecture does not hold in general.

The relation might still be true for all SNRs and ISI channels for some input distributions, such as BPSK, and might even hold for large families of input distributions, such as symmetric sources. Our simulations indicate that is always a tight approximation for , and that it is much tighter than for sources with high skewness or excess kurtosis. Moreover, in the following section we establish that in the high-SNR regime, the inequality holds for any input distribution and any ISI channel.

## Iv High SNR analysis of the Shamai-Laroia approximation

In this section we prove that the weakened Shamai-Laroia bound is valid for any input distribution and ISI channel, for sufficiently high SNR. The proof is carried out by bounding the exponential rates at which and converge to the input entropy as the input SNR grows, and showing that the former rate is strictly higher than the latter for every non-trivial ISI channel. The rate of convergence of is lower bounded using Fano’s inequality and Forney’s analysis of the probability of error of the Maximum Likelihood sequence detector of the input to the ISI channel given its output. The rate of convergence of is upper bounded using the I-MMSE relationship and genie-based bounds on the MMSE estimation of a single channel input from an observation contaminated by additive Gaussian noise.

For convenience, the results of this section assume the normalization . Let

 gZF-DFE=exp{12π∫π−πlog(|H(θ)|2)dθ} (36)

denote the gain factor of the zero-forcing DFE — It is seen that behaves as when . For every possible channel input , let denote its probability of occurrence and let be the input entropy. Finally, let denote the minimal distance between different input values.

Our asymptotic bound for the achievable rate is formally stated as follows,

###### Lemma 1.

For any finite entropy input distribution and any finite length ISI channel there exists a function polynomial in and a constant such that,

 H(x0)−I≤F(PxN0)exp(−Px2N0(dmin2)2δ2min) (37)

and , with strict inequality whenever is not constant (i.e. there is non-zero ISI).

###### Proof:

Since is i.i.d., and hence

 H(x0)−I=H(x0|y∞−∞,x−1−∞)≤H(x0|y∞−∞)=H(x0|^xML0,y∞−∞)≤H(x0|^xML0) (38)

where is the maximum likelihood sequence estimate of given . By Fano’s inequality,

 H(x0|^xML0) ≤ H(x0,1{x0=^xML0}|^xML0)≤H(1{x0=^xML0})+H(x0|1{x0=^xML0},^xML0) (39) ≤ h2(Pr(x0≠^xML0))+Pr(x0≠^xML0)log|X| (40)
 H(x0|^xML0)≤h2(Pr(x0≠^xML0))+Pr(x0≠^xML0)log|X| (41)

where is the binary entropy function and is the set of possible values of . By the analysis of the probability of error in maximum likelihood sequence estimation first preformed by Forney [19] and then refined in [20, 21, 22], we know that

 Pr(x0≠^xML0)≤K′Q⎛⎝√PxN0(dmin2)2δ2min⎞⎠ (42)

with and the minimum weighted and normalized distance between any two input sequences that first diverge at time and last diverge at some finite time ,

 δ2min=infN≥1minxN−10,~xN−10 s.t. x0≠~x0,xN−1≠~xN−1δ2(xN−10,~xN−10) (43)

where

 (44)

Substituting (42) into (40) and taking (38) into account, along with the fact that , yields the bound (37). It remains to show that can be lower bounded by . By keeping only the first and last summands in (44), we have that when , for any feasible pair of sequences ,

 δ2(xN−10,~xN−10)≥∣∣∣(x0−~x0dmin)h0∣∣∣2+∣∣∣(xN−1−~xN−1dmin)hL−1∣∣∣2≥|h0|2+|hL−1|2 (45)

since by assumption and , and implies . Hence, for .

We may assume without loss of generality that is minimum phase (i.e. has no zeros outside the unit circle), because it may always be brought to this form by means of a whitened matched filter. When is minimum phase it follows that , and thus we conclude that , except for the zero-ISI case . For , . ∎

Our asymptotic bound for the achievable rate for the Shamai-Laroia expression is given as,

###### Lemma 2.

For any finite entropy input distribution and any finite length ISI channel there exists a function polynomial in and constants such that,

 H(x0)−ISL≥G(PxN0)exp(−Px2N0(dmin2)2g% ZF-DFE−^K⋅(PxN0)1−ε) (46)
###### Proof:

We rewrite as defined in (7) using the I-MMSE relation [16],

 H(x0)−Ix(snr)=12∫∞snrmmse¯x(γ)dγ (47)

where , and for any RV ,

 mmsez(γ)≜E(z−E[z|√γz+n])2 (48)

with and independent of . Let and be two possible values of such that , and denote their probabilities and , respectively, assuming without loss of generality that . Let be a random variable independent of and distributed on with . Define the random variable , so that given , is distributed equiprobably on . Since conditioning can only decrease MMSE we have

 mmse¯x(γ)≥mmse¯x|B(γ)≥Pr(B=1)mmse¯x|B=1(γ) (49)

Now, , and , where is equiprobably distributed on . The function can be bounded as

 mmseb(γ) = 1√2π∫∞−∞(1−tanh(√γy))e−12(y−√γ)2dy (50) ≥ 1√2π∫∞−∞e−√γ(y+|y|)e−12(y−√γ)2=2Q(√γ) (51)

where we have used . Using we find that,

 ∫∞sQ(√γ)dγ=√se−s/2√2π+(1−s)Q(√s)≥s>1e−s/2√2πs (52)

and so

 12∫∞snrmmse¯x(γ)dγ≥p(v1)∫∞(dmin2)2snrmmseb(γ)dγ≥2√2p(v1)√πd2minsnrexp(−12(dmin2)2snr) (53)

We remark that a bound similar to (49) was developed in [23]. However, the lower bound of [23] does not take into account non-equiprobable inputs.

The last step is to upper bound in terms of for large . We have

 log⎛⎜⎝SNRDFEPxN0g% ZF-DFE⎞⎟⎠=12π∫π−πlog(1+N0Px|H(θ)|2)dθ (54)

If for every , a simple bound is obtained using :

 SNRDFE≤PxN0gZF-DFE+gZF-DFEgZF-LE (55)

with being the SNR gain factor of the linear zero-forcing equalizer. However, if the channel has spectral nulls, and the above bound is useless. In this case, let

 Ω={θ∈[−π,π]||H(θ)|2<√N0/Px} (56)

and for , bound (54) as

 12π∫π−πlog(1+N0Px|H(θ)|2)dθ≤12π∫Ωlog(2√N0Px1|H(θ)|2)dθ+∣∣ΩC∣∣2π