Generalized Belief Propagation for the
Noiseless Capacity and Information Rates of RunLength Limited
Constraints
Abstract
The performance of the generalized belief propagation algorithm to compute the noiseless capacity and mutual information rates of finitesize twodimensional and threedimensional runlength limited constraints is investigated. In both cases, the problem is reduced to estimating the partition function of graphical models with cycles. The partition function is then estimated using the regionbased free energy approximation technique. For each constraint, a method is proposed to choose the basic regions and to construct the region graph which provides the graphical framework to run the generalized belief propagation algorithm. Simulation results for the noiseless capacity of different constraints as a function of the size of the channel are reported. In the cases that tight lower and upper bounds on the Shannon capacity exist, convergence to the Shannon capacity is discussed. For noisy constrained channels, simulation results are reported for mutual information rates as a function of signaltonoise ratio.
I Introduction
Runlength limited (RLL) constraints are widely used in magnetic and optical recording systems. Such constraints reduce the effect of intersymbol interference and help in timing control. In trackoriented storage systems constraints are defined in one dimension.
We say a binary onedimensional (1D) sequence satisfies the RLL constraint if the runs of ’s have length at most and the runs of ’s between successive ’s have length at least . We suppose that .
The Shannon capacity of a 1D RLL constraint is defined as
(1) 
where denotes the number of binary 1D sequences of length that satisfy the RLL constraint, see [1, 2].
With the rise in demand for larger storage in smaller size and with recent developments in pageoriented storage systems, such as holographic data storage, twodimensional (2D) constraints have become more of interest [3]. In these systems, data is organized on a surface and constraints are defined in two dimensions.
A 2D binary array satisfies the RLL constraint if it satisfies a RLL constraint horizontally and a RLL constraint vertically. If a 2D binary array satisfies a 1D RLL constraint both horizontally and vertically, we simply say that it satisfies a 2D RLL constraint.
Example: 2D RLL constraint:
The 2D RLL constraint is satisfied in the following 2D binary array segment. In words, in every row and every column of the array there are at least two ’s between successive ’s; but the runs of ’s can be of any length (however, ’s can be diagonally adjacent).
The Shannon capacity of a 2D RLL constraint is defined as
(2) 
where denotes the number of 2D binary arrays of size that satisfy the RLL constraint.
Similarly, the Shannon capacity can be defined for higher dimensional constrained channels. For example, the Shannon capacity in three dimensions depends on , the number of threedimensional (3D) binary arrays of size that satisfy a RLL constraint.
The noiseless capacity of a constrained channel is an important quantity that provides an upper bound to the information rate of any encoder that maps arbitrary binary input into binary data that satisfies a given constraint. There are a number of techniques to compute the 1D Shannon capacity (for example combinatorial or algebraic approaches) [1]. In contrast to the 1D capacity, except for a few cases, exact values of two and higher dimensional (positive) Shannon capacities are not known, see [4, 5, 7, 6, 8, 9].
For noisy 1D constrained channels, simulationbased techniques proposed in [16, 17] can be used to compute mutual information rates. However, computing mutual information rates of noisy 2D RLL constraints has been an unsolved problem.
In this paper, the goal is to apply the generalized belief propagation (GBP) algorithm [10] for the abovementioned problems, namely, to compute an estimate of the capacity of noiseless 2D and 3D RLL constrained channels and mutual information rates of noisy 2D constrained channels. For both problems GBP turns out to yield very good approximate results.
Preliminary versions of the material of this paper have appeared in [11] and [12]. In [11], we applied GBP to compute the noiseless capacity of 2D and 3D RLL constrained channels. In [12], GBP was applied to compute mutual information rates of a 2D RLL constrained channel with relatively small size and only at high signaltonoise ratio (SNR). In this paper, we show that both problems reduce to estimating the partition function of graphical models with cycles. We then apply GBP to both problems and consider new constraints and larger sizes of grid.
Our main motivations for this research were the successful application of GBP for information rates of 2D finitestate channels with memory in [13], Kikuchi approximation for decoding of LDPC codes and partialresponse channels in [14], and treebased Gibbs sampling for the noiseless capacity and information rates of 2D constrained channels in [15, 12].
The outline of the paper is as follows. In Section II, we consider the problem of computing the partition function and discuss how this problem is related to computing the noiseless capacity and information rates of constrained channels. Region graphs, GBP, and regionbased free energy are outlined in Section III. Section IV discusses the capacity of noiseless 2D constraints. Numerical values and simulation results for the capacity of noiseless 2D and 3D RLL constraints are reported in Section IVA. In Section V, we apply GBP to compute mutual information rates of noisy 2D RLL constraints and report numerical experiments for mutual information rates in Section VA.
Ii Problem Setup
Consider a 2D channel of size with a set of random variables. Let denote a realization of and let x denote . We assume that each takes values in a finite set . Also let be the Cartesian product .
In constrained channels, not all sequences of symbols from the channel alphabet are admissible. Let be the set of admissible input sequences. We define the indicator function
(3) 
The partition function is defined as
(4) 
With the above definitions, is the number of sequences that satisfy a given constraint. Therefore, computing the capacity of constrained channels as expressed in (2), is closely related to computing the partition function as in (4).
Also note that with the above definitions
(5) 
is a probability mass function on .
For a noisy 2D channel, let X be the input and be the output of the channel. The mutual information rate is
(6) 
Let us suppose that is analytically available. In this case, the problem of estimating the mutual information rate reduces to estimating the entropy of the channel output, which is
(7) 
As in [16], we can approximate the expectation in (7) by drawing samples according to and use the empirical average as
(8) 
Therefore, the problem of estimating the mutual information rate reduces to computing for .
We will compute based on
(9) 
which for a fixed has also the form (4) and therefore requires the computation of a partition function.
RLL constraints impose restrictions on the values of variables that can be verified locally. For example, in a 2D RLL constraint no two (horizontally or vertically) adjacent variables can both have the value . The indicator function of this constraint factors into a product of kernels of the form
(10) 
with one such kernel for each adjacent pair .
The factorization with kernels as in (10) can be represented with a graphical model. In this paper, we focus on graphical models defined in terms of Forney factor graphs. Fig. 1 shows the Forney factor graph of a 2D RLL constraint where the boxes labeled “” are equality constraints [19]. (Fig. 1 may also be viewed as a factor graph as in [18] where the boxes labeled “” are the variable nodes).
In general, we suppose that the indicator function of an RLL constraint factors into a product of nonnegative local kernels each having some subset of x as arguments; i.e.
(11) 
where is a subset of x and each kernel has elements of as arguments.
In this case, the partition function in (4) can be written as
(12) 
If the factorization in (11) yields a cyclefree factor graph (with not too many states), the sum in (12), or equivalently the sum in (4), can be computed efficiently by the sumproduct message passing algorithm [18]. However, for the examples we study in this paper, like the Forney factor graph of a 2D RLL constraint in Fig. 1, factor graphs contain (many short) cycles. In such cases computing requires a sum with an exponential number of terms and therefore we are interested in applying approximate methods.
Due to the presence of many short cycles in the factor graph representation of 2D and 3D RLL constraints, loopy belief propagation often fails to converge. As a result, we apply GBP to estimate , which then leads to estimating the noiseless capacity and mutual information rates of RLL constraints.
Iii GBP and the region graph method
In statistical physics, defined in (4) is known as the partition function and the Helmholtz free energy is defined as
(13) 
The partition function and the Helmholtz free energy are important quantities in statistical physics since they carry information about all thermodynamic properties of a system.
A number of techniques have been developed in statistical physics to approximate the free energy. The method that we apply in this paper is known as the regionbased free energy approximation, in particular we use the cluster variation method to select a valid set of regions and counting numbers, see [10] and [20] for more details.
We start by introducing the region graph representation of our problem. Such a region graph will provide a graphical framework for GBP algorithm. For each RLL constraint, the size of the basic region is chosen based on the constraint parameters. For a 2D RLL constraint with finite and , the width and the height of the basic region is chosen as
and for the infinite case, the size is chosen as
Such a choice for the basic regions seems plausible since the validity of a given array can be determined by verifying the constraints in each region and sliding the basic regions along the rows and along the columns of the array. For a 2D RLL constraint, Fig. 2 shows the basic regions and Fig. 3 shows the region graph and the counting numbers associated with each region.
After forming the region graph using the cluster variation method, we perform GBP on this graph by sending messages between the regions while performing exact computations inside each region.
We will need the regionbased free energy to estimate the number of arrays that satisfy a given constraint. Therefore, we operate GBP on the corresponding region graph until convergence and use the obtained region beliefs to compute the regionbased free energy (as an estimate of ). The regionbased free energy can then be used to estimate the partition function using (13). We compute as
(14)  
Here denotes the set of all regions, is the counting number, stands for the set of variables in region , and is the set of factors in region . See Fig. 3.
Iv Capacity of Noiseless 2D RLL Constraints
For a 2D RLL constrained channel of width and of size , we run GBP on the corresponding region graph to compute and an estimate of . We can then compute
(15) 
where denotes the number of 2D binary arrays of size that satisfy the constraint.
In our numerical experiments in Section IVA, for different RLL constraints we show convergence of to the Shannon capacity as increases.
For example, let us consider a 2D RLL constraint with corresponding Forney factor graph in Fig. 1. For this constraint, we chose basic regions with size in a sliding window manner over the factor graph, see Fig. 2. Starting from such basic regions, we applied the cluster variation method on the factor graph in Fig. 2 to obtain the corresponding region graph depicted in Fig. 3. The counting numbers are shown next to each region.
Iva Numerical Experiments
Here we present the numerical results of applying GBP to estimate the finitesized noiseless capacity of RLL constraints.
Tight lower and upper bounds were given for the Shannon capacity of a 2D RLL constraint in [4]. The bounds were further improved in [22] and [23], now known to nine decimal digits.
(16) 
For this constraint, Fig. 5 shows defined in (15) versus the channel width over the interval . The estimation was performed using the parenttochild and twoway GBP algorithms. The two algorithms give almost identical results. The horizontal line in Fig. 5 shows the Shannon capacity for this channel in (16). For a channel of width , the estimated noiseless capacity is about .
Shown in Fig. 5 are plots of for 2D RLL constraints with from top to bottom, versus the channel width over the interval . Fig. 7 shows the plots of for 2D RLL and RLL constraints versus over the interval . From our simulation results, for a channel of width the estimated noiseless capacities for 2D RLL constraints with are about and the estimated noiseless capacities for 2D RLL and RLL are about . To the best of our knowledge, no theoretical upper or lower bounds exist for these constraints. All plots are obtained using the parenttochild algorithm. Note that 2D RLL plot in Fig. 5 is the same as the plot in Fig. 5.
Also shown in Fig. 7 is the plot of for a 2D RLL constraint versus over the interval . For a channel of width , the estimated noiseless capacity is about . Best known lower and upper bounds for the Shannon capacity of a 2D RLL constraint are given in [8] and [9] respectively, as
(17) 
Our proposed method can be generalized to compute the noiseless capacity of 3D and higher dimensional RLL constraints. For a 3D RLL constraint the following lower and upper bounds were introduced in [23]
(18) 
Fig. 8 shows the noiseless capacity estimates of a 3D RLL constraint, obtained using the parenttochild algorithm, versus the channel width . The horizontal dotted lines show the upper and lower bounds for the Shannon capacity. For a channel of width the GBP estimated capacity is about which falls within these bounds.
Simulation results and numerical values for the noiseless capacity of many other 2D RLL constraints are reported in [21].
IvB Bounds for the Shannon Capacity
For any finite , it is possible to compute lower and upper bounds on the Shannon (infinitesize) capacity using the capacity of a 2D RLL constrained channel of width .
For example, consider a 2D RLL constraint with local kernels as in (10). From tiling the whole plane with squares, it is clear that is an upper bound for the Shannon capacity . On the other hand, by tiling the plane with squares separated by allzero guard rows and allzero guard columns, we obtain .
From Fig. 5, the estimated capacity at is about , we thus obtain the following lower and upper bounds for the Shannon capacity
Note that although GBP performs remarkably well for 2D constrained channels, it is an approximate algorithm which yields approximations to the lower and upper bounds to the Shannon capacity. However in order to achieve a desired precision, the bounds could provide a criterion for choosing the value of .
V Information Rates of Noisy 2D RLL Constraints
As explained in Section II, the problem of computing mutual information rates reduces to computing the output probability. Therefore, the remaining tasks are

Drawing input samples from according to and therefrom creating output samples .

Computing for each .
We will compute based on
(19) 
where is a probability mass function on .
Let us assume uniform distribution over the admissible channel input configurations. Therefore we have
(20)  
(21) 
we also assume the channel is memoryless and factors as
(22) 
For such a noisy 2D constrained channel, the corresponding Forney factor graph, as an extension of Fig. 1, is shown in Fig. 9.
The input samples are generated as follows. We run GBP on Fig. 3 until convergence to compute the region beliefs at each region . The region beliefs are GBP approximations to the corresponding marginals . In our numerical experiments, each sample is then generated piecewise sequentially according to the beliefs in basic regions. For example, in the region graph of Fig. 3, after computing , sample is drawn according to , sample is drawn according to , etc. The input samples are then used to create output using (22).
The beliefs are directly proportional to the factor nodes involved in each region, which guarantees that the samples are drawn from . Moreover, since beliefs are good approximations to the marginal probabilities, one expects that the samples are drawn from a distribution close to , see [10].
In order to compute , as in Section III, we start from the factor graph in Fig. 9 to build the region graph representing the problem and run GBP on this region graph. Finally, the estimated are used to compute an estimate of as in (8).
Va Numerical Experiments
In our numerical experiments we consider RLL and RLL constrained channels with size and input alphabet .
Noise is assumed to be i.i.d. zero mean Gaussian with variance and independent of the input. We thus have
(25) 
and in (22) has kernels of the form
(26) 
SNR is defined as
(27) 
Shown in Fig. 11 is the estimated information rate vs. SNR over the interval dB for a noisy 2D RLL constraint. The horizontal dotted line shows the estimated noiseless capacity which can be read from Fig. 5 and is about for this size of channel.
Illustrated in Fig. 11 is the estimated information rate vs. SNR over the interval dB for a noisy 2D RLL channel. The horizontal dotted line shows the estimated noiseless capacity which can be read from Fig. 7 and is about for this size of channel.
The simulation results were obtained by averaging over realizations of the channel output.
Simulation results and numerical values for mutual information rates of many other 2D RLL constraints are reported in [21].
Vi Concluding remarks
We proposed a GBPbased method to estimate the noiseless capacity and mutual information rates of RLL constraints in two and three dimensions. For noiseless RLL constraints, the method was applied to estimate the finitesize capacity of different constraints and to show convergence to the Shannon capacity as the size of the channel increases. In particular, the proposed method can be used to estimate the noiseless capacity of RLL constraints in the cases that the capacity is not known to a useful accuracy. The method was also applied to estimate mutual information rates of noisy RLL constraints with additive white Gaussian noise and with a uniform distribution over the admissible input configurations. Our simulation results show mutual information rates of different constraints as a function of SNR.
Acknowledgements
The authors gratefully acknowledge the support of Prof. H.A. Loeliger. The first author wishes to thank Ori Shental for his helpful comments on GBP implementation. We would also like to thank the reviewers for their many helpful suggestions that helped to improve the presentation of our paper.
References
 [1] K. A. Schouhamer Immink, Codes for Mass Data Storage Systems. Eindhoven: Shannon Foundation Publishers, 2004.
 [2] C. E. Shannon, “A mathematical theory of communications,” Bell Sys. Tech. Journal, vol. 27, pp. 379–423, July 1948.
 [3] P. H. Siegel, “Informationtheoretic limits of twodimensional optical recording channels,” in Optical Data Storage (Proc. of SPIE, Vol. 6282, Eds. Ryuichi Katayama and Tuviah E. Schlesinger), Montreal, Quebec, Canada, April 2006.
 [4] N. J. Calkin and H.S. Wilf, ‘The number of independent sets in a grid graph,” SIAM J. Discr. Math., vol. 11, pp. 54–60, Feb. 1998.
 [5] S. Forchhammer and T.V. Laursen, “Entropy of bitstuffinginduced measures for twodimensional checkerboard constraints,” IEEE Trans. Inform. Theory, vol. 53, pp. 1537–1546, April 2007.
 [6] H. Ito, A. Kato, Z. Nagy, and K. Zeger, ‘Zero capacity region of multidimensional run length constraints,” The Electronic Journal of Combinatorics, vol. 6(1), 1999.
 [7] K. Kato and K. Zeger, ‘On the capacity of twodimensional runlength constrained channels,” IEEE Trans. Inform. Theory, vol. 45, pp. 1527–1540, July 1999.
 [8] E. Ordentlich and R. M. Roth, “Approximate enumerative coding for 2D constraints through ratios of matrix products,” Proc. 2009 IEEE Int. Symp. on Information Theory, Seoul, Korea, pp. 1050–1054.
 [9] I. Tal and R. M. Roth, “Concave programming upper bounds on the capacity of 2D constraints,” IEEE Trans. Inform. Theory, vol. 57, pp. 381–391, Jan. 2011.
 [10] J. S. Yedidia, W. T. Freeman, and Y. Weiss. “Constructing free energy approximations and generalized belief propagation algorithms,” IEEE Trans. Inform. Theory, vol. 51, pp. 2282–2312, July 2005.
 [11] G. Sabato and M. Molkaraie, “Generalized belief propagation algorithm for the capacity of multidimensional runlength limited constraints,” Proc. 2010 IEEE Int. Symp. on Information Theory, Austin, USA, June 13–18, pp. 1213–1217.
 [12] M. Molkaraie and H.A. Loeliger, “Estimating the information rate of noisy constrained 2D channels,” Proc. 2010 IEEE Int. Symp. on Information Theory, Austin, USA, June 13–18, pp. 1678–1682.
 [13] O. Shental, N. Shental, S. Shamai (Shitz), I. Kanter, A. J. Weiss, and Y. Weiss, “Discreteinput twodimensional Gaussian channels with memory: estimation and information rates via graphical models and statistical mechanics,” IEEE Trans. Inform. Theory, vol. 54, pp. 1500–1513, April 2008.
 [14] P. Pakzad and V. Anantharam, “Kikuchi approximation method for joint decoding of LDPC codes and partialresponse channels,” IEEE Trans. Communications, vol. 54, pp. 1149–1153, July 2006.
 [15] H.A. Loeliger and M. Molkaraie, “Estimating the partition function of 2D fields and the capacity of constrained noiseless 2D channels using treebased Gibbs sampling,” Proc. 2009 IEEE Information Theory Workshop, Taormina, Italy, Oct. 11–16, pp. 228–232.
 [16] D. Arnold, H.A. Loeliger, P. O. Vontobel, A. Kavčić, and W. Zeng, “Simulationbased computation of information rates for channels with memory,” IEEE Trans. Inform. Theory, vol. 52, no. 8, pp. 3498–3508, August 2006.
 [17] H. D. Pfister, J.B. Soriaga, and P. H. Siegel, “On the achievable information rates of finitestate ISI channels,” in Proc. 2001 IEEE Globecom, San Antonio, USA, Nov. 25–29, pp. 2992–2996.
 [18] F. R. Kschischang, B. J. Frey, and H.A. Loeliger, “Factor graphs and the sumproduct algorithm,” IEEE Trans. Inform. Theory, vol. 47, pp. 498–519, Feb. 2001.
 [19] H.A. Loeliger, “An introduction to factor graphs,” IEEE Signal Proc. Mag., Jan. 2004, pp. 28–41.
 [20] M. Welling, “On the choice of regions for generalized belief propagation,” Proc. 2004 conference on Uncertainty in artificial intelligence, Banff, Canada, pp. 585–592.
 [21] G. Sabato, Simulationbased techniques to study twodimensional ISI channels and constrained systems. Master thesis, Dept. Inform. Techn. & Electr. Eng, ETH Zürich, Switzerland, 2009.
 [22] W. Weeks and R. E. Blahut, “The capacity and coding gain of certain checkerboard codes,” IEEE Trans. Inform. Theory, vol. 44, pp. 1193–1203, May 1998.
 [23] Z. Nagy and K. Zeger, ‘Capacity bounds for the threedimensional runlength limited channels,” IEEE Trans. Inform. Theory, vol. 46, pp. 1030–1033, May 2000.