A RateDistortion Perspective on Multiple Decoding Attempts for ReedSolomon Codes
Abstract
Recently, a number of authors have proposed decoding schemes for ReedSolomon (RS) codes based on multiple trials of a simple RS decoding algorithm. In this paper, we present a ratedistortion (RD) approach to analyze these multipledecoding algorithms for RS codes. This approach is first used to understand the asymptotic performanceversuscomplexity tradeoff of multiple erroranderasure decoding of RS codes. By defining an appropriate distortion measure between an error pattern and an erasure pattern, the condition for a single erroranderasure decoding to succeed reduces to a form where the distortion is compared to a fixed threshold. Finding the best set of erasure patterns for multiple decoding trials then turns out to be a covering problem which can be solved asymptotically by ratedistortion theory. Next, this approach is extended to analyze multiple algebraic softdecision (ASD) decoding of RS codes. Both analytical and numerical computations of the RD functions for the corresponding distortion measures are discussed. Simulation results show that proposed algorithms using this approach perform better than other algorithms with the same complexity.
I Introduction
ReedSolomon (RS) codes are one of the most widely used errorcorrecting codes in digital communication and data storage systems. This is primarily due to the fact that RS codes are maximum distance separable (MDS) codes, can correct long bursts of errors, and have efficient harddecision decoding (HDD) algorithms, such as the BerlekampMassey (BM) algorithm, which can correct up to half the minimum distance ( of the code. An RS code of length and dimension is known to have due to its MDS nature.
Since the arrival of RS codes, people have put a considerable effort into improving the decoding performance at the expense of complexity. A breakthrough result of Guruswami and Sudan (GS) introduces a harddecision listdecoding algorithm based on algebraic bivariate interpolation and factorization techniques that can correct errors beyond half the minimum distance of the code [1]. Nevertheless, HDD algorithms do not fully exploit the information provided by the channel output. Koetter and Vardy (KV) later extended the GS decoder to an algebraic softdecision (ASD) decoding algorithm by converting the probabilities observed at the channel output into algebraic interpolation conditions in terms of a multiplicity matrix [2]. Both of these algorithms however have significant computational complexity. Thus, multiple runs of erroranderasure and erroronly decoding with some low complexity algorithm, such as the BM algorithm, has renewed the interest of researchers. These algorithms essentially first construct a set of either erasure patterns [3, 4], test patterns [5], or patterns combining both [6] and then attempt to decode using each pattern. There has also been recent interest in lowering the complexity per decoding trial as can be seen in [7, 8, 9].
In the scope of multiple erroranderasure decoding, there have been several algorithms using different sets of erasure patterns. After multiple decoding trials, these algorithms produce a list of candidate codewords and then pick the best codeword on this list, whose size is usually small. The nature of multiple erroranderasure decoding is to erase some of the least reliable symbols since those symbols are more prone to be erroneous. The first algorithm of this type is called Generalized Minimum Distance (GMD) [3] and it repeats erroranderasure decoding while successively erasing an even number of the least reliable positions (LRPs) (assuming that is odd). More recent work by Lee and Kumar [4] proposes a softinformation successive (multiple) erroranderasure decoding (SED) that achieves better performance but also increases the number of decoding attempts. Literally, the LeeKumar’s SED algorithm runs multiple erroranderasure decoding trials with every combination of an even number of erasures within the LRPs.
A natural question that arises is how to construct the “best” set of erasure patterns for multiple erroranderasure decoding. Inspired by this, we first design a ratedistortion framework to analyze the asymptotic tradeoff between performance and complexity of multiple erroranderasure decoding of RS codes. The framework is also extended to analyze multiple algebraic softdecision decoding (ASD). Next, we proposed a group of multipledecoding algorithms based on this approach that achieve better performanceversuscomplexity tradeoff than other algorithms. The multipledecoding algorithm that achieves the best tradeoff turns out to be a multiple erroronly decoding using the set of patterns generated by random codes combining with covering codes. These are the main results of this paper.
Ia Outline of the paper
The paper is organized as follows. In Section II, we design an appropriate distortion measure and present a ratedistortion framework to analyze the performanceversuscomplexity tradeoff of multiple erroranderasure decoding of RS codes. Also in this section, we propose a general multipledecoding algorithm that can be applied to erroranderasure decoding. Then, in Section III, we discuss a numerical computation of RD function which is needed for the proposed algorithm. In Section IV, we analyze both bitlevel and symbollevel ASD decoding and design distortion measures so that they can fit into the general algorithm. In Section V, we offer some extensions that help the algorithm achieve better performance and running time. Simulation results are presented in Section VI and finally, conclusion is provided in Section VII.
Ii Multiple ErrorandErasure Decoding
In this section, we set up a ratedistortion framework to analyze multiple attempts of conventional hard decision erroranderasure decoding.
Let be the Galois field with elements denoted as . We consider an RS code of length , dimension over . Assume that we transmit a codeword over some channel and receive a vector where is the receive alphabet for a single RS symbol. In this paper, we assume that and all simulations are based on transmitting each of the bits in a symbol using Binary PhaseShift Keying (BPSK) on an Additive White Gaussian Noise (AWGN) channel. For each codeword index , let be the permutation given by sorting in decreasing order so that . Then, we can specify as the th most reliable symbol for at codeword index . To obtain the reliability of the codeword positions (indices), we construct the permutation given by sorting the probabilities of the most likely symbols in increasing order. Thus, codeword position is the th LRP. These above notations will be used throughout this paper.
Example 1
Consider and . Assume that we have the probability written in a matrix form as follows
then and .
Condition 1
Iia Conventional error and erasure patterns.
Definition 1
(Conventional error and erasure patterns) We define and as an error pattern and an erasure pattern respectively, where means that an error occurs (i.e. the most likely symbol is incorrect) and means that an erasure occurs at index .
Example 2
If is odd then is the set of erasure patterns for the GMD algorithm. For the SED algorithm, the set of erasure patterns has the form . Here, in each erasure pattern the letters are written in increasing reliability order of the codeword positions.
Let us revisit the question how to construct the best set of erasure patterns for multiple erroranderasure decoding. First, it can be seen that a multiple erroranderasure decoding succeeds if the condition (1) is satisfied during at least one round of decoding. Thus, our approach is to design a distortion measure that converts the condition (1) into a form where the distortion between an error pattern and an erasure pattern , denoted as , is less than a fixed threshold.
Definition 2
Given a letterbyletter distortion measure , the distortion between an error pattern and an erasure pattern is defined by
Proposition 1
If we choose the letterbyletter distortion measure as follows
(2) 
then the condition (1) for a successful erroranderasure decoding then reduces to the form where the distortion is less than a fixed threshold
First, we define to count the number of pairs equal to for every . Noticing that and , the condition (1) for one erroranderasure decoding attempt to succeed becomes . By seeing that we conclude the proof. Next, we try to maximize the chance that this successful decoding condition is satisfied by at least one of the decoding attempts (i.e. for at least one erasure patterns ). Mathematically, we want to build a set of no more than erasure patterns in order to
(3) 
The exact answer to this problem is difficult to find. However, one can see it as a covering problem where one wants to cover the space of error patterns using a minimum number of balls centered at the chosen erasure patterns.
This view leads to an asymptotic solution of the problem based on ratedistortion theory. More precisely, we view the error pattern as a source sequence and the erasure pattern as a reproduction sequence.
Ratedistortion theory shows that the set of reproduction sequences can be generated randomly so that
where the distortion is minimized for a given rate . Thus, for large enough , we have
with high probability. Here, and are closely related to the complexity and the performance, respectively, of the decoding algorithm. Therefore, we characterize the tradeoff between those two aspects using the relationship between and .
IiB Generalized error and erasure patterns
In this subsection, we consider a generalization of the conventional error and erasure patterns under the same framework to make better use of the soft information. At each index of the RS codeword, beside erasing the symbol we can try to decode using not only the most likely symbol but also other ones as the hard decision (HD) symbol. To handle up to the most likely symbols at each index , we let and consider the following definition.
Definition 3
(Generalized error patterns and erasure patterns) Consider a positive integer . Let us define as the generalized error pattern where, at index , implies that the th most likely symbol is correct for , and implies none of the first most likely symbols is correct. Let be the generalized erasure pattern used for decoding where, at index , implies that the th most likely symbol is used as the harddecision symbol for , and implies that an erasure is used at that index.
Theorem 1
We choose the letterbyletter distortion measure defined by in terms of the ) matrix
Using this, the condition (1) for a successful erroranderasure decoding becomes
The reasoning is similar to Proposition 1 using the fact that and where for every . For each , we will refer to this generalized case as mBM decoding.
Example 3
We consider the case mBM2 decoding where . The distortion measure is given by following the matrix
Here, at each codeword position, we consider the first and second most likely symbols as the two harddecision choices like in the Chasetype decoding method proposed by Bellorado and Kavcic [7].
IiC Proposed General MultipleDecoding Algorithm
In this section, we propose a general multipledecoding algorithm for RS codes based on the ratedistortion approach. This general algorithm applies to not only multiple erroranderasure decoding but also multipledecoding of other decoding schemes that we will discuss later. The first step is designing a distortion measure that converts the condition for a single decoding to succeed to the form where distortion is less than a fixed threshold. After that, decoding proceeds as described below.

Phase I: Compute ratedistortion function.
Step 1: Transmit (say ) arbitrary test RS codewords, indexed by time , over the channel and compute a set of matrices where is the probability of the th most likely symbol at position during time .
Step 2: For each time , obtain the matrix from through a permutation that sorts the probabilities in increasing order to indicate the reliability order of codeword positions. Take the entrywise average of all matrices to get an average matrix .
Step 3: Compute the RD function of a source sequence (error pattern) with probability of source letters derived from and the designed distortion measure (see Section III and Section VB) . Determine the point on the RD curve that corresponds to a designated rate along with the testchannel inputprobability distribution vector that achieves that point.

Phase II: Run actual decoder.
Step 4: Based on the actual received signal sequence, compute and determine the permutation that gives the reliability order of codeword positions by sorting in increasing order.
Step 5: Randomly generate a set of erasure patterns using the testchannel inputprobability distribution vector and permute the indices of each erasure pattern by the permutation
Step 6: Run multiple attempts of the corresponding decoding scheme (e.g. errorand erasure decoding) using the set of erasure patterns in Step 5 to produce a list of candidate codewords.
Step 7: Use MaximumLikelihood (ML) decoding to pick the best codeword on the list.
Iii Computing The RateDistortion Function
In this section, we will present a numerical method to compute the RD function and the testchannel inputprobability distribution that achieves a specific point in the RD curve. This probability distribution will be needed to randomly generate the set of erasure patterns in the general multipledecoding algorithm that we have proposed.
For an arbitrary discrete distortion measure, it can be difficult to compute the RD function analytically. Fortunately, the BlahutArimoto (BA) algorithm (see details in [12, 13]) gives an alternating minimization technique that efficiently computes the RD function of a single discrete source. More precisely, given a parameter which represents the slope of the curve at a specific point and an arbitrary allpositive initial testchannel inputprobability distribution vector , the BA algorithm shows us how to compute the ratedistortion point by means of computing the testchannel inputprobability distribution vector and the testchannel transition probability matrix that achieves that point.
However, it is not straightforward to apply the BA algorithm to compute the RD for a discrete source sequence (an error pattern in our context) of independent but non identical source components . In order to do that, we consider the group of source letters where as a supersource letter , the group of reproduction letters where as a superreproduction letter , and the source sequence as a single source. For each supersource letter , follows from the independence of source components. While we could apply the BA algorithm to this source directly, the complexity is a problem because the alphabet sizes for and become the superalphabet sizes and respectively. Instead, we avoid this computational challenge by choosing the initial testchannel inputprobability distribution so that it can be factorized into a product of initial testchannel inputprobability components, i.e. . Then, we see that this factorization rule still applies after every step of the iterative process. By doing this, for each parameter we only need to compute the ratedistortion pair for each component (or index ) separately and sum them together. This is captured into the following theorem.
Theorem 2
(Factored BlahutArimoto algorithm) Consider a discrete source sequence of independent but non identical source components . Given a parameter , the rate and the distortion for this source sequence are given by
where the components and are computed by the BA algorithm with the parameter . This pair of rate and distortion can be achieved by the corresponding testchannel inputprobability distribution where the component probability distribution .
See Appendix A.
Iv Multiple Algebraic Soft Decision Decoding (ASD)
In this section, we analyze and design a distortion measure to convert the condition for successful ASD decoding to a suitable form so that we can apply the general multipledecoding algorithm to ASD decoding.
First, let us give a brief review on ASD decoding of RS codes. Given a set of distinct elements in From each message polynomial , we can have a codeword by evaluating the message polynomial at , i.e. for . Consider a received vector , we can compute the a posteriori probability (APP) matrix as follows.
The ASD decoding as in [2] has the following main steps.

Multiplicity Assignment: Use a particular multiplicity assignment scheme (MAS) to derive a multiplicity matrix, denoted as , of nonnegative integer entries from the APP matrix .

Interpolation: Construct a bivariate polynomial of minimum weighted degree that passes through each of the point with multiplicity for and .

Factorization: Find all polynomials of degree less than such that is a factor of and reevaluate these polynomials to form a list of candidate codewords.
In this paper, we denote as the maximum multiplicity. Intuitively, higher multiplicity should be put on more likely symbols. Increasing generally gives rise to the performance of ASD decoding. However, one of the drawbacks of ASD decoding is that its decoding complexity is roughly which sharply increases with . Thus, in this section we will work with small to keep the complexity affordable.
One of the main contributions of [2] is to offer a condition for successful ASD decoding represented in terms of two quantities specified as the score and the cost as follows.
Definition 4
The score with respect to a codeword and
a multiplicity matrix is defined as
where such that . The cost of
a multiplicity matrix is defined as
Condition 2
To match the general framework, the ASD decoding threshold (or condition for successful ASD decoding) should be converted to the form where the distortion is smaller than a fixed threshold.
Iva Bitlevel ASD case
In this subsection, we consider multiple trials of ASD decoding using bitlevel erasure patterns. A bitlevel error pattern and a bitlevel erasure pattern has length since each symbol has bits. Similar to Definition 1 of a conventional error pattern and a conventional erasure pattern, in a bitlevel error pattern implies a bitlevel error occurs and in a bitlevel erasure pattern implies that a bitlevel erasure occurs.
From each bitlevel erasure pattern we can specify entries of the multiplicity matrix using the bitlevel MAS proposed in [14] as follows: for each codeword position, assign multiplicity 2 to the symbol with no bit erased, assign multiplicity 1 to each of the two candidate symbols if there is 1 bit erased, and assign multiplicity zero to all the symbols if there are bits erased. All the other entries are zeros by default. This MAS has a larger decoding region compared to the conventional erroranderasure decoding scheme.
Condition 3
(Bitlevel ASD decoding threshold, see [14]) For RS codes of rate , ASD decoding using the proposed bitlevel MAS will succeed (i.e. the transmitted codeword is on the list) if
(5) 
where is the number of bitlevel erasures and is the number of bitlevel errors in unerased locations.
We can choose an appropriate distortion measure according to the following proposition which is a natural extension of Proposition 1 in the symbol level.
Proposition 2
If we choose the bitlevel letterbyletter distortion measure as follows
(6) 
then the condition (5) becomes
(7) 
The proof uses the same reasoning as the proof of Proposition 1.
Remark 1
We refer the the multipledecoding of bitlevel ASD as mbASD.
IvB Symbollevel ASD case
In this subsection, we try to convert the condition for successful ASD decoding in general to the form that suits our goal. We will also determine which multiplicity assignment schemes allow us to do so.
Definition 5
(Multiplicity type) For some codeword position, let us assign multiplicity to the th most likely symbol for where . The remaining entries in the column are zeros by default. We call the sequence, , the column multiplicity type for “top” decoding.
First, we notice that a choice of multiplicity types in ASD decoding at each codeword position has the similar meaning to a choice of erasure decisions in the conventional erroranderasure decoding. However, in ASD decoding we are more flexible and may have more types of erasures. For example, assigning multiplicity zero to all the symbols (allzero multiplicity type) at codeword position corresponds to the case when we have a complete erasure at that position. Assigning the maximum multiplicity to one symbol corresponds to the case when we choose that symbol as the harddecision one. Hence with some abuse of terminology, we also use the term (generalized) erasure pattern for the multiplicity assignment scheme in the ASD context. Each erasureletter gives the multiplicity type for the corresponding column of the multiplicity matrix .
Definition 6
(Error and erasure patterns for ASD decoding) Consider a MAS with multiplicity types. Let be an erasure pattern where, at index , implies that multiplicity type is used at column of the multiplicity matrix . Notice that the definition of an error pattern in Definition 3 applies unchanged here.
Ratedistortion theory gives us the intuition that in general the more multiplicity types (erasure choices) we have, the better performance of multiple ASD decoding we achieve as becomes large. Thus, we want to find as many as possible multiplicity types for “top” that allow us to convert condition for successful ASD decoding to the correct form.
Example 4
Choosing , for example, gives four column multiplicity types for “top2” decoding as follows: the first is where we assign multiplicity 2 to the most likely symbol , the second is where we assign equal multiplicity 1 to the first and second most likely symbols and , the third is where we assign multiplicity 2 to the second most likely symbol , and the fourth is where we assign multiplicity zero to all the symbols at index (i.e. the th column of is an allzero column). As a corollary of Theorem 3 below, the distortion matrix that converts (4) to the correct form for this case is
The following definition and theorem provide a set of allowable multiplicity types that converts the condition for successful ASD decoding into the form where distortion is less than a fixed threshold.
Definition 7
The set of allowable multiplicity types for “top” decoding with maximum multiplicity is defined to be^{1}^{1}1We use the convention that if .
(8) 
Taking the elements of this set in an arbitrary order, we let the th multiplicity type in the allowable set be .
Example 5
consists of all permutations of . Meanwhile, comprises all the permutations of and we refer to the multiple ASD decoding algorithm using this set of multiplicity types as mASD2. consists of all the permutations of and this case is referred as mASD3. We also consider another case called mASD2a that uses the set of multiplicity types .
Theorem 3
Let be the number of multiplicity types in a MAS for “top” decoding with maximum multiplicity . Let be a letterbyletter distortion measure defined by , where is the matrix
with . Then, the condition (4) for successful ASD decoding of a RS code with rate is equivalent to
(9) 
[Sketch of proof] (See details in [16]) Let and be the score and cost of the multiplicity assignment. First, we show that in (4) implies that . Combining this inequality with the highrate constraint in Theorem 3 implies that . From (4), we also know that and this implies that . But, the conditions of the theorem can also be used to show that . Combining this with gives a contradiction unless . Thus, we conclude that .
Therefore, the condition in (4) is equivalent to because is a consequence of and is satisfied by the highrate constraint. Finally, one can show that is equivalent to with the chosen distortion matrix.
Remark 2
For a fixed , the size of is maximized when . Multiplicity types and any permutation of are always in the allowable set .
V Some Extensions and Generalizations
Va Erasure patterns using covering codes
The RD framework we use is most suitable when . For a finite , the random coding approach may have problems with only a few LRPs. We can instead use good covering codes to handle these LRPs. In the scope of covering problems, one can use an ary covering code (e.g. a perfect Hamming or Golay code) with covering radius to cover the whole space of ary vectors of the same length. The covering may still work well if the distortion measure is close to, but not exactly equal to the Hamming distortion.
In order take care of up to the most likely symbols at each of the LRPs of an RS, we consider an ary covering code whose codeword alphabet is Then, we give a definition of the (generalized) error patterns and erasure patterns for this case. In order to draw similarities between this case and the previous cases, we still use the terminology “generalized erasure pattern” and shorten it to erasure pattern even if erroronly decoding is used. For erroronly decoding, Condition 1 for successful decoding becomes
Definition 8
(Error and erasure patterns for erroronly decoding) Let us define as an error pattern where, at index , implies that the th most likely symbol is correct for , and implies none of the first most likely symbols is correct. Let be an erasure pattern where, at index , implies that the th most likely symbol is chosen as the harddecision symbol for .
Proposition 3
If we choose the letterbyletter distortion measure defined by in terms of the matrix
(10) 
then the condition for successful erroronly decoding then becomes
(11) 
It follows directly from .
Remark 3
If we delete the first row which corresponds to the case where none of the first most likely symbols is correct then the distortion measure is exactly the Hamming distortion.
Split covering approach:
We can break an error pattern into two suberror patterns of least reliable positions and of most reliable positions. Similarly, we can break an erasure pattern into two suberasure patterns and . Let be the number of positions in the LRPs where none of the first most likely symbols is correct, or . If we assign the set of all suberror patterns to be an covering code then because this covering code has covering radius . Since , in order to increase the probability that the condition (11) is satisfied we want to make as small as possible by the use of the RD approach. The following proposition summarizes how to generate a set of erasure patterns for multiple runs of erroronly decoding.
Proposition 4
In each erasure pattern, the letter sequence at LRPs is set to be a codeword of an ary covering code. The letter sequence of the remaining is generated randomly by the RD method (see Section IIC) with rate and the distortion measure in (10). Since this covering code has codewords, the total rate is
Example 6
For a (7,4,3) binary Hamming code which has covering radius , we take care of the most likely symbols at each of the 7 LRPs. We see that is a codeword of this Hamming code and then form erasure patterns with assumption that the positions are written in increasing reliability order. The suberasure patterns are generated randomly using the RD approach with rate .
Remark 4
While it also makes sense to use a covering codes for the LRPs of the erasure patterns and set the the rest to be letter (i.e. chose the most likely symbol as the harddecision), our simulation results shows that the performance can be improved by using a combination of covering codes and random (i.e., generated by the RD approach) codes.
VB Closed form ratedistortion functions
For some simple distortion measures, we can compute the RD functions analytically in closed form. First, we observe an error pattern as a sequence of independent but nonidentical random sources. Then, we compute the component RD functions at each index of the sequence and use convex optimization techniques to allocate the total rate and distortion to various components. This method converges to the solution faster than the numerical method in Section III. The following two theorems describe how to compute the RD functions for the simple distortion measures of Proposition 1 and 2.
Theorem 4
(Conventional erroranderasure decoding) Let , the overall ratedistortion function is given by where and can be found be a reverse waterfilling procedure:
where should be chosen so that . The function can be achieved by the testchannel inputprobability distribution
[Sketch of proof] (See [16] for details) With the distortion measure in (2), we follow the method in [17] to compute the ratedistortion function component and the testchannel inputprobability distribution and for each index . Then, one can show that the optimal allocation of rate and distortion to the various components is given by a reversewater filling procedure like in [18].
Theorem 5
(Bitlevel ASD case in Proposition 2) The overall ratedistortion function in this case is given by where and the distortion component is given by
where should be chosen so that . The function can be achieved by the testchannel inputprobability distribution
Vi Simulation results
Using simulations, we consider the performance of the (255,239) RS code over an AWGN channel with BPSK as the modulation format. The mBM1 curve corresponds to standard erroranderasure BM decoding with multiple erasure patterns. For , the mBM curves correspond to erroranderasure BM decoding with multiple decoding trials using both erasures and top symbols. The mASD curves correspond to multiple ASD decoding trials with maximum multiplicity . The number of trial decoding patterns is where is denoted in parentheses in each algorithm’s acronym (e.g., mBM2(11) uses ).
Fig. 2 shows the RD curves for various algorithms at dB. For this code, the fixed threshold for decoding is . Therefore, one might expect that algorithms whose average distortion is less than 17 should have a frame error rate (FER) less than . The RD curve allows one to estimate the number of decoding patterns required to achieve this FER. Conventional BM decoding is very similar to mBM1 decoding at rate 0. Notice that the mBM1 algorithm at rate 0, which is very similar to conventional BM decoding, has an expected distortion of roughly 24. For this reason, the FER on conventional decoding is close to 1. The RD curve tells us that trying roughly (i.e., ) erasure patterns would reduce the FER to roughly because this is where the distortion drops down to 17. Likewise, the mBM2(11) algorithm has an expected distortion of less than 14. So we expect (and our simulations confirm) that the FER should be less than . One weakness of this approach is that the RD describes only the average distortion and does not directly consider the probability that the distortion is greater than 17. Still, we can make the following observations from the RD curve. Even at low rates (e.g., ), we see that the distortion achieved by mBM2 is roughly the same as mBM3, mASD2, and mASD3 but smaller than mASD2a and mBM1. This implies that mBM2 is no worse than the more complicated ASD based approaches for a wide range of rates (i.e., ).
The FER of various algorithms can be seen in Fig. 2. The focus on allows us to make fair comparisons with SED(12,12). With the same number of decoding trials, mBM2(11) outperforms SED(12,12) by 0.3 dB at an FER. Even mBM2(7), with many fewer decoding trials, outperforms both SED(12,12) and the KV algorithm with . Among all our proposed algorithms with rate , the mBMHM74(11) achieves the best performance. This algorithm uses the Hamming (7,4) covering code for the 7 LRPs and the RD approach for the remaining codeword positions. Meanwhile, small differences in the performance between mBM2(11), mBM3(11), mASD2(11), and mASD3(11) suggest that: (i) taking care of the most likely symbols at each codeword position is good enough for multiple decoding of highrate RS code and (ii) multiple runs of erroranderasure decoding is almost as good as multiple runs of ASD decoding. Recall that this result is also correctly predicted by the RD analysis. Moreover, it is quite reasonable since we know that the gain of GS decoding, with infinite multiplicity, over the BM algorithm is negligible for highrate RS codes. To compare with the LCC() Chasetype approach used in [7], we also consider the mBMHM74(4) algorithm, which uses the Hamming (7,4) covering codes for the 7 LRPs and the hard decision pattern for the remaining codeword positions. This shows that the covering code achieves better performance with the same number () decoding attempts. The comparison is not entirely fair, however, because of their lowcomplexity approach to multiple decoding. We believe, nevertheless, that their technique can be generalized to covering codes.
Vii Conclusion
A ratedistortion approach is proposed as a unified framework to analyze multiple decoding trials, with various algorithms, of RS codes in terms of performance and complexity. A connection is made between the complexity and performance (in some asymptotic sense) of these multipledecoding algorithms and the rate and distortion of an associated RD problem. Covering codes are also combined with the ratedistortion approach to mitigate the suboptimality of random codes when the effective blocklength is not large. As part of this analysis, we also present numerical and analytical computations of the ratedistortion function for sequences of independent but nonidentical sources. Finally, the simulation results show that our proposed algorithms based on the RD approach achieve a better performanceversuscomplexity tradeoff than previously proposed algorithms. One key result is that, for highrate RS codes, multipledecoding using the standard BM algorithm is as good as multipledecoding using more complex ASD algorithms.
In this paper, we only discuss the ratedistortion approach to solve the problem in (3). However, the performance can be further improved by focusing on the ratedistortion errorexponent. This allows us to approximately solve the covering problem for finite rather than just as . The complexity of multiple decoding can also be decreased by using clever techniques to lower the complexity per decoding trial (e.g., [7]). We will address these two improvements in a future paper.
Appendix A Proof of Theorem 2
First, let us recall that for each source component , the BA algorithm computes the RD pair in the following steps:

Choose an arbitrary allpositive testchannel inputprobability distribution vector .

Iterate the following steps at
where is the transition probability. It is shown by BA that and as .
The rate and distortion can be computed by and