An Iteratively Decodable Tensor Product Code with Application to Data Storage
Abstract
The error pattern correcting code (EPCC) can be constructed to provide a syndrome decoding table targeting the dominant error events of an intersymbol interference channel at the output of the Viterbi detector. For the size of the syndrome table to be manageable and the list of possible error events to be reasonable in size, the codeword length of EPCC needs to be short enough. However, the rate of such a short length code will be too low for hard drive applications. To accommodate the required large redundancy, it is possible to record only a highly compressed function of the parity bits of EPCC’s tensor product with a symbol correcting code. In this paper, we show that the proposed tensor errorpattern correcting code (TEPCC) is linear time encodable and also devise a lowcomplexity soft iterative decoding algorithm for EPCC’s tensor product with ary LDPC (TEPCCLDPC). Simulation results show that TEPCCLDPC achieves almost similar performance to singlelevel LDPC with a KB sector at reduction in decoding complexity. Moreover, KB TEPCCLDPC surpasses the performance of KB singlelevel LDPC at the same decoder complexity.
Tensor product codes, intersymbol interference, turbo equalization, errorpattern correction, ary LDPC, multilevel log likelihood ratio, tensor symbol signatures, signaturecorrecting code, detection postprocessing.
I Introduction
\PARstartThe advent of high recording density enabling technologies, pioneered by galloping improvements in head and media design and manufacturing processes, has pushed for similar advances in read channel design and error correction coding, driving research efforts into developing channelcapacityapproaching coding schemes based on soft iterative decoding that are also implementation friendly [1, 2]. Soft decodable error correction codes (ECC), mainly low density parity check (LDPC) codes, would eventually replace conventional ReedSolomon (RS) outer ECC, which despite its large minimum distance, possesses a dense parity check matrix that does not lend itself easily to powerful belief propagation (BP) decoding. There exists vast literature on the various design aspects of LDPC coded systems for magnetic recording applications. This includes code construction [3, 4, 5, 6], efficient encoding [7, 8], decoder optimization [9, 10, 11], and performance evaluation [12, 14, 13]. In this work, we propose an LDPC coded system optimized for the magnetic recording channel that spans contributions in most of these areas.
The errorpattern correcting code (EPCC) is proposed in [15, 16, 17] motivated by the wellknown observation that the error rate at the channel detector output of an ISI channel is dominated by a few specific known error cluster patterns. This is due to the fact that the channel output energies associated with these error patterns are smaller than those of other patterns. A multiparity cyclic EPCC was first described in[16], with an RS outer ECC, possessing distinct syndrome sets for all such dominant error patterns. To reduce the code rate penalty, which is a severe SNR degradation in recording applications, a method to increase the code rate was introduced in [17] that also improved EPCC’s algebraic single and multiple errorpattern correction capability. In this method, the generator polynomial of the short base EPCC is multiplied by a primitive polynomial that is not already a factor of the generator polynomial. Also, the primitive polynomial degree is chosen so as to achieve a certain desired codeword length. Moreover, [17] describes a Viterbi detection postprocessor that provides erroreventreliability information aiding syndromemapping of EPCC to improve its correction accuracy. However, improving the EPCC code rate by extending its codeword length increases the probability of multiple dominant error patterns within the codeword, and this requires increasing the size of the syndrome table considerably to maintain the same correction power, which eventually results in prohibitive decoding complexity. To maintain correction power with a manageable size syndrome decoding table, [18] discusses a more efficient method based on a list decoding strategy that delivers satisfactory sector error rate (SER) gain with an outer RS ECC. Later, this list decoding scheme was formulated as a softinput softoutput block in [19] and utilized to enhance the performance of turbo equalization based on convolutional codes (CC). Nevertheless, the serial concatenation scheme that proved successful with RS hard decoding and CCbased turbo equalization does not work as well in serial concatenation of longEPCC and LDPC. The reason is that when the LDPC decoder fails, especially in the waterfall region, the sector contains a large number of multiple error occurrences. When many such error events occur in a given EPCC codeword, decoding by any reasonable size list decoder is formidable. Thus, an inner EPCC cannot in any capacity reduce the SER of a serially concatenated outer LDPC. On the other hand, if the EPCC codeword length is decreased substantially, then the number of errors per codeword is reasonable, as long as the overall code rate is somehow kept high. Here, the concept of tensor product construction comes into play.
Tensor product parity codes (TPPC) were first proposed in [21] as the nullspace of the parity check matrix resulting from the tensor product of two other parity check matrices corresponding to a variety of code families. As a result, based on the choice of the concatenated codes, TPPC would be classified as an error correction code if constructed from two ECCs, an error detection code (EDC) if constructed from two EDCs, and an error location code (ELC) if constructed from an EDC and an ECC in a special order. As a matter of fact, ELCs were introduced earlier in [24] and their algebraic properties studied in detail, but later incorporated in the unified theme of TPPCs in [21]. Furthermore, a generalized class of harddecodable ELCs was later suggested for application in the magnetic recording channel in [25]. In addition, TPPCs can be generalized by combining a number of codes on various extension fields with shorter binary codes. For this more general case, a decoding algorithm was developed in [26]. An ECCtype TPPC was applied to longitudinal magnetic recording in [22], and to perpendicular magnetic recording in [23]. In [22], a hard decodable tensor product code based on a single parity code and a BCH code is proposed as an inner code for RS. This code is suitable for low density longitudinal recording channels for which dominant errors have odd weights, such as and . Also, [22] proposes that the hard decoder passes the corrected parity bits to a Viterbi detector reflecting channel states and parity code states in order to compute the decoder output. Later, [23] presented two methods for combining a tensorproduct single parity code with a distanceenhancing constrained code. This code combination achieved more satisfactory performance with RS as an outer code in high density perpendicular recording channels.
Our goal in this work is to utilize the concept of tensor product concatenation to construct high rate softdecodable EPCCs on the symbollevel of the outer ECC. The EPCC target error list is matched to the dominant error events normally observed in high density perpendicular recording channels. Since dominant error events in perpendicular recording are not only of odd weight [12], this requires that our EPCC be a multiparity code. However, in this case, a Viterbi detector matching the channel and parity will have prohibitive complexity. In spite of this, the performance of the optimal decoder of the baseline paritycoded channel can be approached by the low complexity detection postprocessing technique in [18]. We also present in detail a low complexity highly parallel soft decoder for TEPCC and show that it achieves a better performancecomplexity tradeoff compared to conventional iterative decoding schemes.
Ia Notations and Definitions

For a linear code , denotes the codeword length, denotes the user data length, and denotes the number of code parity bits.

For a certain parity check matrix corresponding to a linear code , a syndrome is the range of a perturbation of a codeword . A signature refers to the range under for any bit block, not necessarily a codeword formed of data and parity bits.

The multilevel loglikelihood ratio (mlLLR) of a random variable corresponding to the p.m.f. (probability mass function) , can be defined as: .

denotes a local segment of the sequence .

Assume and , then . Define . Also, and are two different representations of the recursive implementation of acting on the elements of the set .
IB Acronyms

TPPC: Tensor Product Parity Code.

LDPC: ary Low Density Parity Check code.

RS: Reed Solomon code.

BCJR: BahlCockeJelinekRaviv .

TEPCC: Tensor product Error Pattern Correction Code.

TEPCCLDPC and TEPCCRS: Tensor product of EPCC and LDPC or RS, respectively.

LLR: LogLikelihood Ratio.

mlLLR: multilevel LogLikelihood Ratio.

ML: Maximum Likelihood.

MAP: Maximum A Posteriori.

QC: QuasiCyclic.

SPA: SumProduct Algorithm.
Ii Review of EPCC and the Tensor Product Coding Paradigm
In this section we give a brief review on the concept of EPCC, including the design of two example codes that will be utilized later in the simulation study. Also, we review the tensor product coding paradigm and present an encoding method that allows for EPCCbased lineartimeencodable TPPCs.
Iia EPCC Review and Examples
We review constructing a cyclic code targeting the set of dominant error events
represented as polynomials on that can occur at any starting position in a codeword of length . A syndrome of error type at position is defined as , where is the generator polynomial of the code and is the polynomial modulus operation.
A syndrome set for error type contains elements corresponding to all cyclic shifts of polynomial ; elements of are thus related by .
For unambiguous decoding of and , , we must have . This design requirement constrains to have distinct greatest common divisors with , for all targeted [16]. However, even if this constraint is satisfied, an element in can still map to more than one error position, i.e., the period of the syndrome set and period of  can be less than . Moreover, this constraint is only sufficient but not necessary. As shown in [16], there may exist a lower degree that can yield distinct syndrome sets for the targeted error polynomials, resulting in a higher rate EPCC. A search method to find this is already discussed in detail in [16] and [18]. We next give two example EPCC constructions that will be used throughout the paper. We target the dominant error events of the “ideal” equalized monic channel in AWGN, which is suitable as a partial response target in perpendicular magnetic recording read channels. For this channel, the dominant errors are given by: , , , etc., i.e. they can be represented as polynomials on for which all powers of have nonzero coefficients. The two EPCCs are:

Example 1: Targeting error polynomials up to degree , we get the generator polynomial of period via the search procedure of [16]. Choosing a codeword length of , distinct, nonoverlapping syndrome sets are utilized to distinguish the target errors. Then, syndrome set will have period , while all other sets have period . A syndrome set of period means that each syndrome decodes to one of possible error positions within the bit codeword. Nonetheless, can be decoded reliably via channel reliability information and the polarity of data support. The low code rate of makes this code unattractive as an inner code in a serial concatenation setup for recording channel applications. However, as we will see later, a tensor code setup makes it practical to use such powerful codes for recording applications.

Example 2: Targeting error polynomials up to degree , we have to record more redundancy. To accomplish this feat, a cyclic code with parity bits, code rate , and a generator polynomial of period is found by the search procedure in [16]. Then, syndrome sets , , , and each have period and thus can be decoded without ambiguity. While syndrome sets , , , , and each have period , decoding to one of two positions. The worst is of period , which would decode to one of possible positions. Still, the algebraic decoder can quickly shrink this number to few positions by checking the data support, and then would choose the one position with highest local reliability.
IiB Tensor Product Parity Codes
IiB1 Construction and Properties of the TPPC Parity Check Matrix
Consider a binary linear code derived from the null space of parity check matrix , and assume corrects any error event that belongs to class . Also, consider a nonbinary linear code derived from the null space of parity check matrix and defined over elements of . Moreover, assume this code corrects any symbol error type that belongs to class . As a preliminary step, convert the binary matrix , column by column, into a string of elements of dimension . Then, construct the matrix
as a array of elements. Finally, convert the elements of into bit columns, based on the same primitive polynomial of degree used all over in the construction method. The null space of the binary corresponds to a linear binary code . As shown in Fig. 1, a codeword is composed of blocks termed “tensorsymbols”, each having bits. Also, it can be shown that can correct any collection of tensor symbol errors belonging to class , provided that all errors within each tensor symbol belong to class [21]. Note that a tensor symbol is not an actual codeword, and as such, using the terms “inner” and “outer” codes would not be completely accurate. In addition, the tensor symbols are not codewords themselves, as can be seen in Fig. 1, the first tensor symbols are all data bits to start with, and even the last tensor symbols, which are composed of data and parity bits, have nonzero syndromes under . Thus, a TPPC codeword does not correspond directly to either or , and as a result, the component codebooks they describe are not recorded directly on the channel. Another interesting property of the resulting TPPC is that the symbolmapping of the sequence of tensorsymbol signatures under forms a codeword of , which we refer to as the “signaturecorrecting component code”.
IiB2 Encoding of Tensor Product Parity Codes
The encoding of a TPPC can be performed using its binary parity check matrix, but the corresponding binary generator matrix is not guaranteed to possess algebraic properties that enable linear time encodability. Thus, an implementationfriendly approach would be to utilize the encoders of the constituent codes, which can be chosen to be linear time encodable.
Consider a binary code that is the null space of parity check matrix , and a nonbinary code defined on , the tensorproduct concatenation is a binary , where:
Assume that is a cyclic code, and is any of the linear time encodable codes, where we choose a quasicyclic (QC) component code for the purpose of this study. Then, the encoders of and communicate via the following algorithm to generate a codeword of , see Fig. 2:

Receive a block of bits from the data source, call it major block .

Divide major block into minor block of bits, and minor block of bits (i.e. bits).

Divide block into columns each of bits. Then, for each column, calculate the intermediate bit signature under the parity check matrix of . Using a feedback shift register (FSR) to calculate the signatures, the computational cost is operations per signature, and for this entire step.

Convert intermediate signatures from bit strings into symbols.

Encode the nonbinary signatures into a codeword of length . Using FSRs to encode the quasicyclic , the computational complexity of this step is .

Convert computed signatures back into bit strings.

Divide block into columns each of bits. Add blanks in each column to be filled with the parity bits of . Then, align each column with the signatures computed in the previous step, leaving blanks in each column.

Fill blanks in the previous step such that the signature of data plus parity blanks under equals the corresponding aligned signature from step (vi). The parity can be calculated using the systematic and the method of back substitution which requires a computational complexity per column.
The total computational complexity of this encoding algorithm is , i.e. it is , which is the TPPC codeword length. Thus, we have shown with some constraints that if and are linear time encodable, then is linear time encodable.
Iii TEPCCRS Codes
To demonstrate the algebraic properties of TPPC codes, we present an example code suitable for recording applications with KB sector size. Consider two component codes:

A binary cyclic EPCC of example 2 above with rate , parity bits, and parity check matrix in :
(1) 
A RS over , of rate , , and parity symbols.
The resulting TPPC is a binary code, of rate , and redundancy of parity bits. For this code, a codeword is made of bit tensor symbols, of which, any combination of or less tensor symbol errors are correctable, provided that each bit tensor symbol has a single or multiple occurrence of a dominant error that is correctable by EPCC, those being combinations of error polynomials up to degree . Furthermore, although the EPCC constituent code has a very low rate of , the resulting TEPCC has a high rate of . Notably, in the view of the bit EPCC, this reduction in recorded redundancy corresponds to an SNR improvement of dB in a channel with rate penalty , and dB in a channel with rate penalty .
Iiia Hard Decoding of TEPCCRS Codes
Hard decoding of TEPCCRS directly reflects the code’s algebraic properties, and thus, serves to further clarify the concept of tensor product codes. Hence, we discuss the hard decoding approach before going into the design of soft decoding of TEPCC codes. The decoding algorithm is summarized by the following procedure, see Fig. 3:

After hard slicing the output of the Viterbi channel detector, the signature of each tensor symbol is calculated under . Each signature is then mapped into a Galois field symbol, where the sequence of nonbinary signatures constitute an RS codeword  that is if the channel detector did not suffer any errors.

Any hardinput RS decoder, such as the BerlekampMassey decoder, acts to find a legitimate RS codeword based on the observed signaturesequence.

If the number of signaturesymbols in error is larger than the RS correction power, RS decoding fails and the tensor product decoder halts.

Otherwise, if RS decoding is deemed successful, the corrected signaturesymbol sequence is added to the original observed signaturesymbol sequence to generate the “error syndromesymbol” sequence.

Each error syndromesymbol is mapped into an EPCC bitsyndrome of the corresponding tensor symbol.

Finally, EPCC decodes each tensor symbol to satisfy the errorsyndrome generated by the component RS, in which it faces two scenarios:

A zero “errorsyndrome” at the output of RS decoding indicates either no error occurred or a multiple error occurrence that has a zero EPCCsyndrome, which goes undetected. In this case, the EPCC decoder is turned off to save power.

A non zero “errorsyndrome” will turn EPCC correction on. If the errorsyndrome indicates a single error occurrence in the target set, then, the EPCC single error algebraic decoder is turned on. On the other hand, if the errorsyndrome is not recognized, then EPCC list decoding is turned on with a reasonablesize list of test words.
Note that although the number of EPCC codewords (tensor symbols) is huge, the decoder complexity is reasonable since EPCC decoding is turned on only for nonzero errorsyndromes.

Iv TEpccLDPC Codes
We learned from the design of TEPCCRS that the component signaturecorrecting codeword length can be substantially shorter than the competing single level code. Although the minimum distance is bound to be hurt if the increased redundancy does not compensate for the shorter codeword length, employing iterative soft decoding of the component signaturecorrecting code can recover performance if designed properly. While LDPC codes have strictly lower minimum distances compared to comparable rate and code length RS codes, the sparsity of its parity check matrix allows for effective belief propagation (BP) decoding. BP decoding of LDPC codes consistently performs better than the best known soft decoding algorithm for RS codes. Since the TPPC expansion enables the use of to times shorter component LDPC compared to a competing single level LDPC, a class of LDPC codes efficient at such short lengths are critical. LDPC codes on high order fields represent such good candidates. In that respect, [29] showed that the performance of binary LDPC codes in AWGN can be significantly enhanced by a move to fields of higher orders (extensions of being an example). Moreover, [29] established that for a monotonic improvement in waterfall performance with field order, the parity check matrix for very short blocks has to be very sparse. Specifically, column weight codes over exhibit worse biterrorrate (BER) as increases, whereas column weight codes over exhibit monotonically lower BER as increases. These results were later confirmed in [30], where they also showed through a density evolution study of large codes that optimum degree sequences favor a regular graph of degree in all symbol nodes. On the other hand, for satisfactory error floor performance, we found that using a column weight higher than was necessary. This becomes more important as the minimum distance decreases for lower . For instance, we found that a column weight of improved the error floor behavior of LDPC at the expense of performance degradation in the waterfall region.
Iva Design and Construction of Ldpc
The low rate and relatively low column weight design of LDPC in a TPPC results in a very sparse parity check matrix, allowing the usage of high girth component LDPC codes. To optimize the girth for a given rate, we employ the progressive edge growth (PEG) algorithm [30] in LDPC code design. PEG optimizes the placement of a new edge connecting a particular symbol node to a check node on the Tanner graph, such that the largest possible local girth is achieved. Furthermore, PEG construction is very flexible, allowing arbitrary code rates, Galois field sizes, and column weights. In addition, modified PEGconstruction with lineartime encoding can be achieved without noticeable performance degradation, facilitating the design of linear time encodable tensor product codes. Of the two approaches to achieve linear time encodability, namely, the upper triangular parity check matrix construction [30] and PEG construction with a QC constraint [31], we choose the latter approach, for which the designed codes have better error floor behavior. TEPCCLDPC lends itself to iterative soft decoding quite naturally. Next, we present a low complexity soft decoder utilizing this important feature.
IvB Soft Decoding of TEPCCLdpc
To fully utilize the power of the component codes in TEPCCLDPC, we need to develop a soft iterative version of the hard decoder of TEPCCRS. To limit the complexity of the proposed soft decoder, suboptimal detection postprocessing is adopted instead of the maximum a posteriori (MAP) detector to evaluate tensor symbol signature reliabilities. The complexity of the optimal MAP detector matched to both the channel of memory length and of row length is exponential in . We present a practical soft detection scheme that separates soft channel detection from tensor symbol signature detection, though, through a component signaturecorrecting LDPC in a TPPC setup, approaches the joint MAP performance through channel iterations. The main stages of the decoder are, see Fig. 4:

Detection postprocessing:

Utilizing a priori information from the previous decoding iteration, binary Viterbi generates the hard ML word based on channel observations, for which the error sequence is calculated and passed to the correlator bank.

A bank of local correlators estimates the probability of dominant error type/location pairs for all positions inside each tensor symbol.


Signature p.m.f. calculation:

For each tensor symbol, the list of most likely error patterns is constructed. This list includes single occurrences and a predetermined set of their combinations. The list is then divided into sublists, each under the signature value it satisfies.

For each tensor symbol, using each signature value’s error likelihood list, we find the signature p.m.f. of that symbol.


ary LDPC decoding:

Using the observed sequence of signature p.m.f.’s, we decode the component ary LDPC via FFTbased SPA.

For each tensor symbol, the LDPCcorrected signature p.m.f. is convolved with the observed signature p.m.f. at its input to generate the errorsyndrome p.m.f..


EPCC decoding:

For each tensor symbol, we find the list of most probable errorsyndromes and generate a list of test error words to satisfy each syndrome in the list.

A bank of parallel EPCC singleerror correcting decoders generates a list of most probable codewords along with their reliabilities.


BitLLR feedback:

Using the codeword reliabilities we generate bitlevel reliabilities that are fed back to the Viterbi detector and the detection postprocessing stage. Those bitlevel reliabilities, serving as a priori information, favor paths which satisfy both the ISI and parity constraints.

We explain each of these steps in the following sections, but we replace any occurrence in the text of syndrome (signature) p.m.f. by syndrome (signature) multilevel loglikelihood ratios (mlLLR), as decoding will be entirely in the log domain for reasons explained below.
IvB1 Detection Postprocessing
At this decoder stage we prepare a reliability matrix for error type/position pairs  captured in a tensor symbol of length  that is usable by the next stage to calculate the tensor symbol’s signature mlLLR:
where is the error pattern (type / position ) reliability measure computed by the maximum a posteriori (MAP)based errorpattern correlator shown in Fig. 5. The bank of local correlators discussed here was also employed in [18] for AWGN channels, and in [19] for datadependent noise environments. We now discuss how to generate these local metrics. Let be the channel detector input sequence , where is the bipolar representation of the recorded codeword sequence, is the partial response channel of length , and is zeromean AWGN noise with variance . Also, let be the channel detector’s output error sequence. If a target error pattern sequence occurs at positions from to , then can be written as
(2) 
where is the channel response of the error sequence, and is given by , and . Note that we define the start of the tensor symbol at . So, if , then the error pattern starting position is in a preceding tensor symbol.
The reliability for each error pattern with starting position, , can be computed by the local a posteriori probabilities (ignoring tensor symbol boundaries for now):
(3) 
The most likely assumed error type/position pair in a tensor symbol maximizes the a posteriori probability ratio of its reliability to the reliability of the most probable error event (the competing event in this case would be the ML word itself, with no error occurrence assumed at the output of Viterbi detection). Hence, utilizing (IVB1) and Bayes rule, the ratio to maximize becomes
(4) 
where is the ML word’s noiseless channel response. Given the noise model, is a sequence of independent Gaussian random variables with variance . Therefore, maximizing (4) can be shown to be equivalent to maximizing the loglikelihood local measure [18]:
(5) 
where the a priori bias in (5) is evaluated as:
(6) 
where is the a priori LLR of the errorevent bit at position as received from the outer soft decoder, and we are assuming here that error event sequences do not include bits, i.e., the ML sequence and error sequence do not agree for the entire duration of the error event. Equation (5) represents the “local” errorpattern correlator output in the sense that it essentially describes the correlator operation between and the channel output version of the dominant error pattern within the local region . However, equation (5) ignores that errors can span tensor symbol boundaries when or . For instance, an error in the first bit of the tensor symbol can result from a single error event in that bit, a double error event in the last bit of the preceding tensor symbol, a triple error event occurring two bits into the previous symbol, and so on. Hence, the probability of an error in the first bit is the sum of all these parent error event probabilities. Moreover, this can be easily generalized to boundary errors extending beyond the first bit. In a similar manner, an error in the last bit of a tensor symbol can result from a single error event in that bit, a double error event starting in that bit and continuing into the next tensor symbol, a triple error starting at the last bit and continuing into the next tensor symbol, and so on. Again, the probability of an error event in that bit is the sum of the probabilities of all these parent events. Moreover, we have to nullify the probability of the parent error events in the modified reliability matrix since they are already accounted for in the last bit’s reliability calculation. Furthermore, this can also be generalized to error events starting earlier than the last bit and extending into the next tensor symbol. In summary, to calculate a modified metric relevant to the current tensor symbol, we utilize the following procedure:

at , modify
independently for each , where is the maximum length of a targeted error pattern.

Starting at and , do:

.

, set .

Set , .

If go back to (i).

We assume here that dominant error events span only two tensor symbols at a time and that they do not include error free gaps, which is certainly true for the case study of this paper. Following this procedure we obtain the modified reliability matrix .
IvB2 Signature mlLLR Calculation
For each tensor symbol , utilizing , we need to find the p.m.f. or the log domain mlLLR of its signature , for EPCC with parity bits. To limit the computational complexity of this calculation, we construct a signature only from the dominant errors and a subset of their multiple occurrences. Denote as the running estimate of the p.m.f. at , and as the running estimate of mlLLR. Denote a one dimensional index of as corresponding to the th row and th column of and error . We choose the dominant list as the patterns with the largest corresponding elements of having indexes . Based on this list, we developed the following procedure to compute :

Step (Single occurrences):
where , and is an operator that maps bit vectors into symbols.

Step (Double occurrences):
where is the error free distance between the two errors, is the error free distance of the channel beyond which the errors are independent.

…

Step ( occurrences):

Step (MLsignature reliability; computed so that the resulting signature p.m.f. sums to ):
(10) (11) 
Step (Normalization):
(13)
In steps through , to calculate the loglikelihood of signature assuming value , we sum the probabilities of all presumed single and multiple errors in the ML word whose signatures equal . This is equivalent to performing the operation in the log domain on error reliabilities dictated by . However, to limit the complexity of this stage, we only use a truncated set of possible error combinations, in all steps from to . Also, for signature values that do not correspond to any of the combinations, we set their reliability to , or more precisely, a reasonably large negative value in practical decoder implementation. Since there are many such signature values, the corresponding constructed p.m.f. will be sparse.
In step , the likelihood of the ML signature value is computed so that the p.m.f. of the tensor symbol signature sums to . In this step, the operation in (11) is a reflection of the fact that in previous steps, through , some multiple error occurrences have the same signature as the ML tensor symbol value. So, we have to account for such error instances in the running estimate of the ML signature reliability. These events correspond to cases where error events are not detectable by , i.e., they belong to the null space of . In step , the mlLLR of the tensor symbol is centered around to prevent the LDPC SPA messages from saturating after a few BP iterations.
IvB3 ary LDPC Decoding
Now, the sequence of signature mlLLRs is passed as multilevel channel observations to the LDPC decoder. We choose to implement the logdomain ary fast Fourier transformbased SPA (FFTSPA) decoder in [35] for this purpose. The choice of logdomain decoding is essential, since if we use the signature p.m.f. as input, the SPA would run into numerical instability resulting from the sparse p.m.f. generated by the preceding stage.
The LDPC output posteriori mlLLRs correspond to the signatures of tensor symbols, rather than the syndromes of errors expected by EPCC decoding. Similar to the decoder of TEPCCRS, errorsyndrome is the finite field sum of the LDPC’s input channel observation of signature , , and output posteriori signature reliability, . Moreover, the addition of hard signatures corresponds to the convolution of their p.m.f.’s, and this convolution in probability domain corresponds to the following operation in logdomain:
The errorsyndrome mlLLR is later normalized, similar to LDPC BP mlLLR message normalization, according to:
(15) 
IvB4 EPCC Decoding
An errorsyndrome will decode to many possible error events due to the low minimum distance of singleerror correcting EPCC. However, EPCC relies on local channel side information to implement a listdecodinglike procedure that enhances its multiple error correction capability. Moreover, the short codeword length of EPCC reduces the probability of such multiple error occurrences considerably. To minimize power consumption, EPCC is turned on for a tensor symbol only if the most likely value of the errorsyndrome mlLLR is nonzero, i.e., , indicating that a resolvable error has occurred. After this, a few syndrome values, in our case, most likely according to the mlLLR, are decoded in parallel. For each of these syndromes, the list decoding algorithm goes as [18, 19]:

A test error word list is generated by inserting the most probable combination of local error patterns into the ML tensor symbol.

An array of parallel EPCC singlepattern correcting decoders decodes the test words to produce a list of valid codewords that satisfy the current errorsyndrome.

The probability of a candidate codeword is computed as the sum of likelihoods of its parent testword and the error pattern separating the two.

Each candidate codeword probability is biased by the likelihood of the errorsyndrome it is supposed to satisfy.
In addition, when generating test words, we only combine independent error patterns that are separated by the error free distance of the ISI channel.
IvB5 Soft Bitlevel Feedback LLR Calculation
The list of candidate codewords and probabilities are used to generate bit levelprobabilities in a similar manner to [27, 19]. The conversion of wordlevel reliability into bitlevel reliability for a given bit position can be done by grouping the candidate codewords into two groups, according to the binary value of the hard decision bit in that bit position, and then performing groupwise summing of the wordlevel probabilities. Three scenarios are possible for this calculation:

The candidate codewords do not all agree on the bit decision for location ; then, given the list of codewords and their accompanying a posteriori probabilities, the reliability of the coded bit is evaluated as
(16) where is the set of candidate codewords where , and is the set of candidate codewords where

Although rare for such short codeword lengths, in the event that all codewords do agree on the decision for , a method inspired by [27] is adopted for generating soft information as follows
(17) where is the bipolar representation of the agreedupon decision, is a preset value for the maximum reliability at convergence of turbo performance, and the multiplier is a scaling factor. in the first global iterations and is increased to as more global iterations are performed and the confidence in bit decisions improved. Thus, this backoff control process reduces the risk of error propagation.

The heuristic scaling in (17) is again useful when EPCC is turned off for a tensor symbol, in case the most likely errorsyndrome being . Then, the base hard value of the tensor symbol corresponds to the most likely error event found as a side product in stage of the TEPCCLDPC decoder.
IvC Stopping Criterion for TEPCCLDPC and RS Erasure Decoding
Due to the ambiguity in mapping tensor symbols to signatures and syndromes to errors in stages and of the decoder, respectively, the possibility of nontargeted error patterns, or errors that have zero errorsyndromes that are transparent to , a second line of defense is essential to take care of undetected errors. Therefore, an outer RS code of small correction power is concatenated to TEPCCLDPC to take care of the imperfections of the component EPCC. Several concurrent functions are offered by this code, including:

Stopping Flag: If the RS syndrome is zero, then, global iterations are halted and decisions are released.

Outer ECC: Attempt to correct residual errors at the output of EPCC after each global iteration.

Erasure Decoding: If the RS syndrome is nonzero, then, for those tensor symbols that EPCC was turned on, declare their bits as erasures. Next, find the corresponding RS symbol erasures, and attempt RS erasure decoding which is capable of correcting up to such erasures. In this case, TEPCC acts as an error locating code.
V Simulation Results and Discussion
We compare three coding systems based on LDPC: conventional binary LDPC, ary LDPC, and TEPCCLDPC, where all the component LDPC codes are regular and constructed by PEG with a QC constraint. We study their sector error rate (SER) performance on the ideal equalized partial response target corrupted by AWGN, and with coding rate penalty . The nominal systems run at a coding rate of . The minimum SNR required to achieve reliable recording at this rate is dB, estimated by following the same approach as in [28].
Va Singlelevel BLDPC & LDPC Simulation Results
In Fig. 6, we compare SER of the following LDPC codes, each constructed by PEG with a QC constraint:

A LDPC, of column weight , and circulant size bits. The channel detector is a state binary BCJR.

A LDPC, of codeword length bits, column weight , and circulant size of symbols. The channel detector is a symbolBCJR with branches emanating from each of states.

A LDPC, of codeword length bits, column weight , and circulant size of symbols. The channel detector is a symbolBCJR with branches emanating from each of states.

A LDPC, of codeword length bits, column weight , and circulant size of symbols. The channel detector is a symbolBCJR with branches emanating from each of states.
For the binary LDPC turbo equalizer, we run a maximum of iterations, global, and LDPC BP iterations. For the ary turbo equalizers, on the other hand, we run a maximum of iterations. A column weight of gives the best waterfall performance of ary LDPC. However, LDPC exhibits an error floor as early as at SER , whereas a higher order field of does not show such a tendency down to . Nevertheless, the prohibitive complexity of symbolBCJR makes LDPC a more attractive choice. Still, we need to sacrifice LDPC’s waterfall performance gains to guarantee a lower error floor. For that purpose, we move to a column weight LDPC that is dB away at from the independent uniformlydistributed capacity of the channel [28], and dB away from LDPC a the same SER. In this simulation study, we have observed that while binary LDPC can gain up to dB through channel iterations before gain saturates, LDPC and LDPC achieve very little iterative gain by going back to the channel, between to dB through channel iterations. One way to explain this phenomenon, is that symbollevel LDPC decoding divides the bit stream into LDPC symbols that capture the error events introduced by the channel detector, rendering the binary intersymbol interference limited channel into a memoryless multilevel AWGN limited channel. Nonetheless, error events spanning symbol boundaries reintroduce correlations between LDPC symbols that are broken only by going back to the channel. In other words, if it was not due to such boundary effects, a ary LDPC equalizer would not exhibit any iterative turbo gain whatsoever. Nonetheless, fullblown symbol BCJR is still too complex to justify salvaging the small iterative gain by performing extra channel iterations [33]. This is where error event matched decoding comes into the picture, which leads us to the results of the next section.
VB TEpccLDPC Simulation Results
We first construct two TEPCCLDPC codes of rate , the same rate as the competing singlelevel LDPC. These TPPC’s are based on EPCC of example 1. The codes constructed are:

TPPCA: A KB sector, binary TPPC, of rate , and parity bits, based on a component PEGoptimized QC LDPC, of rate , column weight , and circulant size .

TPPCB: A KB sector, binary TPPC, of rate , and parity bits, based on a component PEGoptimized QC LDPC, of rate , column weight , and circulant size .
First, we study the SER of TEPCCLDPC just up to the component LDPC decoder, and only at the first channel pass. This SER is function of the Viterbi symbol error rate, and the accuracy of generating signature mlLLRs, in addition to the component LDPC employed. This SER represents the best that the TPPC code can do, under the assumption of perfect component EPCC, i.e., as long as LDPC generates a clean codeword of signaturesymbols, then EPCC generates a clean codeword of datasymbols. Fig. 7 shows the ideal SER of these two TPPC codes, assuming perfect EPCC, compared to singlelevel LDPC and LDPC. Ideal KB TPPC has about the same SER as single level LDPC at SER. In KB TPPC, the component LDPC has half the codeword length of the single level counterpart, saving of the decoder complexity, while delivering similar SER performance. The TPPC component LDPC faces a harsher channel than singlelevel LDPC, because the symbol error probability of bit data symbols is strictly less than the symbol error probability of bit signature symbols, where signature symbols are compressed down from bit data symbols. Also, the shorter codeword length of component LDPC hurts its minimum distance. Still, these impairments are effectively compensated for by an increase in the redundancy of the TPPC component LDPC. On the other hand, if we match the codeword length of TPPC’s component LDPC to singlelevel LDPC, as part of constructing KB TPPC, then, KB TPPC will have similar decoder complexity to KB singlelevel LDPC with about dB SNR advantage for KB ideal TPPC at SER.
Due to the imperfections of EPCC design, including miscorrection due to onetomany syndrome to error position mapping, and undetected errors due to EPCC’s small minimum distance, achieving the ideal performance in Fig. 7 is not possible in one channel pass. In addition, an outer code is necessary to protect against undetected errors and provide a stopping flag for the iterative decoder. Hence, one can think of an implementation of the full TEPCCLDPC decoder that includes an outer RS for the KB case, and an outer RS for the KB case, so as to protect against EPCC residual errors. These outer RS codes are defined on and have rate . However, this concatenation setup will run at a lower code rate of , which can incur an SNR degradation larger than dB for a noise environment characterized by the rate penalty . In a more thoughtful approach, one can preserve the nominal code rate of and redistribute the redundancy between the inner TPPC and outer RS to achieve an improved tradeoff between miscorrection probability and the inner TPPC’s component LDPC code strength. In that spirit, we construct the following concatenated codes:

TPPCC: A KB sector, binary TPPC, of rate , and parity bits, based on a component PEGoptimized QC LDPC, of rate , column weight , and circulant size . An outer RS code of rate is included, resulting in a total system rate of .

TPPCD: A KB sector, binary TPPC, of rate , and parity bits, based on a component PEGoptimized QC LDPC, of rate , column weight , and circulant size . An outer RS code of rate is included, resulting in a total system rate of .
The control mechanism of iterative decoding for these codes is as follows: if EPCC results in less than RS symbol errors for the KB design or less than for the KB design, or if EPCC generates more errors than this, but declares less than erasures for KB or erasures for KB, then, decoding halts and decisions are released. Otherwise, one more channel iteration is done by passing EPCC soft bitlevel LLR’s to Viterbi detection and the bank of errormatched correlators.
Simulation results in Fig. 8, for a noise environment of rate penalty , demonstrate that after channel iterations, the ideal and practical performances of the new TPPC codes almost lock, while incurring minimal SNR degradation. Also, KB TPPC saves of decoder complexity while achieving the same SER performance as single level LDPC for an additional SNR cost of dB at SER . Hence, TPPCC represents a tradeoff between the lower complexity of LDPC and performance advantage of LDPC, whereas KB TPPC has the same decoding complexity as singlelevel LDPC while furnishing dB gain at SER. In terms of channel detector implementation complexity, the complexity and latency of BCJR in the single level code far exceeds the overall complexity of the nonLDPC parts of two level TEPCCLDPC including Viterbi detection. At the same time, signature mlLLR generation, EPCC decoding, and bitLLR generation are all implemented tensorsymbol by tensorsymbol, achieving full parallelism on the tensorsymbol level. Furthermore, it is only when LDPC finds a syndrome error that EPCC decoding is turned on for each tensor symbol. To eliminate redundant computations in the iterative decoder, branch metric computation in Viterbi and (5) is only required at the first pass. For all subsequent iterations, however, only the a priori bias is updated in the second term of (5), and the branch update of Viterbi [34].
One very important feature of the TPPC setup, that singlelevel LDPC lacks, is its robustness to boundary error events. The presence of a syndromeconstraint means that errors spanning boundaries are broken by EPCC when attempting to independently satisfy the adjacent tensor symbol syndromes, then, in the next turbo iteration, adjacent tensorsymbols are decorrelated. This mechanism enables TPPC to recover from these errors by iterative decoding. However, for errors with a zero errorsyndrome which go undetected by EPCC, outer RS protection becomes handy.
Based on the fact that TPPC enables an increase in the redundancy of its component LDPC, in addition to simulation results demonstrating the utility of such lowered rate in combating the harsher compressed channel, we conjecture that as the sector length of both TPPC and singlelevel LDPC is driven to infinity, TPPC will achieve strict error rate SNR gains. This is mainly because of its surplus of redundancy compared to the single level code at the same rate penalty, whereas channel conditions and EPCC correction power do not change with replication of tensor symbols, and the error rate performance of LDPC asymptotically approaches the noise threshold in the limit of infinite codeword length. Therefore, within a channelcapacity achieving argument, in the limit of infinite codeword length, we take the view that TPPC will bridge the gap to capacity further than any single level system could. Moreover, the advantage of TPPC for larger sector sizes is more timely than ever as the industry moves