Edge Coloring and Stopping Sets Analysis in Product Codes with MDS components

Edge Coloring and Stopping Sets Analysis in Product Codes with MDS components

Fanny Jardel and Joseph J. Boutros
This manuscript was submitted to the IEEE Transactions on Information Theory, paper IT-15-1104, Dec. 2015. Fanny Jardel is with Telecom ParisTech, 75013 Paris, France (email: fannjard@gmail.com). She was with CEA, LIST, Communicating Systems Laboratory BC 94, Gif Sur Yvette, F91191, France. Joseph J. Boutros is with the Dept. of Electrical and Computer Engineering, Texas A&M University at Qatar, Education City, 23874 Doha, Qatar (email: boutros@tamu.edu).
Abstract

We consider non-binary product codes with MDS components and their iterative row-column algebraic decoding on the erasure channel. Both independent and block erasures are considered in this paper. A compact graph representation is introduced on which we define double-diversity edge colorings via the rootcheck concept. An upper bound of the number of decoding iterations is given as a function of the graph size and the color palette size . Stopping sets are defined in the context of MDS components and a relationship is established with the graph representation. A full characterization of these stopping sets is given up to a size , where and are the minimum Hamming distances of the column and row MDS components respectively. Then, we propose a differential evolution edge coloring algorithm that produces colorings with a large population of minimal rootcheck order symbols. The complexity of this algorithm per iteration is , for a given differential evolution parameter , where itself is small with respect to the huge cardinality of the coloring ensemble. The performance of MDS-based product codes with and without double-diversity coloring is analyzed in presence of both block and independent erasures. In the latter case, ML and iterative decoding are proven to coincide at small channel erasure probability. Furthermore, numerical results show excellent performance in presence of unequal erasure probability due to double-diversity colorings.

{keywords}

Product codes, MDS codes, iterative decoding, codes on graphs, differential evolution, distributive storage, edge coloring, diversity, erasure channel, stopping sets.

I Introduction

The colossal amount of data stored or conveyed by network nodes requires a special design of coding structures to protect information against loss or errors and to facilitate its access. At the end-user level, coding is essential for transmitting information towards the network whether it is located in a single node or distributed over many nodes. At the network level, coding should help nodes to reliably save a big amount of data and to efficiently communicate with each others. Powerful capacity-achieving error-correcting codes developed in the last two decades are mainly efficient at large or asymptotic block length, e.g. low-density parity-check codes (LDPC) [23] and their spatially-coupled ensembles [35], parallel-concatenated convolutional (Turbo) codes [6][5], and polar codes derived from channel polarization [4]. Data transmission and storage in many nowadays networks may require short-length packets that are not suitable for capacity-achieving codes. The current interest in finite-length channel coding rates [44] put back the light on code design for short and moderate block length. Many potential candidates are available for this non-asymptotic length context such as binary and non-binary BCH codes, including Reed-Solomon (RS) codes, Reed-Muller (RM) codes, and tensor product codes of all these linear block codes [40][7][39].

Product codes, introduced by Peter Elias in 1954 [19], are tensor products of two (or more) simple codes with a structure that is well-suited to iterative decoding via its graphical description. In the early decades after their invention, product codes received a great attention due to their capability of correcting multiple burst errors [70][64], the availability of erasure-error bounded-distance decoding algorithms [66], the ability of correcting many errors beyond the guaranteed correction capacity [1], and their efficient implementation with a variable rate [68]. The pioneering work by Tanner [60] brought new tools to coding theory and put codes on graphs, including product codes, and their iterative decoding in the heart of modern coding theory [33][32][49]. The graph approach of coding led to new optimal cycle codes on Ramanujan/Cayley graphs [61] and to Generalizations of LDPC and product codes, known as GLD codes, studied for the binary symmetric channel (BSC) and the Gaussian channel [9]. The excellent performance of iterative (turbo) decoding of product codes on the Gaussian channel [45] made them compete with Turbo codes and LDPC codes for short and moderate block length. The convergence rate and stability of product codes iterative decoding were studied based on a geometric framework [55]. Product codes with mixed convolutional and block components were also found efficient in presence of impulsive noise [22]. In addition, iterated Reed-Muller product codes were shown to exhibit good decoding thresholds for the binary erasure channel, but at high and low coding rates only [65].

The class of product codes in which the row and the column code are both Reed-Solomon codes was extensively used since more than two decades in DVD storage media and in mobile cellular networks [69]. In these systems, the channel is modeled as a symbol-error channel without soft information, i.e. suited to algebraic decoding. Improvements were suggested for these RS-based product codes such as soft information provided by list decoding [52] within the iterative process in a Reddy-Robinson framework [48]. Also, RS-based product codes were directly decoded via a Guruswami-Sudan list decoder [28] after being generalized to bivariate polynomials [3]. For general tensor products of codes and interleaved, a recent efficient list decoding algorithm was published [24], with an improved list size in the binary case. On channels with soft information, RS-based product codes may be row-column decoded with soft-decision constituent decoders [20][30].

Tolhuizen found the Hamming weight distribution of both binary and non-binary product codes up to a weight less than  [62]. Enumeration of erasure patterns up to a weight less than was realized by Sendrier for product codes with MDS components [56]. Rosnes studied stopping sets of binary product codes under iterative ML-component-wise decoding [51], where the defined stopping sets and their analysis are based on the generalized Hamming distance [67][29].

I-a Paper content and structure

In this paper, we consider non-binary product codes with MDS components and their iterative algebraic decoding on the erasure channel. Both independent and block erasures are considered in our paper. The erasure channel is currently a major area of research in coding theory [36][37] because of strong connections with theoretical computer science [37] and its model that easily allows to understand the behavior of codes such as for LDPC codes [17], for general linear block codes [54], and for turbo codes [50]. Coding for block erasures was examined by Lapidoth in the context of convolutional codes [38]. This was a basis to later construct codes for the block-fading channel with additive white Gaussian noise [27][13]. The notion of rootcheck introduced in [13][12] for single-parity checknodes was applied to more general checknodes in GLD codes [11] and product codes [10] to achieve diversity on non-ergodic block-fading channels. The rootcheck concept is the main tool in this paper, in a way similar to [10], to define a compact graph representation and study iterative decoding in presence of block erasures. Edge coloring is one of the most interesting problems in modern graph theory [8]. In this paper, edge coloring is a tool, when combined to the rootcheck concept, yields double-diversity product codes. Our work is valid for finite-length MDS-based product codes only. Product codes for asymptotic block length were studied for single-parity codes constituents [46] and for the erasure channel with a standard regular structure [53] and MDS-based irregular structures [2].

Whether a product code is endowed with an edge coloring or not, the analysis of stopping sets, their characterization and their enumeration is a fundamental task to be able to design codes for erasure channels and determine the decoder performance. Our work in this sense is an improvement to previous works cited above by Tolhuizen, Sendrier, and Rosnes. Besides this objective of stopping sets characterization which is useful for independent channel erasures and erasures occurring in blocks of symbols, recent works on locality [25] stimulated us to search for edge colorings with a large population of edges that admit a minimal rootcheck order. Locality is a concept encountered in distributive storage [34][47] where classic coding theory is adapted to the nature of a network with distributed nodes with its own constraints of load in bandwidth and storage [18][42]. Furthermore, product codes with MDS components appear to be suited to distributive storage [21] owing to their simple and mature techniques of erasure resilience. In our search for good edge colorings, we provide a new algorithm based on the concept of differential evolution [59][43]. We use no crossover in our evolution loop, only a mutation of the population of bad edges is made to search for a better edge coloring. Our MDS-based product codes equipped with a double-diversity edge coloring are suited to distributed storage applications and to wireless networks where diversity is a key parameter.

The paper is structured as follows. Section II gives a list of mathematical notations. The graph representation of product codes is given in Section III, including compact and non-compact graphs. Also the rootcheck concept and its consequences are also found in Section III. The analysis of stopping sets is made in Section IV. Our edge coloring algorithm for bipartite graphs of product codes is described in Section V. Finally, in Section VI, we study the performance of product codes with MDS components on erasure channels and we give theoretical and numerical results before the conclusions in the last section.

I-B Main results

The main results in this paper are:

  • Establishing a new compact graph for product codes. The compact graph has many advantages, the main one being its ability to imitate a Tanner graph with parity-check nodes. The compact graph is also the basis for the differential evolution edge coloring. See Section III-B.

  • Iterative decoding analysis of finite-length product codes, mainly the proof of new bounds on the number of decoding iterations. See Theorem 1 and Corollary 1.

  • Proving new properties of stopping sets for product codes with MDS components. See Propositions 1&2, Corollaries 2-4, and Lemmas 1&2.

  • Complete enumeration and characterization of stopping sets up to a size , where are the minimum Hamming distances of the component codes. This stopping set enumeration goes beyond the weight of Tolhuizen’s Theorem 3 for codeword enumeration in the MDS components case. See Lemmas 3&4 and Theorems 2&3.

  • A new edge coloring algorithm (DECA) capable of producing double-diversity colorings despite the huge size of the coloring ensembles. See Section V-B.

  • Construction via the DECA algorithm of product codes maximizing the number of edges with root order , i.e. minimizing the locality when the process of repairing nodes is considered. See Section V-C.

  • First numerical results for MDS-based product codes on erasure channels showing how close iterative decoding is to ML decoding, mainly for small . We proved that iterative decoding perform as well as ML decoding (the ratio of error probabilities tends to ) for MDS-based product codes at small . See Proposition 3, Corollary 5, and other performance results in Section VI-B.

  • Great advantage of double-diversity colorings of product codes (with respect to codes without coloring) in presence of unequal probability erasures. Thus, double-diversity colorings are efficient on both ergodic and non-ergodic erasure channels. See Section VI-C.

Ii Mathematical notation and Terminology

We start by the notation related to the product code and its row and column components. The impatient reader may skip this entire section and then refer to it later to clarify any notation within the text. Basic notions on product codes and fundamental properties are found in main textbooks [40][7][39] and the encyclopedia of telecommunications [32].
The column code is a linear block code over the finite field with parameters which may be summarized by when no confusion is possible. The integer is the code alphabet size, is the code length, is the code dimension as a vector subspace of , and is the minimum Hamming distance of . Similarly, the row code is a linear block code with parameters . Let and be two matrices of size and containing in their row a basis for the subspaces and respectively. From the two generator matrices and a product code is constructed as a subspace of with a generator matrix , where and denotes the Kronecker product [40]. has dimension and minimum Hamming distance . and are also called component codes, this is a terminology from concatenated codes. In [60] and [10], vertices associated to component codes are called subcode nodes.

A linear code is said to be MDS, i.e. Maximum Distance Separable, if it satisfies . Binary MDS codes are the trivial repetition codes and the single parity-check codes. In this paper, we only consider non-trivial non-binary MDS codes where . A linear code over of rate is said to be MDS diversity-wise or MDS in the block-fading/block-erasure sense if it achieves a diversity order such that , where is the number of degrees of freedom in the channel. The right term is known as the block-fading Singleton bound [41][31]. In this paper, shall denote the number of colors, i.e. the palette size of an edge coloring. Assume that code symbols are partitioned into sub-blocks, a code is said to attain diversity if it is capable of correct decoding when sub-blocks are erased by the channel. The reader should refer to [63], chapter 3, for an exact definition of diversity on fading channels with additive white Gaussian noise.

A product code shall be represented by a non-compact graph . is a complete bipartite graph where is the set of right vertices, is the set of left vertices, and is the set of edges representing the code symbols. A compact graph will also be introduced in the next section with . The number of edges (also called super-edges) in the compact graph is . A super-edge is equivalent to a super-symbol that represents symbols from . The ensemble of edge colorings is denoted and for and respectively. An edge coloring will be denoted by . Given , the rootcheck order of an edge is . The greatest among all edges will be referred to as . The number of edges satisfying is , this is the number of good edges and will be processed by the DECA algorithm in Section V. The DECA parameter shall represent the number of edges to be mutated, i.e. those edges being chosen in the population of bad edges satisfying .

Under iterative row-column decoding, the rootcheck order is equal to the number of decoding iterations required to solve the edge value (or the symbol associated to that edge). In this paper, one decoding iteration is equivalent to decoding all rows or decoding all columns. A sequence of row decoders followed by a sequence of column decoders is counted as two decoding iterations.

We give now a general definition of a stopping set. A detailed study is found in Section IV. The notion of a stopping set is useful for iterative decoding in presence of erasures [17].

Definition 1

Let be a linear code. Assume that the symbols of a codeword are transmitted on an erasure channel. The decoder is using some deterministic decoding method. Consider a set of fixed positions where . The set is said to be a Stopping Set if fails in retrieving the transmitted codeword when all symbols on the positions given by are erased.

This paper focuses on stopping sets of a product code under iterative algebraic row-column decoding, i.e. referred to as type II stopping sets. The number of stopping sets of size is . The rectangular support of a stopping set can be seen as the smallest rectangle containing . After excluding rows and columns not involved in , the rectangular support has size where . The word error performance of shall be estimated on erasure channels, is the word error probability under Maximum Likelihood decoding and is the word error probability under iterative row-column decoding. Three erasure channels are considered: 1- The Symbol Erasure Channel, , where code symbols are independently erased with a probability , 2- The Color Erasure Channel, , where all symbols associated to the same color are block-erased with a probability . On the , block-erasure events are independent from one color to another. 3- The unequal probability Symbol Erasure Channel, , where symbol erasures are independent but their erasure probability varies from one color to another.

Iii Graph representations for diversity

Efficient graph representation of codes was established by Tanner for different types of coding structures [60]. Bounds on the code parameters and iterative decoding algorithms were also proposed for codes on graphs [60]. In this paper, we study the edge coloring of a product code graph, where edges represent code symbols. As shown below, the original graph for a product code is too complex, i.e. it leads to a large ensemble of colorings. Hence, we introduce a compact graph where symbols are grouped together with the same color in order to reduce the size of the coloring ensemble. The compact graph also has another asset: grouping parity symbols together renders check nodes similar to parity-check nodes found in standard low-density parity-check codes [23] [49].

Iii-a Non-compact graph

Consider a product code where is the column code and is the row code. The product code is defined over the finite field and has length and dimension given by [40]

(1)

Each code symbol simultaneously belongs to one row and to one column. Product codes studied in this paper are regular, in the sense that all columns are codewords of and all rows are codewords of . The graph of is built as follows. We use the same terminology as in [49]:

  • check nodes are drawn on the left. A left check node represents the coding constraint which states that a row belongs to . The left check nodes are referred to as check nodes, or row check nodes, or equivalently left vertices.

  • check nodes are drawn on the right. A right check node represents the coding constraint which states that a column belongs to . The right check nodes are referred to as check nodes, or column check nodes, or equivalently right vertices.

  • An edge is drawn between a left vertex and right vertex. It represents a code symbol located on the row of the left vertex and on the column of the right vertex. The code symbol belongs to .

Figure 1: Non-compact bipartite graph of a product code , i.e. , , , and edges representing 16 symbols in .

In summary, the product code graph is a complete biregular bipartite graph built from left vertices, right vertices, and edges representing code symbols. The left degree is and the right degree is . Irregular product codes can be found in [2]. Our paper is restricted to regular product codes. Figure 1 shows the bipartite graph of a square regular symmetric product code . The graph structure reveals , , and . The dimensions and of the component codes have no effect on the number of vertices and edges in the product code graph. Indeed, a code can also be defined by the graph in Figure 1. The role of the dimensions and is played within the check constraints inside left and right vertices. Similarly, the size of the finite field defining the code cannot be revealed from the graph structure, i.e. the product code graph does not depend on .

Definition 2

The non-compact graph for a product code is a complete bipartite graph with left vertices and right vertices.

Iii-B Compact graph

In [10] where the diversity of binary product codes was considered, vertices of the non-compact graph were grouped together into super-vertices (or supernodes) because the different channel states lead to multiple classes of check nodes as in root-LDPC codes [13]. To render a graph-encodable code, supernodes in [10] were made by putting nodes together for a component code. Also, is not necessarily a divisor of .

Definition 3

The compact graph for a product code is a complete bipartite graph with left vertices and right vertices.


From the above definition, the number of edges in the compact graph is found to be

(2)

Assuming that divides and divides , a left check node in is equivalent to row constraints and a right check node in is equivalent to column constraints. An edge in the compact graph carries code symbols. To avoid confusion between edges of and , we may refer to those in as super-edges or equivalently as super-symbols. If is not multiple of , then the last row or column supernode will contain less than check nodes. Figure 2 depicts the compact graph of the product code. All product codes have a compact graph identical to that of , for all , even.

Figure 2: Compact bipartite graph with two supernodes on each side for the product code , and supersymbols. Each super-symbol (i.e. super-edge) contains symbols (i.e. edges).

Iii-C Diversity and codes on graphs

From a coding point of view, diversity is the art of creating many replicas of the same information. From a channel point of view, diversity is the number of degrees of freedom available while transmitting information. In distributive storage, independent failure of individual machines is modeled by independent erasures of code symbols, while the outage of a cluster of machines is modeled as block erasures of code symbols. Assuming a storage domain with a large set of machines partitioned into clusters, diversity of distributed coding is defined as follows:

Definition 4

Consider a product code defined over . Assume that symbols are given different colors. Erasing one color is equivalent to erasing all symbols having this color. The code is said to achieve a diversity if it is capable of filling all erasures after erasing colors. The code is full-diversity when .


The integer may also be called the diversity order. For Gaussian channels with fading, the diversity order appears as the slope of the error probability, i.e. [13]. In the above definition, a cluster has been replaced by a color. We will use this terminology throughout the paper. Notice that coloring symbols is equivalent to edge coloring of the product code graph. The number of edges is in the non-compact graph and in the compact graph. In the sequel, all colorings are supposed to be perfectly balanced, i.e. divides both and and the number of edges having the same color is and for the non-compact graph and the compact graph respectively. More formally, our edge coloring is defined as follows: an edge coloring of is a mapping associating one color to every edge in ,

(3)

such that for , where is the inverse image of . Similarly, for and . The set of such mappings for and is denoted and respectively.
Consider a coloring in . It can be embedded into by copying the color of a super-edge to its associated edges in . Thus, let be the subset of colorings in obtained by embedding all colorings of into . We have

(4)

The size of the edge coloring ensembles and is obviously not the same when , which occurs for both row and column component codes not equal to single parity-check codes. Indeed, when a palette of size is used to color edges, the total number of colorings of is

(5)

This number for the compact graph is

(6)

As an example, for the code and , there are edge colorings for the non-compact graph and edge colorings for the compact graph. It is clear that the construction of product codes for diversity is much easier when based on because its edge coloring ensemble is smaller. Furthermore, as described below, vertices in act in a way similar to standard LDPC check nodes making the design very simple. Furthermore, we will see in Section IV that edge colorings of the compact graph render larger stopping sets than colorings of the non-compact graph.

The diversity order attained by a code can never exceed , the latter being the diversity from a channel point of view. A tighter upper bound of showing the rate-diversity tradeoff is the block-fading Singleton bound. The Singleton bound for the maximal achievable diversity order is valid for all types of non-ergodic channels, including block-erasure and block-fading channels. The block-fading Singleton bound states that [31] [41]

(7)

where is the coding rate of the product code. Codes satisfying the equality in the above Singleton bound are referred to as diversity-wise MDS or block-fading MDS codes. From (7), we deduce that if (full-diversity coding). For example, we get with an edge coloring using colors and for colors. The coding rate can exceed when in applications where full diversity is not mandatory. An example suited to distributed storage is an edge coloring with a palette of colors, a diversity , and .

Iii-D Rootcheck nodes and root symbols

In a way similar to root-LDPC codes and product codes built for block-fading channels [10][13], we introduce now the notion of root symbols and root-check nodes in product codes to be designed for distributive storage. A linear code with parity-check matrix can fill erasures at positions where the columns of are independent. These symbols correspond to separate edges in the non-compact graph and to a unique edge (supersymbol) in the compact graph. Therefore, for simplicity, we start by defining a root supersymbol in the compact graph where supernodes are equivalent to standard LDPC parity-check nodes.

Definition 5

Let be a compact graph of a product code, let be a given edge coloring, and let be a supersymbol. is a root supersymbol with respect to if it admits a neighbor vertex , or , such that all adjacent edges in satisfy .


In Definition 5, if then is a root supersymbol thanks to the product code column to which it belongs, i.e. can be solved in one iteration by its column component code when the color is erased. Likewise, is protected against erasures by its row component code if in the previous definition. Finally, a root supersymbol may be doubly protected by both its row and its column if both right and left neighbors and satisfy the condition of Definition 5.

Definition 6

Let be a non-compact graph of a product code, let be a given edge coloring, and let be a symbol. is a root symbol with respect to if it admits a neighbor vertex such that:
for at most adjacent edges if , or
for at most adjacent edges if .


As mentioned in the paragraph before Definition 5, Definition 6 implies that the root symbols with the same color should belong to positions of independent columns in the parity-check matrix of the component code . This constraint automatically disappears for MDS component codes since any set of columns of has full rank.

Iii-E The rootcheck order in product codes

Not all symbols of a product code are root symbols. Under iterative row-column decoding on channels with block erasures, some symbols may be solved in two decoding iterations or more. Some set of symbols may never be solved and are referred to as stopping sets [17][54][51]. Our study is restricted to erasing the symbols of one color out of . Hence, the rest of this paper is restricted to double diversity, . Absence of diversity is equivalent to . We establish now the root order of a symbol. For root symbols satisfying definitions 5 and 6, the root order is . For symbols that can be solved after two decoding iterations, we set . The formal definition of the root order can be written in the following recursive manner (for ).

Definition 7

Let be a compact graph of a product code, let be an edge coloring, and let be a super-symbol. has root order where:
1- Let be the column neighbor vertex of . adjacent to in and , we have .
2- Let be the row neighbor vertex of . adjacent to in and , we have .


The previous definition implies that if there exists no adjacent edge with the same color. Also, for an edge that does not admit a finite , we set . When color is erased, symbols belonging to the so-called stopping sets can never be solved (even after an infinite number of decoding iterations) and hence their root order is infinite. In the next section we review stopping sets as known in the literature and we study new stopping sets for product codes based on MDS components under iterative algebraic decoding. Definition 7 can be rephrased to make it suitable for the non-compact graph . We pursue this section to establish an upper bound of the largest finite root order valid for all edge colorings .

Theorem 1

Let be a product code with a compact graph . and we have:
Case 1: such that and , then

Define the minimum number of good edges,

Then, in Case 1,

(8)

Case 2: such that and , then

where is given by (2).

{proof}

Case 1 corresponds to a product code with diversity , for a given color , which is capable of solving all symbols when that color is erased. The graph has no infinite root order symbols. is recursively built by starting from following two paths in the graph until reaching a common edge that has two neighboring vertices with edges of order . There are up to edges, including , having color equal to . The largest is attained in the middle of the longest path of length , hence which is translated into the stated result for Case 1. An illustrated instance is given for the reader in Example 1. Back to the path of length ending with edges of order on both sides, if the population of order edges is for the color , then the path can only use a maximum of edges. We get the inequality . By plugging instead of , this inequality becomes independent from the particular color. The stated inequality in (8) is obtained after grouping and on the left side.
Case 2 corresponds to bad edge coloring where the product code does not have double diversity, i.e. stopping sets do exist for the color . The order of may be infinite if is involved in a stopping set with another edge having the same color. Otherwise, consider the smallest stopping set of size four symbols (the smallest cycle in with edges of color ), then there remains edges of color . A path of length starting with and ending at may exist. The largest finite order in this path before reaching the stopping set is .

Corollary 1

Let be a product code with a compact graph . Let be an edge coloring. We define

(9)

attains double diversity under iterative row-column decoding if and only if . In this case, we say that is a double-diversity coloring and , can be solved after at most decoding iterations where .

For colorings in , we extend the same definition as in Corollary 1 and we say that is double-diversity if all edges have a finite rootcheck order. The parameter is important in practical applications to bound from above the amount of conveyed information within a network (whether it is a local-area or a wide-area network). In fact, in coding for distributed storage, the locality of a product code per decoding iteration is in under algebraic decoding of its row and column components. Here, the locality is the number of symbols to be accessed in order to repair an erased symbol [25]. Locality is for MDS components under ML decoding of the product code components. Finally, for a product code, the information transfer per symbol is bounded from above by

(10)

The exact transfer cost to fill all erasures with iterative decoding can be determined by multiplying each order with the corresponding edge population size. This exact cost may vary in a wide range from one coloring to another. The DECA algorithm presented in Section V dramatically reduces by enlarging the edge population with root order . The interdependence between and the population of order  was revealed in inequality (8). This inequality is useful in intermediate cases where is not attained, i.e. outside the case where all edges have order . The influence of the component decoding method on the performance of a product code via its stopping sets is discussed in Section IV.

Example 1

Consider a product code and a coloring with colors. The compact graph has edges. Instead of drawing , we draw the compact matrix representation of the product code in Fig. 3. Supersymbols corresponding to a color are shaded. Fig. 3 also shows a path in such that a maximal order is attained for . If has double diversity then will not exceed for all colors . Note that the parameters of this product code are such that is also equal to 5 for a with a diversity defect.

Figure 3: Compact matrix (left) and path in compact graph (right) for a product code showing a maximal root order of 5.
Example 2

Consider a product code and a coloring with colors. The compact graph has edges. The compact matrix and a path attaining are illustrated in Fig. 4. is chosen such that the first color has a cycle involving four supersymbols. Starting from the root supersymbol () it is possible to create a path in the graph such that is reached. Note that a double-diversity cannot exceed a root order .

Figure 4: Compact matrix (left) and path in compact graph (right) for a product code showing a maximal finite root order of 10.

The ideal situation is to construct a product code and its edge coloring in order to obtain for all edges. We investigate now the conditions on the product code rate and its components rates in this ideal situation. The analysis based on reveals the existence of a trade-off between minimizing the number of decoding iterations and the valid range of both coding rates for the product code components.

Firstly, let us look at the upper bound from Theorem 1. Without loss of generality, assume that divides . Then, we have

(11)

The total coding rate becomes

(12)

Using , we get

(13)

Finally, from (13) and Theorem 1, the upper bound of the root order for double-diversity edge coloring of the compact graph can be expressed as

(14)

Fix the product code rate , force the upper bound to , and take colors. Then the denominator in (14) should be less than or equivalently . This second-degree polynomial in is non-negative if and only if

(15)

and

(16)

As a result, with a palette of four colors, (15) tells us that for all edges is feasible for a product code with a rate less than . It is obvious that (15) is a very constraining condition because is an upper bound of for all . It is worth noting that and vary in a smaller range when approaches , which corresponds to a product code with balanced components.

In Section V-A, we will show unbalanced product codes where a sufficient condition on the component rates imposes order to all edges. The sufficient condition, not based on , is given by Lemma 5. But before introducing an efficient edge coloring algorithm in Section V, we analyze stopping sets in product codes with MDS components in the next section, we describe the relationship between stopping sets and the product code graph representation, and finally we enumerate obvious and non-obvious stopping sets. Stopping sets enumeration is useful to determine the performance of a product code with and without edge coloring.

Iv Stopping sets for MDS components

The purpose of this section is to prepare the way for determining the performance of iterative decoding of non-binary product codes. The analysis of stopping sets in a product code will yield a tight upper bound of its iterative decoding performance over a channel with independent erasures. The same analysis will be useful to accurately estimate the performance under edge coloring in presence of block and multiple erasure channels.

Iv-a Decoding erasures

Definition 8

An erasure pattern is said to be ML-correctable if the ML decoder is capable of solving all its erased symbols.

For an erasure pattern which is not correctable under ML or iterative decoding, the decoding process may fill none or some of the erasures and then stay stuck on the remaining ones. Before describing the stopping sets of a product code, let us recall some fundamental results regarding the decoding of its row and column component codes. The ML erasure-filling capability of a linear code satisfies the following property.

Proposition 1

Let be a linear code with . Assume that is not MDS and the symbols of a codeword are transmitted on an erasure channel. Then, there exists an erasure pattern of weight greater than that is ML-correctable.

{proof}

Let be an parity-check matrix of with rank . For any integer in the range , there exists a set of linearly independent columns in . Choose an erasure pattern of weight with erasures located at the positions of the independent columns. Then, the ML decoder is capable of solving all these erasures by simple Gaussian reduction of . For MDS codes, based on a proof similar to the proof of Proposition 1, we state a well-known result in the following corollary.

Corollary 2

Let be an MDS code. All erasure patterns of weight greater than are not ML-correctable.

We conclude from the previous corollary that an algebraic decoder for an MDS code attains the word-error performance of its ML decoder. What about symbol-error performance? Indeed, for general binary and non-binary codes, the ML decoder may outperform an algebraic decoder since it is capable of filling some of the erasures when dealing with a pattern which is not ML-correctable. In the MDS case, the answer comes from the absence of spectral holes for any MDS code beyond its minimum distance. This basic result is proven via standard tools from algebraic coding theory [40][7]:

Proposition 2

Let be a non-binary MDS code (). For any satisfying and any support , where , there exists a codeword in of weight having as its own support.

{proof}

By assumption we have . Let be a parity-check matrix of with rank . Recall that the MDS property makes full-rank any set of columns of [40]. is written as , where . The positions of are anywhere inside the range , but for simplicity let us denote the columns of in the first positions. The last columns are denoted . For any , we have

where otherwise it contradicts . Now, select from such that: is arbitrary, is chosen outside the set , then is chosen outside the set , and so on, up to which is chosen outside the set . Here, the notation in is equivalent to the standard algebraic notation . The equality

produces a codeword of Hamming weight . Hence, there exists a codeword of weight with non-zero symbols in all positions given by . Now, at the symbol level for an MDS code and an erasure pattern which is not ML-correctable (), we conclude from Proposition 2 that the ML decoder cannot solve any of the erasures because they are covered by a codeword. Consequently, an algebraic decoder for an MDS code also attains the symbol-error performance of the ML decoder. This behavior will have a direct consequence on the iterative decoding of a product code with MDS components: stopping sets are identical when dealing with algebraic and ML-per-component decoders.

A general description of a stopping set was given by Definition 1. The exact definition of a stopping set depends on the iterative decoding type. For product codes, four decoding methods are known:

  • Type I: ML decoder. This is a non-iterative decoder. It is based on a Gaussian reduction of the parity-check matrix of the product code.

  • Type II: Iterative algebraic decoder. At odd decoding iterations, component codes on each column are decoded via an algebraic decoder (bounded-distance) that fills up to erasures. Similarly, at even decoding iterations, component codes on each row are decoded via an algebraic decoder.

  • Type III: Iterative ML-per-component decoder. This decoder was considered by Rosnes in [51] for binary product codes. At odd decoding iterations, column codes are decoded via an optimal decoder (ML for ). At even decoding iterations, row codes are decoded via a similar optimal decoder (ML for ).

  • Type IV: Iterative belief-propagation decoder based on the Tanner graph of , as studied by Schwartz et al. for general linear block codes [54] and by Di et al. for low-density parity-check codes [17].

The three iterative decoders listed above give rise to three different kinds of stopping sets. As previously indicated, from Corollary 2 and Propositions 2, we concluded that type-II and type-III stopping sets are identical if component codes are MDS.

Iv-B Stopping set definition

Let be a -ary linear code of length , i.e. is a sub-space of dimension of . The support of , denoted by , is the set of distinct positions , , such that, for all , there exists a codeword with . This notion of support is applied to rows and columns in a product code.

Now, we define a rectangular support which is useful to represent a stopping set in a bi-dimensional product code. Let be a set of symbol positions in the product code. The set of row positions associated to is where and for all there exists . The set of column positions associated to is where and for all there exists . The rectangular support of is

(17)

i.e. the smallest rectangle including all columns and all rows of .

Definition 9

Consider a product code . Let with and . Consider the rows of given by and the columns of given by . The set is a stopping set of type III for if there exist linear subcodes and such that and for all and for all .

The cardinality is called the size of the stopping set and will also be referred to in the sequel as the weight of . Recall that type II and type III stopping sets are identical when both and are MDS. Stopping sets of type III were studied for binary product codes by Rosnes [51]. His analysis is based on the generalized Hamming distance [67][29] because sub-codes involved in Definition 9 may have a dimension greater than 1. In the non-binary MDS case, according to Proposition 2, all these sub-codes have dimension 1, i.e. they are generated by a single non-zero codeword. Consequently, the generalized Hamming distance is not relevant when using MDS components. In such a case, the analysis of type II stopping sets is mainly combinatorial and does not require algebraic tools.

Stopping sets for decoder types II-IV can be characterized by four main properties summarized as follows.

  • Obvious or not obvious sets, also known as rank-1 sets. A stopping set is obvious if .

  • Primitive or non-primitive stopping sets. A stopping set is primitive if it cannot be partitioned into two or more smaller stopping sets. Notice that all stopping sets, whether they are primitive or not, are involved in the code performance.

  • Codeword or non-codeword. A stopping set is said to be a codeword stopping set if there exists a codeword in such that .

  • ML-correctable or non-ML-correctable. A stopping set cannot be corrected via ML decoding if it includes the support of a non-zero codeword.

In the remaining material of this paper, we restrict our study to type II stopping sets.

Example 3

Consider a product code. A stopping set of size is shown as a weight- matrix of size , where corresponds to an erased position:

(18)

We took for illustration. The rectangular support is shown in a compact representation as a matrix of size ,

(19)

The stopping set in (18) is obvious, it has the same size as its rectangular support. It corresponds to a matrix of rank 1. Each row and each column of has weight . Iterative row-column decoding based on component algebraic decoders fails in decoding rows and columns since the number of erasures exceeds the erasure-filling capacity of the MDS components. This stopping set is not ML-correctable because it is a product-code codeword. In the sequel, all stopping sets (type II) shall be represented in this compact manner by a smaller rectangle of size .

Example 4

For the same product code used in the previous example, the following stopping sets of size are not obvious.

(20)
(21)

In compact form, their rectangular support is

(22)

These stopping sets have size and a rectangular support. For , it is also possible to build an obvious stopping set in a rectangle or a rectangle full of . is ML-correctable since it does not cover a product code codeword. covers a codeword hence it is not ML-correctable.

Iv-C Stopping sets and subgraphs of product codes

A stopping set as defined by Definition (9) corresponds to erased edges in the non-compact graph introduced in Section III-A. Indeed, consider the size- stopping set given by (18) or (19). The nine symbol positions involve nine edges in , three row checknodes, and three column checknodes. Each of these six checknodes has three erased symbols making the decoder fail. This stopping set is equivalent to a subgraph of edges in as shown in Figure 5.

Figure 5: A sub-graph of representing the size- obvious stopping set. The graph has edges, left (row) checknodes, and left (column) checknodes. Only the stopping set edges are drawn.

The subgraph in Figure 5 has three length- cycles and two length- cycles. The small cycles of length- are associated to an erasure pattern with a rectangular support which is not a stopping set (). Similarly, length- cycles are not stopping sets and are associated to erasure patterns with a rectangular support. We will see in the next section that the minimum stopping set size is , i.e. it is equal to the minimum Hamming distance of the product code.

A subgraph of can be embedded into by splitting each super-edge into edges. The converse is not always true. The subgraph with nine edges in Figure 5 cannot be compressed into a subgraph of . For the product code, a supersymbol in contains four edges. Hence, a necessary condition for a stopping set in to become a valid stopping set in is to erase edges in groups of . Knowing that type II and type III stopping sets are identical when row and column codes and are MDS, Definition (9) leads to the following corollaries.

Corollary 3

Let be a product code with MDS components and having minimum Hamming distance and respectively. Assume that symbols (edges) of are sent over an erasure channel. A stopping set for the iterative decoder is a subgraph of such that all column vertices in have a degree greater than or equal to and all row vertices in have a degree greater than or equal to .

Corollary 4

Let be a product code with MDS components and having minimum Hamming distance and respectively. Assume that supersymbols (super-edges) of are sent over an erasure channel. A stopping set for the iterative decoder is a subgraph of such that all column vertices in have a degree greater than or equal to and all row vertices in have a degree greater than or equal to