On Partial Maximally-Recoverable and Maximally-Recoverable Codes

On Partial Maximally-Recoverable and Maximally-Recoverable Codes

S. B. Balaji and P. Vijay Kumar, Fellow,IEEE This work is supported in part by the National Science Foundation under Grant No. 0964507 and in part by the NetApp Faculty Fellowship program. Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore.
Email: balaji.profess@gmail.com, vijay@ece.iisc.ernet.in
Abstract

An linear code that is subject to locality constraints imposed by a parity check matrix is said to be a maximally recoverable (MR) code if it can recover from any erasure pattern that some -dimensional subcode of the null space of can recover from. The focus in this paper is on MR codes constrained to have all-symbol locality . Given that it is challenging to construct MR codes having small field size, we present results in two directions. In the first, we relax the MR constraint and require only that apart from the requirement of being an optimum all-symbol locality code, the code must yield an MDS code when punctured in a single, specific pattern which ensures that each local code is punctured in precisely one coordinate and that no two local codes share the same punctured coordinate. We term these codes as partially maximally recoverable (PMR) codes. We provide a simple construction for high-rate PMR codes and then provide a general, promising approach that needs further investigation. In the second direction, we present three constructions of MR codes with improved parameters, primarily the size of the finite field employed in the construction.

Distributed storage, codes with locality, maximally recoverable codes, partial-MDS codes.

I Introduction

In a distributed storage network, each file is regarded as a message, encoded into a codeword by adding redundancy, and stored in the network. Each code symbol is typically placed on a different node to provide resiliency against node failure. Both replication and Reed-Solomon codes are commonly employed to protect data but have their drawbacks. While replication incurs large overhead, RS codes are inefficient when it comes to node repair. The notion of codes with locality introduced in [1], was motivated in part, by this shortcoming of an RS code.

I-a Codes with Locality

Definition 1

[1] An code of block length and dimension is said to have all-symbol locality if for every code symbol in , the dual code contains a codeword with support satisfying and . We will call the recovery set for code symbol . We assume w.l.o.g. that . We will write to indicate an code with such all-symbol locality and if the code has minimum distance .

Codes with all-symbol locality have the property that the number of code symbols that need to be accessed to repair a failed node is at most . The following bound on the minimum distance under a weaker notion called information-symbol locality was derived in [1]:

(1)

The same bound also applies to codes with all-symbol locality and is often (but not always) tight, see [2] for instance. The Pyramid codes introduced in [3] are shown in [1] to be an example of codes with information-symbol locality that are optimal with respect to this bound. The existence of code with all-symbol locality was established in [1] for the case when . Codes with locality also go by the names locally repairable codes [4] or local reconstruction codes [5].

A class of codes with all-symbol locality known as homomorphic self-repairing codes were constructed in [6] with the aid of linearized polynomials. An example provided in [6] is optimal with respect to the bound in (1). A general construction of optimal codes with all-symbol locality is provided in [7], that is based on the construction of Gabidulin maximum rank-distance codes. An upper bound on minimum distance, similar to that in (1), was derived in [4], that applies also to non-linear codes. Also provided, in [4], is an explicit construction of a class of linear, optimal all-symbol locality codes possessing a vector alphabet. This construction is related to an earlier construction in [8], of codes termed as simple regenerating codes. Most recently, Tamo and Barg  [9] have provided general constructions for optimal codes with all-symbol locality.

I-B Maximally Recoverable Codes

The notion of a maximally recoverable code is most easily defined in terms of the generator matrix of the code.

Let be an code that satisfies the all-symbol, locality- constraints imposed by a parity-check matrix . Let denote the null space of and be the corresponding generator matrix. Then is said to be an MR code with respect to if for any collection of linearly independent columns in , the corresponding columns of are also linearly independent.

The construction of optimum codes with locality given in [9], has field size on the order of block length. A principal code constructed in their paper corresponds to a subcode of an RS code. The coordinates of this code are grouped together in accordance with cosets of a cyclic subgroup of the group of th roots of unity. The subcode of the RS code is selected so that the restriction of the RS code to a coset of size corresponds to evaluation of a polynomial of degree , thus providing locality. The degree of the encoding polynomials is shown to be such that the resulting codes are optimal with respect to the minimum distance bound in (1). The authors in [10] define a general notion of maximal recoverable codes and provide a construction for maximally recoverable codes of field size . In [11], a general form of parity-check matrix was considered with the aim of constructing MR codes. These codes are referred to in [11] as partial MDS codes. The authors provide conditions under which the proposed form of parity-check matrix defines an MR code and identify explicit parameter sets for which their construction results in an MR code. A particular instance of their construction has field size , where in the block length of the code. For the case of a single global parity check, the authors provide a construction where the field size is .

The authors of [12], construct codes termed as sector-disk (SD) codes. These are codes which for certain puncturing patterns associated to a combination of disk and sector failures result in MDS codes. The authors provide a construction for the case of global parities for handling the correction of a single or double erasure in each local code and present a parameter range for which their construction satisfies the requirement of an SD code through computer search. In [13], the authors present a construction for maximally recoverable codes with global parities with field size of that can handle single erasures through local error correction. In [14], a construction of SD codes with 2 global parities is provided having field size of to handle one or two erasures in each local code. This was subsequently strengthened in [15], where a construction of SD code and partial MDS code was provided for 2 global parities having field size of that can handle any number of erasures through local error correction.

In [16], a family of explicit, MR codes for single local erasure correction is provided in which the number of global parities can be arbitrary. It is assumed here that where is the locality parameter of the code. The parity check matrix in [16] has the same form as in [11] except that the authors use variables to fill up the entries of the parity check matrix and then proceed to derive conditions needed to be satisfied by these variables in order to yield an MR code.

In [17], a relaxation in the definition of an MR code is proposed. Here the authors seek to correct a select set of erasure patterns. Each codeword is put into matrix form in such a way that each row corresponds to a local code. A vector is used to specify the number of columns of this code matrix in which erasure can occur, the maximum number of erasures allowed within each column as well as the maximum number of complete column erasures permitted. A construction satisfying these requirements is provided.

In the present paper, a relaxation of the MR criterion termed as a partial maximally recoverable (PMR) criterion is presented and a simple, high-rate construction provided. Also contained in the paper are three constructions of MR codes with improved parameters, primarily field size.

Ii Partial Maximum Recoverability

Given that the construction of MR codes having small field size is challenging, we seek here to construct codes that satisfy a weaker condition which we will refer to in this paper as the partial maximally recoverable (PMR) condition. Let be an code having all-symbol locality and whose minimum distance satisfies the bound in (1) with equality. Let denote the recovery sets. In the context of PMR codes, an admissible puncturing pattern is one in which the satisfy the condition:

A PMR code is then defined simply as an optimal all-symbol locality code which becomes an MDS code upon puncturing under some admissible puncturing pattern. The parity-check matrix of a PMR code is characterized below. We assume w.l.o.g. in the section below, that through symbol reordering.

Ii-a Characterizing for a PMR Code

Theorem II.1

Let be a PMR code as defined above for admissible puncturing pattern . Then can be assumed to have parity-check matrix of the form:

where is the parity-check matrix of an MDS code and is of the form:

in which each is a vector of Hamming weight at most .

{proof}

Clearly, can be assumed to be of the form

which can be transformed, upon row reduction to the form:

It is desired that upon puncturing the first coordinates (corresponding to coordinates of the identity matrix in the upper left), the code be MDS. But since the dual of a punctured code is the shortened code in the same coordinates, it follows that must be the parity-check matrix of an MDS code.

Ii-B A Simple Parity-Splitting Construction for a PMR Code when

We will assume throughout the rest of the paper that is an code where and having parameters given by:

Thus represents the number of “global” parity checks imposed on top of the “local” parity checks.

Assume that . Let be the the parity-check matrix of an MDS code. Let be the last row of and be with the last row deleted, i.e.,

In the construction, we will require that also be the parity-check matrix of an MDS code and set . For example, this is the case when is either a Cauchy or a Vandermonde matrix. Let be the contiguous component vectors of defined through

Let be given by

Lemma II.2
Theorem II.3 (Parity-Splitting Construction)

The code having parity-check matrix given by

with as given above and , has locality , the PMR property and minimum distance achieving the bound

{proof}

We need to show that any columns of are linearly independent. From the properties of the matrix , it is not hard to see that it suffices to show that any columns of

are linearly independent. But the rowspace of contains the vector , hence it suffices to show that any columns of

are linearly independent, but this is clearly the case, since is the parity-check matrix of an MDS code having redundancy .

Remark 1

The construction gives rise to codes having parameters and hence, high rate:

Iii A General Approach to PMR Construction

We attempt to handle the general case

in this section and outline one approach. At this time, we are only able to provide constructions for selected parameters with and field size that is cubic in the block length of the code and hold out hope that this construction can be generalized.

The desired minimum distance of the PMR code (with H as given in Theorem II.3 and chosen to be a Vandermonde matrix) can be shown to equal in this case,

It follows that even the code on the right having parity-check matrix

must have the same value of and therefore, the sub matrix formed by any columns of must have full rank. Let be the support of this subset of columns of . Let this support have non-empty intersection with the support of local codes and the support of the intersection with the th code being of size . The corresponding sub matrix will then take on the form:

where are the polynomials whose evaluations provide the local parities. Since we want this matrix to have full rank it must be that the left null space of the matrix must be of dimension . Computing the dimension of this null space is equivalent to computing the number of solutions to

where is generic notation for a polynomial of degree . Let us define

and note that each will in general, have degree . Consider the matrix whose rows correspond to the coefficients of . It follows that the first columns of must have full rank.

Iii-a Restriction to the Case , i.e.,

We now assume that so that and we need the first columns of to have rank . We consider the sub matrix made up of the first two rows and first two columns of . The determinant of this upper-left matrix formed of is given by

where

This is equal to

Let and , , and . Then this becomes:

with which will be nonzero if the minimum polynomial of over has degree , unless all the coefficients are equal to zero.

Numerical Evidence

Computer verification was carried out for the case for over and over with where is the primitive element of and respectively for the two cases and is fifth and seventh root of unity respectively (the choice of fifth and seventh roots of unity varies for each ). For both cases, it was found that the elements never simultaneously vanished for all instances.

Iv Maximal Recoverable Codes

Iv-a A Coset-Based Construction with Locality

Since this construction is based on Construction 1 in [9] of all-symbol locality codes, we briefly review the latter here.

Let , and be a power of a prime such that , for example, could equal . Let be a primitive element of and an element of order . Let

Note that are pairwise disjoint and partition . Let . Let the supports of the local codes be . Note that the monomial is constant on each of the sets . Let us set

where the second term is vacuous for , i.e., is not present when . Consider the code of block length and dimension where each polynomial is associated to a distinct codeword obtained by evaluating the polynomial at the elements of . This code possesses all-symbol locality and has minimum distance satisfying (1).

Note that the exponents in the monomial terms forming each polynomial satisfy . It is this property this property that gives the code its locality properties.

Our construction of an MR code here is based on the above construction with parameters given by so that and . Thus the local codes all have length . Let us denote the algebraic closure of by .

Theorem IV.1

Given positive integers with and

where

there exists an MR code with that is obtained from by puncturing the code at a carefully selected set of cosets .

{proof}

Please see the Appendix A .

Example 1

Let . The condition in the theorem becomes whereas, the optimized construction given in [16] requires a field size of . The construction in [10] requires .

Iv-B Modification of Construction by Blaum et al. for

in [15], the authors provide a construction for an MR code (the code is referred to as a partial MDS code in their paper). We present a modification of this construction here. The modification essentially amounts to a different choice of finite-field elements in the construction of the parity check matrix given in [15] for the partial MDS code. The modified parity-check matrix is provided below.

where

and

In the above, is a primitive element of and is a th root of unity for any and hence divides . Using the closed-form expression for the determinant given in [15], it can be seen that this construction yields an MR code with field size . Note that the field size is independent of .

V Non-Explicit Construction of MR Codes with Field Size

In this section we provide a construction for MR codes derived by ensuring that certain polynomial constraints which reflect the rank conditions the parity-check matrix of an MR code has to satisfy, hold. Our starting point is the canonical form of the parity-check matrix for an MR code given in Theorem II.1. In our construction, the sub-matrix is fixed and we show the existence of assignment of values to the local parities corresponding to the elements of that result in an MR code. Our approach yields improved field size in comparison with the approach in Lemma 32 of [16].

Theorem V.1

There exists a choice of such that

is a maximally recoverable code for any with a field size of (for fixed ).

{proof}

The proof is skipped for lack of space.

The above construction can be extended in a straight forward manner to give maximal recoverable codes with field size of when the matrix is made up of blocks of local codes where we correct erasures in each local code.

Acknowledgment

The authors would like to thank P. Gopalan for introducing us to this problem and for subsequent, useful discussions.

References

  • [1] P. Gopalan, C. Huang, H. Simitci, and S. Yekhanin, “On the Locality of Codeword Symbols,” IEEE Trans. Inf. Theory, vol. 58, no. 11, pp. 6925–6934, Nov. 2012.
  • [2] N. Prakash, V. Lalitha, and P. V. Kumar, “Codes with locality for two erasures,” in IEEE International Symposium on Information Theory, 2014, 2014, pp. 1962–1966.
  • [3] C. Huang, M. Chen, and J. Li, “Pyramid codes: Flexible schemes to trade space for access efficiency in reliable data storage systems,” in Network Computing and Applications, 2007. NCA 2007. Sixth IEEE International Symposium on.   IEEE, 2007, pp. 79–86.
  • [4] D. S. Papailiopoulos and A. G. Dimakis, “Locally repairable codes,” in Information Theory Proceedings (ISIT), 2012 IEEE International Symposium on, july 2012, pp. 2771–2775.
  • [5] C. Huang, H. Simitci, Y. Xu, A. Ogus, B. Calder, P. Gopalan, J. Li, and S. Yekhanin, “Erasure coding in windows azure storage,” in Proceedings of the 2012 USENIX conference on Annual Technical Conference, ser. USENIX ATC’12.   Berkeley, CA, USA: USENIX Association, 2012. [Online]. Available: http://dl.acm.org/citation.cfm?id=2342821.2342823
  • [6] F. Oggier and A. Datta, “Self-repairing homomorphic codes for distributed storage systems,” in INFOCOM, 2011 Proceedings IEEE.   IEEE, 2011, pp. 1215–1223.
  • [7] N. Silberstein, A. S. Rawat, and S. Vishwanath, “Adversarial Error Resilience in Distributed Storage Using MRD Codes and MDS Array Codes,” CoRR, vol. abs/1202.0800, 2012.
  • [8] D. Papailiopoulos, J. Luo, A. Dimakis, C. Huang, and J. Li, “Simple regenerating codes: Network coding for cloud storage,” in INFOCOM, 2012 Proceedings IEEE, march 2012, pp. 2801–2805.
  • [9] I. Tamo and A. Barg, “A family of optimal locally recoverable codes,” IEEE Transactions on Information Theory, vol. 60, no. 8, pp. 4661–4676, 2014.
  • [10] M. Chen, C. Huang, and J. Li, “On the maximally recoverable property for multi-protection group codes”, (to appear.”
  • [11] M. Blaum, J. Hafner, and S. Hetzler, “Partial-MDS Codes and their Application to RAID Type of Architectures,” CoRR, vol. abs/1205.0997, 2012.
  • [12] J. S. Plank and M. Blaum, “Sector-disk (SD) erasure codes for mixed failure modes in RAID systems,” TOS, vol. 10, no. 1, p. 4, 2014.
  • [13] M. Blaum, “Construction of PMDS and SD codes extending RAID 5,” CoRR, vol. abs/1305.0032, 2013.
  • [14] M. Blaum and J. S. Plank, “Construction of two SD codes,” CoRR, vol. abs/1305.1221, 2013.
  • [15] M. Blaum, J. S. Plank, M. Schwartz, and E. Yaakobi, “Construction of partial MDS (PMDS) and sector-disk (SD) codes with two global parity symbols,” CoRR, vol. abs/1401.4715, 2014.
  • [16] P. Gopalan and C. Huang and B. Jenkins and S. Yekhanin, “Explicit maximally recoverable codes with locality,” arXiv preprint arXiv:1307.3150, 2013.
  • [17] M. Li and P. P. C. Lee, “STAIR codes: a general family of erasure codes for tolerating device and sector failures in practical storage systems,” in Proceedings of the 12th USENIX conference on File and Storage Technologies, FAST 2014, Santa Clara, CA, USA, February 17-20, 2014, 2014, pp. 147–162.

Appendix A Proofs of Theorems on Maximal Recoverability

{proof}

[Proof of Theorem IV.1] The code has optimum minimum distance w.r.t locality [1]. Hence puncturing at any number of cosets (local codes) without changing k will maintain the optimum minimum distance. We say that is an admissible puncturing pattern if and , all .

Let be the algebraic closure of . Throughout the proof whenever we say a pattern or just , it refers to an admissible puncturing pattern for an code with all symbol locality . Throughout the discussion any code referred to are polynomial evaluation codes and we assume that the set of evaluation positions of the code to be ordered. We use also to indicate the actual finite field elements at the positions indicated by the puncturing pattern in the set of evaluation positions of the code.

Maximal Recoverability:
Let .
We denote an encoding polynomial of by and we assume . Let denote the cyclic group of cube roots of unity. Let be a primitive element in . If are the roots of then it must satisfy:

where refers to the th elementary symmetric function. Lets denote the above set of conditions based on elementary symmetric functions on by .

If we have a maximally recoverable code based on the theorem and let be the chosen cosets of evaluation positions for forming the codeword of the maximally recoverable code and if we puncture this code by a pattern then for the resulting (assuming k doesnt change after puncturing) code to be MDS we need . Based on the degree of , we know that . Hence out of roots of , we want atleast roots to lie outside for any . In other words its enough if we choose cosets such that for any which satisfies the condition , atmost only distinct elements will lie in the chosen cosets after puncturing by any . Note that this condition will also ensure that the dimension of a length punctured code obtained by puncturing the code by a pattern is for any . If not there are 2 distinct non zero message polynomials which after evaluating at cosets of evaluation positions of the code yields the same codeword after puncturing by a pattern to length. This means is another non zero message or evaluation polynomial with zeros in the chosen cosets after puncturing by but by the condition of choosing cosets mentioned in previous sentence (roots of satisfies ) there can be atmost distinct zeros in the evaluation positions. This is a contradiction as (by the condition given in the theorem). Hence if we choose cosets such that for any pattern and any distinct elements from the cosets after puncturing by , none of from such that satisfies which are distinct from lie in the chosen cosets after puncturing by then we are done.

Proposition 1

Let be a set of elements elements from satisfying and contains for some then satisfies .

{proof}

Since satisfies , this implies for .

Hence

For ,    as has only elements.
Hence,

Hence

for ,

Since, and , this implies that

By induction, if we assume, then since , we have
( is the starting condition of the induction which we already proved).
Hence satisfies .

:
Its enough to choose cosets such that for any and any (contained in the chosen cosets) which are distinct and contains atmost one element from each coset, none of the from such that satisfies , which are distinct from lies in the chosen cosets after puncturing by for any disjoint from .

:
This is because if satisfying contains at least 2 element from some coset for some , since the polynomial restricted to any coset is a degree polynomial, the third element from coset is also a root of . Hence the entire coset is contained in and by similar reasoning can be written as for some where contains at most one element from each coset and satisfies by proposition .

Now by the property of the chosen cosets, we have that for any distinct from the chosen cosets containing atmost one element from each coset, any of which are distinct from such that satisfies will not lie inside the chosen cosets after puncturing by for any such that . Wlog this implies the chosen cosets after puncturing by any can contain atmost only (writing only distinct elements) of the elements. Hence there can be atmost roots out of roots inside the chosen cosets after puncturing by any . Hence we are done.

From here we term a set of cosets satisfying the above claim, to be satisfying .
We are going put another set of conditions on a set of cosets. The necessity of this condition will be clear in the proof.


A given set of cosets, is said to satisfy condition if,
For any and any (contained in the chosen cosets) which are distinct and contains atmost one element from each of cosets, the matrix given by

=

is non-singular, where .

Furthermore, for any and any (contained in the chosen cosets) which are distinct and contains at most one element from each of cosets, the matrix given by
=

is non-singular. where .

From here on we proceed to find a set of cosets satisfying and . We proceed by choosing coset at each step inductively until we choose the required set of cosets.
At each step we select and add one coset to our list and throw away a collection of cosets from the cosets not chosen. Let the cosets chosen upto ith step be and the cosets thrown upto ith step be and let the total collection of cosets in the field be .

1) The first coset is chosen to be any coset. Hence consists of just the coset chosen. We don’t throw away any cosets at this step. Hence is empty. satisfies and trivially.

2) The second coset is also chosen to be any coset from . Hence consists of the chosen cosets.
:
For , and for any distinct elements , one from each coset in , any such that cannot be distinct from and lie in any of the cosets in . If it does, wlog let and lie in same coset which is in then but every coset is a coset of cube roots of unity. Hence where X is the third element from the same coset as . Hence which implies but X is in the same coset as and is in the other coset in . Hence a contradiction.
This implies that additive inverse of sum of distinct elements from different cosets cannot lie in the same coset as the elements.

For , , we need to pick 4 distinct elements, from distinct cosets but there are only 2 cosets in . Hence is satisfied.

:
For , , hence non-singular.
For , we need to pick and distinct elements from distinct cosets but there are only 2 cosets. Hence is satisfied.

:
For every two distinct elements chosen one from each of the 2 cosets in , find the third element such that and throw away the coset in which contains it. Since satisfies , will either not lie any coset in or won’t be distinct from . In the first case, we throw the coset and in the latter case, we don’t do anything. There are 3x3=9 possible summations