On the Duality and File Size Hierarchy of Fractional Repetition Codes

On the Duality and File Size Hierarchy of Fractional Repetition Codes

Bing Zhu, Kenneth W. Shum, and Hui Li B. Zhu is with the School of Electronic and Computer Engineering, Peking University, P. R. China (e-mail: zhubing@sz.pku.edu.cn).K. W. Shum is with the Institute of Network Coding, The Chinese University of Hong Kong (e-mail: wkshum@inc.cuhk.edu.hk).H. Li is with the Shenzhen Key Lab of Information Theory and Future Network Architecture and the Future Network PKU Lab of National Major Research Infrastructure, Peking University Shenzhen Graduate School, Shenzhen 518055, P. R. China (e-mail: lih64@pkusz.edu.cn).
Abstract

Distributed storage systems that deploy erasure codes can provide better features such as lower storage overhead and higher data reliability. In this paper, we focus on fractional repetition (FR) codes, which are a class of storage codes characterized by the features of uncoded exact repair and minimum repair bandwidth. We study the duality of FR codes, and investigate the relationship between the supported file size of an FR code and its dual code. Based on the established relationship, we derive an improved dual bound on the supported file size of FR codes. We further show that FR codes constructed from -designs are optimal when the size of the stored file is sufficiently large. Moreover, we present the tensor product technique for combining FR codes, and elaborate on the file size hierarchy of resulting codes.

Distributed storage systems, regenerating codes, fractional repetition codes, combinatorial designs.

I Introduction

Modern distributed storage systems are often built on thousands of inexpensive servers and disk drives. In such an architecture, data objects are fragmented and spread across a massive collection of physically independent storage devices (e.g., Google file system [1] and Hadoop distributed file system [2]). However, due to the commodity nature of practical data storage servers, component failures are prevalent in real-world storage environments [3, 4]. To provide high reliability and availability, data redundancy should be employed in distributed storage systems.

Replication-based strategy is the simplest method to provide fault tolerance against failures [1, 2], where several copies of each data object are created and arranged on different storage nodes. Although data replication is easy to implement and manage, it suffers from the drawback of low storage efficiency. For the same level of redundancy, erasure coding technique can improve data reliability as compared to the replication scheme [5]. Maximum-distance-separable (MDS) codes are a class of erasure codes capable of providing the optimal trade-off between redundancy and reliability. In an erasure code based system, any data collector is able to reconstruct the original data file by contacting a certain number of nodes in the system. Upon failure of a node, the lost data should be recovered and stored in a replacement node by connecting to some surviving nodes (called helpers) in this system. Even though traditional erasure codes can save the storage space, they generally require the retrieval of large amounts of data downloaded from helpers when repairing a single failed node. For example, an MDS code encodes a data object of fragments into storage nodes such that any subset of nodes are eligible for data retrieval. However, the system needs to recover the entire file in order to repair a node failure, which thus results in a large consumption of network resources (e.g., disk read and network transfer).

Regenerating codes are a class of erasure codes proposed in [6] with the capability to minimize the bandwidth consumption during the repair process. An regenerating code encodes a data file into coded packets, which are spread across a storage system consisting of nodes, each having a capacity of . The stored file can be recovered by downloading data from any storage nodes in the system. When a node fails, the lost coded packets can be regenerated by connecting to any set of surviving nodes and downloading packets from each node with a total repair bandwidth of . In particular, minimum-bandwidth regenerating (MBR) codes can recreate a failed node with the minimum repair bandwidth, i.e., . We refer the readers to [7][9] for explicit constructions of regenerating codes.

Although MBR codes enjoy the minimum repair bandwidth, they impose an additional encoding complexity into the helper nodes contacted in the repair process. Specifically, each helper node needs to read all the packets it stored and transfer a linear combination of the retrieved data, which entails a large number of computations and disk read operations. Motivated by this, a simplified repair scheme called repair-by-transfer, is presented in [7], wherein the lost packets are recovered by duplicating the copies from some surviving nodes. Subsequently, El Rouayheb and Ramchandran [10] generalized the code constructions of [7] and introduced a new class of codes, termed fractional repetition (FR) codes, in which a two-layer encoding structure is employed to ensure data reconstruction and low-complexity node repair. The data objects are encoded in the first layer by an MDS code, and then the coded packets are replicated and stored in the system according to the FR code in the second layer. In the presence of node failures, each helper node transfers a portion of stored data to the replacement node without performing additional encoding operations. By storing the transferred data, the replacement node maintains the same content as in the failed node. In such a sophisticated manner, FR codes enable uncoded exact repairs at the MBR point. However, in contrast to traditional MBR codes, the node repair process of FR codes is table-based, which indicates that the failed node can be regenerated by contacting some specific subsets of surviving nodes [10].

The capacity of a distributed storage system is the maximum amount of data that can be delivered to a data collector when contacting any out of storage nodes in the system [6]. The parameter is called the reconstruction degree. In [6], Dimakis et al. theoretically showed that the storage capacity of an MBR code based system is

(1)

Due to the different requirements in the node repair process, the MBR capacity given in (1) is not applicable to FR codes. For example, the FR codes constructed in [10] have a capacity greater than or equal to that of MBR codes for . Indeed, the data reconstruction mechanism of FR codes is built on the outer MDS code. The supported file size111We notice that the supported file size of a given FR code is equivalent to the storage capacity of the FR code based system. of an FR code essentially equals to the number of guaranteed distinct packets when downloading data from any collection of nodes. Intuitively, we can obtain the file size of a certain FR code by exhaustively considering all the possible combinations of nodes in the system. However, the computational complexity increases as and increase. On the other hand, having a knowledge of the supported file size is critical to the design of FR codes, which can be set as the input size of the outer MDS code.

I-a Related Work

The concept of an FR code is introduced in the pioneer work [10], wherein the authors also proposed explicit code constructions from regular graphs and Steiner systems. Several recent studies extend the construction of FR codes to a larger set of parameters, which are mainly based on the graph theory (e.g., bipartite cage graph [11] and extremal graph [12, 13]) and combinatorial designs (e.g., transversal designs [12], resolvable designs [14], group divisible designs [15], Hadamard designs [16], perfect difference families [17], relative difference sets [18] and partially ordered sets [19]). Further, Pawar et al. [20] proposed a randomized scheme for constructing FR codes, which is based on the balls-and-bins model. In [21], Anil et al. presented an incidence matrix based algorithm for designing FR codes, where they also enumerated FR codes up to a given number of nodes. Constructions of FR codes for dynamic data storage systems are considered in [22, 23], where the code parameters can evolve over time. The authors in [24][26] investigated the constructions of FR codes with small repair degrees (). Moreover, generalization of FR codes to heterogeneous storage networks is discussed in [27][31], where the system nodes have different storage capacities.

In addition to code constructions, some upper bounds on the maximum supported file size of FR codes with given parameters are also investigated in [10, 12, 16]. El Rouayheb and Ramchandran provided in [10] two upper bounds on the file size of FR codes. Subsequently, Silberstein and Etzion presented in [12] explicit code constructions that attain these bounds. Furthermore, Olmez and Ramamoorthy determined the supported file size for most of their code constructions [16].

I-B Our Contributions

In this paper, we investigate the duality of FR codes, and establish a close relationship between the supported file size of an FR code and its dual code. Specifically, our main contributions are three-fold.

  1. By jointly considering the relationship and the upper bound in [10], we provide an improved upper bound on the supported file size of FR codes, which is referred to as the dual bound.

  2. From the dual perspective, we show that FR codes based on -designs are optimal when the size of the stored file is sufficiently large.

  3. We present the tensor product method for combining two FR codes. The file size hierarchy of the resulting code can be obtained from those of the component codes.

The rest of this paper is organized as follows. Section II introduces the necessary background and notations. Section III provides a dual bound on the supported file size of FR codes. Section IV shows that FR codes derived from -designs are optimal for certain parameter ranges. Section V discusses the tensor product of FR codes. Finally, Section VI concludes the paper.

Ii Preliminaries

Ii-a Incidence Structure and -Designs

An incidence structure is a triple , where and are nonempty finite sets, and is a subset of . The elements in are called points, and the elements in are called blocks. An element in is called a flag, and we say that a point is incident with a block if is a flag in . We can also specify an incidence structure by an incidence matrix, which is a zero-one matrix with rows indexed by the blocks and columns indexed by the points, such that the entry corresponding to a point and a block is equal to if and only if is incident with . If an incidence matrix has constant row sums and constant column sums, then the corresponding incidence structure is called a tactical configuration [32].

In this general setting, it is permissible that two distinct blocks are incident with the same set of points, and if it occurs, we say that there are repeated blocks. An incidence structure with no repeated blocks is called simple. In a simple incidence structure, we can identify a block with a subset of , and denote the incidence structure by .

A -design is a simple incidence structure in which every block has the same size and any distinct points are contained in exactly blocks, for some constants and . More precisely, for positive integers , , , and satisfying , a - design is a simple incidence structure such that (i) , (ii) for all , and (iii) any subset of points of occurs in exactly blocks in . When , a -design is nothing but a simple tactical configuration.

For example, consider a point set and a block set . We note that every pair of points appears in exactly two blocks. Thus, forms a - design.

Lemma 1.

([33, Theorem 9.7]) Suppose that is a - design. Let and be disjoint subsets of such that , , and . Then, there are exactly

(2)

blocks in that contain all the points in and none of the points in .

For the special case that , we obtain the number of blocks in a - design, which is given by

(3)

Moreover, if and , we have , implying that each point is contained in blocks.

Ii-B DRESS Code and Fractional Repetition Code

A Distributed Replication-based Exact Simple Storage (DRESS) code is a coding architecture that consists of an outer code and an inner code described as follows [10]. The outer code is an MDS code with dimension and length over a sufficiently large finite field. To distribute a data object of size , which is referred to as a data file, we first encode it by the outer MDS code, such that any out of the obtained coded packets are sufficient to reconstruct the data file. In the following, we will use symbols and packets interchangeably. The inner code is an incidence structure such that the symbols produced by the outer MDS code are indexed by the points in (i.e., ). Each storage node is associated with a unique block in , and stores the coded symbols indexed by the points in the corresponding block.

For a given reconstruction degree , the supported file size of the inner code is defined as

(4)

where the minimum is taken over all -subsets of the block set . By definition, the value of refers to the number of guaranteed distinct packets one can download from any storage nodes. For a fixed value of , we can choose an outer MDS code with length and dimension , such that any subset of nodes are sufficient in decoding the data object.

The design rationale of the inner code is to facilitate node repair. Upon failure of a storage node, each helper node simply passes the packets it has in common with the failed node for repair. In other words, DRESS codes enjoy the repair efficiency of the replication scheme, and are suitable for high-churn environments with frequent node joins/leaves (e.g., peer-to-peer distributed storage systems). Friedman et al. [34] evaluated the efficiency of DRESS codes in practical peer-to-peer environments, and showed that the concatenated scheme can achieve better features than each of the methods separately. Moreover, Itani et al. [35, 36] investigated the optimal repair cost of DRESS code based data storage systems, where they proposed efficient genetic algorithms for the single node failure and multiple node failure scenarios respectively.

In this paper, we concentrate on DRESS codes which employ a tactical configuration as the inner code. We define a fractional repetition (FR) code as a tactical configuration with points and blocks, in which every point is incident with blocks, and every block is incident with points, for some constants and . Hence, every coded packet is replicated times in the storage system, and each storage node contains packets. We refer to such an FR code as an -FR code, and call the parameter the repetition degree.

Since the incidence matrix of an FR code has constant row sum and constant column sum , we have the following basic relation

(5)

among the code parameters.

We illustrate how to distribute data packets across a storage system using the -FR code shown in Fig. 1. By using a MDS code as the outer code, we encode a data file consisting of source symbols to coded symbols. These coded symbols are then distributed to storage nodes according to the incidence structure in Fig. 1. Furthermore, we observe that a data collector contacting any nodes can obtain at least distinct coded packets, which are sufficient to decode the original data.

Figure 1: An FR code with repetition degree for a distributed storage system with nodes. The numbers in the blocks give the indices of packets stored in the node, i.e., each storage node contains packets.

Suppose that is an FR code. The dual of is defined as the FR code , where is the subset of defined by

We denote the dual of by . Notice that the incidence matrix of and are the transpose of each other. In [10], the authors refer to the dual FR code as the transpose code. We state two immediate properties of dual codes in the following lemma.

Lemma 2.

Let be an -FR code.

(i) The dual code of is a -FR code.

(ii) The double dual of is itself.

Example 1. The “repetition code” on storage nodes is an -FR code. The incidence matrix is the all-one matrix. By definition, the dual of this repetition code is a -FR code, which consists of a single storage node containing all the coded symbols.

Iii The Hierarchy of Supported File Size and The Dual Bound

Iii-a File Size Hierarchy of FR Codes

Suppose that is an -FR code. The supported file size of is a non-decreasing function of , i.e.,

(6)

We call the above the hierarchy of supported file size of .222In [12], the authors introduced the notion of file size hierarchy of FR codes for . We extend this study in this paper by taking all the possible reconstruction degrees into consideration. We also define by convention. Similarly, the file size hierarchy of the dual code is

(7)

Note that there is a close relationship between and . This property can be seen from the fact that if we can find an all-zero submatrix in the incidence matrix of , then we have

(8)

and

(9)

This motivates us to define

(10)

with the maximum taken over all subsets of size . By definition, is the largest integer such that we can find a all-zero submatrix in the incidence matrix of . From (6) and (7), we have

The following result follows directly from the relation of and , where and .

Lemma 3.

Let be an FR code and let be a given reconstruction degree. Denote as and as . Then, we have (i) , and (ii) .

Figure 2: The graphic illustration of and .

We now plot the points for , and for in the same figure. The results can be found in Fig. 2. A Pareto optimal point, say , is a vertex of the graph that satisfies

and

Therefore, we obtain

(11)

Based on the above analysis, we obtain the following theorem.

Theorem 4.

Let be an -FR code. With  as defined in (10), we have

(12)

Remark 1. We notice that the identities in (12) can be expressed in a more compact way by

(13)

where is the indicator function equal to if the condition is true and otherwise. In this case, the right-hand side term of (13) counts the number of such that  is strictly less than . Thus,

where .

Example 2. Let be the incidence structure obtained from the line graph of the complete graph on five vertices. This gives the -FR code with incidence matrix

as discussed in [10]. This is a matrix with constant row sum and constant column sum . The blocks in this FR code are

Figure 3: The relationship between and .

For , we can compute that the supported file size of the complete graph based FR code is

and the values of for are

Moreover, the supported file size hierarchy of is , i.e,

Fig. 3 illustrates the relationship between and . We can obtain the two supported file size functions if we view the stair-case graph from two different perspectives, which are distinguished with different colors.

Iii-B An Improved Dual Bound

In [10], the authors showed that the supported file size  of an -FR code is upper bounded by

(14)

where is defined recursively by

Note that Theorem 4 provides a link between an FR code and its dual. Using the mechanism in the previous subsection, we can obtain an improved upper bound if we take the upper bound in (14) into consideration.

Theorem 5.

Given an FR code with parameters , we define the function recursively by

for . Then, for all , we have

(15)
Proof:

The function is the counterpart of the recursive bound (14) on the dual code. Thus,

(16)

Since

(17)

for all , in view of the remark after Theorem 4, we have

(18)

which completes the proof. ∎

We refer to the inequality in (15) as the dual bound on the supported file size.

Example 3. Consider an FR code with parameters . The bound in (14) suggests that the supported file size with reconstruction degree  is upper bounded by .

Moreover, the recursive bound applied to the dual code yields with

Then, the dual bound in (15) gives

This bound can be achieved by the -FR code listed in the database [21] with the following incidence matrix:

We observe that the four storage nodes associated to rows and contain precisely distinct packets. Thus, this FR code can support a file size of with , implying that it is optimal by the dual bound.

Code Parameter Recursive Bound Dual Bound
Table I: Comparison Between the Recursive Bound in [10] and the Dual Bound

Table I shows the comparison between the recursive bound in [10] and our dual bound for some parameter ranges.

Iv Optimal FR Codes based on -Designs

Another upper bound on the supported file size of an -FR code is derived in [10] as

(19)

From the dual perspective, we show that the bound in (19) is essentially the same as the following bound on the reconstruction degree , which is first obtained in [12].

Lemma 6.

([12, Lemma 32]) If we store a data file of size by using an -FR code , then the reconstruction degree is lower bounded by

(20)
Proof:

By applying the bound in (19) to the dual code of , we obtain

(21)

for . (We can remove the floor operator without loss of generality.) Hence,

(22)

Given an integer between and , we let be the integer that satisfies

By Theorem 4, we obtain

(23)

The proof of this theorem is completed by taking the ceiling of both sides. ∎

In what follows, we consider FR codes derived from -designs. Recall that in a - design , each point of is contained in the same number of blocks. Therefore, we can obtain an FR code with repetition degree by taking .

We state the main result in the following theorem.

Theorem 7.

Let be a - design, and let be the FR code based on . Then, the supported file size is optimal for in the range , and is given by

(24)
Proof:

Let be an arbitrary -sized subset of , where . Based on Lemma 1, we obtain that the number of blocks in that are disjoint from is . Hence, for the constructed FR code , we have

(25)

which in conjunction with Theorem 4 gives the file size of .

Let be an integer such that . By substituting into (20), we obtain

(26)
(27)

Therefore, attains the lower bound in Lemma 6 for . ∎

Remark 2. For the given file size , the authors proved in [12] that FR codes based on regular graphs can attain the bound in (20). In this paper, we show that FR codes constructed from -designs require the smallest possible reconstruction degree for those file sizes ranging from to .

V Tensor Product of FR Codes

Let be an -FR code and an -FR code, satisfying the condition that

(28)

Denote the blocks in and by , and , respectively.

We define the tensor product of and , denoted by , as the FR code with points and blocks. The points are the pairs in , and the blocks are given by

Notice that the sizes of and are and , respectively, and they are equal by the hypothesis in (28). Moreover, we observe that each point in appears in exactly blocks. Therefore, the tensor product of and is an FR code with parameters .

Example 4. Let be the trivial -FR code in which each node stores a unique code symbol, i.e., and . Then the tensor product forms a -FR code. Specifically, the points are the pairs for , and the blocks are

This is the same as the grid code considered in [16].

Example 5. Let be the trivial -FR code as in the previous example. We can take the tensor product and obtain a -FR code. We call this the triple tensor product of . The points are the triples for . The blocks are

and each block contains points.

We shall list some simple properties about the tensor product of FR codes.

Lemma 8.

For , let be an -FR code, such that .

  1. and are isomorphic FR codes.

  2. .

Moreover, the file size hierarchy of can be computed by the following theorem.

Theorem 9.

Let be an -FR code, for , such that . Let and be defined as in (10). We have

(29)

for .

Proof:

The incidence matrix of is an binary matrix. Without loss of generality, we assume that the first rows correspond to the blocks generated by the blocks of and the other rows correspond to the blocks obtained by the blocks of . Consider now we have blocks of , among which blocks are taken from the first rows and blocks are from the last rows.

We first consider the submatrix corresponding to the blocks. Based on the tensor product method, we have that the maximum integer such that there exists an all-zero submatrix in the matrix is , i.e., . By jointly considering the rows from the last rows, we obtain that the maximum integer such that there exists a all-zero submatrix in the matrix is , which completes the proof. ∎

Figure 4: A -GFR code. Each entry corresponds to a distinct storage node, and the numbers in an entry correspond to the coded packets stored in the storage node.
Corollary 10.

Let and be positive integers. For , let be an -FR code, such that is equal to a constant for all . Let be the FR code obtained from by repeating each of the blocks in -fold. Then is an FR code with parameters

and the file size hierarchy can be determined by

(30)

for .

Example 6. Let and be integers larger than or equal to 2. Let denote the trivial -FR code with the identity matrix as the incidence matrix. For positive integers , consider the FR code

and denote it by a -GFR code. The resulting FR code has parameters

Fig. 4 shows how to distribute coded packets across storage nodes by a -GFR code. Since the file size hierarchy of is simply given by

we can apply Theorem 4 and Corollary 10 and obtain the file size hierarchy of the -GFR code as