Algebraic Codes and a New Physical Layer Transmission Protocol for Wireless Distributed Storage Systems
In a wireless storage system, having to communicate over a fading channel makes repair transmissions prone to physical layer errors. The first approach to combat fading is to utilize the existing optimal space-time codes. However, it was recently pointed out that such codes are in general too complex to decode when the number of helper nodes is bigger than the number of antennas at the newcomer or data collector. In this paper, a novel protocol for wireless storage transmissions based on algebraic space-time codes is presented in order to improve the system reliability while enabling feasible decoding. The diversity-multiplexing gain tradeoff (DMT) of the system together with sphere-decodability even with low number of antennas are used as the main design criteria, thus naturally establishing a DMT–complexity tradeoff. It is shown that the proposed protocol outperforms the simple time-division multiple access (TDMA) protocol, while still falling behind the optimal DMT.
Productive and active research in the area of data storage has taken place across several disciplines during the past few years. Recently, attention has been paid to distributed storage systems (DSSs), where information is no longer stored in a single device, but rather distributed among several storage nodes in a network, which is potentially wireless. One of the main advantages of storing information in a distributed manner is that the storage system can be made robust against failures by introducing some level of redundancy. This way, even if nodes fail, e.g., go offline, the original file can still be recovered from the surviving storage nodes. In contrast, if a file is stored in a single device and the device breaks, the file is usually lost. Distributed storage also enables the usage of cheap and small disks on small devices instead of having to constantly increase the size of the storage space on an individual device every time the amount of information to be stored increases. The construction and analysis of a robust distributed storage system is highly nontrivial, and requires a detailed mathematical description. There are several aspects of such a system that need to be considered.
The first aspect is that of the storage code. To profit from storing data in a distributed manner, a certain amount of redundancy needs to be introduced to the system in order to make it more robust. The most straightforward scenario is replication. This way, if a node is lost, the original information can be recovered by contacting only one of the surviving nodes, provided that all the nodes store a replica. However, this protocol requires to transfer the entire file for the repair of the lost node, and is also not efficient in terms of storage space, as every storage node has to store the whole file.
More sophisticated protocols have been developed, always giving a tradeoff between the amount of data that needs to be stored in any of the storage nodes, and the amount of data that needs to be retrieved for the repair of a lost node. For more information, see e.g. [1, 2, 3, 4]. Some examples of real-life distributed storage systems are Apache Cassandra developed at Facebook, and Windows Azure created by Microsoft.
A second aspect that needs consideration is communication over the existing wireless channels, as it should be possible to store or retrieve a file using a wireless connection . The mobility of a user has become crucial in everyday life, and we use wireless channels for data transmission instead of wired ones for increased flexibility. Unfortunately, the transmission of data across wireless channels is risky in terms of transmission errors, as waves traveling through the channel typically suffer from several effects imposed by the nature.
From the mathematical point of view, this process of transmission can be significantly improved by using suitable coding techniques, a prominent one being space-time (ST) coding. Space-time codes are used when transmitting information over a wireless multiple-input multiple-output (MIMO) channel between terminals equipped with multiple antennas, and they provide a certain amount of redundancy by using several time slots for the transmission of the same information. A space-time code is a finite subset of complex matrices, where is the number of transmit antennas and the number of channel uses. The main design criteria for ST construction are 1) the rank criterion, 2) the determinant criterion, and 3) the diversity-multiplexing gain tradeoff (DMT) [6, 7, 8]. For a good survey on algebraic space-time codes we refer to [9, 10]. As a recent example, Apple’s new iPad Air, released November 2013, is equipped with two antennas instead of one and profits from the advantages of MIMO coding techniques, increasing the speed of data transmission as well as the reliability of its wireless communications.
I-a Contributions and related work
In most of the storage related research the focus is on the (logical) network layer, while the physical layer is usually ignored. Nonetheless, some interesting works considering the physical layer do exist. In , a so-called partial downloading scheme is proposed that allows for data reconstruction with limited bandwidth by downloading only parts of the contents of the helper nodes. In , the use of a forward error correction code (e.g., LDPC code) is proposed in order to correct bit errors caused by fading. In , optimal storage codes are constructed for the error and erasure scenario.
Recently, space–time storage codes were introduced in  as class of codes that should be able to resist fading of the signals during repair transmissions, while also maintaining the repair property of the underlying storage code. It was pointed out that the obvious way of constructing such codes, namely combining an optimal storage code and an optimal space-time code results in infeasible decoding complexity, when the number of helper nodes is bigger than the number of antennas at the newcomer or data collector (DC). Motivated by this work, we tackle the complexity issue and design a novel yet simple protocol that has feasible decoding complexity for any number of helpers, while the helpers and the newcomer/DC are only required to have at most two antennas each.
Thus, the present paper deviates from earlier work on DSSs in that it addresses the actual encoding of the transmitted data in order to fight the effects caused by fading, continuing along the lines of . In addition to encoding, a sphere decodable transmission protocol for the encoded data will be studied. To make our system as realistic as possible, the protocol requires only up to two antennas at each end while still being sphere-decodable. This is in contrast to the previous optimal space-time codes  that would be in principle suitable for storage transmissions, but would have exponential complexity when the number of helpers is bigger than the number of receive antennas. Furthermore, It is shown that the designed system achieves a significantly higher DMT than the TDMA protocol. The observed gap to the optimal DMT is due to the above complexity requirement, which establishes a natural DMT-complexity tradeoff.
I-B Distributed storage systems
In a distributed storage system, the data is stored over storage nodes by using an erasure code. If a maximum distance separable (MDS) code is used to introduce redundancy in the system, then a data collector can reconstruct the whole file by contacting any out of nodes. In addition to storing the file, the system has to be repaired by replacing a node with a new one whenever some node fails. This can be done by using, e.g., regenerating codes . If the newcomer node replacing the failed node has to contact helper nodes in order to restore the contents of the lost node, we call the code an code, and is referred to as the repair degree. Recent work [1, 16] considers tradeoffs between the storage capacity, secrecy capacity, and repair bandwidth. Explicit storage code constructions achieving some of the tradeoffs can be found in [4, 2, 3], among many others. Regenerating codes by definition achieve the storage capacity–repair bandwidth tradeoff.
Ii Storage Communication over Parallel Fading Channels
In this section, we review from  the basic idea of space-time coded storage transmissions. If a node fails in the system or a data collector (DC) wants to reconstruct the stored file, then the newcomer/DC111We denote both by for simplicity. node has to contact nodes for repair/reconstruction. We denote by the nodes contacted by . Each would like to send its contents to over a Rayleigh fading channel. In order to match the alphabets used for the stored information on one hand, and for the wireless transmission on the other hand, a bijective lift function is introduced to the system. Formally,
where is the set of possible encoded222Encoded by an MDS or other erasure code. file fragments, and is a finite constellation of size equal to the size of . We define . To give some intuition to the problem at hand, we start by assuming that the channels used for communication are parallel to each other, e.g., via orthogonal frequency division multiplexing (OFDM) coding, and that has receive antennas. The newcomer receives the system of equations
where and are the random complex Gaussian variables describing the channel effect and noise, respectively. The newcomer then decodes these equations to produce ML-estimates of the symbols transmitted over the Rayleigh fading channel with fading components . We then compute for all , which are the encoded file fragments actually received by the incoming node . To complete the repair, the node performs the reconstruction algorithm required by the DSS on these file fragments. This setting can be formalized and generalized as follows:
Definition II.1 ()
A space-time storage code consists of the following:
a DSS system with parameters defined as above, employing an MDS code or some other type of erasure code,
a constellation carved from or ,
a bijective lift function
where is the set of possible encoded file fragments, and
a space-time transmission protocol using as information symbols.
The storage transmission protocol proposed in this paper makes use of algebraic space-time codes, and hence our constellation will be carved from an algebraic lattice , and a lift function of the form
will be used. However, the above definition is more general and does not restrict the structure of the code. We refer to this special case as algebraic space-time storage code.
Iii Problem Formulation and Design Criterion for Nonparallel Channels
It was pointed out in  that the repair/reconstruction transmissions can be modeled as a multiple access channel (MAC), where multiple users are communicating simultaneously to a joint destination. Consider now a wireless distributed storage system with repairing333For simplicity, we will exclusively talk about repair (), the file reconstruction process is analogous (). nodes, each node equipped with transmit antennas, and receive antennas at the newcomer node. Let be the (Rayleigh distributed) channel matrix of size and be the code matrix transmitted by the th repairing node444The authors are aware of the slight abuse of notation here, but hope that there is no danger of confusion as we will be using to indicate the number of nodes participating in the transmission., where is the number of channel uses for transmission. The received signal matrix at the incoming node is given by
where is additive white Gaussian noise. Suppose further that after some lifting operation of the node contents, each resulting code matrix is taken from a rank- algebraic lattice space-time code and is in the form of
where ; then we can rewrite (1) as
or equivalently in vector form
where is the vectorization of matrix , , and is the corresponding matrix of size for (2). In order to have an efficient sphere-decoding algorithm for (3), we need such that after the QR-decomposition of , the matrix is an upper triangular matrix. This gives the criterion that the maximal number of independent QAM symbols to be sent by each repairing node is upper bounded by
in each channel use.
The DMT-optimal MIMO multiple access channel (MAC) code  has parameters and , where is the smallest odd integer . Thus by (4), such code can be efficiently sphere-decoded unless there are receive antennas at replacing node.
The design goal in this section is to provide a transmission and coding scheme for the wireless DSS which is efficiently sphere-decodable. The performance of each scheme will be measured by the notion of diversity gain-multiplexing gain tradeoff (DMT). In particular, we say each repairing node transmits at multiplexing gain if the actual transmission rate is bits per channel uses, where SNR is the signal-to-noise power ratio. Given a fixed value of multiplexing gain , we say a scheme achieves diversity gain if its outage probability , which is a lower bound on the probability of decoding error, satisfies
and we will write the above as .
We will compare the proposed schemes with the DMT-optimal MIMO-MAC codes presented in , which achieve the following optimal DMT
where is the optimal DMT for point-to-point MIMO channel and is given by the piecewise linear function connecting the points for . Notice, however, that the codes in  require a high number of receiving antennas to achieve the optimal DMT.
Iv The Proposed Scheme : , , and users
Let denote the set of users, and let denote the set of all -subsets of , i.e.,
With the above, the proposed scheme is the following. Let be a random variable uniformly distributed over . Given , only replacing nodes and transmit during the period of . Note that for any . This means that in order to achieve an average multiplexing gain , each replacing node , when is chosen according to , i.e. , should actually transmit at multiplexing gain . Specifically, we have the following scheme:
Randomly pick from the ensemble .
The repairing nodes and transmit using the DMT-optimal MIMO-MAC code given in [15, Eq. (20)] for , two users, and multiplexing gain .
It should be noted that at each time instant, only two repairing nodes transmit to the replacing node; as it follows from (4) the scheme is efficiently sphere-decodable. An explicit example will be provided in Section V.
The DMT performance achieved by this scheme is the following
In Fig. 1 we compare the DMT performance of this scheme to the DMT-optimal MIMO-MAC code, which has DMT given in (5), and the DMT of the time division multiple access (TDMA) based scheme. By this we mean that each repairing node takes turns in an orthogonal manner to transmit information to the replacing node at multiplexing gain . It can be seen that the first proposed scheme is much better than the TDMA-based one. The gap to the optimal MAC-DMT  is still big, but the optimal codes would require exponential decoding complexity (since ). To decode the optimal MAC space-time codes by a sphere-decoder would require receive antennas, which is highly unrealistic, in particular for device-to-device (d2d) or peer-to-peer (p2p) networks consisting of, e.g., mobile phones and laptops having only one or two antennas.
V Explicit construction
Let us now construct an explicit algebraic space-time storage code. We consider the case of nodes, out of which two nodes at a given time are transmitting, both having one transmit antenna, . We define the field extension of degree 3, where and , and with integral basis so the ring of integers . The Galois group of is generated by .
Assume the nodes contain bit strings of length . We use as a lift function a restriction of a composite map consisting of the Gray map which maps -bit strings to the set of QAM symbols, and the map , where and , mapping QAM-vectors to vectors in an algebraic lattice.
The transmission matrix for each pair of two helper nodes is now
Vi Conclusions and future work
In this paper, we considered space-time storage codes and a related transmission protocol which is able to maintain and repair data that lies on storage systems operating over a wireless fading channel. Here, the focus was on embedding a storage code by a suitable lifting procedure into a multiple access channel space-time code, while maintaining sphere-decodability. This take was motivated by the fact that optimal MAC space-time codes are not sphere-decodable but have exponential decoding complexity when . The proposed protocol improved upon TDMA, but still falls behind the optimal MAC-DMT. Hence, future work should concentrate on improved protocols that achieve better DMT without losing tolerable decoding complexity. Also simulations should be carried out to verify the performance of the proposed protocol compared to the more straightforward combination of (optimal) regenerating codes and optimal MAC space-time codes.
-  A. Dimakis, P. Godfrey, Y. Wu, M. Wainright, and K. Ramchandran, “Network coding for distributed storage systems”, IEEE Trans. Inf. Theory, vol. 56, no. 9, pp. 4539–4551, Sep. 2010.
-  A. G. Dimakis, K. Ramchandran, Y. Wu, and C. Suh, “A survey on network codes for distributed storage”, Proc. of the IEEE, vol. 99, no. 3, March 2011, arXiv:1004.4438, 2010.
-  K.V. Rashmi, N.B. Shah, and P.V. Kumar, “Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction”, IEEE Trans. Inf. Theory, vol. 57, no. 8, August 2011.
-  S. El Rouayheb and K. Ramchandran, “Fractional repetition codes for repair in distributed storage systems”, in Proc. 48th Annual Allerton, Monticello, IL, 2010.
-  CEET, “The power of wireless cloud: An analysis of the energy consumption of wireless cloud”, CEET Centre for Energy-Efficient Telecommunications, Bell Labs and University of Melbourne, April 2013.
-  V. Tarokh, N. Seshadri, A. R. Calderbank, “Space-time codes for high data rate wireless communication: performance criterion and code construction, IEEE Trans. Inf. Theory, vol. 44, no. 2, March 1998.
-  L. Zheng and D. Tse, Diversity and multiplexing: a fundamental tradeoff in multiple antenna channels, IEEE Trans. Inf. Theory, 49(5), 1073-1096, 2003.
-  D. Tse, P. Viswanath, and L. Zheng, “DiversityÐMultiplexing Tradeoff in Multiple-Access Channels”, IEEE Trans. Inf. Theory, 50(9), 1859–1874, Sep. 2004.
-  F. Oggier and E. Viterbo, “Algebraic number theory and code design for rayleigh fading channels,” Commun. Inf. Theory, vol. 1, no. 3, pp. 333–416, 2004.
-  F. Oggier, E. Viterbo, and J.-C. Belfiore, “Cyclic Division Algebras: A Tool for Space-Time Coding”, Foundations and Trends in Communications and Information Theory, vol. 4, no. 1, pp. 1–95, 2007.
-  C. Gong, “On Partial Downloading for Wireless Distributed Storage Networks”, IEEE Trans. on Signal Processing, vol. 60, pp. 3278–3288, June 2012.
-  N. Wang and J. Lin, “Joint Channel-Network Coding (JCNC) for Distributed Storage in Wireless Network”, Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol. 4, 2009.
-  K. V. Rashmi, N. B. Shah, K. Ramchandran, P. V. Kumar, “Regenerating codes for errors and erasures in distributed storage”, IEEE ISIT 2012, July 2012.
-  C. Hollanti, D. Karpuk, A. Barreal, and H.-F. Lu, “Space–time storage codes for wireless distributed storage systems”, to appear in Global Wirelss Summit, May 2014. arXiv:1404.6645 .
-  H. F. Lu, C. Hollanti, R. Vehkalahti, and J. Lahtonen. DMT optimal codes constructions for multiple-access MIMO channel. IEEE Trans. Inf. Theory, 57(6):3594–3617, Jun. 2011.
-  T. Ernvall, S. El Royhayeb, C. Hollanti, and V. Poor, “Heterogeneous distributed storage systems: capacity and security results”, J. on Selected Areas in Communications, Dec. 2013, Available at http://arxiv.org/abs/1211.0415.