New constructions of CSS codes obtained by moving to higher alphabets
We generalize a construction of non-binary quantum LDPC codes over due to [KHIK11] and apply it in particular to toric codes. We obtain in this way not only codes with better rates than toric codes but also improve dramatically the performance of standard iterative decoding. Moreover, the new codes obtained in this fashion inherit the distance properties of the underlying toric codes and have therefore a minimum distance which grows as the square root of the length of the code for fixed .
LDPC codes [Gal63] and their variants are one of the most satisfying answers to the problem of devising codes guaranteed by Shannon’s theorem. They display outstanding performance for a large class of error models with a fast decoding algorithm. Generalizing these codes to the quantum setting seems a promising way to devise powerful quantum error correcting codes for protecting, for instance, the very fragile superpositions manipulated in a quantum computer. It should be emphasized that a fast decoding algorithm could be even more crucial in the quantum setting than in the classical one. In the classical case, when error correction codes are used for communication over a noisy channel, the decoding time translates directly into communication delays. This has been the driving motivation to devise decoding schemes of low complexity, and is likely to be important in the quantum setting as well. However, there is an important additional motivation for efficient decoding in the quantum setting. Quantum computation is likely to require active stabilization. The decoding time thus translates into computation delays, and most importantly in error suppression delays. If errors accumulate faster than they can be identified, quantum computation may well become infeasible: fast decoding is an essential ingredient to fault-tolerant computation.
Quantum generalizations of LDPC codes have indeed been proposed in [MMM04]. However, it has turned out that the design of high performance quantum LDPC codes is much more complicated than in the classical setting. This is due to several reasons, the most obvious of which being that the parity-check matrix of quantum LDPC codes must satisfy certain orthogonality constraints. This complicates significantly the construction of such codes. In particular, the plain random constructions that work so well in the classical setting are pointless here. There have been a number of attempts at overcoming this difficulty and a variety of methods for constructing quantum LDPC codes have been proposed [Pos01, Kit03, MMM04, COT05, COT07, LGF06, GFL08, HI07, IM07, Djo08, SMK08, Aly07, Aly08, HBD08, TZ09, TL10, KHIK11]. However, with the exception of [TZ09] which gives a construction of LDPC codes with minimum distance of the order of the square root of the blocklength, all of these constructions suffer from disappointingly small minimum distances, namely whenever they have non-vanishing rate and parity-check matrices with bounded row-weight, their minimum distance is either proved to be bounded, or unknown and with little hope for unboundedness.
The point has been made several times that minimum distance is not everything, because there are complex decoding issues involved, whose behavior depends only in part on the minimum distance, and also because a poor asymptotic behavior may be acceptable when one limits oneself to practical lengths. This is illustrated for instance in our case by the codes constructed in [KHIK11] whose performance under iterative decoding is quite good even if their minimum distance might be bounded. Their construction can be summarized as follows. There are three ingredients:
The starting point is a CSS quantum code associated to a couple of binary LDPC codes satisfying (see Section II) obtained from a construction due to [HI07]. These LDPC codes have parity check matrices and which are -regular, meaning that each column contains exactly 2 “1”’s and each row contains exactly -ones.
From this construction, a pair of -ary LDPC codes is deduced which satisfies , where is some power of two, . These codes have parity-check matrices and of the same size as and respectively and which have nonzero entries whenever the corresponding entry of (respectively ) is equal to , that is
where denote the entry corresponding to the -th row and the -th column of respectively.
By denoting the length of by , and by replacing each entry of and in the finite field over elements by a binary matrix of size , through a ring isomorphism where is a certain subring of binary matrices (i.e. a one-to-one mapping preserving field addition and multiplication), a pair of two parity-check matrices is obtained. They define a pair of binary codes of length satisfying the CSS condition .
The point of this construction is that the new quantum code associated to the pair can now be decoded on the extension field and this improves dramatically the performance in the same way as the performance of classical binary regular LDPC codes is improved by moving to a larger extension field as shown in [Hu02, HEA05].
Our purpose in this article is here to generalize the construction of [KHIK11] and to show that it can be applied to any
pair of binary codes satisfying which are LDPC codes which have parity check matrices
which have exactly “1”’s per column
(i) it has the same two dimensional structure as toric codes, this might turn out to very helpful for its implementation. It represents for instance a quite attractive code choice for performing quantum fault-tolerant computation [Kit03].
(ii) it inherits the distance properties from the underlying toric code and has therefore a minimum distance which grows like the square root of the length,
(iii) the number of encoded qubits is not constant anymore as for toric codes but grows as where is the degree of the extension field,
(iv) whereas iterative decoding displays very bad performances when applied to toric codes, plain iterative decoding behaves much better for this new family of codes and when for instance, we obtain codes for which iterative decoding performs quite well (see Section V).
Apart from the practical relevance of the codes constructed, there is also a theoretical aspect. This shows for instance that it is possible to obtain families of CSS codes with a prescribed degree distribution on the check nodes with an unbounded minimum distance with the construction strategy of [KHIK11]. It is questionable whether or not the codes constructed in [KHIK11] meet this property (one of the drawback of the codes constructed there is that they start with a certain construction of quasi-cyclic CSS codes which can be easily proved to have bounded minimum distance).
Ii CSS codes and Tanner graphs
The codes constructed in this paper fall into the category of Calderbank-Shor-Steane (CSS) codes [CS96, Ste96] which belong to a more general class of quantum codes called stabilizer codes [Got97, CRSS98]. The first class is described with the help of a pair of mutually orthogonal binary codes, whereas the second class is given by an additive self-orthogonal code over with respect to the trace hermitian product. Quantum codes on qubits are linear subspaces of a Hilbert space of dimension and do not necessarily have a compact representation in general. The nice feature of stabilizer codes is that they allow to define such a space with the help of a very short representation, which is given here by a set of generators of the aforementioned additive code. Each generator is viewed as an element of the Pauli group on qubits and the quantum code is then nothing but the space stabilized by these Pauli group elements. Moreover, the set of errors that such a quantum code can correct can also be deduced directly from this discrete representation. For the subclass of CSS codes, this representation in terms of additive self-orthogonal codes is equivalent to a representation in terms of a pair of binary linear codes satisfying the condition . The quantum minimum distance of such a CSS code is given by
Such a code allows to protect a subspace of qubits against errors where
is called the quantum dimension of the CSS code.
LDPC codes are linear codes which have a sparse parity-check matrix. They can be decoded by using the Tanner graph
associated to such a parity-check matrix . This graph is defined as follows. Assume that is an matrix (where is the length of
the code). The associated Tanner graph is bipartite and has:
vertex set , where the first set is in bijection with the indices of the columns of , say and is called the set of variable nodes, whereas the second set is called the set of check nodes and is in bijection with the indices of the rows of : .
edge set ; there is an edge between and if and only if and the edge receives label in this case.
A CSS code defined by a couple of binary code is said to be a quantum LDPC code if and only if and are LDPC codes.
We show in this section how to derive for any integer from a pair of binary LDPC codes with parity-check matrices and satisfying
(2) all the columns of and have exactly “1”’s in it,
a pair of -ary LDPC codes with parity-check matrices and satisfying
(2) all the columns of and have exactly non zero elements in it.
This generalizes the construction of [KHIK11] to other codes than the ones obtained from [HI07] by using the ring isomorphism from the finite field to which is described in Subsection II.C of [KHIK11].
We show the existence of the couple by providing an efficient algorithm which outputs a couple of matrices meeting (1) and (2). To explain how the algorithm works let us bring in the following definition
To each row of we associate a parity-check matrix consisting of the submatrix of formed by the columns of such that and by keeping only the non zero rows in it. Let be the Tanner graph associated to this parity-check matrix.
The crucial point is the following lemma
The degree of every variable node of is two, whereas the degree of every check node is an even positive number.
The fact that the degree of every variable node is exactly two is a direct consequence of the fact that the columns of are all of weight since the columns of have exactly this property. The second claim about the degree of the check nodes is a consequence of . This can be verified as follows. Each check node corresponds to a row of which corresponds itself to some row of . We denote such a row by . The degree of the check node corresponding to is nothing but the weight of row of . It is equal to the number of ’s such that we both have . Notice that implies in particular that
This implies the aforementioned claim about the degree of the check node, since the aforementioned number of ’s is necessarily even in order to meet (4).
Since the degrees of all the vertices of is even, can be decomposed in an edge-disjoint subset of cycles . Each variable node vertex belongs to a unique cycle of this kind whereas a check node may belong to several cycles of . Our strategy to ensure that there is a choice of and meeting Condition (1) and is to look for solutions which satisfy for all rows of , all cycles of , and all check nodes belonging to
where we denote by the set of edges of . Notice that there are exactly two variable nodes which are adjacent to in . The first point is that the sum can be decomposed as a sum which implies that ensuring (5) implies (4) and therefore . Moreover the code associated to the cyclic Tanner graph is non trivial if and only if the product of its labels on its cycle is equal to . We define here for a Tanner graph the product over a cycle by
Definition 1 (product over a cycle of a Tanner graph)
Let be a cycle in the Tanner graph code. Then the product over this cycle is the product of all the coefficients of the edges over this cycle, with a power if it is a check-to-node edge, and if it is node-to-check. We denote this product by .
It is namely well known that
The code associated to Tanner graph which is a unique cycle is not reduced to the zero codeword if and only if the product of the labels over the cycle is equal to . In such a case, all the non-zero codewords have only non-zero positions.
The proof of this proposition is given in the appendix.
The algorithm for choosing the entries of and is described below as Algorithm 1.
The fact that the ’s can be chosen to be different from zero comes from the fact that the product of the labels along is equal to and from Proposition 1. It just amounts to choose a non-zero codeword in the code whose Tanner graph is given by and the labels of the edges are given by the ’s. This leads to two matrices and which satisfy Condition (1) and . Finally, it remains to explain how we choose the entries of . We will actually provide an algorithm which provides a stronger condition on the ’s, namely that
The fact that the product over all cycles of will be equal to (and not only the cycles of the subgraphs ) will be quite useful when applied to the toric code and this stronger condition can be met with Algorithm 2 which gives a very large choice for the coefficients.
(of correctness of Algorithm 2) Let be a cycle of . Let us prove that . This product can be written as
where counts the contribution to the product which involves terms which depend on . By denoting by and the two variable nodes adjacent to in the cycle and by and the two other check nodes which are adjacent in the cycle to and respectively we can decompose as
where gives the part of the contribution to stemming from edge by keeping only elements of the product which depend on . We observe now that , , and . This implies , which in turn implies that .
Remark: One might wonder whether or not it is possible to obtain -ary versions of and which satisfy the orthogonality condition when the columns of and have weight greater than . While this can be easily done for certain structured constructions such as the one proposed in [TZ09], it is not clear how to achieve this in all generality. The difficulty is the following. Consider the code defined by a Tanner graph which is a subgraph of labelled by a certain choice of the and which consists in codewords of the form satisfying (4). All these codes (for ranging over all rows of ) should be not reduced to the zero codeword. While this is easily achieved in the case of column weight essentially by the fact that the number of check nodes of the Tanner graphs is always less than or equal to the number of variable nodes (since by Lemma 1 the degree of the check nodes is greater than or equal to and the degree of the variable nodes is constant and equal to ), this is not the case anymore when the column weight is higher.
Iv An application: the extended toric code
Iv-a Definition of the toric code and its extended version
The toric code (see [BK98] for more details) is a CSS code of length which encodes qubits. It is convenient to define the Tanner graphs and of the couple of binary codes of the CSS code simultaneously. Let and be the set of variable nodes of and respectively and we identify the variable node sets and of both codes, say . These graphs are defined as follows:
A check node is connected to variable nodes in both graphs (where addition is performed modulo ). The degree of the variable nodes is of course .
The construction, summarized on Fig 1, has the shape of a torus of length and width .
Even if this code has as many checks as qubits, its dimension is positive: the rank of and associated to and is instead of , thus the dimension is (from (3)). The code has a rather large minimum distance [Kit03], however its performances when decoded with standard belief propagation is quite bad, because of the presence of many small cycles and also because the (classical) minimum distance of and is only .
Now we construct a -ary version of this code, in the same way as in Section III. In other terms, we just put some non-zero labels on the edges of the graph. For simplicity of notation we will further use to design , the label in on the edge between check and node . Labeling is performed through Algorithm 1 by choosing the coefficients and at random in Algorithm 2. We obtain a couple of -ary codes satisfying
We obtain the extended toric code by applying the aforementioned ring isomorphism to the entries of the parity-check matrices and of and : the resulting code has length . We denote the couple of binary codes defining this toric code by .
Strictly speaking, by applying Algorithm 1, the dimension of minus the dimension of could be smaller than . Indeed and might now be of full rank and we might have . This would imply that and the quantum dimension of the extended toric code would be . However, when we apply Algorithm 2 to choose the labels (so that the product of the labels over all cycles of is equal to ), then it will turn out that
so that . This means that
Theorem 1 (Dimension of the extended toric code)
If and are constructed such that verifies (6) and , then the extended toric code has dimension .
This is shown with the help of two lemmas:
If verifies (6), then it has -ary dimension .
From these two lemmas, we obtain that the dimension of and is , which gives
This implies that the quantum dimension of the extended toric code is
The proof of the two lemmas is given in the appendix.
Iv-C Minimum distance
Choosing the product of the labels to be equal to on all cycles of brings another benefit : it allows to control the minimum distance, since we have in this case
The proof is given in the appendix. This implies that
Theorem 2 (minimum distance of the extended toric code)
The minimum distance of the extended toric code is .
The minimum distance of the extended toric code is the minimal weight of a word from or . The Hamming weight of such a word is greater than or equal to the Hamming weight of the word in or it corresponds to after taking the aforementioned ring isomorphism .
There is also an upper bound on the minimum distance: it is at most , since a word of weight in has minimal weight and maximal weight in .
We have implemented standard belief propagation over to decode extended toric codes for
several values of and (see Section III of [KHIK11]) but which correspond to the same final length , which is
here. We have
(i) , ,
(iii) , .
The channel error model is the depolarizing channel model with depolarizing probability , meaning that the probability of an or error is which implies that the codes and see a binary symmetric channel of probability .
The performance of belief propagation is quite bad in the binary case (that is for standard toric code), even if the qubit error rate is rather low, the whole error is typically badly estimated. On the other hand the performances get better by moving from to and become quite good over . This is remarkable since the length of these CSS codes is constant but the rate increases with . For instance, the rate of the toric code is whereas the rate of the extended toric code over is equal to . It would be interesting to carry over the renormalizing approach of [DCP10] which improves dramatically belief propagation over standard toric codes and study how much it is able to improve the performance of standard belief propagation over these larger alphabets.
[Proof of Proposition 1] Let us consider a Tanner graph composed of a cycle , and let be the label on the edge between check and node . A codeword of the code associated to this Tanner graph is such that:
This system has non-trivial solutions if and only if the determinant of this system is , ie if:
which means that the product over the cycle is .
If this condition is verified, and one of the ’s is zero, for example , we can see from the system that have to be equal to zero too. So the non-zero codewords have only non-zero positions.
[Proof of Lemma 2]
We consider here two basic types of cycles in the Tanner graphs of and : the minimal cycles of length , and cycles of length that go through the length or the width of the torus, we call the last ones “big cycles”. An example is shown on Fig 3.
Definition 2 (minimal cycle)
A minimal cycle in the Tanner graph of or is a cycle of the form: with even, so that is a variable node.
Definition 3 (Big cycle)
A horizontal big cycle in the Tanner graph of or is a cycle of the form: , with even.
A vertical big cycle in the Tanner graph of or is a cycle , with even.
Our first observation is that it is enough to prove Condition (6) on the minimal cycles and the big cycles of the Tanner graph of , since the product of any other cycle in this Tanner graph can be decomposed as a product of products over these basic cycles.
Let us now consider a -cycle which lives in the union of the two Tanner graphs of and . It consists in two checks (see Fig 4) and , that are both connected to two variable nodes and . From the orthogonality constraint , we deduce that the labels on the edges of this cycle satisfy
We can reformulate this:
With the following definition, we obtain in this way that the product over such cycles of size is equal to .
Definition 4 (Product over a cycle - extended version)
The notion of product over a cycle can be extended to the union of the Tanner graphs of and . If is a cycle in this union, the product over this cycle is the product of all the labels of the edges over this cycle, with a power:
if the edge is check-to-variable node and belongs to the -part,
if the edge is variable node-to-check and belongs to the -part,
if the edge is check-to-variable node and belongs to the -part,
if the edge is variable node-to-check and belongs to the -part,
Now, let us look at a combination of such small cycles, as in Fig 5.
The product over all small cycles is :
By multiplying all these equations, we obtain:
which is exactly the product over a minimal cycle of .
Now, we consider another combination of 4-cycles such as in Fig 4, among one direction of the torus, as shown in Fig 6. It consists, in the subgraph of both Tanner graphs, in the variable and check nodes in the cartesian product . To simplify notation we have relabeled a variable node by , a variable node by , a check node corresponding to by and a check node corresponding to also by . It is summarized in Fig 6.
The product over all such cycles is , ie:
By multiplying all these equations, we get:
The first parenthesis is the product over a big cycle of , and the second parenthesis is the product over a big cycle of .
It shows that if the product over a big horizontal cycle is equal to in , then the product over a big horizontal cycle in is also equal to . There is a similar proof for the vertical cycles.
[Proof of lemma 3]
First, we show that the dimension of is at least .
The idea is to construct a set of independent codewords associated to cycles of the Tanner graph of . This is obtained as follows. Since all variable nodes of this Tanner graph have degree , we can consider the graph of the checks, where the vertices are the checks, and there is an edge between two vertices if and only if there is a variable node that is adjacent to the two checks. Informally, it just consists of the same graph where an ”edge-variable node-edge” is replaced by a single edge. We consider a spanning tree of this graph. An example of such spanning tree is shown in Fig 7.
This spanning tree has of course checks, and therefore edges between these checks. There are other edges: let be such edges. For all , adding to the spanning tree provides a unique cycle, . Let be the corresponding cycle in the original Tanner graph. Now, the product over each such cycle is . From Proposition 1, each of these cycles provides a codeword of . These codewords are necessarily independent, since for all the positions which correspond to the edges , exactly one of these codewords has a non zero entry (for the edge it is precisely which has a non zero entry for this position).
To show that this dimension is at most , we remove a certain check, say check . We want to show that the remaining checks are independent. To obtain this, we prove that for any syndrome, we can construct an error that gives this syndrome. In particular, we show that for every check , we can get the syndrome with at position .
Let be some path in the Tanner graph that links to . An example of such path is shown in Fig 8.
Now we construct an error that has in every position except the ’s:
is such that the syndrome in is , ie
such that has syndrome , ie , and so on and so forth.
Since has been removed, all the checks except are satisfied.
[Proof of Lemma 4]
Consider an element of minimal weight in the set . We are going to prove that its weight is greater than or equal to . A similar proof shows that this is also the case for the minimal weight elements of and this proves the lemma.
From Lemma 3, we know that the dimension of is and the dimension of is . Then the quotient has dimension , consequently we just need to find two independent codewords and , and any such can be written as , with , and at least one of either or should be non zero.
We claim that we can choose to be a codeword provided by a big vertical cycle of the Tanner graph of (obtained from Proposition 1), and being defined similarly with a big horizontal cycle. We also define and , provided by respectively a big horizontal cycle and a big vertical cycle of .
We notice that the following inner product is non zero