Quantum serial turbo-codes

# Quantum serial turbo-codes

\authorblockNDavid Poulin\authorrefmark1, Jean-Pierre Tillich\authorrefmark2, and Harold Ollivier\authorrefmark3 \authorblockA \authorrefmark1 Center for the Physics of Information, California Institute of Technology, Pasadena, CA 91125, USA.
\authorrefmark2 INRIA, Equipe Secret, Domaine de Voluceau BP 105, F-78153 Le Chesnay cedex, France.
\authorrefmark3 Perimeter Institute for Theoretical Physics, Waterloo, ON, N2J 2W9, Canada.
###### Abstract

We present a theory of quantum serial turbo-codes, describe their iterative decoding algorithm, and study their performances numerically on a depolarization channel. Our construction offers several advantages over quantum LDPC codes. First, the Tanner graph used for decoding is free of 4-cycles that deteriorate the performances of iterative decoding. Secondly, the iterative decoder makes explicit use of the code’s degeneracy. Finally, there is complete freedom in the code design in terms of length, rate, memory size, and interleaver choice.

We define a quantum analogue of a state diagram that provides an efficient way to verify the properties of a quantum convolutional code, and in particular its recursiveness and the presence of catastrophic error propagation. We prove that all recursive quantum convolutional encoder have catastrophic error propagation. In our constructions, the convolutional codes have thus been chosen to be non-catastrophic and non-recursive. While the resulting families of turbo-codes have bounded minimum distance, from a pragmatic point of view the effective minimum distances of the codes that we have simulated are large enough not to degrade the iterative decoding performance up to reasonable word error rates and block sizes. With well chosen constituent convolutional codes, we observe an important reduction of the word error rate as the code length increases.

{keywords}

Belief propagation, Convolutional-codes, Iterative decoding, Quantum error correction, Turbo-codes.

## I Introduction

For the fifty years that followed Shannon’s landmark paper [39] on information theory, the primary goal of the field of coding theory was the design of practical coding schemes that could come arbitrarily close to the channel capacity. Random codes were used by Shannon to prove the existence of codes approaching the capacity – in fact he proved that the overwhelming majority of codes are good in this sense. For symmetric channels this can even be achieved by linear codes. Unfortunately, decoding a linear code is an NP-hard problem [5], so they have no practical relevance. Making the decoding problem tractable thus requires the use of codes with even more structure.

The first few decades were dominated by algebraic coding theory. Codes such as Reed-Solomon codes [38] and Bose-Chaudhuri-Hocquenghem codes [21, 7] use the algebraic structure of finite fields to design codes with large minimal distances that have efficient minimal distance decoders. The most satisfying compromise nowadays is instead obtained from families of codes (sometimes referred to as “probabilistic codes”) with some element of randomness but sufficiently structured to be suitable for iterative decoding. They display good performances for a large class of error models with a decoding algorithm of reasonable complexity. The most prominent families of probabilistic codes are Gallager’s low density parity-check (LDPC) codes [16] and turbo-codes [6]. They are all decoded by a belief propagation algorithm which, albeit sub-optimal, has been shown to have astonishing performance even at rates very close to the channel capacity. Moreover, the randomness involved in the code design can facilitate the analysis of their average performance. Indeed, probabilistic codes are in many aspect related to quench-disordered physical systems, so standard statistical physics tools can be called into play [46, 29].

Quantum information and quantum error correction [41, 44, 4, 17, 23] are much younger theories and differ from their classical cousins in many aspects. For instance, there exists a quantum analogue of the Shannon channel capacity called the quantum channel capacity [12, 40, 26], which sets the maximum rate at which quantum information can be sent over a noisy quantum channel. Contrarily to the classical case, we do not know how to efficiently compute its value for channels of practical significance, except for quite peculiar channels such as the quantum erasure channel where it is equal to one minus twice the erasure probability [3]. For the depolarizing channel – the quantum generalization of the binary symmetric channel – random codes do not achieve the optimal transmission rate in general. Instead, they provide a lower bound on the channel capacity, often referred to as the hashing bound. In fact, coding schemes have been designed to reliably transmit information on a depolarization channel in a noise regime where the hashing bound is zero [14, 42].

The stabilizer formalism  [17] is a powerful method in which a quantum code on qubits can be seen as classical linear codes on bits, but with a parity-check matrix whose rows are orthogonal relative to a symplectic inner product. Moreover, a special class of stabilizer codes, called CSS codes after their inventors [8, 43], can turn any pair of dual classical linear code into a quantum code with related properties. The stabilizer formalism and the CSS construction allow to import a great deal of knowledge directly from the classical theory, and one may hope to use them to leverage the power of probabilistic coding to the quantum domain. In particular, one may expect that, as in the classical case, quantum analogues of LDPC codes or turbo-codes could perform under iterative decoding as well as random quantum codes, i.e. that they could come arbitrarily close to the hashing bound.

For this purpose, it is also necessary to design a good iterative decoding algorithm for quantum codes. For a special class of noise models considered here – namely Pauli noise models – it turns out that a version of the classical belief propagation algorithm can be applied. For CSS codes in particular, each code in the pair of dual codes can be decoded independently as a classical code. However, this is done at the cost of neglecting some correlations between errors that impact the coding scheme’s performances. For some class of stabilizer codes, the classical belief propagation can be improved to exploit the coset structure of degenerate errors which improve the code’s performances. This is the case for concatenated block codes [35] and the turbo-codes we consider here, but we do not know how to exploit this feature for LDPC codes for instance. Finally, a quantum belief propagation algorithm was recently proposed [25] to enable iterative decoding of more general (non-Pauli) noise models. As in the classical case, quantum belief propagation also ties in with statistical physics [20, 24, 25, 36].

We emphasize that a fast decoding algorithm is crucial in quantum information theory. In the classical setting, when error correction codes are used for communication over a noisy channel, the decoding time translate directly into communication delays. This has been the driving motivation to devise fast decoding schemes, and is likely to be important in the quantum setting as well. However, there is an important additional motivation for efficient decoding in the quantum setting. Quantum computation is likely to require active stabilization. The decoding time thus translates into computation delays, and most importantly in error suppression delays. If errors accumulate faster than they can be identified, quantum computation may well become infeasible: fast decoding is an essential ingredient to fault-tolerant computation (see however [13]).

The first attempts at obtaining quantum analogues of LDPC codes [28, 9, 19] have not yielded results as spectacular as their classical counterpart. This is due to several reasons. First there are issues with the code design. Due to the orthogonality constraints imposed on the parity-check matrix, it is much harder to construct quantum LDPC codes than classical ones. In particular, constructing the code at random will certainly not do. The CSS construction is of no help since random sparse classical codes do not have sparse duals. In fact, it is still unknown whether there exist families of quantum LDPC codes with non-vanishing rate and unbounded minimum distance. Moreover, all known construction seem to suffer from a poor minimum distances for reasons which are not always fully understood. Second, there are issues with the decoder. The Tanner graph associated to a quantum LDPC code necessarily contains many -cycles which are well known for their negative effect on the performances of iterative decoding. Moreover, quantum LDPC codes are by definition highly degenerate but their decoder does not exploit this property: rather it is impaired by it [37].

On the other hand, generalizing turbo-codes to the quantum setting first requires a quantum analogue of convolutional codes. These have been introduced in [10, 11, 31, 32] and followed by further investigations [15, 18, 1]. Quantum turbo-codes can be obtained from the interleaved serial concatenation of convolutional codes. This idea was first introduced in [33]. There, it was shown that, on memoryless Pauli channels, quantum turbo-codes can be decoded similarly to classical serial turbo-codes. One of the motivation behind this work was to overcome some of the problems faced by quantum LDPC codes. For instance, graphical representation of serial quantum turbo-codes do not necessarily contain 4-cycles. Moreover, there is complete freedom in the code parameters. Both of these points are related to the fact that there are basically no restrictions on the choice of the interleaver used in the concatenation. An other advantage over LDPC codes is that the decoder makes explicit use of the coset structure associated to degenerate errors.

Despite these features, the iterative decoding performance of the turbo-code considered in [33] was quite poor, much poorer in fact that results obtained from quantum LDPC codes. The purpose of the present article is to discuss in length several issues omitted in [33], to provide a detailed description of the decoding algorithm, to suggest much better turbo-codes than the one proposed there, and, most importantly, to address the issue of catastrophic error propagation for recursive quantum convolutional encoders.

Non-catastrophic and recursive convolutional encoders are responsible for the great success of parallel and serial classical turbo-codes. In a serial concatenation scheme, an inner convolutional code that is recursive yields turbo-code families with unbounded minimum distance [22], while non-catastrophic error propagation is necessary for iterative decoding convergence. The last point can be circumvented in several ways (by doping for instance, see [45]) and some of these tricks can be adapted to the quantum setting, but are beyond the scope of this paper.

The proof [22] that serial turbo-codes have unbounded minimal-distance carries almost verbatim to the quantum setting. Thus, it is possible to design quantum turbo-codes with polynomially large minimal distances. However, we will demonstrate that all recursive quantum convolutional encoders have catastrophic error propagation. This phenomenon is related to the orthogonality constraints which appear in the quantum setting and to the fact that quantum codes are in a sense coset codes. As a consequence, such encoders are not suitable for (standard) serial turbo-codes schemes.

In our constructions, the convolutional codes are therefore chosen to be non-catastrophic and non-recursive, so there is no guarantee that the resulting families of turbo-codes have a minimum distance which grows with the number of encoded qubits. Despite these limitations, we provide strong numerical evidence that their error probability decreases as we increase the block size at fixed rate – and this up to rather large block sizes. In other words, from a pragmatic point of view, the minimum distances of the codes that we have simulated are large enough not to degrade the iterative decoding performance up to moderate word error rates () and block sizes ().

The style of our presentation is motivated by the intention to accommodate a readership familiar with either classical turbo-codes or quantum information science. This unavoidably implies some redundancy and the expert reader may want to skip some sections, or perhaps glimpse at them to pick up the notation. In particular, the necessary background from classical coding theory and convolutional codes is presented in the next section using the circuit language of quantum information science. This framework is somewhat unconventional – block codes are defined using reversible matrices rather than parity-check or generating matrices, convolutional codes are defined via a reversible seed transformation instead of a linear filter built from shift registers and feed-back lines – yet requires little departure from standard presentations. The benefit is a very smooth transition between classical codes and quantum codes, which are the subject of Sec. III. Whenever possible, the definitions used in the quantum setting directly mirror those established in the classical setting. The other benefit of this framework is that it permits to generate all quantum convolutional codes straightforwardly without being hassled by the orthogonality constraint. In fact, the codes we describe are in general not of the CSS class.

Section IV uses the circuit representation to define quantum convolutional codes and their associated state diagram. The state diagram is an important tool to understand the properties of a convolutional code. In particular, the detailed analysis of the state diagram of recursive convolutional encoders performed in Sec. IV-E will lead to the conclusion that they all have catastrophic error propagation. Section V is a detailed presentation of the iterative decoding procedure used for quantum turbo-codes. Finally, our numerical results on the codes’ word error rate and spectral properties are presented at Sec. VI.

## Ii Classical preliminaries

The main purpose of this section is to introduce a circuit representation of convolutional encoders which simplifies the generalization of several crucial notions to the quantum setting. For instance, it allows to define in a straightforward way a state diagram for the quantum analogue of a convolutional code which arises naturally from this circuit representation. This state diagram will be particularly helpful for defining and studying fundamental issues related to turbo-codes such as recursiveness and non-catastrophicity of the constituent convolutional encoders. The circuit representation is also particularly well suited to present the decoding algorithm of quantum convolutional codes.

### Ii-a Linear block codes

A classical binary linear code of dimension and length can be specified by a full-rank parity-check matrix over :

 C={¯¯c | H¯¯cT=0}. (1)

Alternatively, the code can be specified by fixing the encoding of each information word through a linear mapping for some full-rank generator matrix over that satisfies . Since has rank , there exists an matrix over that we denote by a slight abuse of notation by satisfying where for any integer , denotes the identity matrix. Similarly, since has rank , there exists a matrix over satisfying .

###### Lemma 1

The right inverses and can always be chosen such that .

{proof}

Let . The substitution preserves the property and fulfills the desired requirement.

We will henceforth assume that the right inverses and are chosen to fulfill the condition of Lemma 1.

To study the analogy between classical linear binary codes and stabilizer codes, we view a rate classical linear code and its encoding in a slightly unconventional fashion. We specify the encoding by an invertible encoding matrix over . The code space is defined as

 C={¯¯c=(c:0n−k)V | c∈Fk2}, (2)

where we use the following notation.

###### Notation 1

For an -tuple and an -tuple over some alphabet , we denote by the -tuple formed by the concatenation of followed by .

Given the generator matrix and parity check matrix of a code, the encoding matrix can be fixed to

 V=(G(H−1)T). (3)

This matrix is invertible:

 V−1=(G−1,HT) (4)

and satisfies following Lemma 1. Clearly, the encoding matrix specifies both the code space and the encoding. The output of the encoding matrix is in the code space if and only if the input is of the form where . This follows from the equalities and .

The encoding matrix also specifies the syndrome associated to each error. When transmitted on a bit-flip channel, a codeword will result in the message for some . The error can be decomposed into an error syndrome and a logical error as . This is conveniently represented by the circuit diagram shown at Fig. 1, in which time flows from left to right. In such diagrams, the inverse is obtained by reading the circuit from right to left, running time backwards. This circuit representation is at the core of our construction of quantum turbo-codes, it greatly simplifies all definition and analysis.

A probability distribution on the error incurred during transmission induces a probability distribution on logical transformation and syndromes

 P(l,s)=P(p)∣∣p=(l:s)V−1. (5)

We call the pullback of the probability through the gate . Maximum likelihood decoding consists in identifying the most likely logical transformation given the syndrome

 lML(s)=argmaxlP(l|s) (6)

where the conditional probability is defined the usual way

 P(l|s)=P(l,s)∑l′P(l′,s). (7)

Similarly, we can define the bit-wise maximum likelihood decoder which performs a local optimization on each logical bit

 liML(s)=argmaxliP(li|s), (8)

where the marginal conditional probability is defined the usual way

 P(l|s)=∑l1,…li−1,li+1,…lkP(l1,…lk|s). (9)

### Ii-B Convolutional codes

We define now a convolutional code as a linear code whose encoder has the form shown at Fig. 2. The circuit is built from repeated uses of a linear invertible seed transformation shifted by bits. In this circuit, particular attention must be paid to the order of the inputs as they alternate between syndrome bits and logical bits. The total number of identical repetition is called the duration of the code and is denoted . The bits that connect gates from consecutive “time slices” are called memory bits. The encoding is initialized by setting the first memory bits to . There are several ways to terminate the encoding, but we here focus on a padding technique. This simply consists in setting the logical bits of the last time slices equal to , where is a free parameter independent of . The rate of the code is thus .

Note that in this diagram, we use a subscript to denote the different elements of a stream. For instance, denotes the -bit output string at time . The th bits of would be denoted by a subscript as , or simply when the particular time is clear from context. This convention will be used throughout the paper.

This definition of convolutional code differs at first sight from the usual one based on linear filters built from shift register and feed-back lines. An example of a linear filter for a rate (systematic and recursive) convolutional encoder is shown at Fig. 3. An other common description of this encoder would be in terms of its rational transfer function which related the -transform of the output to that of the input . Remember that the -transform of a bit stream is given by . For the code of Fig. 3, the output’s -transforms are

 p1(D) = l(D) (10) p2(D) = f0+f1D+…+fmDm1+q1D+…+qmDml(D) (11)

where the inverse is the Laurent series defined by long division. The code can also be specified by the recursion relation

 wji = wj−1i−1for j>1 w1i = li+m∑j=1qjwji−1 p2i = f0(m∑j=1qjwji−1+li)+m∑j=1fjwji−1 = f0li+m∑j=1(fj+f0qj)wji−1.

These definitions are in fact equivalent to the circuit of Fig. 2 with the seed transformation specified by Fig. 4. Note that we can assume without lost of generality that or (or both), and these two cases lead to different seed transformations. The generalization to arbitrary linear filters is straightforward. In terms of matrices, the seed transformation associated to this convolutional code encodes the relation with given by

 U=⎛⎜ ⎜ ⎜⎝\raisebox0.0pt[6.45pt]$nμP$\raisebox0.0pt[6.45pt]$mμM$ΛPΛMΣPΣM⎞⎟ ⎟ ⎟⎠}m}k}n−k. (12)

where

 μP=⎛⎜ ⎜⎝0f1+f0q1⋮⋮0fm+f0qm⎞⎟ ⎟⎠, μM=⎛⎜ ⎜ ⎜ ⎜⎝q1q21lm−1⋮qm000⎞⎟ ⎟ ⎟ ⎟⎠,

, and . The two other components depend on whether or . In the former case and while in the latter case and .

Not only does the circuit of Fig. 2 produce the same encoding as the linear filter of Fig. 3, it also has the same memory states. More precisely, the value contained in the th shift register at time in Fig. 3 is equal to the value of the th memory bit between gate and on Fig. 2. This is important because it allows to define the state diagram (see Sec. IV-B) directly from the circuit diagram Fig. 4.

Of particular interest are systematic recursive encoders that are defined as follows.

###### Definition 1 (Systematic encoder)

An encoder is systematic when the input stream is a sub-stream of the output stream.

###### Definition 2 (Recursive encoder)

A convolutional encoder is recursive when its rational transfer function involves genuine Laurent series (as opposed to simple polynomials).

Systematic encoders copy the input stream in clear in one of the output stream. Typically they have transfer functions of the form for and arbitrary for , so is a copy of . The systematic character of the code considered in the above example is most easily seen from Fig. 3: is a copy of the input . Systematic encoders are used to avoid catastrophic error propagation. This term will be defined formally in the quantum setting, but it essentially means that an error affecting a finite number of physical bits is mapped to a logical transformation on an infinite number of logical bits by the encoder inverse. Catastrophic encoders cannot be used directly in standard turbo-code schemes. The problem is that the first iteration of iterative decoding does not provide information on the logical bits. This is due to the fact that as the length of the convolutional encoder tends to infinity and in the absence of prior information about the value of the logical bits, the logical bit error rate after decoding tends to .

A recursive encoder has an infinite impulsive response: on input of Hamming weight , it creates an output of infinite weight for a code of infinite duration . Recursiveness is also related to the presence of feed-back in the encoding circuit, which is easily understood from the linear filter of Fig. 3. Except when the polynomial factors , an encoder with feed-back will be recursive. It is essential to use as constituent recursive convolutional codes in classical turbo-codes schemes to obtain families of turbo-codes of unbounded minimum distance and with performances which improve with the block size.

## Iii Quantum Mechanics and Quantum Codes

In this section, we review some basic notions of quantum mechanics, the stabilizer formalism, and the decoding problem for quantum codes. In Sec. III-B, stabilizer codes are defined the usual way, as subspaces of the Hilbert space stabilized by an Abelian subgroup of the Pauli group. We detail in Sec. III-C how these codes are decoded. Even if a stabilizer code is a continuous space, it can be defined and studied by using only discrete objects (parity-check matrix, encoding matrix, syndrome) which are quite close to classical linear codes. We discuss in Sec. III-D the relations between such quantum codes and classical linear codes but also highlight the crucial distinctions between them. Particular emphasis is put on the role of the encoder because it is a crucial ingredient for our definition of quantum turbo-codes. The encoder also provides an intuitive picture for the logical cosets, which are an important distinction between classical codes and quantum stabilizer codes.

### Iii-a Qubits and the Pauli group

A qubit is a physical system whose state is described by a unit-length vector in a two-dimensional Hilbert space. The two vectors of a given orthonormal basis are conventionally denoted by and . We identify the Hilbert space with in the usual way with the help of such a basis. The state of a system comprising qubits is an unit-length vector in the tensor product of two-dimensional Hilbert spaces. It is a space of dimension which can be identified with . It has a basis given by all tensor products of the form , where the and the inner product between two basis elements and is the product of the inner products of with the corresponding . In other words, this basis is orthonormal. It will be convenient to use the following notation

###### Notation 2
 |0n⟩≜|0⟩⊗⋯⊗|0⟩n times.

The error model we consider in this paper is a Pauli-memoryless channel which is defined with the help of the three Pauli matrices

 X=(0110),Y=(0−ii0),Z=(100−1).

These matrices anti-commute with each other and satisfy the following multiplication table

 ×XYZXIiZ−iYY−iZIiXZiY−iXI

where denotes the identity matrix. The action of these operators on the state of a qubit is obtained by right multiplication , with viewed as an element of .

These matrices generate the Pauli group which is readily seen to be the set

 {±I,±iI,±X,±iX,±Y,±iY,±Z,±iZ}.

They also form all the errors which may affect one qubit in our error model. If we have an -qubit system, then the errors which may affect it belong to the Pauli group over qubits which is defined by

 \mathrsfsGn = \mathrsfsG⊗n1 = {ϵP1⊗⋯⊗Pn|ϵ∈{±1,±i},Pi∈{I,X,Y,Z}}

This group is generated by and the set of ’s and ’s for which are defined by:

###### Notation 3
 Xi ≜ % i−1 timesI⊗⋯⊗I⊗X⊗n−i timesI⊗⋯⊗I Zi ≜ % i−1 timesI⊗⋯⊗I⊗Z⊗n−i timesI⊗⋯⊗I

In quantum mechanics two states are physically indistinguishable if they differ by a multiplicative constant. This motivates the definition another group of errors, called the effective Pauli group, obtained by taking the quotient of by .

###### Definition 3 (Effective Pauli group)

The effective Pauli group on qubits is the set of equivalence classes for in , where the equivalence class is the set of elements of which differ from by a multiplicative constant. We will also use the notation and .

All the effective Pauli groups are Abelian. is isomorphic to where the group operation of corresponds to bitwise addition over . As a consequence effective Pauli operators can be represented by binary couples. We will henceforth make use of the following representation

 I ↔ (0,0) (13) X ↔ (1,0) (14) Y ↔ (1,1) (15) Z ↔ (0,1) (16)

Note that and we will either view, depending on the context, an element as an -tuple with entries in or as -tuple with entries in obtained by replacing each by its corresponding binary representation. is generated by the and , and we introduce the following notation.

###### Notation 4

For in , we denote by and the only elements of satisfying:

1. , and

2. ,.

An important property of is that any pair of elements either commutes or anti-commutes. This leads to the definition of an inner product “” for elements and of such that . Here, if , and ; and otherwise.

###### Fact 1

commute if and only if .

This product can also be defined with the help of the following matrix which will appear again later in the definition of symplectic matrices.

###### Notation 5
 Λn≜1ln⊗X.

By viewing now elements of as binary -tuples we have:

###### Definition 4 (Inner product)

Define the inner product by .

is an -vector space and we use the inner product to define the orthogonal space of a subspace of as follows.

###### Definition 5 (Orthogonal subspace)

Let be a subset of . We define by

 V⊥≜{P∈Gn:P⋆Q=0for every Q∈V}.

is always a subspace of and if the space spanned by is of dimension , then is of dimension .

From the fact that two states are indistinguishable if they differ by a multiplicative constant, a Pauli error may only be specified by its effective Pauli group equivalence to which it belongs. A very important quantum error model is the depolarizing channel. It is in a sense the quantum analogue of the binary symmetric channel.

###### Definition 6 (Depolarizing channel)

The depolarizing channel on qubits of error probability is an error model where all the errors which occur belong to and the probability that a particular element is chosen is equal to where the weight of a Pauli error is given by

###### Notation 6

is the number of coordinates of which differ from .

In other words, the coordinates of the error are chosen independently: there is no error on a given coordinate with probability and there is an error on it of type or each with probability .

### Iii-B Stabilizer codes: Hilbert space perspective

A quantum error correction code protecting a system of qubits by embedding them in a larger system of qubits is a dimensional subspace of . We say that it is a quantum code of length and rate . It can be specified by a unitary transformation :

 \mathrsfsC={|¯¯¯¯ψ⟩=V(|ψ⟩⊗|0n−k⟩) | |ψ⟩∈C2k}. (17)

This definition directly reflects Eq. (2). As in the classical case, the matrix specifies not only the code but also the encoding, that is the particular embedding . An importance distinction however is that in the quantum case, the dimension of the matrix is exponential in the number of qubits . To obtain an efficiently specifiable code, we choose from a subgroup of the unitary group over called the Clifford group. In fact, not only are Clifford transformations over qubits efficiently specifiable, they can also be implemented efficiently by a quantum circuit involving only elementary quantum gates on and qubits (see Theorem 10.6 in [30] for instance).

###### Definition 7 (Clifford transformation and Clifford group)

A Clifford transformation over qubits is a unitary transform over which leaves the Pauli group over qubits globally invariant by conjugation

 V\mathrsfsGnV†=\mathrsfsGn.

The set of Clifford transformations is a group and is called the Clifford group over qubits.

This definition naturally leads to the action of the Clifford group on elements of the Pauli group.

###### Definition 8 (Action of Clifford transformation on Pauli)

A Clifford transformation acts on the Pauli group as

 \mathrsfsGn → \mathrsfsGn P ↦ P′=VPV†

It also acts on the effective Pauli group by the mapping .

The last mapping is -linear and there is a square binary matrix of size which is such that

 [VPV†]=[P]V.

This matrix will be called the encoding matrix.

###### Definition 9 (Encoding matrix)

The encoding matrix associated to an encoding operation , which is a Clifford transformation over qubits, is the binary matrix of size such that for any we have

 [VPV†]=[P]V.

Clearly then, a Clifford transformation on qubits can be specified by its associated encoding matrix on together with a collection of phases. This shows that Clifford transformations are efficiently specifiable as claimed. It can readily be verified that the rows of , denoted , are equal to

 V2i−1 = [VXiV†]=XiV, (18) V2i = [VZiV†]=ZiV. (19)

Since conjugation by a unitary matrix does not change the commutation relations, the above equations implies that the encoding matrix is a symplectic matrix, whose definition is recalled below.

###### Definition 10 (Symplectic transformation)

A -qubit symplectic transformation is a matrix over that satisfies

 UΛnUT=Λn.

By definition, symplectic transformation are invertible and preserve the inner product between -qubit Pauli group elements. Conversely, every symplectic matrices always correspond to a (non-unique) Clifford transformation.

A stabilizer code is thus a quantum code specified by Eq. (17), but with in the Clifford group. The code (but not the encoding) can equivalently be specified with independent mutually commuting elements of of order as follows:

###### Definition 11 (Stabilizer code)

The stabilizer code associated to the stabilizer set , where the ’s are independent mutually commuting elements of of order and different from , is the subspace of of elements stabilized by the ’s, that is

 \mathrsfsC={|¯¯¯¯ψ⟩ | Hi|¯¯¯¯ψ⟩=|¯¯¯¯ψ⟩,1≤i≤n−k}. (20)

This is the usual definition of stabilizer codes. The play a role analogous to the rows of the parity-check matrix of a classical linear code, and this connection will be formalized in Subsection III-D. To see the equivalence between this definition and Eq. (17), set . These operators are independent and of order 2 since they are conjugate to the which are independent and of order 2. Now, consider a as defined in Eq. (17). For all , we have

 Hi|¯¯¯¯ψ⟩ = VZi+kV†V(|ψ⟩⊗|0n−k⟩) (21) = V(|ψ⟩⊗Zi|0n−k⟩)=|¯¯¯¯ψ⟩, (22)

where we used the fact that . Hence, satisfies the condition of Def. 11. Conversely, for any state according to Def. 11, we have

 Zk+iV†|¯¯¯¯ψ⟩ = V†Hi|¯¯¯¯ψ⟩ (23) = V†|¯¯¯¯ψ⟩, (24)

which implies that the th qubit of must be in state . Since this holds for all , we conclude that the two definitions are equivalent. This equivalence has the following consequence:

###### Fact 2

A stabilizer code of length associated to independent generators is of dimension .

Since are all of order , all the generators of order in are of the form where is a tensor product of matrices all chosen among the set . Thus, we can specify the generators of the stabilizer code by giving only the associated effective Pauli group elements together with a sign for each generator. Changing the sign of a stabilizer generator changes the code, but not its properties111This is strictly true for Pauli channels which are considered here. For a general noise model, error correcting properties may actually depend on the sign of the stabilizer generators.. More precisely, the set of Pauli errors which can be corrected by such a code does not depend on the signs which have been chosen. Hence, we can specify a family of “equivalent” codes by specifying instead of the ’s the set of . It is important to note that these elements have to be orthogonal: the fact that the ’s commute translate into the orthogonality condition . Thus, the span a linear space called the stabilizer space, that we denote for reasons that will become apparent later.

Thus, in analogy with classical linear codes, a stabilizer code (or more precisely an equivalent class thereof) can be efficiently specified by an encoding matrix on . This matrix also provides an efficient description of the encoding up to a set of phases. There is another analogy with a classical encoding matrix that will be crucial for our definition of quantum turbo-codes. Assume that we concatenate two stabilizer codes and that these codes are encoded by Clifford transformations. The result of the concatenation is also a stabilizer code (because Clifford transformations form a group) and the resulting encoding matrix is just the product of the two encoding matrices of each constituent code. This reflects the fact that the encoding matrices provide a representation of the Clifford group.

###### Fact 3

Let and be two Clifford transformations over qubits with encoding matrices and respectively. Then is a Clifford transformation with encoding matrix .

{proof}

Consider the Clifford transformation . It suffices to verify the statement on a generating set of the Pauli group:

 [VXiV†] = [V2V1XiV†1V†2] = [V1XiV†1]V2 = XiV1V2

Equation (III-B) uses the fact that belongs to . The same kind of result holds for the ’s and this completes the proof.

### Iii-C Decoding

When transmitted on a Pauli channel, an encoded state (where belongs to ) will result in a state for some . Upon inverting the encoding we obtain the state

 V†P|¯¯¯¯ψ⟩ = V†PV(|ψ⟩⊗|0n−k⟩) = (L|ψ⟩)⊗(S|0n−k⟩),

where belongs to and belongs to (and the ’s to ). Notice that is equal to where and

 si = 0if Si∈{I,Z}, (26) si = 1otherwise. (27)

Measuring the last qubits reveals which is the analogue of a classical syndrome. This motivates the following definition.

###### Definition 12 (Error syndrome)

The syndrome associated to a Pauli error is the binary vector defined by Equations (26) and (27).

Note that the syndrome can be obtained from the ’s (which are defined as in the previous subsection by ) by

###### Proposition 1
 s(P)=([P]⋆Hi)1≤i≤n−k.
{proof}

is equal to by definition. Since symplectic transformations preserve the symplectic inner product we deduce that .

This proposition motivates the following definition of a parity-check matrix of a stabilizer code

###### Definition 13 (Parity-check matrix)

The parity-check matrix of a quantum code with stabilizer set is the binary matrix of size with rows .

The calculation of the syndrome depends only on the effective Pauli error . As we did for classical errors in Sec. II-A, it will be convenient to decompose the error as , with and . Like in the classical case, this is conveniently represented by the circuit diagram of Fig. 5. At this point however, the analogy with the classical case partially breaks down. As described in Section II-A, in the classical setting a bit-flip error can be decomposed as . In that case, is the error syndrome and is therefore known. Decoding then consists in identifying the most likely given knowledge of . In the quantum case however, is only partially determined by the error syndrome . Indeed, we can decompose as (c.f. Notation 4), and notice that from (26) and (27), reveals only . More precisely, we have the following relation for the -th component of

 Sxi = Xif si=1 Sxi = Iotherwise.

Hence, two physical errors and have the same error syndrome 222By a slight abuse of terminology, we use the one-to-one correspondence between and to refer to both quantities as the error syndrome. , so cannot be distinguished. However, they also yield the same logical transformation , so they can be corrected by the same operation (namely applying again). Therefore, they cannot and need not be distinguished by the error syndrome: such errors are called degenerate. This reflects the fact that all errors of the form (with ) have zero syndrome but do not need to be corrected. We denote such kind of errors by

###### Definition 14 (Harmless undetected errors)

The set of errors of the form where ranges over is called the set of harmless undetected errors.

All the other errors of zero syndrome (and which are therefore undetected) have a non trivial action on the first qubits after inverting the encoding transformation. This motivates the following definition

###### Definition 15 (Harmful undetected errors)

The set of errors of the form where ranges over and is different from is called the set of harmless undetected errors.

Note that the set of errors of the form with in is also the subgroup spanned by the rows for , or what is the same, the subgroup spanned by the for . In other words

###### Proposition 2

The set of harmless undetected errors is equal to .

This fact that there are errors which do no need to be corrected has an important consequence. Contrarily to the classical setting where the most likely error satisfying the measured syndrome is sought, in the quantum case, we look for the most likely coset of satisfying the measured syndrome. Such a coset is the set of errors of the form

###### Definition 16 (Logical coset)

Given an encoding matrix , the logical coset associated to the logical transformation and to the syndrome (belonging to ) is defined as

 C(L,Sx) = {P=(L:Sz+Sx)V | Sz∈{I,Z}n−k} = (L:Sx)V+C(I).

When we simply write instead of .

What replaces the classical probability that a given information sequence has been sent given a measured syndrome is in the quantum case the probability that applying the transformation to the first qubits after performing the inverse of the encoding operation corrects the error on these qubits. It corresponds to the probability that the error belongs to the coset which is therefore equal to

 P(L|Sx)=P(L,Sx)∑L′P(L′,Sx). (28)

with the probability is the pullback of through the encoding matrix

 P(L,Sx)=∑Sz∈{I,Z}n−kP(P)∣∣P=(L:Sx+Sz)V−1. (29)

Similarly to the classical setting, maximum likelihood decoding consists in identifying the most likely logical transformation given the syndrome . More formally:

###### Definition 17 (Maximum likelihood decoder)

The maximum likelihood decoder is defined by

 LML(Sx)=argmaxLP(L|Sx) (30)

The classical MAP decoding (or bit-wise decoding) has also a quantum analogue

###### Definition 18 (Qubit-wise maximum likelihood decoder)

The qubit-wise maximum likelihood decoder is defined by

 LiML(Sx)=argmaxLiP(Li|Sx) (31)

where the marginal conditional probability is defined the usual way

 P(Li|Sx)=∑L1,…Li−1,Li+1,…LkP(L1,…Lk|Sx). (32)

Equation (29) differs from its classical analogue Eq. (5) by a summation over which reflects the coset structure of the code. Aside from this distinction, the maximum-likelihood decoders are defined as in the classical case.

### Iii-D Comparison between stabilizer codes and classical linear codes

One of the main advantage of the stabilizer formalism is that it allows to discretize a seemingly continuous problem by studying the effect of Pauli errors (which are discrete) on the continuous code subspace. By classifying these errors, discrete quantities such as error syndromes or parity-check matrices arise naturally. In other words, stabilizer codes share many analogies with classical linear codes, but there are also some fundamental differences. Let us summarize these analogies and differences here. We assume in what follows that the relevant quantum quantities are defined for a stabilizer code of length and rate .

Syndrome and parity-check matrix. The parity-check matrix is a binary matrix of size . It differs from a classical parity-check matrix in two respects:

1. Its rows must be orthogonal with respect to the -product,

2. The syndrome of a Pauli error in is defined with the help of the -product (rather than by matrix multiplication): .

Encoding matrix. It is a binary matrix of size and must be a symplectic matrix (and any symplectic matrix is the encoding matrix of a certain stabilizer code). Because it is a symplectic matrix , it plays a role analogous to both the classical encoding matrix and its inverse. Like the classical encoding matrix Eq. (3), it contains a generator matrix as a sub-matrix. Like the inverse of the classical encoding matrix Eq. (4), it also contains a parity check matrix as a sub-matrix. The parity check matrix is formed of rows while the generating matrix consists of rows . The remaining rows are sometimes referred to as “pure errors” [34]. Indeed, taking the rows of as generators of , the syndrome associated to an element of depends only on its pure error component. Hence, their classical analogue is the matrix appearing in the classical encoding matrix Eq. (4).

The encoding matrix is associated to a (continuous) unitary encoding transformation . Like in the classical case, the natural decoding process consisting in inverting and measuring the last qubits, which yields a syndrome that is associated to a parity check matrix.

Code. We may define the discrete stabilizer code as in the classical setting as the set of errors with zero syndrome, that is

###### Definition 19 (Discrete stabilizer code)

The discrete stabilizer code associated to the stabilizer set , where the ’s are independent mutually orthogonal elements of , is the subspace of orthogonal to the , that is

 C={P∈Gn | Hi⋆P=0,1≤i≤n−k}, (33)

or more succinctly .

Codewords. There is an important difference between the classical setting and the quantum setting here. Since all elements of a coset of have the same effect on , we make no distinction between the elements of such cosets. Therefore the codewords in the quantum setting are grouped in cosets of . Note that all elements of the coset are the analogue of the zero codeword. With the notation introduced in the previous subsection we have

 C=⋃L∈GkC(L). (34)

Minimum distance. In the classical setting, the minimum distance of a linear code is the smallest Hamming weight of a non-zero codeword. This definition carries over to the quantum setting with the coset playing the role of the zero codeword. Thus, the minimal distance of a code is the minimum weight of an element of . With this definition of the minimum distance , it is straightforward to check that the number of errors which are corrected by a decoder which outputs the coset containing the element of lowest weight and satisfying the syndrome is equal to .

Information symbols. There is in the quantum setting a natural notion of information sequence corresponding to a Pauli error which consists in taking the element in such that there exists an in for which .

## Iv Quantum turbo-codes

In this section, we describe quantum turbo-codes obtained from interleaved serial concatenation of quantum convolutional codes. This first requires the definition of quantum convolutional codes. We will define them through their circuit representation as in [31] rather than through their parity-check matrix as in [15, 18, 1]: this allows to define in a natural way the state diagram and is also quite helpful for describing the decoding algorithm.

### Iv-a Quantum convolutional codes

A quantum convolutional encoder can be defined quite succinctly as a stabilizer code with encoding matrix given by the circuit diagram of Fig. 6. The circuit is built from repeated uses of the seed transformation shifted by qubits. In this circuit, particular attention must be paid to the order of the inputs as they alternate between stabilizer qubits and logical qubits. This is a slight deviation from the convention established in the previous section, and it is convenient to introduce the following notation to label the different qubits appearing in the encoding matrix of a quantum stabilizer code.

###### Definition 20

The positions corresponding to are called the logical positions and the positions corresponding to are called the syndrome positions.

The total number of identical repetition of the seed transformation is called the duration of the code and is denoted . The qubits that connect gates from consecutive time slices are called memory qubits. The encoding is initialized by setting the first memory qubits in the state. To terminate the encoding, set the information qubits of the last time slices in the state, where is a free parameter independent of . The rate of the code is thus