Numerical and analytical bounds on threshold error rates for hypergraph-product codes

# Numerical and analytical bounds on threshold error rates for hypergraph-product codes

Alexey A. Kovalev Department of Physics and Astronomy and Nebraska Center for Materials and Nanoscience, University of Nebraska, Lincoln, Nebraska 68588, USA    Sanjay Prabhakar Department of Physics and Astronomy and Nebraska Center for Materials and Nanoscience, University of Nebraska, Lincoln, Nebraska 68588, USA    Ilya Dumer Department of Electrical Engineering, University of California, Riverside, California 92521, USA    Leonid P. Pryadko Department of Physics & Astronomy, University of California, Riverside, California 92521, USA
August 1, 2019
###### Abstract

We study analytically and numerically decoding properties of finite rate hypergraph-product quantum LDPC codes obtained from random -regular Gallager codes, with a simple model of independent and errors. Several non-trival lower and upper bounds for the decodable region are constructed analytically by analyzing the properties of the homological difference, equal minus the logarithm of the maximum-likelihood decoding probability for a given syndrome. Numerical results include an upper bound for the decodable region from specific heat calculations in associated Ising models, and a minimum weight decoding threshold of approximately .

###### pacs:
72.20.Pa, 75.30.Ds, 72.20.My

## I Introduction

Coherence protection is one of the key technologies required for scalable quantum computation. Quantum error correction is one such technique. It enables scalable quantum computation with a polynomial overhead as long as the accuracy of elementary gates and measurements exceeds certain thresholdShor (1996). For a given family of quantum error correcting codes (QECCs), the actual threshold value depends, e.g., on hardware architecture, implementation of the elementary gates, and on the algorithm used for syndrome-based decoding. While these details are ultimately very important, for the purposes of comparing different families of QECCs, one is also interested in the threshold(s) computed in simple “channel” models where errors on different qubits are assumed independent and identically distributed (i.i.d.), with the assumption of perfect syndrome measurement. The resulting threshold is a single number which depends on the chosen algorithm for syndrome-based decoding. Among any decoders, the threshold is maximal for the (exponentially expensive) maximum-likelihood (ML) decoder.

While the original version of the threshold theorem was based on concatenated codesShor (1996); Knill et al. (1998); Fowler et al. (2004); Aliferis et al. (2006), much better thresholds are obtained with topological surfaceDennis et al. (2002); Wang et al. (2010a, 2011) and related topological color codesWang et al. (2010b); Landahl et al. (2011). Even though their thresholds are higher, all surface codes, and, generally, all codes local in dimensions, are necessarily zero-rate codesBravyi and Terhal (2009); Bravyi et al. (2010). Just like codes obtained by repeated concatenation, such local codes also require the overhead that takes a fraction of the code length that is increasing with , so that the overall overhead needs to increase faster than linear as the size of the computation grows.

Scalable quantum computation with a linear overhead can be potentially achievedGottesman (2014) using more general quantum LDPC (low density parity-check) codes. These are just stabilizer codes, with the property that each stabilizer generator involves a bounded number of qubits. Then, a non-zero fault-tolerant threshold is guaranteed if the distance scales logarithmically or faster with the block lengthKovalev and Pryadko (2013a); Dumer et al. (2015). At the same time, to achieve a finite rate, the generators must remain non-local whenever the qubits are laid out in dimensionsBravyi and Terhal (2009); Bravyi et al. (2010). There are several known infinite code families that achieve these requirementsTillich and Zemor (2009); Delfosse and Zémor (2010); Delfosse (2013); Kovalev and Pryadko (2013b); Guth and Lubotzky (2014).

The ML decoding probability for a quantum LDPC code can be formally expressed as the average of a ratio of partition functions for two associated random-bond Ising models (RBIM) Dennis et al. (2002); Bombin (2010), computed on the Nishimori lineNishimori (1980, 2001) in the plane, where is the error probability and is the temperature. A temperature away from the Nishimori line corresponds to a suboptimal decoder which assumes an incorrect input error probability in the decoding process. For topological codes local in dimensions, the decodable region is a subset of (and possibly coincides with) the thermodynamical phase of RBIM where certain extended topological defects have finite tensionDennis et al. (2002); Wang et al. (2003); Katzgraber et al. (2009); Bombin (2010); Bombin et al. (2012); Katzgraber and Andrist (2013); Jouzdani et al. (2014); Kovalev and Pryadko (2015). For finite-rate codes, decodability requires that the average defect tension be sufficiently largeKovalev and Pryadko (2015).

In this work we analyze error-correcting properties of the finite-rate family of hypergraph-product codesTillich and Zemor (2009) based on random -regular Gallager ensemble of classical LDPC codes, in conjunction with the phase diagrams of the two mutually dual associated RBIMs, constructed assuming independent and errors which happen independently with probability at each qubit (see Fig. 1). More specifically, we use a large-distance subset of the random Gallager codes. Each constructed code is a CSS codeCalderbank and Shor (1996); Steane (1996) with parameters . Here the asymptotic rate , and the distance scales as a square root of the code length . The first corresponding Ising model has interaction terms (“bonds”) and spins; each bond is a product of or spin variables, and each spin participates in up to seven bonds. These numbers are higher for the second (dual) model which includes a summation over additional spin variables corresponding to the codewords.

Analytically, we construct a lower bound for the decodable region, and study the relation between the decodability and thermodynamical phases of the corresponding Ising model. In particular, this produces a non-trivial upper temperature bound for the decodable region. Numerically, we use Metropolis updates in canonical ensemble simulations and in feedback-optimized parallel tempering Monte Carlo method to compute average specific heat as a function of temperature and the flipped bond probability ; extrapolation to infinite code distance gives the transition temperatures in the two models. We give an argument that it is the transition temperature of the dual model that gives a more accurate estimate of the upper temperature bound of the decodable region. We also use a decoder approximating the minimum-energy decoder to obtain a lower bound for the ML decoding threshold, with the result .

The rest of the paper is organized as follows. In Sec. II, we give a brief overview of classical and quantum error correcting codes, the Ising models related to ML decoding, and quantum hypergraph-product codes. We give the analytical bounds for the decodable region in Sec. III, with the proofs given in the Appendices. The numerical techniques and the corresponding results are presented in Sec. IV. We summarize our results and give some concluding remarks in Sec. V.

## Ii Background

### ii.1 Classical and quantum error correcting codes

A classical binary linear code with parameters is a -dimensional subspace of the vector space of all binary strings of length . Code distance is the minimal weight (number of non-zero elements) of a non-zero string in the code. A code can be specified in terms of the generator matrix whose rows are the basis vectors of the code. All vectors orthogonal to the rows of form the dual code . The generator matrix of the dual code, ,

 GPT=0,rankG+rankP=n, (1)

is also called dual of , . It is the parity check matrix of the original code, .

A quantum stabilizer code is a -dimensional subspace of the -qubit Hilbert space , a common eigenspace of all operators in an Abelian stabilizer group , , where the -qubit Pauli group is generated by tensor products of the and single-qubit Pauli operators. The stabilizer is typically specified in terms of its generators, . The weight of a Pauli operator is the number of qubits that it affects. The distance of a quantum code is the minimum weight of an operator which commutes with all operators from the stabilizer , but is not a part of the stabilizer, . Such operators correspond to the logical qubits and are called logical operators. A Pauli operator , where and , , can be mapped, up to a phase, to a binary vector . With this map, generators of the stabilizer group are mapped to rows of a generator matrix forming a binary classical linear codeCalderbank et al. (1998). We will also consider the matrix obtained in a similar fashion from independent logical operators.

For a more narrow set of CSS codesCalderbank and Shor (1996); Steane (1996) the stabilizer generators can be chosen as products of either only or only Pauli operators. The corresponding generator matrix is a direct sum, , where rows of the matrices and are orthogonal, . For any CSS code, independent logical operators can also be chosen as products of only or only Pauli operators, which gives . Rows of the matrix are orthogonal to rows of , , and they are linearly independent from rows of . Similarly, rows of the matrix are orthogonal to rows of , and they are linearly independent from rows of . For a CSS code of block length , these matrices have columns, and the number of encoded qubits is

 k=rankLx=rankLz=n−rankGx−rankGz. (2)

Rows of and , respectively, have weights that are bounded from below in terms of the corresponding CSS distances,

 dx≡minc∈C⊥Gz∖CGxwgt(c),dz≡minb∈C⊥Gx∖CGzwgt(b). (3)

The code distance is just .

In what follows, we concentrate on CSS codes. It will be convenient to assume that matrices and have full row rank (each has exactly rows), and specifically define the form of the dual matrix [see Eq. (1)] as a combination of rows of matrices and , and, similarly, the dual matrix as a combination of rows of and . Also, to simplify the notations, it will be convenient to drop the indices and and use the matrices and . The corresponding CSS distances (3) will be denoted as and .

### ii.2 Maximum likelihood decoding and random-bond Ising model

Consider a CSS code with generator matrices and , and an error model where bit-flip and phase-flip errors happen independently with the same probability . In such a case, decoding of and errors can be done separately. In the following, we only consider errors.

Generally, an error can be described by a length- binary vector ; errors obtained by adding linear combinations of rows of are mutually degenerate (equivalent), they act identically on the code. In the absence of measurement errors, one needs to figure out the degeneracy class of the error from the measured syndrome vector, . While it is easy to come up with a vector that satisfies these equations, so do vectors obtained by adding inequivalent codewords . For the maximum-likelihood (ML) decoding, one compares the probabilities of errors in different degeneracy sectors (inequivalent ), and chooses the most likely.

The probability of an error degenerate with is obtained as a sum of probabilities of errors for different binary . Such a sum can be readily seenDennis et al. (2002) to be proportional to the partition function of RBIM in Wegner’s formWegner (1971),

 Ze(G;Kp)=∑Si=±1n∏b=1eKp(−1)ebRb, (4)

where the interaction term for bond , , is defined by the column of the matrix , is the corresponding bit in the vector , and the coupling constant is the inverse Nishimori temperatureNishimori (1980, 2001), .

Similarly, for a given codeword , the probability of an error degenerate with is proportional to . Given the syndrome , the conditional probability that an error degenerate with actually happened can be written as the ratioKovalev and Pryadko (2015)

 P(e|s)=Ze(G;Kp)∑cZe+c(G;Kp)=Ze(G;Kp)Ze(H∗;Kp), (5)

where the sum in the denominator is proportional to the probability of the syndrome to happen. In Eq. (5), is a matrix dual of , see Eq. (1); for correct normalization, should be constructed from by adding exactly rows corresponding to mutually non-degenerate codewords . The conditional probability (5), with taken from the most likely degeneracy class for the syndrome , is the probability of successful ML decoding for the given syndrome. One can then calculate the average probability of successful ML decoding Kovalev and Pryadko (2015),

 Psucc(G,H;K,p)=[P(e|eHT)]p,K=Kp, (6)

where denotes the averaging over error vectors (each set bit occurs independently with probability ).

Notice that if we take a temperature away from the Nishimori line, , we are using a decoder with an incorrect , which would result in suboptimal decodingKovalev and Pryadko (2015). For an infinite sequence of codes , with increasing distance, we define the decodable region on the - plane as such where

 limt→∞Psucc(Gt,Ht;K,p)=1. (7)

The overlap of the decodable region with the Nishimori line gives the threshold error rate for ML decoding with the chosen sequence of codes. More generally, the extent of the decodable region away from the Nishimori line can be seen as a measure of the decoding robustness.

Generally, a code with the distance can correct any errors. If the errors on different (qu)bits happen independently with probability , a typical error has weight asymptotically close to ; the existence of a decodable region is guaranteed only if the asymptotic relative distance is finite. Thus, in general, the decoding threshold satisfies .

Existence of a finite threshold for (quantum and classical) LDPC codes with sublinear distance scaling where has been established by two of us in Ref. Kovalev and Pryadko, 2013a. The basic reason for the existence of a threshold is that at small enough , likely error configurations can be decomposed into relatively small connected clusters on the (qu)bit connectivity graph. Specifically, two (qu)bits are considered connected if there is a check (a stabilizer generator) with the support including both positions. For an LDPC code (quantum or classical), the connectivity graph has a bounded degree. Then, formation of the connected error clusters is described by the site percolation process on the connectivity graph; it has a finite thresholdHammersley (1961) for any graph with the maximum degree . Moreover, below this bound, the probability to encounter a large cluster decreases exponentially with the cluster size; this fact may be used to construct a syndrome-based decoderKovalev and Pryadko (2013a).

More accurate lower bounds for decoding thresholds in different error models (including phenomenological error model for syndrome measurement errors) are given in Ref. Dumer et al., 2015. Consider CSS codes whose generator matrices and have row weights not exceeding some fixed , and distance scaling logarithmically or faster with ,

 d≥Dlnn. (8)

Assuming independent and errors with equal probabilities , the corresponding lower bound reads

 2[p(1−p)]1/2≥(m−1)−1e−1/D. (9)

With distance scaling like a power of , with , one should use . The bound (9) was obtained by analyzing a minimum energy decoder, which corresponds to .

### ii.3 Duality

As demonstrated by WegnerWegner (1971), a general Ising model with the partition function (4) has a dual representation, which is a generalization of Kramers-WannierKramers and Wannier (1941) duality. The same duality has been first established in coding theory by MacWilliamsMacWilliams (1963) as a relation between weight polynomials of two dual codes. It is convenient to introduce a generalized partition function,

 Ze,m(G;K)≡∑Si=±1n∏b=1RmbbeK(−1)ebRb, (10)

that involves binary vectors of “electric” and “magnetic” charges. Then, the duality reads

 Ze,m(G;K)=(−1)e⋅mZm,e(G∗;K∗)A(K), (11)

where an matrix is the exact dual of (dimensions ), see Eq. (1), is the Kramers-Wannier dual of , , and the scaling factor depends on the dimensions of the matrices,

 A(K)=2r−r∗+rankG∗(sinhKcoshK)n/2. (12)

Notice that the electric charges in Eq. (10) define the negative bonds as in Eq. (4), while the magnetic charges select the bonds to be used in an average,

 (13)

which is the most general form of a spin correlation function that is not identically zeroWegner (1971).

### ii.4 Quantum hypergraph-product codes

In this work we specifically focus on the quantum hypergraph product (QHP) codesTillich and Zemor (2009); Tillich and Zémor (2014), an infinite family of quantum CSS codes which includes finite-rate LDPC codes with distance scaling as a square root of the block length. A general QHP code is defined in terms of a pair of binary matrices and with dimensions and . The corresponding stabilizer generators are formed by two blocks constructed as Kronecker productsKovalev and Pryadko (2012),

 Gx=(E2⊗H1,H2⊗E1),Gz=(HT2⊗˜E1,˜E2⊗HT1), (14)

where and , , are unit matrices of dimensions given by and ; the matrices and have and rows, respectively. Clearly, the ansatz (14) guarantees that the rows of and are orthogonal, . The block length of such a quantum code is the number of columns, .

We are using the construction originally proposed in Ref. Tillich and Zemor, 2009, namely, , where is assumed to have a full row rank. If the binary code with the check matrix , , has parameters , the corresponding QHP code has the parametersTillich and Zemor (2009); Tillich and Zémor (2014) , where , , , and .

We should mention that the QHP construction with is weakly self-dual, meaning that the matrices and in Eq. (14) can be transformed into each other by row and column permutations. As a result, in particular, for any binary matrix , the decoding probabilities (6) in and sectors must coincide, .

A family of quantum LDPC codes with distance scaling as a square root of the block size can be obtained, e.g., by taking from a random ensemble of classical LDPC codes, which are known to have finite rates and finite relative distances , and removing any linearly-dependent rows. We specifically consider the ensemble of regular -LDPC codes with column weight and row weight originally introduced by GallagerGallager (1962, 1963). For each code in , its parity-check matrix of size is divided into horizontal blocks of size . Here the first block consists of unit matrices of size . Any other block is obtained by some random permutation of columns of . Thus, all columns in each block have weight . This ensemble achieves the best asymptotic distance for a given designed code rate among the LDPC ensembles studied to dateLitsyn and Shevelev (2002). In practice, it often happens that one or few rows of thus constructed are linearly dependent, which gives a code with a larger rate, . It is easy to check, however, that asymptotic at rate equals the designed rate, . For the ensemble used in this work, the asymptotic relative distance is Litsyn and Shevelev (2002).

For brevity, we will refer to QHP codes constructed using the matrices from Gallager ensemble (with any linearly dependent rows dropped) as QHP codes.

## Iii Analytical bounds

Partition function (4) scales exponentially with the system size; one more commonly works with the corresponding logarithm, the (dimensionless) free energy

 Fe(G;K)=−lnZe(G;K), (15)

which is an extensive quantity, meaning that it scales linearly with the system size. Alternatively, one can also use the free energy density (per bond), , which usually has a well-defined thermodynamical limit. The logarithm of the ML decoding probability (5), up to a sign, equals the homological difference,

 ΔFe(G,H;K)≡Fe(G;K)−Fe(H∗;K). (16)

At , this quantity satisfies the inequalities

 0≤ΔF0(G,H;K)≤kln2, (17)

where the lower and the upper bounds are saturated, respectively, in the limits of zero and infinite temperatures. Combining duality (11) with Griffiths-Kelly-ShermanGriffiths (1967); Kelly and Sherman (1968) (GKS) inequalities for spin averages we also obtain

 ΔFe(G,H;K)−ΔF0(G,H;K)≥0. (18)

In addition, also at , the duality (11) gives

 ΔF0(G,H;K)=Rln2−ΔF0(H,G;K∗), (19)

where , and is the dimension of the CSS code, see Eq. (2). The proof of these expressions is given in Appendix A.

The relation of the homological difference averaged over the disorder, , and the corresponding quantity normalized per unit bond, , to decoding with asymptotic probability one, see Eq. (7), is given by the following Lemma (proved in App. B). \thmt@toks\thmt@toks For a sequence of quantum CSS codes defined by pairs of matrices , , where , given a finite and an error probability ,
(a) implies the point to be in the decodable region;
(b) implies the point to be outside of the decodable region.

### iii.1 Lower bound for decodable region

Here, we use part (a) of Lemma III to establish an existence bound for the decodable region. Specifically, we construct an upper bound for in a finite system, and use it to show the existence of a non-trivial region where , as long as the distance scales logarithmically or faster with the block length , see Eq. (8). In Appendix C we prove: \thmt@toks\thmt@toks Consider a sequence of quantum CSS codes , , of increasing lengths , where row weights of each and do not exceed a fixed , and the code distances , with some . Then the sequence , , converges to zero in the region

 (m−1)[e−2K(1−p)+e2Kp]
\thmt@toks\thmt@toks
###### Theorem 2.

The rightmost point of this region, the maximum value where Eq. (20) has a solution, satisfies the equation . The same bound was obtained previously in Ref. Dumer et al., 2015 using estimates based on minimum-energy decoding which corresponds to . Thus, present bound does not improve the existing lower bound for ML decoding threshold.

Further, the entire region (20) lies at temperatures above the Nishimori line (see Fig. 1). In particular, at the right-most cusp of this region, the temperature is exactly twice the Nishimori temperature at . The importance of Theorem III.1 is that we got a sense of the robustness of suboptimal decoding, where the ML decoder assumes an input value of larger than the actual one.

### iii.2 Upper temperature bound for decodable region

Here we combine part (b) of Lemma (III) with duality (11) to establish an upper temperature bound for the decodable region for a sequence of codes with asymptotic rate . We first argue that existence of a low-temperature homological region where , by duality, implies the existence of a high-temperature dual homological region where , and thus at any . Further, the derivative of with respect to is just the energy per bond, with negative sign; its magnitude does not exceed one. Therefore, there should be some minimal distance between the upper temperature bound of the homological region and the lower temperature bound at of the dual homological region. This gives an upper temperature bound for the homological region. By part (b) of Lemma (III), the same bound also works as an upper bound for the decodable region at any . These arguments give (see App. D):\thmt@toks\thmt@toks Consider a sequence of CSS codes defined by pairs of finite binary matrices with mutually orthogonal rows, , , where row weights of and do not exceed a fixed , the sequence of CSS distances is strictly increasing with , , and the sequence of rates converges, . Then, assuming equal probabilities of and errors, the upper temperature boundary of the decodable region, , satisfies the inequality

 Kmax−K∗max≥Rln2. (21)
\thmt@toks\thmt@toks
###### Theorem 3.

Explicitly, this gives an upper temperature bound for the location of the ML-decodable region for any CSS code family with asymptotic rate ,

 e2Kmax≥1+r+√(1+r)2+r2,r≡22R≥1. (22)

In the case this bound corresponds to the self-dual point, which equals to the upper bound of the decodable region of the square-lattice toric code (ferromagnetic phase of the square-lattice Ising model).

## Iv Numerical results

### iv.1 Justification

We first note that numerically, it is only possible to analyze systems of finite size. Numerical techniques used for predicting asymptotic large-size properties, such as finite-size scaling, are only good as long as such properties exist and change with the system size in a regular manner. For example, even though we know the existence of a non-zero decoding threshold, it is not a priori clear that the finite-size data would show a well defined crossing point, as seen on Fig. 2.

Similarly, a well-defined thermodynamical limit is known to exist for bulk quantities like magnetization or specific heat for Ising models on lattices that are local in dimensions, simply because the corrections due to the boundary scale as the surface area, which scales as a sublinear power of the volumeGriffiths (1972). Well-defined infinite-size limit (although not necessarily universal) also exists if one considers a sequence of models on increasing subgraphs of an infinite graph, where the boundary spins are either “free” (the couplings connecting them to outside are set to zero), or “wired” (the outside couplings are set to infinity). The existence of a thermodynamical limit in each of these cases follows from the GKS inequalitiesGriffiths (1967); Kelly and Sherman (1968), which require that spin correlations change monotonously with the system size (increase for wired and decrease for free boundary conditions).

The problem we are considering is different from either case, as a sequence of matrices (or the corresponding bipartite graphs) defines a sequence of finite few-body Ising models without boundaries. Further, the finite asymptotic rate of the considered code family guarantees the absenceBravyi and Terhal (2009); Bravyi et al. (2010) of a -dimensional layout of the qubits with local stabilizer generators, at any finite . The only rigorous result, proved in a companion paperJiang et al. (2018), is that a well defined limit for average free energy density for QHP codes exists for any in a finite region around the infinite temperature and, by duality, for , in a finite region around the zero temperature. This follows from the absolute convergence of the corresponding high-temperature series (HTS) established using the bound on high-order cumulantsFéray et al. (2016), and from the fact that a large random bipartite graph with vertex degrees and has few short cycles. The local Benjamini-Schramm limitBenjamini and Schramm (2001) of such a graph is a bipartite tree, meaning that the asymptotic coefficients of the high-temperature series expansion to any finite order can be computed by analyzing only the clusters present when in Eq. (14) corresponds to such a tree. The corresponding argument is a direct generalization of that in Refs. Borgs et al., 2013; Lovász, 2016, where the existence of a well defined limit for free energy density was analyzed for general models with up to two-body interactions.

One consequence of this argument is that in the asymptotic limit, we do not expect much difference between the use of matrices from the full Gallager ensemble, and the corresponding subset where for each size we pick only the matrices which result in the largest distance of the classical code . On the other hand, we expect that the use of such matrices should significantly improve the convergence in the high- and low-temperature regions where the corresponding series converge: with larger distance, a larger number of coefficients of the series would match those for the infinite-size system.

Unfortunately, even though the corresponding series can be analytically continued beyond the convergence radius, this does not guarantee the existence of a well-defined limit for thermodynamical quantities at all temperatures, as would be required to formally justify the use of finite size scaling. Therefore, numerical results presented in the following sections represent numerical trends in systems of relatively small size; they do not necessarily guarantee the existence of well defined transition(s).

### iv.2 Approximate minimum-weight decoding

To obtain an empirical lower bound for the ML decoding threshold, we constructed a cluster-based decoder using the approach suggested in Refs. Kovalev and Pryadko (2013a); Dumer et al. (2015) (see Sec. II.2). Specifically, given the syndrome vector , we construct a list of irreducible clusters up to the chosen cut-off weight . Each irreducible cluster should correct some syndrome bits without introducing new ones, and it should not contain a subcluster with the same property. As explained in Sec. II.2, with an LDPC code where the stabilizer weight is bounded, and for large enough , we expect this list with high probability to include all clusters present in the connected-cluster decomposition of the actual error. The actual decoding is done by solving a minimum-weight set cover problem: among the subsets of the cluster list with the property that every non-zero syndrome bit be covered exactly once, we want to find such that the sum of the cluster weights be minimal. This latter problem is solved in two steps: first, by running the LinearProgramming over integers in MathematicaWolfram Research, Inc. (2017) to arrive at a valid solution with a reasonably small weight, and then by trying to minimize the weight further with the help of a precomputed list of non-trivial irreducible codewordsDumer et al. (2015). In our calculations, for each disorder realization we generated irreducible clusters of weight up to , and, for each code, the list of irreducible codewords of weight up to .

Without the limits on the clusters’ and codewords’ weights, this procedure would be equivalent to minimum-weight decoding. Unfortunately, the corresponding complexity grows prohibitively (exponentially with the size of the code). Nevertheless, for smaller codes we were able to choose large enough and to estimate the minimum-weight decoding threshold, as seen from the convergence of the corresponding decoding probabilities.

The decoding complexity is determined by the sum of those for the construction of the cluster list and for solving the weighted set cover problem. The construction of the cluster list was analyzed in detail in Refs. Dumer et al., 2014, 2016, 2017. In particular, if the maximum weight of a stabilizer generator is , the corresponding complexity is . At small enough , the probability of a large cluster decays exponentially with its weight. Thus, in most cases, maximum cluster size scales logarithmically with the code length , and a sufficient cluster list can be prepared with the cost polynomial in .

On the other hand, the weighted set cover problem is NP-completeGarey and Johnson (1979); the corresponding cost is exponential in the length of the cluster list. Generally, this problem is equivalent to an integer linear programming (LP) problem. To find a valid (but not necessarily the minimal) solution, we use a call to the built-in Mathematica function LinearProgramming. While the details of its implementation are proprietary, it is our understanding that an integer solution is found by first solving the corresponding problem over reals using an algorithm with polynomial complexity, and then finding the nearest integer point in the LP polytope. With rare exception (few instances over the entire set of our simulations where we had to record decoder failure due to calculation time-out), LinearProgramming returns a valid solution which satisfies the constraints but does not necessarily have the smallest weight.

To reduce the weight further, we used a version of the approach used previouslyDumer et al. (2015) to construct an analytical bound for minimum-energy decoding threshold. Notice that the minimum-weight (same as minimum-energy) solution produces the same syndrome as , thus produces the zero syndrome, and in general can be decomposed into a sum of irreducible codewordsDumer et al. (2015), , such (a) that the supports of different do not overlap, (b) each of is a valid codeword, in the sense that it produces a zero syndrome, and (c) any cannot be decomposed further into a sum of non-overlapping codewords. Such a decomposition is not necessarily unique. It is easy to seeDumer et al. (2015) that weight of for any must not exceed that of . Thus, if we have a list of all non-trivial, , irreducible codewords, the equivalence class of the minimum-weight solution can be found by adding those that reduce the weight of , until the weight can no longer be reduced.

Notice that with a complete list of non-trivial irreducible codewords, the degeneracy class of the minimum weight solution can be correctly identified from any vector which produces the correct syndrome. In practice, since the weights of the irreducible codewords in our list are limited, the decoding success probability increases with the reduced weight of the initial vector .

Overall, for each code in our simulations, the majority of the computational time was spent on preparing the list of non-trivial irreducible codewords with weights .

The results of the described threshold simulations are presented in Fig. 2 (a) and (b), respectively, for QHP codes and for the toric codes. More precisely, in Fig. 2(a), we show the fraction of decoder failures for QHP codes constructed from three large-distance classical codes from Gallager ensemble, for different values of bit flip probability . The codes used have parameters , , and ; they were constructed from binary codes with parameters , , and . Code obtained from the binary code turned out too large for the present decoding technique; the corresponding data is not included in Fig. 2(a).

In our calculations, we used and , which was sufficient for convergence of the average decoding probability for used in the simulations. The well defined crossing point in Fig. 1(a) indicates a (pseudo)threshold for decoding of QHP codes in the vicinity of . Convergence of the average decoding probability with increasing and is an indication that this value is a good estimate of the minimum-weight decoding threshold.

For comparison, in Fig. 2(b), we show the corresponding results for the rotated toric codesBombin and Martin-Delgado (2007) with the parameters , with , and , where the crossing point is close to , the minimum-weight decoder threshold obtained using the minimum-weight matching algorithmDennis et al. (2002).

Notice that both for QHPs and for the toric code, the obtained threshold estimates are much larger than the corresponding analytical lower bounds from Ref. Dumer et al., 2015, and , respectively.

### iv.3 Monte Carlo simulations and the phase diagram

In this section we analyze numerically the low-disorder portion of the phase diagram of the two random-bond Ising models corresponding to the ML decoding of QHP codes with i.i.d. bit-flip errors. For a CSS code with generators and , the corresponding Ising models have the free energies and , see Eqs. (4) and (15), where is the binary error vector whose non-zero bits indicate the flipped bonds, and is the inverse temperature. These models, respectively, correspond to the numerator and the denominator of the conditional ML decoding probability (5).

The parameters of the four QHP codes used in the simulations are described in the previous section. For simulation efficiency, we attempted to minimize the weights of the rows of the matrices . To this end, starting with the matrix , we added one row at a time, corresponding to one of the minimum-weight vectors in , where is the previously constructed matrix. As a result, the row weights of each matrix did not exceed .

To calculate the averages, we performed feedback optimized parallel tempering Monte Carlo simulationsKatzgraber et al. (2006); Kubica et al. (2017), as well as the usual simulated annealing. In both cases we used standard Metropolis updates.

For both models, the observed scaling of the height of the specific heat maxima with , and the hysteresis which we could not eliminate for larger codes, are consistent with the discontinuous transitions. We also observe that the use of the parallel tempering method does not improve the convergence significantly; we attribute this to the discontinuity of the phase transition.

Samples of the computed specific heat (per bond) for the (3,4) QHP models, , where is the energy, is the number of bonds, and is the temperature, are shown in Fig. 3 (comparing different values of separately for distances , , and ). The specific heat values shown in Fig. 4 have been additionally divided by the number of bonds ; the corresponding maximum values are weakly increasing with the code distance for , see Fig. 4(a), and weakly decreasing for and , see Figs. 4(b) and 4(c). Such a slow dependence on the system size is consistent with a 1st order transition, where one expects . In Fig. 4(c) we also compare the data obtained using parallel tempering and the usual annealing. These data were obtained after Monte Carlo sweeps for QHP codes of distance and , and Monte Carlo sweeps for codes of distances and , for each of 128 realizations of disorder at every .

When the positions of the specific heat maxima are plotted as a function of (asymptotically, , although such a relation does not hold for the small codes used in the simulations), the corresponding points are seated close to a straight line, see Fig. 5(a). Respectively, we used the linear fit to extrapolate our finite-size data and extract more accurate critical point of the transition, , as a function of the flipped bond probability . The resulting phase boundary is shown in Fig. 1 with solid blue circles, along with the solid blue line which is the linear fit to the data. The fit indicates that for , the phase transition temperature is approximately linear in . It is also clear from Fig. 1 that at every , the phase boundary for this model is higher than the corresponding line for the square-lattice Ising model, plotted with dot-dashed line using the data from Ref. Thomas and Katzgraber, 2011.

The analysis for the dual QHP models was performed similarly (specific heat data not shown). The positions of the specific heat maxima as a function of for different values of are shown in Fig. 5(b), along with the corresponding linear fits. Notice that the points at show significant curvature which cannot be attributed to the statistical errors alone. By this reason we also tried a parabolic fit, which resulted in a substantially lower extrapolated compared with from the linear fit. In comparison, at , parabolic fit gives , which is not as significantly reduced compared to the linear fit result of .

The extrapolated positions of the specific heat maxima are plotted in Fig. 1 with solid red boxes, along with a solid red line which is the ad hoc linear fit to the data. (The two extrapolated values obtained from parabolic fits in Fig. 5(b) are shown in Fig. 1 with open red boxes.) The corresponding line is approximately parallel to that for the (3,4) QHP model. As expected (see Sec. II.3), the points at are located close to mutually dual positions. For the dual model, the extrapolation gives , which is close to obtained from .

Empirically, the transition temperatures in the two dual models are different. Under this conditionJiang et al. (2018) more precisely, assuming that large-system free energy density be non-singular at and below the lowest-temperature singular point of the transition temperature of the dual (3,4) QHP model should coincide with the homological transition, where reaches the lower bound of [cf. Eqs. (17) and (18)]. Above this temperature, . Thus, according to part (b) of Lemma III, the critical point of the dual model (red squares in Fig. 1) gives an upper bound for the decodable phase of QHP codes.

Fig. 1 also shows several analytical bounds for the decodable region. Magenta-shaded region corresponds to the lower bound for the decodable region given by Eq. (20) with . Its rightmost point is at the same as the energy-based analytical bound from Ref. Dumer et al. (2015), the corresponding point is indicated on the horizontal axis by the magenta arrow. The lower bound for the decodable region (pseudothreshold for energy-based decoding) is shown with the blue vertical arrow. Finally, a pair of gray arrows separated by the gray bar on the vertical axis show the bound from Theorem III.2 and the corresponding dual temperature, . As expectedJiang et al. (2018), the transitions temperatures for QHP and the dual QHP models at are outside of this interval.

## V Conclusion

In conclusion, we have studied error correction properties of the finite-rate family of quantum hypergraph product codes obtained from Gallager ensemble of classical binary codes, by combining the threshold calculation using a cluster-based decoder approximating minimum-energy decoding with the analysis of the phase diagram of the associated spin models (Fig. 1). Rigorous analytical bounds for the decodable region are constructed by analyzing the properties of the homological difference (16), equal to the logarithm of the conditional decoding probability with the negative sign.

The estimated minimum-weight decoding threshold error rate for this code family is in the vicinity of . This estimate is not so far from the perfect matching algorithm threshold of for the toric codes Dennis et al. (2002), and is much higher compared to the analytic lower bound of obtained in Ref. Dumer et al. (2015).

The most striking feature of the phase diagram of the associated spin models originating from the finite asymptotic rate [ for QHP codes] is the deviation of the transition lines from the self-dual temperature at . In fact, the transitions temperatures of the two dual models deviate from each other throughout the small- region we studied. We expect the multicritical points, where the corresponding transition lines intersect the Nishimori line, also to be different, contrary to the implicit assumption in Ref. Kovalev and Pryadko, 2015.

Notice that the horizontal position of the rightmost point of the region where Theorem III.1 guarantees decodability with asymptotic probability one (magenta-shaded region in Fig. 1) coincides with the analytic lower bound for the energy-based decoding from Ref. Dumer et al., 2015. While the former region is entirely located above the Nishimori line, the minimum-energy decoding threshold corresponds to . A point on the Nishimori line correspond to maximum-likelihood decoding at the corresponding . This guarantees that the portion of the Nishimori line for is also inside the decodable region. It is reasonable to expect that for , the entire interval of temperatures below the bound of Theorem III.1 would be in the decodable region. However, construction of the corresponding analytical bound is still an open problem.

###### Acknowledgements.
This work was supported in part by the NSF under Grants No. PHY-1415600 (AAK) and PHY-1416578 (LPP). The computations were performed utilizing the Holland Computing Center of the University of Nebraska.

## Appendix A Proof of Eqs. (17) to (19).

(i) The lower bound in Eq. (17),

 0≤ΔF0(G,H;K)≤kln2,

is trivial to prove, since is a sum of positive terms which include every term present in . To prove the upper bound, notice that for any , ; this can be proved by comparing the corresponding expansions in powers of . The expression for includes the summation over distinct defect vectors , thus , which gives the upper bound in Eq. (17).

(ii) The inequality

 ΔFe(G,H;K)−ΔF0(G,H;K)≥0

is derived with the help of the duality (11) which maps the l.h.s. into the difference of the logarithms of the averages,

 ΔFe−ΔF0 = lnZe(H∗;K)Z0(H∗;K)−lnZe(G;K)Z0(G;K) =

the difference is non-negative by the GKS second inequalityGriffiths (1967); Kelly and Sherman (1968) (average in the first term can be obtained from that on the right by applying an infinite field at the additional spins).

(iii) The duality relation

 ΔF0(G,H;K)=kln2−ΔF0(H,G;K∗).

is a simple consequence of Eq. (11) with and the definition of the dual matrices , . Let and denote the numbers of rows in and , respectively. By construction, the dual matrices and have and rows, and their ranks are , . We have,

 Z0(H∗,K)Z0(G,K) = 2r∗H−rH+rankH2rG−r∗G+rankG∗Z0(H,K∗)Z0(G∗,K∗) = 2kZ0(H,K∗)Z0(G∗,K∗).

Eq. (19) is obtained by taking the logarithm.

## Appendix B Proof of Lemma Iii

###### Proof.

Part (a) immediately follows from the convexity of the logarithm,

 [P(e|eHT)]p≥exp[lnZe(G;K)Ze(H∗;K)]p=e−ΔFp(G,H;K).

Part (b) follows from the trivial bounds on the partition function, , where is an matrix. This gives a lower bound for the conditional probability (5),

 lnP(e|s)≥ln(2re−Kn2r+keKn)=−n(2K+R)ln2. (23)

Now, for some , let us say that a “good” disorder configuration corresponds to , to obtain

where is the constant in the r.h.s. of Eq. (23), and is the net probability to encounter a bad configuration. A similar chain of inequalities gives an upper bound for :

 Psucc = [P(e|HTe)]p ≤ Pgood+(1−Pgood)(1−δ) = 1−(1−Pgood)δ;\ thus\ 1−Psucc ≥ (1−Pgood)δ=Pbadδ,

Combining with Eq. (24), this gives for the success probability (6), at a fixed :

 1−Psucc ≥ Pbadδ (25) ≥ δ[ΔFe]p+ln(1−δ)nM \lx@stackreln→∞= δ[Δfe]p(2K+R)ln2>0,

which limits from above, away from one.

## Appendix C Proof of Theorem iii.1

###### Theorem ??.

The statement of the theorem immediately follows from the positivity of , see Eq. (17), and the following Lemma:

###### Lemma 6.

Consider a pair of Ising models defined in terms of matrices and with orthogonal rows, such that the matrix has a maximum row weight . Let denote the CSS distance (3), the minimum weight of a defect . Denote , and assume that . Then, the disorder-averaged homological difference (16) satisfies

 [ΔF(G,H;K)]p≤n(m−1)dGCdG+11−(m−1)C. (26)
###### Proof.

It is convenient to represent the partition function (4) in the form

 Ze(G;K)=eKn∑ε≃0e−2Kwgt(e+ε),

where the notation indicates that is in the trivial degeneracy class, that is, it can be represented as a linear combination of rows of , , and is the total number of flipped bonds with the spins . In comparison,

 Ze(H∗;K)=eKn∑ε:HεT=0e−2Kwgt(e+ε);

here the summation is over all vectors which are orthogonal to the rows of . Let us consider a decomposition of any such binary vector into irreducible components, , where supports of different vectors in the decomposition do not overlap, if , while each component remains orthogonal to the rows of (such a decomposition is not necessarily unique). Further, group all of the components which are trivial, , into the vector , and the non-trivial components into the vector , so that , where , vector is a sum of non-trivial non-overlapping codewords , and the remainder is trivial, .

Given such a decomposition for each vector , we can construct an upper bound for the ratio,

 Ze(H∗;K)Ze(G;K) = ∑ε:HεT=0e−2Kwgt(e+ε)∑ε≃0e−2Kwgt(e+ε) ≤ ∑ε′∑ε′′≃0:ε′′∩ε′=∅e−2Kwgt(e+ε′+ε′′)∑ε′′≃0:ε′′∩ε′=∅e−2Kwgt(e+ε′′),

where the outside summation is over , a sum of non-overlapping irreducible codewords, and (for a given ) we reduced the denominator by dropping the terms which overlap with , to match the corresponding sum in the numerator. The ratios for each can now be trivially calculated in terms of the weight of in the support of , which we denote as . We have

 Ze(H∗;K)Ze(G;K)≤∑ε′e−2K[wgt(ε′)−2wgt(e′)],

and the corresponding average

 [Ze(H∗;K)Ze(G;K)]p≤∑ε′Cwgt(ε′),

where the constant . The summation is over sums of irreducible non-overlapping codewords, ; we can further increase the r.h.s. if we allow the overlaps between the codewords, to obtain

 [Ze(H∗;K)Ze(G;K)]p≤exp(∑cCwgt(c)),

where the summation is now done over irreducible codewords. The bound for