Optimization of the paritycheck matrix density in QCLDPC codebased McEliece cryptosystems ^{†}^{†}thanks: This work was supported in part by the MIUR project “ESCAPADE” (Grant RBFR105NLC) under the “FIRB  Futuro in Ricerca 2010” funding program.
Abstract
Lowdensity paritycheck (LDPC) codes are one of the most promising families of codes to replace the Goppa codes originally used in the McEliece cryptosystem. In fact, it has been shown that by using quasicyclic lowdensity paritycheck (QCLDPC) codes in this system, drastic reductions in the public key size can be achieved, while maintaining fixed security levels. Recently, some proposals have appeared in the literature using codes with denser paritycheck matrices, named moderatedensity paritycheck (MDPC) codes. However, the density of the paritycheck matrices to be used in QCLDPC codebased variants of the McEliece cryptosystem has never been optimized. This paper aims at filling such gap, by proposing a procedure for selecting the density of the private paritycheck matrix, based on the security level and the decryption complexity. We provide some examples of the system parameters obtained through the proposed technique.
I Introduction
The perspective of introducing quantum computers has driven a renewed interest towards publickey encryption schemes which are alternative to widespread solutions, like the Rivest, Shamir, Adleman (RSA) system, based on the integer factorization problem. The latter, in fact, would be solved in polynomial time through quantum computers, and hence would no longer represent a hard problem after their advent.
The McEliece and Niederreiter cryptosystems [1, 2], which exploit the hardness of the decoding problem to implement publickey cryptography, are among the most interesting alternatives to RSA. Secure instances of these systems are based on Goppa codes and, despite some revision of their parameters due to optimized cryptanalysis and increased computational power [3], they have never been seriously endangered by cryptanalysis. However, using Goppa codes has the major drawback of requiring large public keys, whose size increases quadratically in the security level. Several attempts to replace Goppa codes have been made during years, but only a few have resisted cryptanalysis. Among them, variants based on QCLDPC codes are very promising, since they achieve very small keys, with size increasing linearly in the security level. These variants are unbroken up to now, though some refinements have been necessary since their first proposal.
LDPC codes are stateoftheart iteratively decoded codes, first introduced by Gallager [4], then rediscovered [5] and now used in many contexts [6]. Recently, LDPC codes have also been introduced in several securityrelated contexts, like physical layer security [7, 8, 9] and key agreement over wireless channels [10]. LDPC codes were initially thought to be insecure in the McEliece cryptosystem [11], and very large codes were required to avoid attacks [12]. This scenario has changed when it has been shown that the permutation matrix used to obtain the public key from the private key could be replaced with a more general matrix [13]. Despite some adjustments have been necessary after the first proposal, these matrices have allowed to design secure and efficient instances of the system based on QCLDPC codes [14, 15].
Recently, it has been shown that the use of permutation matrices, like in the original McEliece cryptosystem, can be restored by using codes with increased paritycheck matrix density, named MDPC codes [16, 17]. MDPC codes also exhibit performance which does not degrade significantly when there are short cycles in their associated Tanner graph. This allows for a completely random code design, which has permitted to obtain a security reduction to the hard problem of decoding a generic linear code [16].
In this paper, we compare LDPC and MDPC codebased McEliece proposals and provide a procedure to optimize the density of the paritycheck matrices of the private code, in such a way as to reach a fixed security level and, at the same time, keep complexity to the minimum. The paper is organized as follows: in Section II, we assess the error correction performance of the codes of interest, and its dependence on the paritycheck matrix density; in Section III, we estimate the security level of the system by considering the most dangerous structural and local attacks; in Section IV, we show how to optimize the private paritycheck matrix density by taking into account complexity; in Section V we provide some system design examples through the proposed procedure and, finally, in Section VI we draw some conclusive remarks.
Ii Error correction performance
QCLDPC and quasicyclic moderatedensity paritycheck (QCMDPC) codebased variants of the McEliece cryptosystem use codes with length , dimension and redundancy , where is a small integer (e.g., ), , and is a large integer (on the order of some thousands or more). The code rate is therefore . Since adopting a rather high code rate is important to reduce the encryption overhead on the cleartext, in this work we focus on the choice , such that the size of a cleartext is times that of the corresponding ciphertext.
The private key contains a quasicyclic (QC) paritycheck matrix having the following form [18, 15]:
(1) 
where each is a circulant matrix with row and column weight . It follows that the row weight of is . So, the code defined by is an LDPC code or MDPC code, according to the definition in [16]. Actually, the border between LDPC and MDPC codes is not tidy: MDPC codes are LDPC codes too, but their paritycheck matrix density is not optimal, in regard to the error rate performance.
The private key also contains two other matrices: a non singular scrambling matrix and an non singular transformation matrix having average row and column weight . For the sake of simplicity, was always chosen as an integer in previous proposals [14, 15], and was a regular matrix. However, can also be slightly irregular, in such a way that can be rational. This provides a further degree of freedom in the design of the system parameters, which will be exploited in this paper.
Let be the private code generator matrix, the public key is obtained as for the McEliece cryptosystem and as for the Niederreiter version [19]. In order to preserve the QC nature of the public keys, the matrices and are also chosen to be QC, that is, formed by and circulant blocks, respectively. This way, and by using a suitable CCA2 secure conversion of the system [3], which allows using public keys in systematic form, the public key size becomes equal to bits, which is very small compared to Goppa codebased instances. On the other hand, the use of in QC form limits the resolution of , which cannot vary by less than , but this is not an important limitation in the present context. When using MDPC codes, the matrix reduces to a permutation matrix (i.e., ). In this case, by using a CCA2 secure conversion of the system, and can be eliminated [16], since the public generator matrix can be in systematic form and can be directly used as the public key. In fact, differently from Goppa codes, when using MDPC codes, exposing does not allow an attacker to perform efficient decoding.
Though the public matrices are dense, the public code admits a valid paritycheck matrix in the form , which, due to the sparse nature of both and , has column and row weight approximately equal to and , respectively. The matrix has also effect on the intentional error vectors used for encryption, since if Alice adds intentional errors for encrypting a message, then Bob must be able to correct up to errors to decrypt it [15].
Concerning the error correction performance of the private code, though for LDPC codes its evaluation without simulations is in general a hard task, we can get a reasonable estimate by computing the bit flipping (BF) decoding threshold [15]. We have computed this threshold, for , by considering a fixed and optimized decision threshold for the BF decoder, and letting vary between and . Since we are interested in studying the dependence of the BF threshold on the paritycheck matrix density, we computed such a threshold for different column weights () ranging between and . The results obtained are reported in Fig. 1. We observe that the decoding threshold, so estimated, increases linearly in the code length, and generally decreases for increasing paritycheck matrix densities, though with some local oscillations.
Actually, the BF threshold represents the waterfall threshold when using BF decoding on an infinitelength code without cycles in the Tanner graph, and hence it does not correspond to sufficiently low error rates when such a decoding algorithm is used on finitelength codes. However, several variations and improvements of the BF algorithm have been proposed for decoding LDPC codes, and they actually provide very low, and even negligible, residual error rates when the number of errors equals, or slightly overcomes, the BF threshold [15]. Even better performance can be achieved by using LDPC decoding algorithms based on soft decision, like the sum product algorithm (SPA). Thus, for these codes, we can actually use the BF threshold as a measure of the number of errors that can be corrected with very high probability. An example in this sense is provided in Fig. 2, where the error correcting performance achieved by eight QCLDPC codes with , and through SPA decoding is reported. The residual bit error rate (BER) and codeword error rate (CER) after decoding have been assessed through simulation. According to Fig. 1, the BF threshold for these codes is errors, and Fig. 2 confirms that it provides a conservative estimate of the number of correctable errors.
The same conclusion does not seem to be valid for MDPC codes, especially for high values. As an example, we have considered a code with , and . Its BF threshold is at errors; however, we have verified through simulations that, with intentional errors, the SPA achieves a residual CER of about . This result can be improved by resorting to BF decoding. In fact, for MDPC codes, which have many short cycles in their Tanner graphs, using soft information may result in worse performance than using good harddecision decoding algorithms. For example, the BF decoder with variable and optimized decision thresholds is able to reach a residual CER of about . However, these residual error rates confirm that, for MDPC codes, the BF threshold may overestimate the number of correctable errors.
From Fig. 2 we also get another important information. The first four codes considered (denoted by rand) were designed completely at random, that is, by randomly choosing the positions of the ones in the first row of each circulant block. The second four codes considered (denoted by RDF) were instead designed by using random difference families (RDF) [13].
From the figure we observe that no significant difference appears between the two sets of curves. These codes have the lowest paritycheck matrix density among those considered, that is, . A similar behavior was observed in [16] for MDPC codes with on the order of or more. This suggests that, for the paritycheck matrix densities that are of interest for this kind of applications, there is no substantial difference between completely random and constrained random code designs. A difference would instead appear for sparser matrices, like those of interest for application of LDPC codes to transmissions (that is, with on the order of some units), for which short cycles in the Tanner graph deteriorate the code minimum distance. Hence, it is reasonable to conclude that a completely random code design can be used in this context, independently of the paritycheck matrix density of the private code. Therefore, the security reduction provided in [16] also applies to LDPC codebased variants of the McEliece cryptosystem, similarly to those using MDPC codes.
Iii Security level
The most dangerous attacks against the considered systems are dual code attacks (DCA) and information set decoding attacks (ISDA) [15]. In order to estimate the work factor (WF) of these attacks, we consider the algorithm proposed in [20] to search for low weight codewords in a random linear code. Actually, some advances have recently appeared in the literature concerning decoding of binary random linear codes [21, 22]. However, these works are more focused on asymptotic evaluations rather than on actual operation counts, which are needed for our WF estimations. Also “ball collision decoding”, proposed in [23], achieves important WF reductions asymptotically, but these reductions are negligible for the considered code lengths and security levels.
DCA aim at obtaining the private key from the public key by searching for low weight codewords in the dual of the public code. This way, an attacker could find the rows of , and then use , which is sparse, to decode the public code through LDPC decoding algorithms. The row weight of is and the corresponding multiplicity is . Figure 3 reports the values of the WF of DCA, as functions of , for the shortest and the longest code lengths here considered. We observe that, for a fixed , the two curves differ by less than , hence DCA exhibit a weak dependence on .
ISDA instead aim at finding the error vector affecting an intercepted ciphertext. This can be done by searching for the minimum weight codewords of the extended code generated by . This task is facilitated by the QC nature of the codes we consider, since each blockwise cyclically shifted version of an intercepted ciphertext is another valid ciphertext. Hence, can be further extended by adding blockwise shifted versions of the intercepted ciphertext, and the attacker can search for one among as many shifted versions of the error vector. We have considered the optimum number of shifted ciphertexts that can be used by an attacker, and computed the WF of ISDA according to the above procedure. The results obtained are reported in Fig. 4, as functions of the number of intentional errors, for the smallest and the largest code lengths here considered. Also in this case, we observe that the WF of the attack has a weak dependence on the code length.
From Fig. 4 we also observe that the ISDA WF (in ) increases linearly in the number of intentional errors, and we know from Fig. 1 that the decoding threshold increases linearly in the code length. Hence, provided that is chosen in such a way that DCA have WF equal to or higher than ISDA, the security level of the system increases linearly in the code length, which is a desirable feature for any cryptosystem.
Iv Density optimization
Some features of the McEliece cryptosystem variants we study are not affected by the private paritycheck matrix density. One of them is the key size. In fact, the public key is always a dense matrix and, hence, its size does not change between LDPC and MDPC codebased variants. The public key size can be reduced to the minimum by using , as in [16], but this reduces the code rate to , which is less than in the original McEliece cryptosystem and its most recent variants. We instead consider , which gives slightly larger keys, but also a more sensible code rate. In fact, due to the QC nature of the public matrices, the public key size remains very small, and increases linearly in the code length, that is, for the considered cryptosystem, in the security level. Some examples of key size can be found in [14, 15, 16], both for classical cryptosystem versions and CCA2 secure conversions.
Also the encryption complexity is not affected by the private matrix density, since encryption is performed through the dense public matrix. Concerning decryption, the following steps must be performed to decrypt a ciphertext [15]:

multiplication of the ciphertext by ;

LDPC decoding;

multiplication of the decoded information word by .
The last step is not affected by the private paritycheck matrix density, while the complexity of the first two steps depends on it. More specifically, the matrix is sparse, hence the cost of step i) is proportional to its average column weight (). Since, once having fixed according to the desired security level against DCA, equals , complexity depends on the private code paritycheck matrix density.
LDPC decoding is performed through iterative algorithms working on the code Tanner graph, which has a number of edges equal to the number of ones in the code paritycheck matrix. Hence, for a given , the choice of and represents a tradeoff between complexity of the steps i) and ii): increasing (and decreasing , at most down to , as in MDPC codebased variants) decreases the complexity of the step i) and increases that of the step ii), while increasing (and decreasing , as in [14, 15]) increases the complexity of the step i) and decreases that of the step ii).
In order to assess this tradeoff, we define two compact complexity metrics for steps i) and ii): is the number of operations needed to perform multiplication of a vector by and , where is the average number of decoding iterations, is proportional to the number of operations needed to perform LDPC decoding. In order to provide the actual count of binary operations, the latter should be further multiplied by the number of binary operations () performed along each edge of the Tanner graph. However, this quantity depends on the specific decoding algorithm used. In order to keep our analysis as general as possible, we first consider , and we will comment on the effect of higher values of later on.
Since , optimizing the tradeoff between steps i) and ii) reduces to choosing which minimizes:
(2) 
This must be performed by considering a value of able to guarantee sufficient security against DCA (see Fig. 3) and a value of such that the code is able to correct errors, where is chosen in such a way as to reach a sufficient security level against ISDA (see Fig. 4).
We observe that the minimum of (2) corresponds to . However, for , the private code might be unable to correct all errors, hence a smaller value of might be necessary. In addition, a high value of implies a small and, if becomes too small, the private paritycheck matrix could be discovered by enumeration. On the other hand, by decreasing below , the value of (2) increases, and reaches a maximum for , which is the minimum allowed to have a non singular matrix . Based on these considerations, we can conclude that the optimum value of is always greater than , and comprised between and . By considering a more sensible value of , would further increase. However, this would have no effect on the actual optimal value of , which, for the system parameters that are of practical interest, always remains below .
Finally, we also observe that a low value of also affects the total number of different matrices which can be chosen as . When , the matrix becomes a QC permutation matrix , that is, a matrix formed by circulant blocks with size , among which only one block per row and per column is a circulant permutation matrix, while all the other blocks are null. Hence, the total number of different choices for is . For example, by considering the parameters proposed in [16] for achieving bit security, which are , and , we would have, respectively, , and different choices for , which would be too few to guarantee security. However, this weakness can be avoided by resorting to a CCA2 secure conversion of the system, and hence eliminating and , as pointed out in [16]. On the other hand, when using higher values of , this potential weakness can easily be avoided, just for moderately high values of (like ), as needed for achieving high code rates.
V Design examples
We first consider the target of bit security. According to Figs. 3 and 4 (and assuming the shortest code length there considered, which provides a conservative estimate), this can be achieved, with , by choosing and . An MDPC code with length and has a BF threshold equal to errors, and we have verified that it is actually able to correct errors with very high probability. Hence these parameters provide a bit security system design with . Instead, if we fix (that is, ), we have . From Fig. 1 it results that an LDPC code with and has a BF threshold equal to errors, and we have shown in Section II that, for such sparse codes, the BF threshold actually provides a conservative estimate of the number of correctable errors. So, we have two system designs which achieve the same security level, but with different matrix densities. In these two cases, and by considering that a typical value of is , we have and .
As another example, we consider a bit security level. Similarly to the previous case, from Figs. 3 and 4 we obtain that this requires and . An MDPC codebased design can be obtained with code length (and ), which provides a BF threshold equal to errors. We have verified that such an MDPC code is actually able to correct errors with very high probability, hence this solution reaches bit security with . An LDPC codebased alternative can be obtained by using the same code length and , that is, . In this case, the BF threshold is equal to errors, hence the code is able to correct all the errors with very high probability. In these cases (and with ), we have and .
These examples confirm that, for a fixed security level, choosing sparser codes, and hence higher values of , is advantageous from the complexity viewpoint.
Vi Conclusion
In this paper, we have analyzed the choice of the private paritycheck matrix density in QCLDPC codebased variants of the McEliece cryptosystem. We have shown that a given security level can be achieved by a balancing of the density of the private paritycheck matrix and that of the matrix used to disguise into the public key.
Through some practical examples, we have shown that, from the complexity standpoint, it is generally preferable to decrease the density of the private paritycheck matrix and to increase that of the transformation matrix . For this reason, LDPC codebased instances of the system result to be preferable to MDPC codebased instances if one wishes to keep complexity at its minimum, for a fixed security level.
References
 [1] R. J. McEliece, “A publickey cryptosystem based on algebraic coding theory.” DSN Progress Report, pp. 114–116, 1978.
 [2] H. Niederreiter, “Knapsacktype cryptosystems and algebraic coding theory,” Probl. Contr. and Inform. Theory, vol. 15, pp. 159–166, 1986.
 [3] D. J. Bernstein, T. Lange, and C. Peters, “Attacking and defending the McEliece cryptosystem,” in PostQuantum Cryptography, ser. Lecture Notes in Computer Science. Springer Verlag, 2008, vol. 5299, pp. 31–46.
 [4] R. G. Gallager, “Lowdensity paritycheck codes,” IRE Trans. Inform. Theory, vol. IT8, pp. 21–28, Jan. 1962.
 [5] D. J. C. MacKay, “Good error correcting codes based on very sparse matrices,” IEEE Trans. Inform. Theory, vol. 45, no. 2, pp. 399–432, Mar. 1999.
 [6] E. Paolini and M. Chiani, “Construction of nearoptimum burst erasure correcting lowdensity paritycheck codes,” IEEE Trans. Commun., vol. 57, no. 5, pp. 1320–1328, May 2009.
 [7] M. Baldi, M. Bianchi, and F. Chiaraluce, “Nonsystematic codes for physical layer security,” in Proc. IEEE Information Theory Workshop (ITW 2010), Dublin, Ireland, Aug. 2010.
 [8] ——, “Increasing physical layer security through scrambled codes and ARQ,” in Proc. IEEE International Conference on Communications (ICC 2011), Kyoto, Japan, Jun. 2011.
 [9] ——, “Coding with scrambling, concatenation, and HARQ for the AWGN wiretap channel: A security gap analysis,” IEEE Trans. Inf. Forensics Security, vol. 7, no. 3, pp. 883–894, Jun. 2012.
 [10] F. Renna, N. Laurenti, S. Tomasin, M. Baldi, N. Maturo, M. Bianchi, F. Chiaraluce, and M. Bloch, “Lowpower secret key agreement over OFDM,” in Proc. ACM HotWiSec 2013, Budapest, Hungary, Apr. 2013.
 [11] C. Monico, J. Rosenthal, and A. Shokrollahi, “Using low density parity check codes in the McEliece cryptosystem,” in Proc. IEEE International Symposium on Information Theory (ISIT 2000), Sorrento, Italy, Jun. 2000, p. 215.
 [12] M. Baldi, F. Chiaraluce, R. Garello, and F. Mininni, “Quasicyclic lowdensity paritycheck codes in the McEliece cryptosystem,” in Proc. IEEE International Conference on Communications (ICC 2007), Glasgow, Scotland, Jun. 2007, pp. 951–956.
 [13] M. Baldi and F. Chiaraluce, “Cryptanalysis of a new instance of McEliece cryptosystem based on QCLDPC codes,” in Proc. IEEE International Symposium on Information Theory (ISIT 2007), Nice, France, Jun. 2007, pp. 2591–2595.
 [14] M. Baldi, M. Bodrato, and F. Chiaraluce, “A new analysis of the McEliece cryptosystem based on QCLDPC codes,” in Security and Cryptography for Networks, ser. Lecture Notes in Computer Science. Springer Verlag, 2008, vol. 5229, pp. 246–262.
 [15] M. Baldi, M. Bianchi, and F. Chiaraluce. (2012) Security and complexity of the McEliece cryptosystem based on QCLDPC codes. Accepted for publication in IET Information Security. [Online]. Available: http://arxiv.org/abs/1109.5827
 [16] R. Misoczki, J.P. Tillich, N. Sendrier, and P. S. L. M. Barreto. (2012) MDPCMcEliece: New McEliece variants from moderate density paritycheck codes. [Online]. Available: http://eprint.iacr.org/2012/409
 [17] F. P. Biasi, P. S. L. M. Barreto, R. Misoczki, and W. V. Ruggiero. (2012) Scaling efficient codebased cryptosystems for embedded platforms. [Online]. Available: http://arxiv.org/abs/1212.4317
 [18] M. Baldi, F. Bambozzi, and F. Chiaraluce, “On a family of circulant matrices for quasicyclic lowdensity generator matrix codes,” IEEE Trans. Inform. Theory, vol. 57, no. 9, pp. 6052–6067, Sep. 2011.
 [19] M. Baldi, M. Bianchi, F. Chiaraluce, J. Rosenthal, and D. Schipani. (2011) Enhanced public key security for the McEliece cryptosystem. Submitted to the Journal of Cryptology. [Online]. Available: http://arxiv.org/abs/1108.2462
 [20] C. Peters, “Informationset decoding for linear codes over ,” in PostQuantum Cryptography, ser. Lecture Notes in Computer Science. Springer Verlag, 2010, vol. 6061, pp. 81–94.
 [21] A. May, A. Meurer, and E. Thomae, “Decoding random linear codes in ,” in ASIACRYPT 2011, ser. Lecture Notes in Computer Science. Springer Verlag, 2011, vol. 7073, pp. 107–124.
 [22] A. Becker, A. Joux, A. May, and A. Meurer, “Decoding random binary linear codes in : How 1 + 1 = 0 improves information set decoding,” in EUROCRYPT 2012, ser. Lecture Notes in Computer Science. Springer Verlag, 2012.
 [23] D. J. Bernstein, T. Lange, and C. Peters, “Smaller decoding exponents: ballcollision decoding,” in CRYPTO 2011, ser. Lecture Notes in Computer Science. Springer Verlag, 2011, vol. 6841, pp. 743–760.