List-decoding algorithms for lifted codes

List-decoding algorithms for lifted codes

Alan Guo CSAIL, Massachusetts Institute of Technology, 32 Vassar Street, Cambridge, MA, USA. aguo@mit.edu. Research supported in part by NSF grants CCF-0829672, CCF-1065125, and CCF-6922462, and an NSF Graduate Research Fellowship    Swastik Kopparty Department of Mathematics & Department of Computer Science, Rutgers University. swastik.kopparty@rutgers.edu. Research supported in part by a Sloan Fellowship and NSF CCF-1253886.
Abstract

Lifted Reed-Solomon codes are a natural affine-invariant family of error-correcting codes which generalize Reed-Muller codes. They were known to have efficient local-testing and local-decoding algorithms (comparable to the known algorithms for Reed-Muller codes), but with significantly better rate. We give efficient algorithms for list-decoding and local list-decoding of lifted codes. Our algorithms are based on a new technical lemma, which says that codewords of lifted codes are low degree polynomials when viewed as univariate polynomials over a big field (even though they may be very high degree when viewed as multivariate polynomials over a small field).

1 Introduction

By virtue of their many powerful applications in complexity theory, there has been much interest in the study of error-correcting codes which support “local” operations. The operations of interest include local decoding, local testing, local correcting, and local list-decoding. Error correcting codes equipped with such local algorithms have been useful, for example, in proof-checking, private information retrieval, and hardness amplification.

The canonical example of a code which supports all the above local operations is the Reed-Muller code, which is a code based on evaluations of low-degree polynomials. Reed-Muller codes have nontrivial local algorithms across a wide range of parameters. In this paper, we will be interested in the constant rate regime. For a long time, Reed-Muller codes were the only known codes in this regime supporting nontrivial locality. Concretely, for every constant integer and every constant , there are Reed-Muller codes of arbitrarily large length , rate , constant relative distance , which are locally decodable/testable/correctable from fraction fraction errors using queries. In particular, no nontrivial locality was known for Reed-Muller codes (or any other codes, until recently) with rate .

In the last few years, new families of codes were found which had interesting local algorithms in the high rate regime (i.e., with rate near ). These codes include multiplicity codes [KSY11, Kop12], lifted codes [GKS13, Guo13], expander codes [HOW13] and tensor codes [Vid10]. Of these, lifted codes are the only ones that are known to be both locally decodable and locally testable. This paper gives new and improved decoding and testing algorithms for lifted codes.

1.1 Lifted Codes and our Main Result

Lifted codes are a natural family of algebraic, affine-invariant codes which generalize Reed-Muller codes. We give a brief introduction to these codes now111Technically we are talking about lifted Reed-Solomon codes, but for brevity we refer to them as lifted codes.. Let be prime power, let and let be an integer. Define alphabet . We define the lifted code to be a subset of , the space of functions from to . A function is in if for every line , the restriction of to is a univariate polynomial of degree at most . Note that if is the evaluation table of an -variate polynomial of degree , then is automatically in . The surprising (and useful) fact is that if is large and has small characteristic, then has significantly more functions, but has the same distance as the Reed-Muller code. This leads to its improved rate relative to the corresponding Reed-Muller code, which only contains the evaluation tables of low degree polynomials.

Our main result is an algorithm for list-decoding and local list-decoding of lifted codes. We show that lifted codes of distance can be efficiently list-decoded and locally list-decoded (in sublinear-time) upto their “Johnson radius” (). Combined with the local testability of lifted codes, this also implies that lifted codes can be locally tested in the high-error regime, upto the Johnson radius.

It is well known that Reed-Muller codes can be list decoded and locally list-decoded upto the Johnson radius [PW04, STV99]222To locally list-decode all the way upto the Johnson bound, one actually needs a variant of [STV99] given in [BK09]. 333There is another regime, where is constant, in which the Reed-Muller codes can be list-decoded beyond the Johnson bound, upto the minimum distance. See [GKZ08, Gop10, BL14]. Our result shows that a lifted code, which is a natural algebraic supercode of Reed-Muller codes, despite having a vastly greater rate than the corresponding Reed-Muller code, loses absolutely nothing in terms of any (local) algorithmic decoding / testing properties.

In the appendix, we also prove two other results as part of the basic toolkit for working with lifted codes.

  • Explicit interpolating sets: For a lifted code , we give a strongly explicit subset of such that for every , there is a unique lifted codeword from with . The main interest in explicit interpolating sets for us is that it allows us to convert the sublinear-time local correction algorithm for lifted codes into a sublinear-time local decoding algorithm for lifted codes (earlier the known sublinear-time local correction, only implied low-query-complexity local decoding, without any associated sublinear-time local decoding algorithm).

  • Simple local decoding upto half the minimum distance: We note that there is a simple algorithm for local decoding of lifted codes upto half the minimum distance. This is a direct translation of the elegant weighted-lines local decoding algorithm for matching-vector codes [BET10] to the Reed-Muller code / lifted codes setting.

1.2 Methods

We first discuss our (global) list-decoding algorithm, which generalizes the list-decoding algorithm for Reed-Muller codes due to Pellikaan-Wu [PW04]. The main technical lemma underlying our algorithm says that codewords of lifted codes are low-degree when viewed as univariate polynomials. This generalizes the classical fact due to Kasami-Lin-Peterson [KLP68] underlying the Pellikaan-Wu decoding algorithm: that multivariate polynomials are low-degree when viewed as univariate polynomials (“Reed-Muller codes are subcodes of Reed-Solomon codes”).

The codewords of a lifted code are in general very high degree as -variate polynomials over . There is a description of these codes in terms of spanning monomials [GKS13], but it is not even clear from this description that lifted codes have good distance. The handle that we get on lifted codes arises by considering the big field , and letting be an -linear isomorphism between and . Given a function , we can consider the composed function , and view it as a function from . Our technical lemma says that this function is low-degree as a univariate polynomial over (irrespective of the choice of the map ).

Through this lemma, we reduce the problem of list-decoding lifted codes over the small field to the problem of list-decoding univariate polynomials (i.e., Reed-Solomon codes) over the large field . This latter problem can be solved using the Guruswami-Sudan algorithm [GS99].

Our local list-decoding algorithm uses the above list-decoding algorithm. Following [AS03, STV99, BK09], local list-decoding of -variate Reed-Muller codes over reduces to (global) list-decoding of -variate Reed-Muller codes over (for some ). For the list-decoding radius to approach the Johnson radius, one needs . This is where the above list-decoding algorithm gets used.

Organization of this paper

Section 2 introduces notation and preliminary definitions and facts to be used in later proofs. Section 3 proves our main technical result, that lifted RS codes over domain are low degree when viewed as univariate polynomials over , as well as the consequence for global list decoding. Section 4 presents and analyzes the local list decoding algorithm for lifted RS codes, along with the consequence for local testability. Appendix A describes the explicit interpolating sets for arbitrary lifted affine-invariant codes. Appendix B presents and analyzes the local correction algorithm upto half the minimum distance.

2 Preliminaries

2.1 Notation

For a positive integer , we use to denote the set . For sets and , we use to denote the set of functions mapping to .

For a prime power , is the finite field of size . We think of a code as a family of functions , where is an extension field of , but each codeword is a vector of evaluations assuming some canonical ordering of elements in ; we abuse notation and say to mean .

If and line is a line in , this formally means is specified by some and the restriction of to , denoted by , means the function . Similarly, if is a plane, then it is specified by some and the restriction of to , denoted by , means the function .

2.2 Interpolating sets and decoding

Definition 2.1 (Interpolating set).

A set is an interpolating set for if for every there exists a unique such that .

Note that if is an interpolating set for , then .

Definition 2.2 (Local decoding).

Let be an alphabet and let be an encoding map. A -local decoding algorithm for is a randomized algorithm with oracle access to an input word and satisfies the following:

  1. If there is a message such that , then for every input , we have .

  2. On every input , always makes at most queries to .

We call the fraction of errors decodable, or the decoding radius, and we call the query complexity.

Definition 2.3 (Local correction).

Let be a code. A -local correction algorithm for is a randomized algorithm with oracle access to an input word and satisfies the following:

  1. If there is a codeword such that , then for every input , we have .

  2. On every input , always makes at most queries to .

As before, is the decoding radius and is the query complexity.

The definition and construction of interpolating sets is motivated by the fact that if we have an explicit interpolating set for a code , then we have an explicit systematic encoding for , which allows us to easily transform a local correction algorithm into a local decoding algorithm.

Definition 2.4 (List decoding).

Let be a code. A -list decoding algorithm for is an algorithm which takes as input a received word that outputs a list of size containing all such that . The parameter is the list-decoding radius and is the list size.

Definition 2.5 (Local list decoding).

Let be a code. A -local list decoding algorithm for is a randomized algorithm with oracle access to an input word and outputs a collection of randomized oracles with oracle access to satisfying the following:

  1. With high probability, it holds that for every such that , there exists a such that for every , .

  2. makes at most queries to , and on any input and for every , makes at most queries to .

As before, is the list decoding radius, is the list size, and is the query complexity.

2.3 Affine-invariant codes

Definition 2.6 (Affine-invariant code).

A code is affine-invariant if for every and affine permutation , the function is in .

Definition 2.7 (Degree set).

For a function , written as , its support is . If is an affine-invariant code, then its degree set is

Proposition 2.8 ([Bgm11]).

If is a linear affine-invariant code, then .

In particular, if is an interpolating set for an affine-invariant code , then . Proposition 2.8 will be used in Appendix A.

2.4 Lifted codes

Definition 2.9 (Lift).

Let be an affine-invariant code. For integer , the -th dimensional lift of , , is the code

Let be the Reed-Solomon code of degree over ,

Definition 2.10 (Lifted Reed-Solomon code).

The -variate lifted Reed-Solomon code of degree over is the code

For positive integers , we say is in the -shadow of , or , if dominates digit-wise in base : in other words, if and are the -ary representations, then for all . We define the notion of -shadow for vectors recursively as follows. A vector is in the -shadow of , denoted by , if and . It follows easily from the definition that if , then . The following fact motivates these definitions.

Proposition 2.11 (Lucas’ theorem).

Let be positive integers and and let be a prime. The multinomial coefficient is nonzero modulo if only if .

For integers and , we define the mod-star operator by if and if and . This is motivated by the fact that defines the same function as over .

Remark 2.12.

For , note that if and only if there is some integer such that .

Proposition 2.13 ([Gks13]).

The lifted Reed-Solomon code is spanned by monomials such that for every , , we have .

Proposition 2.14 ([Gks13]).

The lifted Reed-Solomon code has distance

2.5 Finite field isomorphisms

Let be the -linear trace map . Let be linearly independent over and let be the map . Since is -linear, is an -linear map and in fact it is an isomorphism. Observe that induces a -linear isomorphism defined by for all .

3 Global list decoding

In this section, we present an efficient global list decoding algorithm for . Define , , and as in Section 2.5. The key new structural result, Theorem 3.2, states that is isomorphic to a subcode of . In particular, this lets us list decode by list decoding up to the Johnson radius. We will use this algorithm for as a subroutine in our local list decoding algorithm in Section 4.

3.1 Lifted Reed-Solomon codes are subcodes of Reed-Solomon codes

We begin with a lemma on monomials in lifted Reed-Solomon codes. We postpone the proof of this lemma to Section 3.2.

Lemma 3.1.

Let satisfy , where . Write , where and . Then .

We now state and prove our main structural theorem, which shows that codewords of an -variate lifted Reed-Solomon code over are low degree when viewed as univariate polynomials over .

Theorem 3.2.

Let . If , then .

Proof.

By Proposition 2.13 and linearity, it suffices to prove this for a monomial , where have the property that for every with , we have .

For , by the multinomial theorem we get the following expansion:

We now use Lucas’ theorem to understand the multinomial coefficients, (in a similar manner to Lemma B.2 and Proposition 2.8 in [GKS13]), and this tells us that many terms in this sum equal . So we get that is of the form:

To conclude the proof of this theorem, we just need to show that the only monomials that appear in the above expression are all such that is at most . Concretely, we need to show that whenever satisfy (1) for all , and (2) , then we have the bound

Recall that Proposition 2.13 allowed us to assume that have the property that for every , , we have . Therefore, for some and .

We now proceed to give upper and lower bounds on , which will then enable us to show that . We start with the upper bound:

We proceed with the lower bound. If , then . Suppose . Since is transitive, by Proposition 2.13, the monomial . Recall that . Thus by Lemma 3.1, . Therefore,

To summarize, if , then , and if , then . In both cases, we get that , as desired. ∎

Corollary 3.3.

There is a polynomial time global list decoding algorithm for that decodes up to fraction errors. In particular, if and , then and the list decoding algorithm decodes up to fraction errors as .

Proof.

Given , convert it to , and then run the Guruswami-Sudan list decoder for on to obtain a list with the guarantee that any with lies in . We require that any satisfying also satisfies , and this follows immediately from Theorem 3.2. The fact that when and follows immediately from Proposition 2.14. ∎

3.2 Proof of Lemma 3.1

We begin with three simple claims about the relation.

Claim 3.4.

If , then there exist such that for each and .

Proof.

The coefficient of in is . By Proposition 2.11, the hypothesis implies that this coefficient is nonzero modulo , hence there is some choice of such that is nonzero modulo . By Proposition 2.11, for each . ∎

Claim 3.5.

Let and . If , then there exists such that .

Proof.

Let . We have the identity

from the fact that the LHS counts the number of ways of choosing elements from , whereas the RHS counts the same thing by picking elements from and picking elements from . The LHS is by Proposition 2.11. Using the identity above, there must be some such that . Again, by Proposition 2.11, . ∎

Claim 3.6.

If , where and is a power of prime , then there exists such that .

Proof.

ñ Write and , where . Then . But , therefore . Therefore, it suffices to find such that . If , then we can simply take . Otherwise, if , then , for if not, then and and therefore , a contradiction. By Claim 3.5, there exists such that . Set . ∎

We can now complete the proof of Lemma 3.1.

Proof of Lemma 3.1.

If , then the result trivially holds. Suppose . Then . Suppose, for the sake of contradiction, that . By Claim 3.6, there exists such that . By Claim 3.4, there exist such that for and . For , define . Then , and so by Proposition 2.13 we have . On the other hand, . We can lower bound this by

and upper bound this by

and so , a contradiction. ∎

4 Local list decoding

In this section, we present a local list decoding algorithm for , where which decodes up to radius for any constant , with list size and query complexity .

Local list decoder:

Oracle access to received word .

  1. Pick a random line in .

  2. Run Reed-Solomon list decoder (e.g. Guruswami-Sudan) on from fraction errors to get list of Reed-Solomon codewords.

  3. For each , output

where Correct is a local correction algorithm for the lifted codes for fraction errors, and is an oracle which takes as advice a line and a univariate polynomial and simulates oracle access to a function which is supposed to be close to a lifted RS codeword.

Oracle :

  1. If contains , i.e.  for some and , then output .

  2. Otherwise, let be the plane containing and , parametrized by .

    1. Use the global list decoder for bivariate lifted RS code given above to list decode from fraction errors and obtain a list .

    2. If there exists a unique such that , output , otherwise fail.

Analysis:

To show that this works, we just have to show that, with high probability over the choice of , for every lifted RS codeword such that , there is such that , i.e. .

We will proceed in two steps:

  1. First, we show that with high probability over , there is some such that .

  2. Next, we show that .

For the first step, note that if . Note that has mean with variance less than (by pairwise independence of points on a line), so by Chebyshev’s inequality the probability that is .

For the second step, we want to show that . First consider the probability when we randomize as well. We get as long as and no element has . With probability , does no contain , and conditioned on this, is a uniformly random plane. It samples the space well, so with probability we have and hence . For the probability that no two codewords in agree on , view this as first choosing , then choosing within . The list size is a constant, polynomial in . So we just need to bound the probability that two bivariate lifted RS codewords agree on a uniformly random line. This is the same as the probability that a uniformly random line in is contained in the agreement set of two fixed bivariate lifted RS codewords, which we know has size at most . By a standard second moment bound, this probability is at most . Thus, with probability , is the unique codeword in which is consistent with on . Therefore,

As a corollary, we get the following testing algorithm.

Theorem 4.1.

For any , there is an -query algorithm which, given oracle access to a function , distinguishes between the cases where is -close to and where is -far.

Proof.

Let and let , so that and . Let be a local testing algorithm for with query complexity , which distinguishes between codewords and words that are -far from the code. The algorithm is to run the local list decoding algorithm on with error radius such that , to obtain a list of oracles . For each , we use random sampling to estimate the distance between and the function computed by to within additive error, and keep only the ones with estimated distance less than . Then, for each remaining , we run on . We accept if accepts some , otherwise we reject.

If is -close to , then it is -close to some codeword , and by the guarantee of the local list decoding algorithm there is some such that computes . Moreover, this will not be pruned by our distance estimation. Since is a codeword, this will pass the testing algorithm and so our algorithm will accept.

Now suppose is -far from , and consider any oracle output by the local list decoding algorithm and pruned by our distance estimation. The estimated distance between and the function computed by is at most , so the true distance is at most . Since is -far from any codeword, that means the function computed by is -far from any codeword, and hence will reject .

All of the statements made above were deterministic, but the testing algorithm and distance estimation are randomized procedures. However, at a price of constant blowup in query complexity, we can make their failure probabilities arbitrarily small constants, so that by a union bound the distance estimations and tests run by simultaneously succeed with large constant probability. ∎

Acknowledgements

We thank the anonymous reviewers for their helpful and insightful comments.

References

  • [AS03] Sanjeev Arora and Madhu Sudan. Improved low-degree testing and its applications. Combinatorica, 23:365–426, 2003.
  • [BET10] Avraham Ben-Aroya, Klim Efremenko, and Amnon Ta-Shma. Local list decoding with a constant number of queries. In 51th Annual IEEE Symposium on Foundations of Computer Science, FOCS, pages 715–722, 2010.
  • [BGM11] Eli Ben-Sasson, Elena Grigorescu, Ghid Maatouk, Amir Shpilka, and Madhu Sudan. On sums of locally testable affine invariant properties. In APPROX-RANDOM, pages 400–411, 2011.
  • [BK09] K. Brander and S. Kopparty. List-decoding Reed-Muller over large fields upto the Johnson radius. Manuscript, 2009.
  • [BL14] Abhishek Bhowmick and Shachar Lovett. List decoding Reed-Muller codes over small fields. CoRR, abs/1407.3433, 2014.
  • [GKS13] A. Guo, S. Kopparty, and M. Sudan. New affine-invariant codes from lifting. In ITCS, pages 529–540, 2013.
  • [GKZ08] Parikshit Gopalan, Adam R. Klivans, and David Zuckerman. List-decoding Reed-Muller codes over small fields. In Proceedings of the 40th Annual ACM Symposium on Theory of Computing, STOC, pages 265–274, 2008.
  • [Gop10] Parikshit Gopalan. A Fourier-Analytic approach to Reed-Muller decoding. In 51th Annual IEEE Symposium on Foundations of Computer Science, FOCS, pages 685–694, 2010.
  • [GS99] Venkatesan Guruswami and Madhu Sudan. Improved decoding of Reed-Solomon and algebraic-geometric codes. IEEE Transactions on Information Theory, 45:1757–1767, 1999.
  • [Guo13] A. Guo. High rate locally correctable codes via lifting. Electronic Colloquium on Computational Complexity (ECCC), 20:53, 2013.
  • [HOW13] B. Hemenway, R. Ostrovsky, and M. Wootters. Local correctability of expander codes. In ICALP (1), pages 540–551, 2013.
  • [KLP68] Tadao Kasami, Shu Lin, and W. Wesley Peterson. Polynomial codes. IEEE Transactions on Information Theory, 14(6):807–814, 1968.
  • [Kop12] S. Kopparty. List-decoding multiplicity codes. In Electronic Colloquium on Computational Complexity (ECCC), TR12-044, 2012.
  • [KSY11] S. Kopparty, S. Saraf, and S. Yekhanin. High-rate codes with sublinear-time decoding. In STOC, pages 167–176, 2011.
  • [PW04] R. Pellikaan and X. Wu. List decoding of q-ary Reed-Muller codes. IEEE Transactions on Information Theory, 50(4):679–682, 2004.
  • [STV99] Madhu Sudan, Luca Trevisan, and Salil Vadhan. Pseudorandom generators without the XOR lemma. In 39th ACM Symposium on Theory of Computing (STOC), pages 537–546, 1999.
  • [Vid10] Michael Viderman. A note on high-rate locally testable codes with sublinear query complexity. Electronic Colloquium on Computational Complexity (ECCC), 17:171, 2010.

Appendix A Interpolating set for affine-invariant codes

In this section, we present, for any affine-invariant code , an explicit interpolating set , i.e. for any there exists a unique such that .

Define , , and as in Section 2.5. It is straightforward to verify that if and is an interpolating set for , then is an interpolating set for .

Theorem A.1.

Let be a nontrivial affine-invariant code with . Let be a generator, i.e. has order