We prove improved inapproximability results for hypergraph coloring using the lowdegree polynomial code (aka, the “short code” of Barak et. al. [FOCS 2012]) and the techniques proposed by Dinur and Guruswami [FOCS 2013] to incorporate this code for inapproximability results.
In particular, we prove quasiNPhardness of the following problems on vertex hypergraphs:

Coloring a 2colorable 8uniform hypergraph with colors.

Coloring a 4colorable 4uniform hypergraph with colors.

Coloring a 3colorable 3uniform hypergraph with colors.
In each of these cases, the hardness results obtained are (at least) exponentially stronger than what was previously known for the respective cases. In fact, prior to this result, colors was the strongest quantitative bound on the number of colors ruled out by inapproximability results for colorable hypergraphs.
The fundamental bottleneck in obtaining coloring inapproximability results using the lowdegree long code was a multipartite structural restriction in the PCP construction of DinurGuruswami. We are able to get around this restriction by simulating the multipartite structure implicitly by querying just one partition (albeit requiring 8 queries), which yields our result for 2colorable 8uniform hypergraphs. The result for 4colorable 4uniform hypergraphs is obtained via a “query doubling” method exploiting additional properties of the query test. For 3colorable 3uniform hypergraphs, we exploit the ternary domain to design a test with an additive (as opposed to multiplicative) noise function, and analyze its efficacy in killing high weight Fourier coefficients via the pseudorandom properties of an associated quadratic form. The latter step involves extending the key algebraic ingredient of DinurGuruswami concerning testing binary ReedMuller codes to the ternary alphabet.
1 Introduction
The last two decades have seen tremendous progress in understanding the hardness of approximating constraint satisfaction problems. Despite this progress, the status of approximate coloring of constant colorable (hyper)graphs is not resolved and in fact, there is an exponential (if not doubly exponential) gap between the best known approximation algorithms and inapproximability results. The current best known approximation algorithms require at least colors to color a constant colorable (hyper)graph on vertices while the best inapproximability results only rule out at best (and in fact, in most cases, only ) colors.
Given this disparity between the positive and negative results, it is natural to ask why current inapproximability techniques get stuck at the color barrier. The primary bottleneck in going past polylogarithmic colors is the use of the long code, a quintessential ingredient in almost all tight inapproximability results, since it was first introduced by Bellare, Goldreich and Sudan [2]. The long code, as the name suggests, is the most redundant encoding, wherein a bit Boolean string is encoded by a bit string which consists of the evaluation of all Boolean functions on bits at the point . It is this doubly exponential blowup of the long code which prevents the coloring inapproximability to go past the barrier. Recently, Barak et. al. [1], while trying to understanding the tightness of the AroraBarakSteurer algorithm for unique games, introduced the short code, also called the lowdegree long code [4]. The lowdegree long code is a puncturing of the long code in the sense, that it contains only the evaluations of lowdegree functions (opposed to all functions). Barak et. al. [1] introduced the lowdegree long code to prove exponentially stronger integrality gaps for Unique Games, and construct small set expanders whose Laplacians have many small eigenvalues,
Being a derandomization of the long code, one might hope to use the lowdegree long code as a more sizeefficient surrogate for the long code in inapproximability results. In fact, Barak et. al. [1] used it obtain a more efficient version of the KKMO alphabet reduction [12] for Unique Games. However, using the lowdegree long code towards improved reductions from Label Cover posed some challenges related to folding, and incorporating noise without giving up perfect completeness (which is crucial for results on coloring). Recently, Dinur and Guruswami [4] introduced a very elegant set of techniques to adapt the long code based inapproximability results to lowdegree long codes. Using these techniques, they proved (1) improved inapproximability results for gap4SAT for (long code based reductions show for ) and (2) hardness for a variant of approximate hypergraph coloring, with a gap of 2 and number of colors (where is the number of vertices). It is to be noted that the latter is the first result to go beyond the logarithmic barrier for a coloringtype problem. However, the DinurGuruswami [4] results do not extend to standard (hyper)graph coloring hardness due to a multipartite structural bottleneck in the PCP construction, which we elaborate below.
As mentioned earlier, the two main contributions of DinurGuruswami [4] are (1) folding mechanism over the lowdegree long code and (2) noise in the lowdegree polynomials. The results of Bhattacharyya et. al. [3] and Barak et. al. [1] suggest that the product of linearly independent affine functions suffices to work as noise for the lowdegree long code setting (with degree = ) in the sense that it attenuates the contribution of large weight Fourier coefficients. However, this works only for PCP tests with imperfect completeness. Since approximate coloring results require perfect completeness, Dinur and Guruswami [4] inspired by the above result, develop a noise function which is the product of two random lowdegree polynomials such that the sum of the degrees is at most . This necessitates restricting certain functions in the PCP test to be of smaller degree which in turn requires the PCP tests to query two types of tables – one a lowdegree long code of degree and another a lowdegree long code of smaller degree. Though the latter table is a part of the former, a separate table is needed since otherwise the queries will be biased to the small degree portion of the lowdegree long code. This multipartite structure is what precludes them from extending their result for standard coloring results. (Clearly, if the query of the PCP tests straddles two tables, then the associated hypergraph is trivially 2colorable.)
1.1 Hypergraph coloring results
In this work, we show how this multipartite structural restriction can be overcome, thus yielding (standard) coloring inapproximability results. The first of our results extends the result of DinurGuruswami [4]: variant of 6uniform hypergraph coloring result to a standard hypergraph coloring result, albeit of larger uniformity, namely 8.
Theorem 1.1 (2colorable 8uniform hypergraphs).
Assuming , there is no polynomial time algorithm which, when given as input an uniform hypergraph on vertices can distinguish between the following:

is colorable,

has no independent set of size .
This result is obtained using the framework of DinurGuruswami [4] by showing that the two additional queries can be used to simulate queries into the smaller table via queries into the larger table.
We note that prior to this result, colors was the strongest quantitative bound on hardness for hypergraph coloring: Khot obtained such a result for coloring 7colorable 4uniform hypergraphs [10] while Dinur and Guruswami [4] obtained a similar (but incomparable) result for 2colorable 6uniform hypergraphs both using the long code.
We observe that the 8query PCP test used in the above inapproximability result has a stronger completeness guarantee than required to prove the above result: the 8 queries of the NotAllEqual () PCP test, say in the completeness case satisfy
which is stronger than the required
Furthermore, for each , the queries and appear in the same table. This lets us perform the following “doubling of queries”: each location is now indexed by a pair of queries, e.g., and is expected to return 2 bits which are the answers to the two queries respectively. The stronger completeness property yields a 4query PCP test over an alphabet of size 4 with the completeness property,
which suffices for the completeness for proving inapproximability results for 4colorable 4uniform hypergraphs. We show that the soundness analysis also carries over to yield the following hardness for 4colorable 4uniform hypergraphs.
Theorem 1.2 (4colorable 4uniform hypergraphs).
Assuming , there is no polynomial time algorithm which, when given as input a uniform hypergraph on vertices can distinguish between the following:

is colorable,

has no independent set of size .
We remark that the doubling method, mentioned above, when used in the vanilla long code setting (as opposed to lowdegree long code setting) already yields the following inapproximability: it is quasiNPhard to color a 4colorable 4uniform hypergraph with colors. This result already improves upon the above mentioned result of Khot [10] for 7colorable 4uniform hypergraphs. Another feature of the doubling method is that although the underlying alphabet is of size 4, namely , it suffices for the soundness analysis to perform standard Fourier analysis over .
In the language of covering complexity^{1}^{1}1The covering number of a CSP is the minimal number of assignments to the vertices so that each hyperedge is covered by at least one assignment, (the proof of) Theorem 1.2 demonstrates a Boolean 4CSP for which it is quasiNPhard to distinguish between covering number of 2 vs. . The previous best result for a Boolean 4CSP was 2 vs. , due to Dinur and Kol [6].
We then ask if we can prove coloring inapproximability for even smaller uniformity, i.e., 2 and 3 (graphs and 3uniform hypergraphs respectively). We show that we can use a different noise function over to obtain the following inapproximability result for 3colorable 3uniform hypergraphs.
Theorem 1.3 (3colorable 3uniform hypergraphs).
Assuming , there is no polynomial time algorithm which, when given as input a uniform hypergraph on vertices can distinguish between the following:

is colorable.

has no independent set of size .
Prior to this result, the best inapproximability result for O(1)colorable 3uniform hypergraphs were as follows: Khot [11] showed that it is quasiNPhard to color a 3colorable 3uniform hypergraphs with colors and Dinur, Regev and Smyth [7] showed that it is quasiNPhard to color a 2colorable 3uniform hypergraphs with colors (observe that is exponentially larger than ). For 2colorable 3uniform hypergraphs, the result of Dinur et. al. [7] only rules out colorability by , while a recent result due to Khot and Saket [13] shows that it is hard to find a sized independent set in a given vertex 2colorable 3uniform hypergraph assuming the to games conjecture. Our improved inapproximability result is obtained by adapting Khot’s proof to the lowdegree long code using the new noise function over . We remark that this result is not as strong as the previous two ( instead of ) as for 3uniform hypergraphs, the starting point is a multilayered smooth label cover instance instead of just label cover, which causes a blowup in size and a corresponding deterioration in the parameters.
1.2 Lowdegree long code analysis via ReedMuller testing
One of the key contributions of Barak et. al. [1] was the discovery of a connection between ReedMuller testing and the analysis of the lowdegree long code. In particular, they showed the following. Let set of degree polynomials on variables over . For functions , let . Barak et. al. oberved that if is far from the set of degree polynomials, then one can bound the expectation for a random lowweight using a powerful result on ReedMuller testing over due to Bhattacharyya et. al. [3]. This demonstrates that the noise function attenuates the contribution of highorder Fourier coefficients and is thus useful in the lowdegree long code analysis. However, this noise has imperfect completeness and DinurGuruswami had to prove a new result on ReedMuller testing over to construct a noise function that allows for perfect completeness. They showed that if is far from , then was doubly exponentially small in (see Theorem 2.12 for a fomal statement). This allowed them to extend some of the long code based inapproximability with perfect completeness to the lowdegree long code setting. Tests based on the above property need to access functions of different degree (e.g., in the above discussion) and this results in a multipartite structure in the lowdegree long code tables of [4]. The results for 2colorable 8uniform hypergraphs and 4uniform 4colorable hypergraphs are obtained using the above result of [4].
For the case of 3uniform 3colorable hypergraphs, we observe that if we extend the alphabet to ternary (i.e., instead of ), we can design a noise function that has both perfect completeness and does not result in a multipartite structural restriction. Let now denote the set of degree polynomials on variables over . We show that if is far from , then is doubly exponentially small in . This is proved by showing the following pseudorandom property of the associated quadratic form defined as where is the columnvector of evaluation of all degree monomials at the point . If the distance of from polynomials of degree , denoted by is at least , then the rank of the matrix is exponential in and is otherwise equal to the distance . This rank bound is proved along the lines of [4] using the ReedMuller tester analysis of Haramaty, Shpilka and Sudan [9] over general fields instead of the Bhattacharyya et. al. [3] analysis over . ^{color=red!100!green!33, size=}^{color=red!100!green!33, size=}todo: color=red!100!green!33, size=ph: Add related work section, explain almostcolorable results
Organization
2 Preliminaries
2.1 Label cover
All our reductions start from an appropriate instance of the label cover problem, bipartite or multipartite. A bipartite label cover instance consists of a bipartite graph , label sets , and a set of projection constraints .We consider label cover instances obtained from instances in the following natural manner.
Definition 2.1 (repeated label cover).
Let be a instance with as the set of variables and the set of clauses. The repeated bipartite label cover instance is specified by:

A graph , where .

.

There is an edge if the tuple of variables can be obtained from the tuple of clauses by replacing each clause by a variable in it.

The constraint is simply the projection of the assignments on variables in all the clauses in to the assignments on the variables in .

For each there is a set of functions such that iff the assignment satisfies the th clause in . Note that depends only on the variables in the th clause.
A labeling satisfies an edge iff and satisfies all the clauses in . Let be the maximal fraction of constraints that can be satisfied by any labeling.
The following theorem is obtained by applying Raz’s parallel repetition theorem [15] with repetitions on hard instances of  where each variable occurs the same number of times [8].
Theorem 2.2.
There is an algorithm which on input a instance and outputs an repeated label cover instance in time with the following properties.

If , then .

If , then for some universal constant .
Moreover, the underlying graph is both left and right regular.
Multilayered smooth label cover:
For our hardness results for uniform colorable hypergraphs, we need a multipartite version of label cover, satisfying a smoothness condition.
Definition 2.3 (smoothness).
Let be a bipartite label cover instance specified by . Then is smooth iff for every and two distinct labels
where is a random neighbour of .
Definition 2.4 (repeated layered smooth label cover).
Let and be a instance with as the set of variables and the set of clauses. The repeated layered smooth label cover instance is specified by:

An partite graph with vertex sets . Elements of are tuples of the form where is a set of clauses and is a a set of variables.

where which corresponds to all Boolean assignments to the clauses and variables corresponding to a vertex in layer .

For , denotes the set of edges between layers and . For , there is an edge iff can be obtained from by replacing some clauses in with variables occurring in the clauses respectively.

The constraint is the projection of assignments for clauses and variables in to that of .

For each , , there are functions , one for each clause in such that iff satisfies the clause . This function only depends on the coordinates in .
Given a labeling for all the vertices, an edge is satisfied iff satisfies all the clauses in , satisfies all the clauses in and . Let be the maximum fraction of edges in that can be satisfied by any labeling.
The following theorem was proved by Dinur et. al. [5] in the context of hypergraph vertex cover inapproximability (also see [7]).
Theorem 2.5.
There is an algorithm which on input a instance and outputs a repeated layered smooth label cover instance in time with the following properties.

, the bipartite label cover instance on is smooth.

For , any layers , any such that , there exists distinct and such that the fraction of edges between and relative to is at least .

If , then there is a labeling for that satisfies all the constraints.

If , then
2.2 Lowdegree long code
Let be the finite field of size where is a prime. The results in this section apply when . The choice of will be clear from context and hence the dependence of on the quantities defined will be omitted. Let be the set of degree polynomials on variables over . Let . Note that is the set of all functions from to . is a vector space of dimension and is its subspace of dimension . The Hamming distance between and , denoted by , is the number of inputs on which and differ. When , . We say is far from if and is close to otherwise. Given , the dot product between them is defined as . For a subspace , the dual subspace is defined as . The following theorem relating dual spaces is well known.
Lemma 2.6.
.
We need the following SchwartzZippellike Lemma for degree polynomials.
Lemma 2.7 (SchwartzZippel lemma [9, Lemma 3.2]).
Let be a nonzero polynomial of degree at most with individual degrees at most . Then .
We now define the lowdegree long code (introduced as the short code by Barak et. al. [1] in the case).
Definition 2.8 (lowdegree long code).
For , the degree long code for is a function defined as
Note that for , this matches with the definition of the original long code over the alphabet .
Definition 2.9 (characters).
A character of is a function such that
The following lemma lists the basic properties of characters.
Lemma 2.10.
Let be the th roots of unity and for , .

The characters of are .

For any , if and only if .

For , is the constant function.

such that and (i.e., the constant function is (one of) the closest function to in ). We call such a a minimum support function for the coset .

Characters forms an orthonormal basis for the vector space of functions from to , under the inner product

Any function can be uniquely decomposed as
where is the set of minimum support functions, one for each of the cosets in , with ties broken arbitrarily.

Parseval’s identity: For any function , In particular, if , .
The following lemma relates characters over different domains related by coordinate projections.
Lemma 2.11.
Let and be a (coordinate) projection i.e., there exist indices such that . Then for ,
where .
Proof.
Dinur and Guruswami [4] proved the following theorem about ReedMuller codes over using Bhattacharyya et. al. [3] testing result.
Theorem 2.12 ([4, Theorem 1]).
Let be a multiple of and . If is far from , then
2.3 Folding over satisfying assignments
Lemma 2.13.
Let , be a set of points in and an arbitrary function. Then there exists a polynomial of degree at most such that agrees with on all points in .
Proof.
For any set , a function is said to be folded over a subspace if is constant over cosets of in .
Fact 2.14.
Given a function there is a unique function that is folded over such that for . We call the lift of .
Given , let
The following lemma shows that if a function is folded over , then it cannot have weight on small support characters that are nonzero on (this is a generalization of the corresponding lemma in [4] to arbitrary fields).
Lemma 2.15.
Let is such that , and there exists with for some . Then if is folded over , then .
Proof.
Construct a polynomial which is zero at all points in support of except at . From Lemma 2.13, its possible to construct such a polynomial of degree at most . Then we have that and . Now
3 Correlation with a random square
In this section, we analyze the quantity
where is chosen uniformly at random and is a fixed function having distance exactly from .
Throughout this section, we work over the field . For , let and denote the monomial . Over , the individual degrees are at most (since ). Hence, we assume wlog. that the coefficient vector . In this notation, where are chosen independently and uniformly at random from . For , let be the column vector of evaluation of all degree monomials at , i.e., . Then where is now thought of as the column vector and hence, .
We are thus, interested in the quadratic form represented by the matrix . Observe that all belonging to the same coset in have the same value for and the matrix . Hence, by Lemma 2.10, we might wlog. assume that satisifies . The following lemma (an easy consequence of [14, Theorem 6.21]), shows that it suffices to understand the rank of .
Lemma 3.1.
Let be a , symmetric matrix with entries from . The statistical distance of the random variable from uniform is .
In the next sequence of lemmas, we relate to . In particular, we show that is equal to if and is exponential in otherwise. Recall that over , is the set of all function from to and .
Lemma 3.2.
.
Proof.
By assumption, satisfies . The lemma follows from that fact that are rank one matrices and . ∎
Lemma 3.3.
If , then .
Proof.
By assumption, satisifies and . Since and any nonzero polynomial with degree has support at least (Lemma 2.7), any vectors are linearly independent. In particular, the vectors for in are linearly independent. Consider any nonzero in the kernel of the matrix . The linear independence of ’s gives that for all . Hence, the kernel of resides in a codimensional space which implies that . ∎
We conjecture that Lemma 3.3 holds for larger values of , but for our purposes we only need a lower bound on the rank when .
Lemma 3.4.
There exists a constant such that if and then .
Proof.
The proof of this theorem is similar to the proof of [4, Theorems 15,17] for the case and we follow it step by step. Define .
Claim 3.5.
.
Proof.
The matrix satisfies that , for all . Using this description of , we obtain the following description of .
Thus to prove Lemma 3.4, it suffices to show that Towards this end, we define
(3.1) 
In terms of , Lemma 3.4 now reduces to showing that . We obtain this lower bound by recursively bounding this quantity . The following serves as the base case of the recursion.
Claim 3.6.
For , , and for , .
Proof.
Let be the polynomial which attains the minimum in (3.1). The first part of the claim follows from the fact that if then .
Now for the second part. Since , there is a monomial with such that
If , and we are done. Otherwise, consider such that coordinatewise and . Suppose then which is a contradiction. Hence, and the second part of the claim follows. ∎
For the induction step, we need the following result from Haramaty, Shpilka and Sudan [9].
Claim 3.7 ([9, Theorems 4.16, 1.7]).
There exists a constant such that if , where is far from , then there exists nonzero such that are far from the restriction of to affine hyperplanes.
See Appendix A for a proof of Claim 3.7 from Theorems 4.16 and 1.7 of [9].
Claim 3.8.
If and , then
Proof.
From Lemma 3.7, we get that there exists nonzero such that for all is far from . By applying a change of basis, we can assume that .
Let and where do not depend on . Note that are far from . Expanding the product , we have
Comparing terms, we observe that iff the following are true: