We present a new approach to constructing unconditional pseudorandom generators against classes of functions that involve computing a linear function of the inputs. We give an explicit construction of a pseudorandom generator that fools the discrete Fourier transforms of linear functions with seed-length that is nearly logarithmic (up to polyloglog factors) in the input size and the desired error parameter. Our result gives a single pseudorandom generator that fools several important classes of tests computable in logspace that have been considered in the literature, including halfspaces (over general domains), modular tests and combinatorial shapes. For all these classes, our generator is the first that achieves near logarithmic seed-length in both the input length and the error parameter. Getting such a seed-length is a natural challenge in its own right, which needs to be overcome in order to derandomize — a central question in complexity theory.
Our construction combines ideas from a large body of prior work, ranging from a classical construction of [NN93] to the recent gradually increasing independence paradigm of [KMN11, CRSW13, GMR12], while also introducing some novel analytic machinery which might find other applications.
A central goal of computational complexity is to understand the power that randomness adds to efficient computation. The main questions in this area are whether and , which respectively assert that randomness can be eliminated from efficient computation, at the price of a polynomial slowdown in time, and a constant blowup in space. It is known that proving will imply strong circuit lower bounds that seem out of reach of current techniques. In contrast, proving , could well be within reach. Indeed, bounded-space algorithms are a natural computational model for which we know how to construct strong pseudo-random generators, s, unconditionally.
Let denote the class of randomized algorithms with work space which can access the random bits in a read-once pre-specified order. Nisan [Nis92] devised a of seed length that fools with error . This generator was subsequently used by Nisan [Nis94] to show that and by Saks and Zhou [SZ99] to prove that can be simulated in space . Constructing s with the optimal seed length for this class and showing that is arguably the outstanding open problem in derandomization (which might not require a breakthrough in lower bounds). Despite much progress in this area [INW94, NZ96, RR99, Rei08, RTV06, BRRY14, BV10, KNP11, De11, GMR12], there are few cases where we can improve on Nisan’s twenty year old bound of [Nis92].
1.1 Fourier shapes
A conceptual contribution of this work is to propose a class of functions in which we call Fourier shapes that unify and generalize the problem of fooling many natural classes of test functions that are computable in logspace and involve computing linear combinations of (functions of) the input variables. In the following, let be the unit-disk in the complex plane.
A -Fourier shape is a function of the form where each . We refer to and as the alphabet size and the dimension of the Fourier shape respectively.
Clearly, -Fourier shapes can be computed with workspace, as long as the bit-complexity of is logarithmic for each ; a condition that can be enforced without loss of generality. Since our goal is to fool functions , it might be unclear why we should consider complex-valued functions (or larger domains). The answer comes from the discrete Fourier transform which maps integer random variables to . Concretely consider a Boolean function of the form where , , and is a simple function like a threshold or a mod function. To fool such a function , it suffices to fool the linear function . A natural way to establish the closeness of distributions on the integers is via the discrete Fourier transform. The discrete Fourier transform of at is given by
which is a Fourier shape.
Allowing a non-binary alphabet not only allows us to capture more general classes of functions (such as combinatorial shapes), it makes the class more robust. For instance, given a Fourier shape , if we consider inputs bits in blocks of length , then the resulting function is still a Fourier shape over a larger input domain (in dimension ). This allows certain compositions of s and simplifies our construction even for the case .
s for Fourier shapes and their applications.
A is a function . We refer to as
the seed-length of the generator. We say is explicit if
the output of can be computed in time
A fools a class of functions with error (or -fools ) if for every ,
We motivate the problem of constructing s for Fourier shapes by discussing how they capture a variety of well-studied classes like halfspaces (over general domains), combinatorial rectangles, modular tests and combinatorial shapes.
s for halfspaces. Halfspaces are functions that can be represented as
for some weight vector and threshold where if and otherwise. Halfspaces are of central importance in computational complexity, learning theory and social choice. Lower bounds for halfspaces are trivial, whereas the problem of proving lower bounds against depth- or halfspaces of halfspaces is a frontier open problem in computational complexity. The problem of constructing explicit s that can fool halfspaces is a natural challenge that has seen a lot of exciting progress recently [DGJ09, MZ13, Kan11b, Kan14, KM15]. The best known construction for halfspaces is that of Meka and Zuckerman [MZ13] who gave a with seed-length , which is for polynomially small error. They also showed that s against with inverse polynomial error can be used to fool halfspaces, and thus constructing better s for halfspaces is a necessary step towards progress for bounded-space algorithms. However, even for special cases of halfspaces (such as derandomizing the Chernoff bound), beating seed-length has proved difficult.
We show that a for -Fourier shapes with error also fools halfspaces with error . In particular, s fooling Fourier shapes with polynomially small error also fool halfspaces with small error.
s for generalized halfspaces. s for -Fourier shapes give us s for halfspaces not just for the uniform distribution over the hypercube, but for a large class of distributions that have been studied in the literature. We can derive these results in a unified manner by considering the class of generalized halfspaces.
A generalized halfspace over is a function that can be represented as
where are arbitrary functions for and .
Derandomizing the Chernoff-Hoeffding bound. A consequence of fooling generalized halfspaces is to derandomize Chernoff-Hoeffding type bounds for sums of independent random variables which are ubiquitous in the analysis of randomized algorithms. We state our result in the language of “randomness-efficient samplers” (cf. [Zuc97]). Let be independent random variables over a domain and let be arbitrary bounded functions. The classical Chernoff-Hoeffding bounds [Hoe63] say that
There has been a long line of work on showing sharp tail bounds for pseudorandom sequences starting from [SSS95] who showed that similar tail bounds hold under limited independence. But all previous constructions for the polynomial small error regime required seed-length . s for generalized halfspaces give Chernoff-Hoeffding tail bounds with polynomially small error, with seed-length .
s for modular tests. An important class of functions in is that of modular tests, i.e., functions of the form , where , for , coefficients and . Such a test is computable in as long as . The case when corresponds to small-bias spaces, for which optimal constructions were first given in the seminal work of Naor and Naor [NN93]. The case of arbitrary was considered by [LRTV09] (see also [MZ09]), their generator gives seed-length . Thus for , their generator does not improve on Nisan’s generator even for constant error . s fooling -Fourier shapes with polynomially small error fools modular tests.
s for combinatorial shapes. Combinatorial shapes were introduced in the work of [GMRZ13] as a generalization of combinatorial rectangles and to address fooling linear sums in statistical distance. These are functions of the form
for functions and a function . The best previous generators of [GMRZ13] and [De14] for combinatorial shapes achieve a seed-length of , ; in particular, the best previous seed-length for polynomially small error was . s for -Fourier shapes with error imply s for combinatorial shapes.
Combinatorial rectangles are a well-studied subset of combinatorial shapes [EGL98, ASWZ96, LLSZ97, Lu02]. They are functions that can be written as for some arbitrary subsets . The best known due to [GMR12, GY14] gives a seed-length of . Combinatorial rectangles are special cases of Fourier shapes so our for -Fourier shapes also fools combinatorial rectangles, but requires a slightly longer seed. The alphabet-reduction step in our construction is inspired by the generator of [GMR12, GY14].
Achieving optimal error dependence via Fourier shapes.
We note that having generators for Fourier shapes with seed-length even when is polynomially small is essential in our reductions: we sometimes need error for Fourier shapes in order to get error for our target class of functions. Once we have this, starting with a sufficiently small polynomial results in polynomially small error for the target class of functions.
We briefly explain why previous techniques based on limit theorems were unable to achieve polynomially small error with optimal seed-length, by considering the setting of halfspaces under the uniform distribution on . Fooling halfspaces is equivalent to fooling all linear functions in Kolmogorov or cdf distance. Previous work on fooling halfspaces [DGJ09, MZ13] relies on the Berry-Esséen theorem, a quantiative form of the central limit theorem, to show that the cdf of regular linear functions is close to that of the Gaussian distribution, both under the uniform distribution and under the pseudorandom distribution. However, even for the majority function (which is the most regular linear function), the discreteness of means that the Kolmogorov distance from the Gaussian distribution is , even when is uniformly random. Approaches that show closeness in cdf distance by comparison to the Gaussian distribution seem unlikely to give polynomially small error with optimal seed-length.
We depart from the derandomized limit theorem approach taken by several previous works [DGJ09, DKN10, GOWZ10, HKM12, GMRZ13, MZ13] and work directly with the Fourier transform. A crucial insight (that is formalized in Lemma 9.2) is that fooling the Fourier transform of linear forms to within polynomially small error implies polynomially small Kolmogorov distance.
1.2 Our results
Our main result is the following:
There is an explicit generator that fools all -Fourier shapes with error , and has seed-length .
We now state various corollaries of our main result starting with fooling halfspaces.
There is an explicit generator that fools halfspaces over under the uniform distribution with error , and has seed-length .
The best previous generator due to [MZ13] had a seed-length of , which is for polynomially small error .
We also get a with similar parameters for generalized halfspaces.
There is an explicit generator that -fools generalized halfspaces over , and has seed-length .
From this we can derive s with seed-length for fooling halfspaces with error under the Gaussian distribution and the uniform distribution on the sphere. Indeed, we get the following bound for arbitrary product distributions over , which depends on the moment of each co-ordinate.
Let be a product distribution on such that for all ,
There exists an explicit generator such that if , then for every halfspace ,
The generator has seed-length .
The next corollary is a near-optimal derandomization of the Chernoff-Hoeffding bounds. To get a similar guarantee, the best known seed-length that follows from previous work [SSS95, MZ13, GOWZ10] was .
Let be independent random variables over the domain . Let be arbitrary bounded functions. There exists an explicit generator such that if where , then is distributed identically to and
has seed-length .
We get the first generator for fooling modular tests whose dependence on the modulus is near-logarithmic. The best previous generator from [LRTV09] had a seed-length of , which is for .
There is an explicit generator that fools all linear tests modulo for all with error , and has seed-length .
Finally, we get a generator with near-logarithmic seedlength for fooling combinatorial shapes. [GMRZ13] gave a for combinatorial shapes with a seed-length of . This was improved recently by De [De14] who gave a with seed-length ; in particular, the best previous seed-length for polynomially small error was .
There is an explicit generator that fools -combinatorial shapes to error and has seed-length .
1.3 Other related work
Starting with the work of Diakonikolas et al. [DGJ09], there has been a lot of interest in constructing s for halfspaces and related classes such as intersections of halfspaces and polynomial threshold functions over the domain [DKN10, GOWZ10, HKM12, MZ13, Kan11b, Kan11a, Kan14]. Rabani and Shpilka [RS10] construct optimal hitting set generators for halfspaces over ; hitting set generators are weaker than s.
Another line of work gives s for halfspaces for the uniform distribution over the sphere (spherical caps) or the Gaussian distribution. For spherical caps, Karnin, Rabani and Shpilka [KRS12] gave a with a seed-length of . For the Gaussian distribution, [Kan14] gave a which achieves a seed-length of . Recently, [KM15] gave the first s for these settings with seedlength . Fooling halfspaces over the hypercube is known to be harder than the Gaussian setting or the uniform distribution on the sphere; hence our result gives a construction with similar parameters up to a factor. At a high level, [KM15] also uses a iterative dimension reduction approach like in [KMN11, CRSW13, GMR12]; however, the final construction and its analysis are significantly different from ours.
Gopalan et al. [GOWZ10] gave a generator fooling halfspaces under product distributions with bounded fourth moments, whose seed-length is .
The present work completely subsumes a manuscript of the authors which essentially solved the special-case of derandomizing Chernoff bounds and a special class of halfspaces [GKM14].
2 Proof overview
We describe our for Fourier shapes as in Theorem 1.1. The various corollaries are derived from this Theorem using properties of the discrete Fourier transform of integer-valued random variables.
Let us first consider a very simple : -wise independent distributions over . At a glance, it appears to do very poorly as it is easy to express the parity of a subset of bits as a Fourier shape and parities are not fooled even by -wise independence. The starting point for our construction is that bounded independence does fool a special but important class of Fourier shapes, namely those with polynomially small total variance.
For a complex valued random variable , define the variance of as
It is easy to verify that
so that if takes values in , then
The total-variance of a -Fourier shape with is defined as
To gain some intuition for why this is a natural quantity, note that gives an easy upper bound on the expectation of a Fourier shape:
This inequality suggests a natural dichotomy for the task of fooling Fourier shapes. It suggests that high variance shapes where are easy in the sense that is small for such Fourier shapes. So a for such shapes only needs to ensure that is also sufficiently small under the pseudorandom output.
To complement the above, we show that if the total-variance is very small, then generators based on limited independence do fairly well. Concretely, our main technical lemma says that limited independence fools products of bounded (complex-valued) random variables, provided that the sum of their variances is small.
Let be -wise independent random variables taking values in . Then,
We defer discussion of the proof to Section 2.4, and continue the description of our construction. Recall that we are trying to fool a -Fourier shape with to error . It is helpful to think of the desired error as being fixed at the beginning and staying unchanged through our iterations, while and change during the iterations. Generating -wise independent distributions over takes random bits. Thus if we use -wise independence, we would achieve error , but with seed-length rather than .
On the other hand, if for a fixed constant , then choosing -wise independence is enough to get error while also achieving seed-length as desired. We exploit this observation by combining the use of limited independence with the recent iterative-dimension-reduction paradigm of [KMN11, CRSW13, GMR12]. Our construction reduces the problem of fooling Fourier shapes with through a sequence of iterations to fooling Fourier shapes where the total variance is polynomially small in in each iteration and then uses limited independence in each iteration.
To conclude our high-level description, our generator consists of three modular parts. The first is a generator for Fourier shapes with high total variance: . We then give two reductions to handle low variance Fourier shapes: an alphabet-reduction step reduces the alphabet down to and leaves unchanged, and a dimension-reduction step that reduces the dimension from to while possibly blowing up the alphabet to . We describe each of these parts in more detail below.
2.1 Fooling high-variance Fourier shapes
We construct a with seed-length which -fools -Fourier shapes when for some sufficiently large constant . We build the generator in two steps.
In the first step, we build a with seed-length which achieves constant error for -Fourier shapes with . In the second step, we drive the error down to as follows. We hash the coordinates into roughly buckets, so that for at least buckets, restricted to the coordinates within the bucket has total-variance at least . We use the with constant error within each bucket, while the seeds across buckets are recycled using a for small-space algorithms. This construction is inspired by the construction of small-bias spaces due to Naor and Naor [NN93]; the difference being that we use generators for space bounded algorithms for amplification, as opposed to expander random walks as done in [NN93].
The next building block in our construction is alphabet-reduction which helps us assume without loss of generality that the alphabet-size is polynomially bounded in terms of the dimension . This is motivated by the construction of [GMR12].
Concretely, we show that constructing an - for -Fourier shapes can be reduced to that of constructing an - for -Fourier shapes for . The alphabet-reduction step consists of steps where in each step we reduce fooling -Fourier shapes for , to that of fooling -Fourier shapes, at the cost of random bits.
We now describe a single step that reduces the alphabet from to . Consider the following procedure for generating a uniformly random element in :
For , sample uniformly random subsets
Sample uniformly at random from .
Output , where .
Our goal is to derandomize this procedure. The key observation is that once the subsets are chosen, we are left with a -Fourier shape as a function of . So the choice of can be derandomized using a for Fourier shapes with alphabet , and it suffices to derandomize the choice of the ’s. A calculation shows that (because the ’s are uniformly random), derandomizing the choice of the ’s reduces to that of fooling a Fourier shape of total-variance . Lemma 2.1 implies that this can be done with limited independence.
2.3 Dimension-reduction for low-variance Fourier shapes
We show that constructing an - for -Fourier shapes with can be reduced to that of -fooling -Fourier shapes for . Note that here we decreased the dimension at the expense of increasing the alphabet-size. However, this can be fixed by employing another iteration of alphabet-reduction. This is the reason why considering -Fourier shapes for arbitrary helps us even if we were only trying to fool -Fourier shapes. The dimension-reduction proceeds as follows:
We first hash the coordinates into roughly buckets using a -wise independent hash function for . Note that this only requires random bits.
For the coordinates within each bucket we use a -wise independent string in for . We use true independence across buckets. Note that this requires independent seeds of length .
While the above process requires too many random bits by itself, it is easy to analyze. We then reduce the seed-length by observing that if we fix the hash function , then what we are left with as a function of the seeds used for generating the symbols in each bucket is a -Fourier shape. So rather than using independent seeds, we can use the output of a generator for such Fourier shapes.
2.4 Main Technical Lemma
The lemma can be seen as a generalization of a similar result proved for real-valued random variables in [GY14](who also have an additional restriction on the means of the random variables ). However, the generalization to complex-valued variables is substantial and seems to require different proof techniques.
We first consider the case where the ’s not only have small total-variance, but also have small absolute deviation from their means. Concretely, let where and . In this case, we do a variable change (taking the principal branch of the algorithm) to rewrite
We then argue that can be approximated by a polynomial of degree less than with small expected error. The polynomial is obtained by truncating the Taylor series expansion of the function. Once, we have such a low-degree polynomial approximator, the claim follows as limited independence fools low-degree polynomials.
To handle the general case where ’s are not necessarily bounded, we use an inclusion-exclusion argument and exploit the fact that with high probability, not many of the ’s (say more than ) will deviate too much from their expectation. We leave the details to the actual proof.
We start with some notation:
For and a hash function , define
be the unit disk in the complex plane.
For a complex valued random variable ,
Unless otherwise stated denote universal constants.
Throughout we assume that is sufficiently large and that are sufficiently small.
For positive functions we write when .
For a integer-valued random variable , its Fourier transform is given as follows: for , . Further, given the Fourier coefficients , one can compute the probability density function of as follows: for any integer ,
For we say that a family of hash functions is -biased if for any distinct indices and ,
We say that such a family is -wise independent if the above holds with for all .
We say that a distribution over is -biased or -wise independent if the corresponding family of functions is.
Such families of functions can be generated efficiently using small seeds.
For , there exist explicit -biased families of hash functions that can be generated efficiently from a seed of length . There are also, explicit -wise independent families that can be generated efficiently from a seed of length .
Taking the pointwise sum of such generators modulo gives a family of hash functions that is both -biased and -wise independent generated from a seed of length .
3.1 Basic Results
We start with the simple observation that to -fool an -Fourier shape , we can assume the functions in have bit-precision . This observation will be useful when we use PRGs for small-space machines to fool Fourier shapes in certain parameter regimes.
If a -fools -Fourier shapes when ’s have bit precision , then fools all -Fourier shapes with error at most .
Consider an arbitrary -Fourier shape with . Let be obtained by truncating the ’s to bits. Then, for all . Therefore, if we define , then for any , (as the ’s and ’s are in )
The claim now follows as the above inequality holds point-wise and by assumption, -fools . ∎
We collect some known results about pseudorandomness and prove some other technical results that will be used later.
We shall use s for small-space machines or read-once branching programs (ROBP) of Nisan [Nis92], [NZ96] and Impagliazzo, Nisan and Wigderson [INW94]. We extend the usual definitions of read-once branching programs to compute complex-valued functions; the results of [Nis92], [NZ96], [INW94] apply to this extended model readily
Definition 4 (-Robp).
An -ROBP is a layered directed graph with layers and vertices per layer with the following properties.
The first layer has a single start node and the vertices in the last layer are labeled by complex numbers from .
A vertex in layer , has edges to layer each labeled with an element of .
A graph as above naturally defines a function where on input one traverses the edges of the graph according to the labels and outputs the label of the final vertex reached.
There exists an explicit which -fools -branching programs and has seed-length .
Theorem 3.4 ([Nz96]).
For all and , there exists an explicit which -fools -branching programs for and has seed-length .
The next two lemmas quantify load-balancing properties of -biased hash functions in terms of the -norms of vectors. Proofs can be found in Appendix A.
Let be an integer. Let and be either a -biased hash family for or a -wise independent family for . Then
For all , let be even and a -wise independent family, and ,
4 Fooling products of low-variance random variables
We now show one of our main technical claims that products of complex-valued random variables are fooled by limited independence if the sum of variances of the random variables is small. The lemma is essentially equivalent to saying that limited independence fools low-variance Fourier shapes.
Let be -wise independent random variables taking values in . Then,
More concretely, let be independent random variables taking values in . Let and . Let be a positive even integer and let be a -wise independent family of random variables with each distributed identically to . Then, we will show that for a sufficiently big constant,
We start with the following standard bound on moments of bounded random variables whose proof is deferred to appendix B.
Let be random variables with , and . Then, for all even positive integers ,
We also use some elementary properties of the (complex-valued) log and exponential functions:
For with , , where we take the principle branch of the logarithm.
For and ,
For a random variable with , , and the principle branch of the logarithm function (phase between ), .
For any complex-valued random variable , .
Claims (1), (2) follow from the Taylor series expansions for the complex-valued log and exponential functions.
For (3), note that .
For (4), note that and similarly . The statement now follows from Jensen’s inequality applied to the random variable . ∎
We prove Lemma 4.1 or equivalently, Equation (3) by proving a sequence of increasingly stronger claims. We begin by proving that Equation (3) holds if ’s have small absolute deviation, i.e., lie in a disk of small radius about a fixed point.
Let and be as above. Furthermore, assume that for complex numbers and random variables so that with probability , for all . Let , and . Then we have that
Let , taking the principle branch of the logarithm function and let . Then, by Lemma 4.3 (1), (3), , so that