Hitting-sets for ROABP and Sum of Set-Multilinear circuits

# Hitting-sets for ROABP and Sum of Set-Multilinear circuits

## Abstract

We give a -time ( is the input size) blackbox polynomial identity testing algorithm for unknown-order read-once oblivious algebraic branching programs (ROABP). The best time-complexity known for this class was due to Forbes-Saptharishi-Shpilka (STOC 2014), and that too only for multilinear ROABP. We get rid of their exponential dependence on the individual degree. With this, we match the time-complexity for the unknown order ROABP with the known order ROABP (due to Forbes-Shpilka (FOCS 2013)) and also with the depth- set-multilinear circuits (due to Agrawal-Saha-Saxena (STOC 2013)). Our proof is simpler and involves a new technique called basis isolation.

The depth- model has recently gained much importance, as it has become a stepping-stone to understanding general arithmetic circuits. Its restriction to multilinearity has known exponential lower bounds but no nontrivial blackbox identity tests. In this paper, we take a step towards designing such hitting-sets. We give the first subexponential whitebox PIT for the sum of constantly many set-multilinear depth- circuits. To achieve this, we define notions of distance and base sets. Distance, for a multilinear depth- circuit (say, in variables and product gates), measures how far are the partitions from a mere refinement. The -distance strictly subsumes the set-multilinear model, while -distance captures general multilinear depth-. We design a hitting-set in time for -distance. Further, we give an extension of our result to models where the distance is large (close to ) but it is small when restricted to certain base sets (of variables).

We also explore a new model of read-once algebraic branching programs (ROABP) where the factor-matrices are invertible (called invertible-factor ROABP). We design a hitting-set in time poly() for width- invertible-factor ROABP. Further, we could do without the invertibility restriction when . Previously, the best result for width- ROABP was quasi-polynomial time (Forbes-Saptharishi-Shpilka, STOC 2014).

## 1 Introduction

The problem of Polynomial Identity Testing is that of deciding if a given polynomial is nonzero. The complexity of the question depends crucially on the way the polynomial is input to the PIT test. For example, if the polynomial is given as a set of coefficients of the monomials, then we can easily check whether the polynomial is nonzero in polynomial time. The problem has been studied for different input models. Most prominent among them is the model of arithmetic circuits. Arithmetic circuits are the arithmetic analog of boolean circuits and are defined over a field . They are directed acyclic graphs, where every node is a ’’ or ’’ gate and each input gate is a constant from the field or a variable from . Every edge has a weight from the underlying field . The computation is done in the natural way. Clearly, the output gate computes a polynomial in . We can restate the PIT problem as: Given an arithmetic circuit , decide if the polynomial computed by is nonzero in time polynomial in the circuit size. Note that, given a circuit, computing the polynomial explicitly is not possible, as it can have exponentially many monomials. However, given the circuit, it is easy to compute an evaluation of the polynomial by substituting the variables with constants.

Though there is no known deterministic algorithm for PIT, there are easy randomized algorithms, e.g. [Sch80]. These randomized algorithms are based on the theorem: A nonzero polynomial, evaluated at a random point, gives a nonzero value with a good probability. Observe that such an algorithm does not need to access the structure of the circuit, it just uses the evaluations; it is a blackbox algorithm. The other kind of algorithms, where the structure of the input is used, are called whitebox algorithms. Whitebox algorithms for PIT have many known applications. E.g. graph matching reduces to PIT. On the other hand, blackbox algorithms (or hitting-sets) have connections to circuit lower bound proofs. Arguably, this is currently the only concrete approach towards lower bounds, see [Mul12b, Mul12a]. See the surveys by Saxena [Sax09, Sax14] and Shpilka & Yehudayoff [SY10] for more applications.

An Arithmetic Branching Program (ABP) is another interesting model of computing polynomials. It consists of a directed acyclic graph with a source and a sink. The edges of the graph have polynomials as their weights. The weight of a path is the product of the weights of the edges present in the path. The polynomial computed by the ABP is the sum of the weights of all the paths from the source to the sink. It is well known that for an ABP, the underlying graph can seen as a layered graph such that all paths from the source to the sink have exactly one edge in each layer. And the polynomial computed by the ABP can be written as a matrix product, where each matrix corresponds to a layer. The entries in the matrices are weights of the corresponding edges. The maximum number of vertices in a layer, or equivalently, the dimension of the corresponding matrices is called the width of the ABP. It is known that symbolic determinant and ABP are equivalent models of computation [Tod91, MV97]. Ben-Or & Cleve [BOC92] have shown that a polynomial computed by a formula of logarithmic depth and constant fan-in, can also be computed by a width- ABP. Thus, ABP is a strong model for computing polynomials. The following chain of reductions shows the power of ABP and its constant-width version relative to other arithmetic computation models (see [BOC92] and [Nis91, Lemma 1]).

 Constant-depth Arithmetic Circuits≤pConstant-% width ABP ≤pFormulas≤pABP≤pArithmetic% Circuits

Our first result is for a special class of ABP called Read Once Oblivious Arithmetic Branching Programs (ROABP). An ABP is a read once ABP (ROABP) if the weights in its layers are univariate polynomials in distinct variables, i.e. the -th layer has weights coming from , where is a permutation on the set . When we know this permutation , we call it an ROABP with known variable order (it is significant only in the blackbox setting).

Raz and Shpilka [RS05] gave a -time whitebox algorithm for -variate polynomials computed by a width- ROABP with individual degree bound . Recently, Forbes and Shpilka [FS12, FS13] gave a -time blackbox algorithm for the same, when the variable order is known. Subsequently, Forbes et al. [FSS14] gave a blackbox test for the case of unknown variable order, but with time complexity being . Note the exponential dependence on the degree. Their time complexity becomes quasi-polynomial in case of multilinear polynomials, i.e. .

In another work Jansen et al. [JQS10b] gave quasi-polynomial time blackbox test for a sum of constantly many multilinear “ROABP”. Their definition of “ROABP” is more stringent. They assume that every variable appears in at most once in the ABP. Later, this result was generalized to “read- OABP” [JQS10a], where a variable can occur in at most one layer, and on at most edges. Our definition of ROABP seems much more powerful than both of these.

We improve the result of [FSS14] and match the time complexity for the unknown order case with the known order case (given by [FS12, FS13]). Unlike [FSS14], we do not have exponential dependence on the individual degree. Formally,

{theorem}

Let be an -variate polynomial computed by a width- ROABP (unknown order) with the degree of each variable bounded by . Then there is a -time hitting set for .

Remark. Our algorithm also works when the layers have their weights as general sparse polynomials (still over disjoint sets of variables) instead of univariate polynomials (see the detailed version in Section 3).

A polynomial computed by a width- ABP can be written as , where and is a polynomial over the matrix algebra. Like [ASS13, FSS14], we try to construct a basis (or extract the rank) for the coefficient vectors in . We actually construct a weight assignment on the variables, which isolates a basis in the coefficients in . This idea is inspired from the rank extractor techniques in [ASS13, FSS14]. Our approach is to directly work with , while [ASS13, FSS14] have applied a rank extractor to small subcircuits of , by shifting it carefully. In fact, the idea of basis isolating weight assignment evolved when we tried to find a direct proof, for the rank extractor in [ASS13], which does not involve subcircuits. But, our techniques go much further than both [ASS13, FSS14], as is evident from our strictly better time-complexity results.

The boolean analog of ROABP, read once ordered branching programs (ROBP) have been studied extensively, with regard to the RL vs. L question. For ROBP, a pseudorandom generator (PRG) with seed length ( size sample set) is known in the case of known variable order [Nis90]. This is analogous to the [FS13] result for known order ROABP. On the other hand, in the unknown order case, the best known seed length is of size ( size sample set) [IMZ12]. One can ask: Can the result for the unknown order case be matched with the known order case in the boolean setting as well. Recently, there has been a partial progress in this direction by [SVW14].

The PIT problem has also been studied for various restricted classes of circuits. One such class is depth- circuits. Our second result is about a special case of this class. A depth- circuit is usually defined as a circuit: The circuit gates are in three layers, the top layer has an output gate which is , second layer has all gates and the last layer has all gates. In other words, the polynomial computed by a circuit is of the form , where is the number of input lines to the -th product gate and is a linear polynomial of the form . An efficient solution for depth- PIT is still not known. Recently, it was shown by Gupta et al. [GKKS13], that depth-3 circuits are almost as powerful as general circuits. A polynomial time hitting-set for a depth- circuit implies a quasi-poly-time hitting-set for general circuits. Till now, for depth- circuits, efficient PIT is known when the top fan-in is assumed to be constant [DS07, KS07, KS09, KS11, SS11, SS12, SS13] and for certain other restrictions [Sax08, SSS13, ASSS12].

On the other hand, there are exponential lower bounds for depth- multilinear circuits [RY09]. Since there is a connection between lower bounds and PIT [Agr05], we can hope that solving PIT for depth- multilinear circuits should also be feasible. This should also lead to new tools for general depth-.

A polynomial is said to be multilinear if the degree of every variable in every term is at most . The circuit is a multilinear circuit if the polynomial computed at every gate is multilinear. A polynomial time algorithm is known only for a sub-class of multilinear depth- circuits, called depth- set-multilinear circuits. This algorithm is due to Raz and Shpilka [RS05] and is whitebox. In a depth- multilinear circuit, since every product gate computes a multilinear polynomial, a variable occurs in at most one of the linear polynomials input to it. Thus, each product gate naturally induces a partition of the variables, where each color (i.e. part) of the partition contains the variables present in a linear polynomial . Further, if the partitions induced by all the product gates are the same then the circuit is called a depth- set-multilinear circuit.

Agrawal et al. [ASS13] gave a quasi-polynomial time blackbox algorithm for the class of depth- set-multilinear circuits. But till now, no subexponential time test (not even whitebox) was known even for sum of two set-multilinear circuits. We give a subexponential time whitebox PIT for sum of constantly many set-multilinear circuits.

{theorem}

Let be a -variate polynomial, which is a sum of set-multinear depth- circuits, each having top fan-in . Then there is a -time whitebox test for , where .

To achieve this, we define a new class of circuits, as a tool, called multilinear depth- circuits with -distance. A multilinear depth- circuit has -distance if there is an ordering on the partitions induced by the product gates, say , such that for any color in the partition , there exists a set of other colors in such that the set of variables in the union of these colors are exactly partitioned in the upper partitions, i.e. . As we will see, such sets of colors form equivalence classes of the colors at partition . We call them friendly neighborhoods and they help us in identifying subcircuits. Intuitively, the distance measures how far away are the partitions from a mere refinement sequence of partitions, . A refinement sequence of partitions will have distance . On the other hand, general multilinear depth- circuits can have at most -distance.

As it turns out, a polynomial computed by a depth- -distance circuit (top fan-in ) can also be computed by a width- ROABP (see Lemma 4.1.1). Thus, we get a -time hitting set for this class, from Theorem 1. Next, we use a general result about finding a hitting set for a class -base-sets-, if a hitting set is known for class . A polynomial is in -base-sets-, if there exists a partition of the variables into base sets such that restricted to each base set (treat other variables as field constants), the polynomial is in class . We combine these two tools to prove Theorem 1. We show that a sum of constantly many set-multilinear circuits falls into the class -base-sets--distance, for .

Agrawal et al. [AGKS13] had achieved rank concentration, which implies a hitting set, for the class -base-sets--distance, but through complicated proofs. On the other hand, this work gives only a hitting set for the same class, but with the advantage of simplied proofs.

Our third result deals again with arithmetic branching programs. The results of [BOC92] and [SSS09] show that the constant-width ABP is already a strong model. Here, we study constant-width ABP with some natural restrictions.

We consider a class of ROABPs where all the matrices in the matrix product, except the left-most and the right-most matrices, are invertible. We give a blackbox test for this class of ROABP. In contrast to [FSS14] and our Theorem 1, this test works in polynomial time if the dimension of the matrices is constant.

Note that the class of ABP, where the factor matrices are invertible, is quite powerful, as Ben-Or and Cleve [BOC92] actually reduce formulas to width- ABP with invertible factors. Saha, Saptharishi and Saxena [SSS09] reduce depth- circuits to width- ABP with invertible factors. But the constraints of invertibility and read-once together seem to restrict the computing power of ABP. Interestingly, an analogous class of read-once boolean branching programs called permutation branching programs has been studied recently [KNP11, De11, Ste12]. These works give PRG for this class (for constant width) with seed-length , in the known variable order case. In other words, they give polynomial size sample set which can fool these programs. For the unknown variable order case, Reingold et al. [RSV13] gave a PRG with seed-length . Our polynomial size hitting sets for the arithmetic setting work for any unknown variable order. Hence, it is better as compared to the currently known results for the boolean case.

{theorem}

[Informal version] Let be a polynomial such that and and for all , is an invertible matrix (order of the variables is unknown). Let the degree bound on be for . Then there is a -time hitting-set for .

The proof technique here is very different from the first two theorems (here we show rank concentration over a non-commutative algebra, see the proof idea in Section 5). Our algorithm works even when the factor matrices have their entries as general sparse polynomials (still over disjoint sets of variables) instead of univariate polynomials (see the detailed version in Section 5). Running time in this case grows to quasi-polynomial (but is still better than Theorem 1 in several interesting cases).

If the matrices are , then we do not need the assumption of invertibility (see Theorem 5.3, Section 5.3). So, for width- ROABP our results are strictly stronger than [FSS14] and our Theorem 1. Here again, there is a comparable result in the boolean setting. PRG with seed-length (polynomial size sample set) are known for width- ROBP [BDVY13].

## 2 Preliminaries

Hitting Set A set of points is called a hitting set for a class of polynomials if for any nonzero polynomial in , there exists a point in where evaluates to a nonzero value. An -time hitting set would mean that the hitting set can be generated in time for input size .

### 2.1 Notation

denotes the set . denotes the set . denotes the set . will denote a set of variables. For a set of variables and for an exponent , will denote the monomial . The support of a monomial is the set of variables that have degree in that monomial. The support size of the monomial is the cardinality of its support. A polynomial is called -sparse if there are monomials in it with nonzero coefficients. For a polynomial , the coefficient of the monomial in is denoted by .

represents the set of all matrices over the field . will denote the algebra of matrices over the field . Let be any -dimensional algebra over the field . For any two elements and (having a natural basis representation in mind), their dot product is defined as ; and the product will denote the product in the algebra .

denotes the set of all possible partitions of the set . Elements in a partition are called colors (or parts).

### 2.2 Arithmetic Branching Programs

An ABP is a directed graph with layers of vertices and a start node and an end node such that the edges are only going from to , to for any , to . A width- ABP has for all . Let the set of nodes in be . All the edges in the graph have weights from , for some field . As a convention, the edges going from and coming to are assumed to have weights from the field .

For an edge , let us denote its weight by . For a path from to , its weight is defined to be the product of weights of all the edges in it, i.e. . Consider the polynomial which is the sum of the weights of all the paths from to . This polynomial is said to be computed by the ABP.

It is easy to see that this polynomial is the same as , where and is a matrix for such that

 S(ℓ) = W(u,v0,ℓ) for 1≤ℓ≤w Di(k,ℓ) = W(vi−1,k,vi,ℓ) for 1≤ℓ,k≤w and 1≤i≤d T(k) = W(vd,k,t) for 1≤k≤w

#### Roabp

An ABP is called a read once oblivious ABP (ROABP) if the edge weights in the different layers are univariate polynomials in distinct variables. Formally, the entries in come from for all , where is a permutation on the set .

#### sparse-factor ROABP

We call the ABP a sparse-factor ROABP if the edge weights in different layers are sparse polynomials in disjoint sets of variables. Formally, if there exists an unknown partition of the variable set into sets such that is a -sparse polynomial, for all , then the corresponding ROABP is called a -sparse-factor ROABP. It is read once in the sense that in the corresponding ABP, any particular variable contributes to at most one edge on any path.

### 2.3 Kronecker Map

We will often use a weight function on the variables which separates a desired set of monomials. Let be a weight function on the variables. Consider its natural extension to the set of all monomials as follows: , where .

{lemma}

[Efficient Kronecker map [Kro82, Agr05]] Let be the set of all monomials in variables with maximum individual degree . Let be a set of pairs of monomials from . Then there exists a (constructible) set of -many weight functions , such that at least one of them separates all the pairs in , i.e. for any , , where .

{proof}

Since we want to separate the -variate monomials with maximum individual degree , we use the naïve Kronecker map for all . It can be easily seen that will give distinct weights to any two monomials (with maximum individual degree ). But, the weights given by are exponentially high.

So, we take the weight function modulo , for many small primes . Each prime leads to a different weight function. That is our set of candidate weight functions. We need to bound the number of primes that ensures that at least one of the weight functions separates all the monomial pairs in . We choose the smallest primes, say is the set. By the effective version of the Prime Number Theorem, the highest value in the set is .

To bound the number of primes: We want a such that . Which means,

 ∃p∈P,p∤∏(m,m′)∈A(W(m)−W(m′)).

In other words,

 ∏p∈Pp∤∏(m,m′)∈A(W(m)−W(m′)).

This can be ensured by setting . There are such monomial pairs and each . Also, . Hence, suffices.

## 3 Hitting set for ROABP: Theorem 1

Like [ASS13] and [FSS14], we work with the vector polynomial. I.e. for a polynomial computed by a width- ROABP, , we see the product as a polynomial over the matrix algebra . We can write the polynomial as the dot product , where . The vector space spanned by the coefficients of is called the coefficient space of . This space will have dimension at most . We essentially try to construct a small set of vectors, by evaluating , which can span the coefficient space of . Clearly, if then the dot product of with at least one of these spanning vectors will be nonzero. And thus, we get a hitting set.

Unlike [ASS13] and [FSS14], we directly work with the original polynomial , instead of shifting it and breaking it into subcircuits. Our approach for finding the hitting set is to come up with a weight function on the variables which can isolate a basis for the coefficients of the polynomial . This can be seen as a generalization of isolating a monomial for a polynomial in , which is a usual technique for PIT (e.g. sparse PIT [KS01]).

We present our results for polynomials over arbitrary algebra. Let be a -dimensional algebra over the field . Let be a set of variables and let be a polynomial in with highest individual degree . Let denote the set of all monomials over the variable set with highest individual degree .

Now, we will define a basis isolating weight assignment for a polynomial which would lead to a hitting set for the polynomial , where , for some .

{definition}

[Basis Isolating Weight Assignment] A weight function is called a basis isolating weight assignment for a polynomial if there exists a set of monomials () whose coefficients form a basis for the coefficient space of , such that

• for any , and

• for any monomial ,

 coefD(m)∈span{coefD(m′)∣m′∈S,w(m′)

The above definition is equivalent to saying that there exists a unique minimum weight basis (according to the weight function ) among the coefficients of , and also the basis monomials have distinct weights. We skip the easy proof for this equivalence, as we will not need it. Note that a weight assignment, which gives distinct weights to all the monomials, is indeed a basis isolating weight assignment. But, it will involve exponentially large weights. To, find an efficient weight assignment one must use some properties of the given circuit. First, we show how such a weight assignment would lead to hitting set. We will actually show that it isolates a monomial in .

{lemma}

Let is a basis isolating weight assignment for a polynomial . And let be a nonzero polynomial, for some . Then, after the substitution for all , the polynomial remains nonzero, where is an indeterminate. {proof} Let denote the coefficient . It is easy to see that after the mentioned substitution, the new polynomial is equal to .

Let us say that is the set of monomials whose coefficients form the isolated basis for . According to the definition of the basis isolating weight assignment, for any monomial ,

 Dm∈span{Dm′∣m′∈S,w(m′)

First, we claim that such that . For the sake of contradiction, let us assume that . Taking the dot product with on both the sides of Equation (1), we get that for any monomial ,

 R⋅Dm∈span{R⋅Dm′∣m′∈S,w(m′)

Hence, . That means , which contradicts our assumption.

Now, let be the minimum weight monomial in whose coefficient gives a nonzero dot product with , i.e. . There is a unique such monomial in because all the monomials in have distinct weights.

We claim that and hence . To see this, consider any monomial , other than , with . The monomial has to be in the set , as the monomials in have distinct weights. From Equation (1),

 Dm∈span{Dm′∣m′∈S,w(m′)

Taking dot product with on both the sides we get,

 R⋅Dm∈span{R⋅Dm′∣m′∈S,w(m′)

But, by the choice of , , for any with . Hence, , for any with .

So, the coefficient can be written as

 ∑m∈Mw(m)=w(m∗)R⋅Dm=R⋅Dm∗,

which, we know, is nonzero.

To construct a hitting set for , we can try many possible field values of . The number of such values needed will be the degree of after the substitution, which is at most . Hence, the cost of the hitting set is dominated by the cost of the weight function, i.e. the maximum weight given to any variable and the time taken to construct the weight function.

In the next step, we show that such a basis isolating weight assignment can indeed be found for a sparse-factor ROABP, but with cost quasi-polynomial in the input size. First, we make the following observation that it suffices that the coefficients of the monomials not in , linearly depend on any coefficients with strictly smaller weight, not necessarily coming from .

###### Observation \thetheorem

If, for a polynomial , there exists a weight function and a set of monomials () such that for any monomial ,

 coefD(m)∈span{coefD(m′)∣m′∈M,w(m′)

then we can also conclude that for any monomial ,

 coefD(m)∈span{coefD(m′)∣m′∈S,w(m′)
{proof}

We are given that for any monomial ,

 coefD(m)∈span{coefD(m′)∣m′∈M,w(m′)

Any coefficient on the right hand side of this equation, which corresponds to an index in , can be replaced with some other coefficients, which have further smaller weight. If we keep doing this, we will be left with the coefficients only corresponding to the set , because in each step we are getting smaller and smaller weight coefficients.

In our construction of the weight function, we will create the set incrementally, i.e. in each step we will make more coefficients depend on strictly smaller weight coefficients. Finally, we will be left with only (the rank of the coefficient space of ) many coefficients in . We present the result for an arbitrary -dimensional algebra , instead of just the matrix algebra.

{lemma}

[Weight Construction] Let be given by a union of disjoint sets of variables , with . Let , where is a sparsity-, individual degree- polynomial, for all . Then, we can construct a basis isolating weight assignment for with the cost being . {proof} In our construction, the final weight function will be a combination of -many different weight functions, say . Let us say, their precedence is decreasing from left to right, i.e.  has the highest precedence and has the lowest precedence. As mentioned earlier, we will build the set (the set of monomials whose coefficients are in the span of strictly smaller weight coefficients than themselves) incrementally in steps, using weight function in the -th step.

Iteration : Let be the sets of monomials and be the sets of coefficients in the polynomials respectively.

Notation. The product of two sets of monomials and is defined as . The product of any two sets of coefficients and is defined as .

The crucial property of the polynomial is that the set of coeffcients in , , is just the product . Similary, the set of all the monomials in , say , can be viewed as the product . Let be a monomial, where and , for . Then will denote the coefficient .

Let us fix to be a weight function on the variables which gives distinct weights to all the monomials in , for each . As assigns distinct weights to these monomials, so does the weight function .

For each we do the following:

• arrange the coefficients in in increasing order of their weight according to (or equivalently, according to ),

• choose a maximal set of linearly independent coefficients, in a greedy manner, going from lower weights to higher weights.

The fact that the weight functions are not defined yet does not matter because has the highest precedence. The total order given to the monomials in by is the same as given by , irrespective of what the functions are chosen to be.

This gives us a basis for the coefficients of , say . Let denote the monomials in corresponding to these basis coefficients. From the construction of the basis, it follows that for any monomial ,

 Dm∈span{Dm′∣m′∈M′0,i,w(m′)

Now, consider any monomial which is not present in the set . Let , where for all . We know that for at least one , . Then using Equation (2) we can write the following about ,

 Dm∈span{Dm1⋯Dmj−1Dm′jDmj+1⋯Dmd∣m′j∈M′0,j,w(m′j)

This holds, because the algebra product is bilinear. Equivalently, for any monomial ,

 Dm∈span{Dm′∣m′∈M0,w(m′)

This is true because

 w(m1)+⋯+w(m′j)+⋯+w(md)

Hence, all the monomials in can be put into , i.e. their corresponding coefficients depend on strictly smaller weight coefficients.

Iteration : Now, let us consider monomials in the set . Let the corresponding set of coefficients be . Since, the underlying algebra has dimension at most and the coefficients in form a basis for , , for all . In the above product, let us make disjoint pairs of consecutive terms, and for each pair, multiply the two terms in it. Putting it formally, let us define to be the product and similarly , for all (if is odd, we can make it even by multiplying the identity element of in the end). Now, let , and , where . For any , has at most monomials.

Now, we fix the weight function such that it gives distinct weights to all the monomials in , for each . As separates these monomials, so does the weight function . Now, we repeat the same procedure of constructing a basis in a greedy manner for according to the weight function , for each . Let the basis coefficients for be and corresponding monomials be .

As argued before, any coefficient in , which is outside the set , is in the span of strictly smaller weight (than itself) coefficients. So, we can also put the corresponding monomials in where .

Iteration : We keep repeating the same procedure for -many rounds. After round , say the set of monomials we are left with is given by the product , where has at most monomials, for each and . In the above product, we make disjoint pairs of consecutive terms, and multiply the two terms in each pair. Let us say we get , where . Say, the corresponding set of coefficients is given by . Note that , for each .

We fix the weight function such that it gives distinct weights to all the monomials in the set , for each . We once again mention that fixing of does not affect the greedy basis constructed in earlier rounds and hence the monomials which were put in the set , because has less precendence than any , for .

For each , we construct a basis in a greedy manner going from lower weight to higher weight (according to the weight function ). Let this set of basis coefficients be and corresponding monomials be , for each . Let and . Arguing similar as before we can say that each coefficient in is in the span of strictly smaller weight coefficients (from ) than itself. Hence, the same can be said about any coefficient in the set . So, all the monomials in the set can be put into . Now, we are left with monomials for the next round.

Iteration : As in each round, the number of terms in the product gets halved, after rounds we will be left with just one term, i.e. . Now, we will fix the function which separates all the monomials in . By arguments similar as above, we will be finally left with at most monomials in , which will all have distinct weights. It is clear that for every monomial in , its coefficient will be in the span of strictly smaller weight coefficients than itself.

Now, let us look at the cost of this weight function. In the first round, needs to separate at most many pairs of monomials. For each , needs to separate at most many pairs of monomials. From Lemma 2.3, to construct , for any , one needs to try -many weight functions each having highest weight at most (as is bounded by ). To get the correct combination of the weight functions we need to try all possible combinations of these polynomially many choices for each . Thus, we have to try many combinations.

To combine these weight functions we can choose a large enough number (greater than the highest weight a monomial can get in any of the weight functions), and define . The choice of ensures that the different weight functions cannot interfere with each other, and they also get the desired precedence order.

The highest weight a monomial can get from the weight function would be . Thus, the cost of remains .

Combining Lemma 3 with Observation 3 and Lemma 3, we can get a hitting set for ROABP.

Theorem 1 (restated). Let be an -variate polynomial computed by a width-, -sparse-factor ROABP, with individual degree bound . Then there is a -time hitting set for . {proof} As mentioned earlier, can be written as , for some , where . The underlying matrix algebra has dimension . The hitting set size will be dominated by the cost of the weight function constructed in Lemma 3. As the parameter in Lemma 3, i.e. the number of layers in the ROABP, is bounded by , the hitting set size will be .

## 4 Sum of constantly many set-multilinear circuits: Theorem 1

To find a hitting set for a sum of constantly many set-multilinear circuits, we build some tools. The first is depth-3 multilinear circuits with ‘small distance’. As it turns out, a multilinear polynomial computed by a depth- -distance circuit (top fan-in ) can also be computed by a width- ROABP (Lemma 4.1.1). Thus, we get a -time hitting set for this class, from Theorem 1. Next, we use a general result about finding a hitting set for a class -base-sets-, if a hitting set is known for class (Lemma 4.2). A polynomial is in -base-sets-, if there exists a partition of the variables into base sets such that restricted to each base set (treat other variables as field constants), the polynomial is in class . Finally, we show that a sum of constantly many set-multilinear circuits falls into the class -base-sets--distance, for . Thus, we get Theorem 1.

### 4.1 Δ-distance circuits

Recall that each product gate in a depth- multilinear circuit induces a partition on the variables. Let these partitions be .

{definition}

[Distance for a partition sequence] Let be the partitions of the variables . Then if such that equals a union of some colors in .

In other words, in every partition , each color has a set of colors called ‘friendly neighborhood’, , consisting of at most colors, which is exactly partitioned in the ‘upper partitions’. We call , an upper partition relative to (and , a lower partition relative to ), if . For a color of a partition , let denote its friendly neighborhood. The friendly neighborhood of a variable in a partition is defined as , where is the color in the partition that contains the variable .

{definition}

[-distance circuits] A multilinear depth- circuit has -distance if its product gates can be ordered to correspond to a partition sequence with .

Every depth- multilinear circuit is thus an -distance circuit. A circuit with a partition sequence, where the partition is a refinement of the partition , exactly characterizes a -distance circuit. All depth- multilinear circuits have distance between and . Also observe that the circuits with -distance strictly subsume set-multilinear circuits. E.g. a circuit, whose product gates induce two different partitions and , has -distance but is not set-multilinear.

Friendly neighborhoods - To get a better picture, we ask: Given a color of a partition in a circuit , how do we find its friendly neighborhood ? Consider a graph which has the colors of the partitions , as its vertices. For all