Characterizing Propositional Proofs
as Non-Commutative Formulas111An extended abstract of this work entitled “Non-commutative Formulas and Frege Lower Bounds: a New Characterization of Propositional Proofs” appeared in Proceedings of the 30th Annual Computational Complexity Conference (CCC): June 17-19, 2015.
Does every Boolean tautology have a short propositional-calculus proof? Here, a propositional-calculus (i.e., Frege) proof is any proof starting from a set of axioms and deriving new Boolean formulas using a fixed set of sound derivation rules. Establishing any super-polynomial size lower bound on Frege proofs (in terms of the size of the formula proved) is a major open problem in proof complexity, and among a handful of fundamental hardness questions in complexity theory by and large. Non-commutative arithmetic formulas, on the other hand, constitute a quite weak computational model, for which exponential-size lower bounds were shown already back in 1991 by Nisan [STOC 1991], using a particularly transparent argument.
In this work we show that Frege lower bounds in fact follow from corresponding size lower bounds on non-commutative formulas computing certain polynomials (and that such lower bounds on non-commutative formulas must exist, unless \NP=\coNP). More precisely, we demonstrate a natural association between tautologies to non-commutative polynomials , such that:
if has a polynomial-size Frege proof then has a polynomial-size non-commutative arithmetic formula; and conversely, when is a DNF, if has a polynomial-size non-commutative arithmetic formula over then has a Frege proof of quasi-polynomial size.
The argument is a characterization of Frege proofs as non-commutative formulas: we show that the Frege system is (quasi-) polynomially equivalent to a non-commutative Ideal Proof System (IPS), following the recent work of Grochow and Pitassi [FOCS 2014] that introduced a propositional proof system in which proofs are arithmetic circuits, and the work in [Tza11] that considered adding the commutator as an axiom in algebraic propositional proof systems. This also gives a characterization of propositional Frege proofs in terms of (non-commutative) arithmetic formulas that is tighter than (the formula version of IPS) in Grochow and Pitassi [FOCS 2014].
1.1 Propositional proof complexity
The field of propositional proof complexity aims to understand and analyze the computational resources required to prove propositional statements. The problems the field poses are fundamental, difficult, and of central importance to computer science and complexity theory as demonstrated by the seminal work of Cook and Reckhow [CR79], who showed the immediate relevance of these problems to the \NP vs. \coNP problem (and thus to the \Ptime vs. \NP problem).
Among the major unsolved questions in propositional proof complexity, is whether the standard propositional logic calculus, either in the form of the Sequent Calculus, or equivalently, in the axiomatic form of Hilbert style proofs (i.e., Frege proofs), is polynomially bounded; that is, whether every propositional tautology—namely, a formula that is satisfied by every assignment—has a proof whose size is polynomially bounded in the size of the formula proved (alternatively and equivalently, we can think of unsatisfiable formulas and their refutations). Here, we consider the size of proofs as the number of symbols it takes to write them down, where each formula in the proof is written as a Boolean formula (in other words we count the total number of logical gates appearing in the proof).
It is known since Reckhow work [Rec76] that all Frege proof-systems222Formally, a Frege proof system is any propositional proof system with a fixed number of axiom schemes and sound derivation rules that is also implicationally complete, and in which proof-lines are written as propositional formulas (see Definition 2.4). (as well as the Gentzen sequent calculus with the cut rule [Gen35]) are polynomially equivalent to each other, and hence it does not matter precisely which rules, axioms, and logical-connectives we use in the system. Nevertheless, for concreteness, the reader can think of the Frege proof system as the following simple one (known as Schoenfield’s system), consisting of only three axiom schemes (where is an abbreviation of ; and are any propositional formulas):
and a single inference rule (known as modus ponens):
Complexity-wise, Frege is considered a very strong proof system alas a poorly understood one. The qualification strong here has several meanings: first, that no super-polynomial lower bound is known for Frege proofs. Second, that there are not even good hard candidates for the Frege system (see [BBP95, Raz15, Kra11, LT13] for further discussions on hard proof complexity candidates). Third, that for most hard instances (e.g., the pigeonhole principle and Tseitin tautologies) that are known to be hard for weaker systems (e.g., resolution, cutting planes, etc.), there are known polynomial bounds on Frege proofs. Fourth, that proving super-polynomial lower bounds on Frege proofs seems to a certain extent out of reach of current techniques (and believed by some to be even harder than proving explicit circuit lower bounds [Raz15]). And finally, that by the common (mainly informal) correspondence between circuits and proofs—namely, the correspondence between a circuit-class and a proof system in which every proof-line is written as a circuit333To be more precise, one has to associate a circuit class with a proof system in which a family of proofs is written such that every proof-line in the family is a circuit family from . from —Frege system corresponds to the circuit class of polynomial-size -depth circuits denoted \NCOne (equivalently, of polynomial-size formulas [Spi71]), considered to be a strong computational model for which no (explicit) super-polynomial lower bounds are currently known.
Accordingly, proving lower bounds on Frege proofs is considered an extremely hard task. In fact, the best lower bound known today is only quadratic, which uses a fairly simple syntactic argument [Kra95]. If we put further impeding restrictions on Frege proofs, like restricting the depth of each formula appearing in a proof to a certain fixed constant, exponential lower bounds can be obtained [Ajt88, PBI93, PBI93]. Although these constant-depth Frege exponential-size lower bounds go back to Ajtai’s result from 1988, they are still in some sense the state-of-the-art in proof complexity lower bounds (beyond the important developments on weaker proof systems, such as resolution and its comparatively weak extensions). Constant-depth Frege lower bounds use quite involved probabilistic arguments, mainly specialized switching lemmas tailored for specific tautologies (namely, counting tautologies, most notable of which are the Pigeonhole Principle tautologies). Even random CNF formulas near the satisfiability threshold are not known to be hard for constant-depth Frege (let alone hard for [unrestricted depth] Frege).
All of the above goes to emphasize the importance, basic nature and difficulty in understanding the complexity of strong propositional proof systems, while showing how little is actually known about these systems.
1.2 Prominent directions for understanding propositional proofs
As we already mentioned, there is a guiding line in proof complexity which states a correspondence between the complexity of circuits and the complexity of proofs. This correspondence is mainly informal, but there are seemingly good indications showing it might be more than a superficial analogy. One of the most compelling evidence for this correspondence is that there is a formal correspondence (cf. [CN10] for a clean formulation of this) between the first-order logical theories of bounded arithmetic (whose axioms state the existence of sets taken from a given complexity class ) to propositional proof systems (in which proof-lines are circuits from ).
Another aspect of the informal correspondence between circuit complexity and proof complexity is that circuit hardness sometimes can be used to obtain proof complexity hardness. The most notable example of this are the lower bounds on constant-depth Frege proofs mentioned above: constant-depth Frege proofs can be viewed as propositional calculus operating with \ACZ circuits, and the known lower bounds on constant depth Frege proofs (cf. [Ajt88, KPW95, PBI93]) use techniques borrowed from \ACZ circuits lower bounds. The success in moving from circuit hardness towards proof-complexity hardness has spurred a flow of attempts to obtain lower bounds on proof systems other than constant depth Frege. For example, Pudlák [Pud99] and Atserias et al. [AGP02] studied proofs based on monotone circuits, motivated by known exponential lower bounds on monotone circuits [Raz85]. Raz and Tzameret [RT08b, RT08a, Tza08] investigated algebraic proof systems operating with multilinear formulas, motivated by lower bounds on multilinear formulas for the determinant, permanent and other explicit polynomials [Raz09, Raz06]. Atserias et al. [AKV04], Krajíček [Kra08] and Segerlind [Seg07] have considered proofs operating with ordered binary decision diagrams (OBDDs), and the second author [Tza11] initiated the study of proofs operating with non-commutative formulas (see Sec. 1.4 for a comparison with the current work).444We do not discuss here the important thread of results whose aim is to establish conditional lower bounds based on Nisan-Wigderson generators. This direction was developed in e.g. [ABSRW04, Raz15, Kra04, Kra10].
Until quite recently it was unknown whether the correspondence between proofs and circuits is two-sided, namely, whether proof complexity hardness (of concrete known proof systems) can imply any computational hardness. An initial example of such an implication from proof hardness to circuit hardness was given by Raz and Tzameret [RT08b]. They showed that a separation between algebraic proof systems operating with arithmetic circuits and multilinear arithmetic circuits, resp., for an explicit family of polynomials, implies a separation between arithmetic circuits and multilinear arithmetic circuits.
In a recent significant development about the complexity of strong proof systems, Grochow and Pitassi [GP14] demonstrated a much stronger correspondence. They introduced a natural propositional proof system, called the Ideal Proof System (IPS for short), for which any super-polynomial size lower bound on IPS implies a corresponding size lower bound on arithmetic circuits, and formally, that the permanent does not have polynomial-size arithmetic circuits. The IPS is defined as follows:
Definition 1.1 (Ideal Proof System (IPS) [Gp14]).
Let be a system of polynomials in the variables , where the polynomials , for all , are part of this system. An IPS refutation (or certificate) that the ’s polynomials have no common 0-1 solutions is a polynomial in the variables and , such that:
The essence of IPS is that a proof (or refutation) is a single polynomial that can be written simply as an arithmetic circuit or formula. The advantage of this formulation is that now we can obtain direct connections between circuit/formula hardness (i.e., “computational hardness”) and hardness of proofs. Grochow and Pitassi showed indeed that a lower bound on IPS written as an arithmetic circuit implies that the permanent does not have polynomial-size algebraic circuits (Valiant’s conjectured separation [Val79]); And similarly, a lower bound on IPS written as an arithmetic formula implies that the permanent does not have polynomial-size algebraic formulas (, ibid).
Under certain assumptions, Grochow and Pitassi [GP14] were able to connect their result to standard propositional-calculus proof systems, i.e., Frege and Extended Frege. Their assumption was the following: Frege has polynomial-size proofs of the statement expressing that the PIT for arithmetic formulas is decidable by polynomial-size Boolean circuits (PIT for arithmetic formulas is the problem of deciding whether an input arithmetic formula computes the [formal] zero polynomial). They showed that555We focus only on the relevant results about Frege proofs from [GP14] (and not the results about Extended Frege in [GP14]; the latter proof system operates, essentially, with Boolean circuits, in the same way that Frege operates with Boolean formulas (equivalently \NCOne circuits))., under this assumption super-polynomial lower bounds on Frege proofs imply that the permanent does not have polynomial-size arithmetic circuits. This, in turn, can be considered as a (conditional) justification for the apparent long-standing difficulty of proving lower bounds on strong proof systems.
1.3 Overview of results and proofs
In this work we give a novel characterization of the propositional calculus—a fundamental and prominent object by itself—and by this contribute to the understanding of strong propositional proof systems, and to the fundamental search for lower bounds on these proofs. We formulate a very natural proof system, namely a non-commutative variant of the ideal proof system, which we show captures unconditionally (up to a quasi-polynomial-size increase, and in some cases only a polynomial increase666We establish a slightly stronger characterization: the non-commutative IPS polynomially simulates Frege; and conversely, the complexity in which Frege simulates the non-commutative IPS depends on the degree of the non-commutative IPS refutation; e.g., the simulation is polynomial when refutations are of logarithmic degrees (see note after Theorem 1.7).) propositional Frege proofs. A proof in the non-commutative IPS is simply a single non-commutative polynomial written as a non-commutative formula.
Our results thus give a compelling and simple new characterization of the proof complexity of propositional Frege proofs and brings new hope for achieving lower bounds on strong proof systems, by reducing the task of lower bounding Frege proofs to the following seemingly much more manageable task: proving matrix rank lower bounds on the matrices associated with certain non-commutative polynomials (in the sense of Nisan [Nis91]; see below for details).
The new characterization also tightens the recent results of Grochow and Pitassi [GP14] in the following sense:
The non-commutative IPS is polynomial-time checkable—whereas the original IPS was checkable in probabilistic polynomial-time; and
Frege proofs unconditionally quasi-polynomially simulate the non-commutative IPS—whereas Frege was shown to efficiently simulate IPS only assuming that the decidability of PIT for (commutative) arithmetic formulas by polynomial-size circuits is efficiently provable in Frege.
The tighter result shows that, at least for Frege, and in the framework of the ideal proof system, lower bounds on Frege proofs do not necessarily entail in themselves very strong computational lower bounds.
1.3.2 Some preliminaries: non-commutative polynomials and formulas
A non-commutative polynomial over a given field and with the variables is a formal sum of monomials with coefficients from such that the product of variables is non-commuting. For example, and are three distinct polynomials in . The ring of non-commutative polynomials with variables and coefficients from is denoted .
A polynomial (i.e., a commutative polynomial) over a field is defined in the same way as a non-commutative polynomial except that the product of variables is commutative; in other words, it is a sum of (commutative) monomials.
A non-commutative arithmetic formula (non-commutative formula for short) is a fan-in two labeled tree, with edges directed from leaves towards the root, such that the leaves are labeled with field elements (for a given field ) or variables and internal nodes (including the root) are labeled with a plus or product gates. A product gate has an order on its two children (holding the order of non-commutative product). A non-commutative formula computes a non-commutative polynomial in the natural way (see Definition 2.5).
Exponential-size lower bounds on non-commutative formulas (over any field) were established by Nisan [Nis91]. The idea (in retrospect) is quite simple: first transform a non-commutative formula into an algebraic branching program (ABP; Definition 4.13); and then show that the number of nodes in the th layer of an ABP computing a degree homogenous non-commutative polynomial is bounded from below by the rank of the degree -partial-derivative matrix of .777The degree partial derivative matrix of is the matrix whose ro‘ws are all non-commutative monomials of degree and columns are all non-commutative monomials of degree , such that the entry in row and column is the coefficient of the degree monomial in . Thus, lower bounds on non-commutative formulas follow from quite immediate rank arguments (e.g., the partial derivative matrices associated with the permanent and determinant can easily be shown to have high ranks).
1.3.3 Non-commutative ideal proof system
Recall the IPS refutation system from Definition 1.1 above. We use the idea introduced in [Tza11], which considered adding the commutator as an axiom in propositional algebraic proof systems, to define a refutation system that polynomially simulates Frege:
Definition 1.2 (Non-commutative IPS).
Let be a field. Assume that is a system of non-commutative polynomial equations from , and suppose that the following set of equations (axioms) are included in the ’s:
- Boolean axioms:
for all ;
- Commutator axioms:
, for all
Suppose that the ’s have no common - solutions.888One can check that the ’s have no common - solutions in iff they do not have a common 0-1 solution in every -algebra. A non-commutative IPS refutation (or certificate) that the system of ’s is unsatisfiable is a non-commutative polynomial in the variables and (i.e. , such that:
We always assume that the non-commutative IPS refutation is written as a non-commutative formula. Hence the size of a non-commutative IPS refutation is the minimal size of a non-commutative formula computing the non-commutative IPS refutation.
Note: (i) It is important to note that identities 1 and 2 in Definition 1.2 are formal identities between non-commutative polynomials. It is possible to show that without the commutator axioms the system becomes incomplete in the sense that there will be unsatisfiable systems of non-commutative polynomials (where the ’s include the Boolean and commutator axioms) for which there are no non-commutative IPS refutations.
(ii) In order to prove that a system of commutative polynomial equations (where each is expressed as an arithmetic formula) has no common roots in non-commutative IPS, we write each as a non-commutative formula (in some way; note that there is no unique way to do this).
The main result of this paper is that the non-commutative IPS (over either or , for any prime ) polynomially simulates Frege; and conversely, Frege quasi-polynomially simulates the non-commutative IPS (over ). We explain the results in what follows.
Non-commutative IPS simulates Frege
For the purpose of the next theorem we use a standard translation of propositional formulas into non-commutative arithmetic formulas:
Definition 1.3 ().
Let , for variables ; ; ; and by induction on the size of the propositional formula: ; and finally .
For a non-commutative formula denote by the non-commutative polynomial computed by . Thus, is a propositional tautology iff for every 0-1 assignment to the variables of the non-commutative polynomial.
Theorem 1.4 (First main theorem).
Let be either the rational numbers or , for a prime . The non-commutative IPS refutation system, when refutations are written as non-commutative formulas over , polynomially simulates the Frege system. More precisely, for every propositional tautology T, if T has a polynomial-size Frege proof then there is a non-commutative IPS certificate (over ) of that has a polynomial non-commutative formula size.
The fact that an arithmetic formula (or circuit) in the form of the IPS can simulate a propositional Frege proof was shown in [GP14]. The non-commutative IPS, on the other hand, is much more restrictive than the original (commutative) IPS: instead of using commutative polynomials (written as arithmetic formulas) we now use non-commutative polynomials (written as non-commutative arithmetic formulas). And as mentioned above, in order to maintain the completeness of the non-commutative IPS we must add the commutator axioms to the system. Thus, the question arises: how can we still polynomially simulate Frege in this restrictive framework? The answer to this, which also constitutes one of the main observation of the simulation, is that the commutator axioms are already used implicitly in propositional Frege proofs: every classical propositional calculus system has some (possibly implicit) structural rules that enable one to commute AND’s and OR’s (e.g., is not the same formula as , from the perspective of the propositional calculus). In other words, Frege proofs operate with formulas as purely syntactic terms, and thus commutativity of AND and OR are not free for Frege proofs.
We now sketch in more detail the proof of Theorem 1.4. To simulate Frege proofs we use an intermediate proof system (standing for “formula polynomial calculus”) introduced by Grigoriev and Hirsch [GH03]. The proof system (Definition 2.8) can be thought of as a simple variant of the well-studied polynomial calculus (PC) system in which polynomials are written as arithmetic formulas (instead of sums of monomials as in PC).
Recall that a PC-refutation, as introduced by Clegg, Edmonds and Impagliazzo [CEI96], is simply a sequence of polynomials written as sum of monomials, where each polynomial is either taken from the initial unsatisfiable set of polynomials or was derived using two algebraic rules: from a pair of previously derived polynomials and , derive (for field elements); and from a previously derived , derive , for any variable . The proof system makes the following two changes to PC (turning it into a provably much stronger system):
every polynomial in an -proof is written as an arithmetic formula (instead of as a sum of monomials) and is treated as a purely syntactic object (like in Frege); and
we can derive new polynomials either by the two aforementioned PC rules, or by local rewriting rules operating on any subformula and expressing simple operations on polynomials (such as commutativity of addition and product, associativity, distributivity, etc.).
Grigoriev and Hirsch [GH03] showed that polynomially simulates Frege proofs, and that for tree-like Frege proofs the polynomial simulation yields tree-like proofs. Since tree-like Frege is polynomially equivalent to Frege—because Frege proofs can always be balanced to a depth that is logarithmic in their size (cf. [Kra95] for a proof)—we get that tree-like polynomially simulates (dag-like) Frege proofs.
Therefore, to conclude Theorem 1.4 it suffices to prove that the non-commutative IPS polynomially simulates tree-like proofs. To do this, loosely speaking, we construct the non-commutative formula tree according to the structure of the tree-like proof, line by line.
Now, since we write refutations as non-commutative formulas we can use the polynomial-time deterministic Polynomial Identity Testing (PIT) algorithm for non-commutative formulas, devised by Raz and Shpilka [RS05], to check in deterministic polynomial-time the correctness of non-commutative IPS refutations:
The non-commutative IPS is a sound and complete refutation system in the sense of Cook-Reckhow [CR79]. That is, it is a sound and complete refutation system for unsatisfiable propositional formulas in which refutations can be checked for correctness in deterministic polynomial-time.
This should be contrasted with the original (commutative) IPS of [GP14], for which verification of refutations is done in probabilistic polynomial time using the standard Schwartz-Zippel [Sch80, Zip79] PIT algorithm.
The major consequence of Theorem 1.4 is that to prove a super-polynomial Frege lower bound it suffices to prove a super-polynomial lower bound on non-commutative formulas computing certain polynomials. Specifically, it is enough to prove that any non-commutative IPS certificate (which is simply a non-commutative polynomial) has a super-polynomial non-commutative formula size; and yet in other words, it suffices to show that any such must have a super-polynomial total rank according to the associated partial-derivatives matrices in the sense of Nisan [Nis91] as discussed before.
Frege simulates non-commutative IPS
We shall prove that Frege simulates the non-commutative IPS for CNFs (this is the case considered in [GP14]), over and with only a quasi-polynomial increase in size (and for some specific cases the simulation can become polynomial).
It will be convenient to use a translation of clauses to non-commutative formulas which is slightly different than Definition 1.3:
Definition 1.6 ( and ).
Given a Boolean formula we define its non-commutative formula translation as follows. Let and , for a variable. Let ; ; and (where the sequence of products stands for a (balanced) fan-in two tree of product gates with on the leaves). Further, for a CNF , denote by the non-commutative formula translation of the clause .
Note that this way, the system of equations is unsatisfiable iff is unsatisfiable.
Theorem 1.7 (Second main theorem).
Let be a CNF and let denote the corresponding non-commutative formulas for the clauses of . If there is a non-commutative IPS refutation of size of over , then there is a Frege proof of size of the tautology .
Note: The proof of Theorem 1.7 achieves in fact a slightly stronger simulation than stated. That is, our simulation shows that if the degree of the non-commutative IPS refutation is and its formula depth is , then there is a Frege proof of with size . And in particular, Frege polynomially simulates non-commutative IPS refutations of degrees (for the number of variables in the CNF). However, for simplicity we shall always assume that the depth of the non-commutative IPS formula is logarithmic in its size (Lemma 4.2 shows that we can always balance non-commutative formulas), and so explicitly we only deal with the case where and .
The proof of Theorem 1.7 consists of several separate steps of independent interest. From the logical point of view, the argument is a short Frege proof of a reflection principle for the non-commutative IPS system. A reflection principle for a given proof system is a statement that says that if there exists a -proof of a formula then is also true. The argument becomes rather complicated because we need to prove properties of the PIT algorithm for non-commutative formulas devised by Raz and Shpilka [RS05] within the restrictive framework of propositional Frege proofs.
Our goal is then to prove in Frege, given a non-commutative IPS refutation of .
Step 1: balancing. We first balance the non-commutative IPS , so that its depth is logarithmic in its size. We observe that the recent construction of Hrubeš and Wigderson [HW14] for balancing non-commutative formulas with division gates (incurring with at most a polynomial increase in size) results in a division-free formula, when the initial non-commutative formula is division-free by itself. Therefore, we can assume that the non-commutative IPS certificate is already balanced (this step is independent of the Frege system).
Step 2: Booleanization. We then consider our balanced , which is a non-commutative polynomial identity over , as a Boolean tautology, by replacing plus gates with XORs and product gates with ANDs.
Step 3: reflection principle. We use a reflection principle to reduce the task of efficiently proving in Frege to the following task: show that any non-commutative formula identity over , considered as a Boolean tautology, has a short Frege proof.
Step 4: homogenization. This is the only step that is responsible for the quasi-polynomial size increase in Theorem 1.7. More precisely, this increase in size depends on the fact that for the purpose of establishing short Frege proofs for all non-commutative polynomial identities over (considered as Boolean tautological formulas) it is important that the formulas are written as a sum of homogenous non-commutative formulas.
Note that it is not known whether arithmetic formulas can be turned into a (sum of) homogenous formulas with only a polynomial increase in size (in contrast to the standard efficient homogenization of arithmetic circuits by Strassen [Str73] that does allow such a conversion). Nevertheless, Strassen’s standard procedure enables us to transform any polynomial-size arithmetic formula into a sum of homogenous formulas with only a quasi-polynomial increase in size: any formula of size computing a polynomial (and thus the degree of is also polynomial) can be transformed into a sum of homogenous formulas, each having size and computes the corresponding homogenous part of . (One can show that the same also holds for non-commutative formulas.)
For the purpose of establishing a quasi-polynomial simulation of non-commutative IPS by Frege, it is sufficient to use the original Strassen’s homogenization procedure (as simulated inside Frege; cf. [HT12]). However, as the note after Theorem 1.7 indicates, we show a slightly stronger simulation result, using an efficient Frege simulation of a recent result due to Raz [Raz13] who showed how to transform an arithmetic formula into (a sum of) homogenous formulas in a manner which is more efficient than Strassen [Str73]. Specifically, in Lemma 4.8 we show that:
The same construction in [Raz13] also holds for non-commutative formulas;
This construction for non-commutative formulas can be carried out efficiently inside Frege. That is, if is a non-commutative formula of size and depth computing a homogenous non-commutative polynomial over of degree , then there exists a syntactic homogenous non-commutative formula computing the same polynomial and with size , such that Frege admits a proof of of size polynomial (in ).
Step 5: short proofs for homogenous non-commutative identities. Now that we have reduced our task to the task of showing that every non-commutative formula identity over (considered as a tautology) has a short Frege proof; and we have also agreed to first turn (inside Frege) our non-commutative identities into homogenous formulas (incurring in up to a quasi-polynomial increase in the formulas size)—it remains only to show how to efficiently prove in Frege homogenous non-commutative identities. (Formally, we shall in fact deal with syntactic homogenous formulas.)
To this end we essentially construct an efficient Frege proof of the correctness of the Raz and Shpilka PIT algorithm for non-commutative formulas [RS05]. This PIT algorithm uses some basic linear algebraic concepts that might be beyond the efficient-reasoning strength of Frege. However, since we only need to show the existence of short Frege proofs for the PIT algorithm’s correctness, we can supply witnesses to witness the desired linear algebraic objects needed in the proof (these witnesses will be a sequence of linear transformations).
A bigger obstacle is that it seems impossible to reason directly inside Frege about the algorithm of [RS05], since this algorithm first converts a non-commutative formula into an algebraic branching program (ABP); but the evaluation of ABPs (apparently) cannot be done with Boolean formulas (and accordingly Frege (apparently) cannot reason about the evaluation of ABPs). The reason for this apparent inability of Frege to reason efficiently about ABP’s evaluation is that an ABP is a slightly more “sequential” object than a formula: an evaluation of an ABP with layers can be done by an iterative matrix multiplication of matrices—known to be doable with quasi-polynomial size formulas (or polynomial-size circuits with depth)—while Frege is a system that operates with formulas. To overcome this obstacle we show how to perform Raz and Shpilka’s PIT algorithm directly on non-commutative formulas, without converting the formulas first into ABPs. This technical contribution takes quite a large part of the argument (Sec. 4.7).
We are finally able to prove the following statement, which might be of independent interest:
If a non-commutative homogeneous formula over of size is identically zero, then the corresponding Boolean formula (where results by replacing with XOR and with AND in ) can be proved with a Frege proof of size at most .
1.4 Comparison with previous work
Our main characterization of the Frege system is based on a non-commutative version of the IPS system from Grochow and Pitassi [GP14]. As described above, the non-commutative IPS gives a tighter characterization than the (commutative) IPS in [GP14], and close to capture almost tightly the Frege system.
In the original (formula version of the) IPS, proofs are arithmetic formulas, and thus any super-polynomial lower bound on IPS refutations implies , or in other words, that the permanent does not have polynomial-size arithmetic formulas (Joshua Grochow [personal communication]). This shows that proving IPS lower bounds will be considerably difficult to obtain. For the non-commutative IPS, on the other hand, we face a seemingly much favourable situation: an exponential-size lower bound on non-commutative IPS gives only a corresponding lower bound on non-commutative formulas, for which exponential-size lower bounds are already known [Nis91]. In other words, exponential-size lower bounds on Frege implies merely—at least in the context of the Ideal Proof System—corresponding lower bounds on non-commutative formulas, a result which is already known. In view of this, it seems that there is no strong concrete justification to believe that Frege lower bounds are beyond current techniques.
Let us also mention the work in [Tza11] that dealt with propositional proof systems over non-commutative formulas. In [Tza11] the choice was made to define all proof systems as polynomial calculus-style systems in which proof-lines are written as non-commutative formulas (as well as the more restricted class of ordered-formulas). This meant that the characterization of a proof system in terms of a single non-commutative polynomial is lacking from that work (as well as the consequences we obtained in the current work).
For a positive natural number we use the standard notation for .
Definition 2.1 (Boolean formulas).
Given a set of input variables a Boolean formula on the input variables is a rooted finite tree of fan-in at most 2, with edges directed from leaves to the root. We consider the edges coming into nodes as ordered.999This is not important in general, but for Frege proofs it is in fact implicit that propositional formulas are ordered. Internal nodes are labeled with the Boolean gates OR, AND and NOT, denoted , respectively, where the fan-in of and is two and the fan-in of is one. The leaves are labeled either with input variables or with (identified with the truth values false and true, resp.). The entire formula computes the function computed by the gate at the root. Given a formula , the size of the formula is the number of Boolean gates in , denoted .
Given a pair of Boolean formulas and over the variables , we denote by the formula in which every occurrence of in is substituted by the formula .
We use the symbol to denote logical equivalence and we use the symbol to denote .
2.1 The Frege proof system
As outlined in the introduction, a Frege proof system is any standard propositional proof system for proving propositional tautologies having finitely many axiom schemes and deduction rules, and where proof-lines are written as Boolean formulas. The size of a Frege proof is the number of symbols it takes to write down the proof, namely the total of all the formula sizes appearing in the proof. Let us define Frege proofs in a more formal way.
Definition 2.2 (Frege (derivation) rule).
A Frege rule is a sequence of propositional formulas , for , written as . In case , the Frege rule is called an axiom scheme. A formula is said to be derived by the rule from if are all substitution instances of , for some assignment to the variables (that is, there are formulas such that , for all ). The Frege rule is said to be sound if whenever an assignment satisfies the formulas above the line, then it also satisfies the formula below the line.
Definition 2.3 (Frege proof).
Given a set of Frege rules, a Frege proof is a sequence of Boolean formulas such that every formula is either an axiom or was derived by one of the given Frege rules from previous formulas. If the sequence terminates with the Boolean formula , then the proof is said to be a proof of . The size of a Frege proof is the sum of all formula sizes in the proof.
A proof system is said to be sound if it admits proofs of only tautologies. A proof system is said to be implicationally complete if for all set of formulas , if semantically implies , then there is a proof of using (possibly) axioms from .
Definition 2.4 (Frege proof system).
Given a set of sound Frege rules, we say that is a Frege proof system if is implicationally complete.
Note that a Frege proof is always sound since the Frege rules are assumed to be sound. Frege is also complete (that is, can prove all tautologies), by implicational completeness. We do not need to work with a specific Frege proof system, since a basic result in proof complexity by Reckhow [Rec76] states that every two Frege proof systems, even with different propositional connectives, are polynomially equivalent. For concreteness the reader can think of Schoenfield’s system from the introduction, noting it is indeed a Frege system.
The problem of demonstrating super-polynomial size lower bounds on propositional Frege proofs asks whether there is a family of propositional tautological formulas for which there is no polynomial such that the minimal Frege proof size of is at most , for all .
2.2 Preliminary algebraic models of computation and proofs
Here we define arithmetic formulas (both commutative and non-commutative) as well as the algebraic propositional proof system Polynomial Calculus over Formulas () introduced by Grigoriev and Hirsch [GH03].
Definition 2.5 (Non-commutative formula).
Let be a field and be (algebraic) variables. A non-commutative arithmetic formula (or non-commutative formula for short) is a finite (ordered) labeled tree, with edges directed from the leaves to the root, and with fan-in at most two, such that there is an order on the edges coming into a node: the first edge is called the left edge and the second one the right edge. Every leaf of the tree (namely, a node of fan-in zero) is labeled either with an input variable or a field element. Every other node of the tree is labeled either with or (in the first case the node is a plus gate and in the second case a non-commutative product gate). We assume that there is only one node of out-degree zero, called the root.
A non-commutative formula computes a non-commutative polynomial in in the following way. A leaf computes the input variable or field element that labels it. A plus gate computes the sum of polynomials computed by its incoming nodes. A product gate computes the non-commutative product of the polynomials computed by its incoming nodes according to the order of the edges. (Subtraction is obtained using the constant .) The output of the formula is the polynomial computed at the root. The depth of a formula is the maximal length of a path from the root to the leaf. The size of a non-commutative formula is the total number of internal nodes (i.e., all nodes except the leaves) in its underlying tree, and is denoted similarly to the Boolean case by .
The definition of (a commutative) arithmetic formula is almost identical:
Definition 2.6 ((Commutative) arithmetic formula).
An arithmetic formula is defined in a similar way to a non-commutative formula, except that we ignore the order of multiplication (that is, a product node does not have order on its children and there is no order on multiplication when defining the polynomial computed by a formula).
Substitutions of non-commutative formulas into other non-commutative formulas are defined and denoted similarly to substitutions in Boolean formulas.
Note that we consider arithmetic formulas as syntactic objects. For example, and are different formulas. Furthermore, in the proof system defined below they should be derived from each other via an explicit application of a rewrite rule.
2.2.1 Polynomial calculus over formulas
The polynomial calculus over formulas system, denoted , was introduced by Grigoriev and Hirsch [GH03]. This system operates with (commutative) arithmetic formulas (as purely syntactic terms). is a refutation system: an refutation establishes that a collection of polynomials has no 0-1 roots. We can also treat as a proof system for propositional tautologies: for every Boolean tautology , (Definition 1.3) is a polynomial that does not have a 0-1 root, and therefore, an refutation of can be considered as an proof of the tautology .
Definition 2.7 (Rewrite rule).
A rewrite rule is a pair of formulas denoted . Given a formula , an application of a rewrite rule to is the result of replacing at most one occurrence of in by (that is, substituting a subformula inside by the formula ). We write to denote the pair of rewriting rules and .
Definition 2.8 ( [Gh03]).
Fix a field . Let be a collection of formulas101010Note here that we are talking about formulas (treated as syntactic terms). Also notice that all the formulas in are considered as commutative formulas computing (commutative) polynomials, though, because the formulas are merely syntactic terms we have an order on children of internal nodes, and in particular children of product gates are ordered. computing polynomials from . Let the set of axioms be the following formulas:
- Boolean axioms
A sequence of formulas computing polynomials from is said to be an proof of from , if for every we have one of the following:
, for some ;
is a Boolean axiom;
was deduced by one of the following inference rules from previous proof-lines , for :
(Where are formulas constructed as displayed; e.g., is the formula with product gate at the root having the formulas and as children.)111111In [GH03] the product rule of is defined so that one can derive from , where is any formula, and not just a variable. However, it is easy to show that the definition of in [GH03] and our Definition 2.8 polynomially-simulate each other.
was deduced from previous proof-line , for , by one of the following rewriting rules expressing the polynomial-ring axioms (where range over all arithmetic formulas computing polynomials in ):
- Zero rule
- Unit rule
- Scalar rule
, where is a formula containing no variables (only field elements) that computes the constant .
- Commutativity rules
- Associativity rule
- Distributivity rule
(The semantics of an proof-line is the polynomial equation .)
An refutation of is a proof of the formula from . The size of an proof is defined as the total size of all formulas in and is denoted by .
Definition 2.9 (Tree-like ).
A system is a tree-like if every derived arithmetic formula in the proof system is used only once (and if it is needed again, it must be derived once more).
For the purpose of comparing the relative complexity of different proof systems we have the concept of a simulation. Specifically, we say that a propositional proof system polynomially simulates another propositional proof system if there is a polynomial-time computable function that maps -proofs to -proofs of the same tautologies (if and use different representations for tautologies, we fix a translation (such as ) from one representation to the other). In case is computable in time (for the input-size), we say that -simulates . Specifically, if we say the simulation is quasi-polynomial. We say that and are polynomially equivalent in case polynomially simulates and polynomially simulates . (Our simulations will always be formally -simulations, though we might not always state explicitly that the map , from -proofs to -proofs is efficiently computable, and only show the existence of a -proof whose size is proportional to the corresponding -proof.)
Tree-like polynomially simulates Frege.
Grigoriev and Hirsch showed the following:
Theorem 2.10 ([Gh03]).
Tree-like polynomially simulates Frege. More precisely, for every propositional tautology T, if T has a polynomial-size Frege proof then there is a polynomial-size tree-like proof of (over , for a prime, or ).
Let us shortly explain how Grigoriev and Hirsch [GH03] obtained a simulation of Frege by tree-like (in contrast to simply (dag-like) ), as this is not an entirely trivial result (and which, in turn, is important to understand our simulation). Indeed, this simulation depends crucially on a somewhat surprising result of Krajíček who showed that tree-like Frege and (dag-like) Frege are polynomially equivalent [Kra95]:
Tree-like Frege proofs polynomially simulate Frege proofs.
Grigoriev and Hirsch show that (Theorem 3 in [GH03]) polynomially simulates Frege. Then, by inspection of this simulation, one can observe that tree-like Frege proofs are simulated by tree-like proofs (which is sufficient to conclude the simulation due to the theorem above), namely:
Tree-like polynomially simulates tree-like Frege.
3 Non-commutative ideal proof system polynomially simulates Frege
Here we show that the non-commutative IPS polynomially simulates Frege.
Theorem 3.1 (restatement of Theorem 1.4).
The non-commutative IPS refutation system (when refutations are written as non-commutative formulas) polynomially simulates the Frege system. More precisely, for every propositional tautology , if has a polynomial-size Frege proof then there is a non-commutative IPS refutation of (over for a prime , or ) of polynomial size.
Recall that Raz and Shpilka [RS05] gave a deterministic polynomial-time PIT algorithm for non-commutative formulas (over any field):
Theorem 3.2 (PIT for non-commutative formulas [Rs05]).
There is a deterministic polynomial-time algorithm that decides whether a given noncommutative formula over a field computes the zero polynomial .121212We assume here that the elements of have an efficient representation and the field operations are efficiently computable (e.g., the field of rationals).
Now, since we write refutations as non-commutative formulas we can use the theorem above to check in deterministic polynomial-time the correctness of non-commutative IPS refutations, obtaining:
Corollary 3.3 (restatement of Corollary 1.5).
The non-commutative IPS is a sound and complete Cook-Reckhow refutation system. That is, it is a sound and complete refutation system for unsatisfiable propositional formulas in which refutations can be checked for correctness in deterministic polynomial-time.
3.1 Non-commutative IPS polynomially simulates tree-like
For convenience, let denote the commutator axiom , for , and let denote the vector of all the axioms. When we write where are formulas (e.g., and , resp.), we mean .
Non-commutative IPS polynomially simulates tree-like (Definition 2.8). Specifically, if is a tree-like proof of a tautology then there is a non-commutative IPS refutation of of size polynomial in .
Let be arithmetic formulas over the variables . We denote by the vector . Since an arithmetic formula is a syntactic term in which the children of gates are ordered we can treat a (commutative) arithmetic formula as a non-commutative arithmetic formula by taking the order on the children of products gates to be the order of non-commutative multiplication.
Suppose has a -size tree-like refutation of the ’s (i.e., a proof of the polynomial from ), where each is an arithmetic formula. We construct a corresponding non-commutative IPS refutation of the ’s from this tree-like refutation. The following lemma suffices for this purpose:
For every , there exists a non-commutative formula such that
, where are the indices of the proof-lines involved in deriving .
For example, if is derived by and is derived by for some , then we say that are both involved in deriving . In other words, the lines involved in deriving a proof-line are all the proof-lines in the sub-tree of when we consider the underlying graph of the (tree-like) proof as a tree.
Note that if the lemma holds, then is a non-commutative IPS proof because it has the property that and . And its size is bounded by
We construct by induction on the length of the refutation . That is, for from to , we construct the non-commutative formula according to , as follows:
Base case: is an axiom for some .
Let . Obviously, and .
Case 1: is derived from the addition rule , for . Put where . Thus, and (where the right most inequality holds since is a tree-like refutation and hence ).
Case 2: is derived from the product rule , for and . Put . Then and .
Case 3: is derived from , for , by a rewriting rule which is not the commutative rule of multiplication (). Let . The non-commutative trivially satisfies the properties claimed since all the rewriting rules (excluding the commutative rule of multiplication) express the non-commutative polynomial-ring axioms, and thus cannot change the polynomial computed by a non-commutative formula. And .
Case 4: is derived from , for , by a single application of the commutative rule of multiplication. Then by Lemma 3.6 below, we can construct a non-commutative formula such that satisfies the desired properties (stated in Lemma 3.5).
Let be non-commutative formulas, such that can be derived from via the commutative rule of multiplication . Then there is a non-commutative formula in variables such that:
We define the non-commutative formula inductively as follows:
If , and , then is defined to be the formula constructed in Lemma 3.7 below.
If , .
Case 1: If , then let .
Case 2: If , then let .
If , .
Case 1: If , then let .
Case 2: If , then let .
By induction, the construction satisfies the desired properties. ∎
For any pair of two non-commutative formulas there exists a non-commutative formula in variables such that:
Let denote the smallest size of satisfying the above properties. We will show that by induction on .
Base case: .
In this case both and are constants or variables, thus .
In the following induction step, we consider the case where (which is symmetric for the case ).
Induction step: Assume that .
Case 1: The root of is addition.
Let . We have (after rearranging):
By induction hypothesis, we have .
Case 2: The root of is a product gate.
Let . By rearranging:
By induction hypothesis, we have . ∎
4 Frege quasi-polynomially simulates non-commutative IPS
In this long section we prove Theorem 4.1 stating that the Frege system quasi-polynomially simulates the non-commutative IPS (over ). Together with Theorem 3.1, this gives a new characterization (up to a quasi-polynomial increase in size) of propositional Frege proofs as non-commutative arithmetic formulas.
We use the notation in Section 1.3.3 as follows: for a clause in a CNF , we denote by the non-commutative formula translation of the clause (Definition 1.6). Thus, translates to , translates to and translates to (considered as a tree of product gates with as leaves), and where the formulas are over (meaning that is in fact ). Recall that this way, for every 0-1 assignment (when we identify true with 1 and false with 0), iff is true.
Theorem 4.1 (Second main theorem; Restatement of Theorem 1.7).
For a 3CNF where are the corresponding polynomial equations for the clauses, if there is a non-commutative IPS refutation of size of over , then there is a Frege proof of size of .
As mentioned in the introduction, it will be evident that our proof in fact establishes a slightly tighter simulation of the non-commutative IPS by Frege. Specifically, if the degree of the non-commutative IPS refutation is and its formula depth is , then there is a Frege proof of with size . This will follow from our efficient simulation within Frege of Raz’ [Raz13] homogenization construction (Lemma 4.8). Nevertheless, for simplicity we shall always assume that the depth of the non-commutative IPS refutation formula is logarithmic in the size and that the degree of the refutation is at most , and thus will not take care to explicitly establish the dependence of the simulation on the parameters and .
The rest of the paper is dedicated to proving Theorem 4.1.
4.1 Balancing non-commutative formulas
First we show that a non-commutative formula of size can be balanced to an equivalent formula of depth , and thus we can assume that the non-commutative IPS certificate is already given as a balanced formula (this is needed for what follows). Both the statement of the balancing construction and its proof are similar to Proposition 4.1 in Hrubeš and Wigderson [HW14] (which in turn is similar to the case of commutative formulas with division gates in Brent [Bre74]). (Note that a formula of a logarithmic depth (in the number of variables) must have a polynomial-size (in the number of variables).)
Assume that a non-commutative polynomial can be computed by a formula of size . Then can be computed by a formula of depth (and hence of polynomial-size when is polynomial in the number of variables).
The proof is almost identical to Hrubeš and Wigderson’s proof of Proposition 4.1 in [HW14], which deals with rational functions and allows formulas with division gates. Thus, we only outline the argument in [HW14] and argue that if the given formula does not have division gates, then the new formula obtained by the balancing construction will not contain any division gate as well.
Let be a non-commutative formula and let g be a gate in . We denote by the subformula of with the root being g and by the formula obtained by replacing in by the variable . We denote by the non-commutative polynomials in computed by and , respectively.
We simultaneously prove the following two statements by induction on , concluding the lemma:
Inductive statement: If is a non-commutative formula of size , then for sufficiently large and suitable constants , the following hold:
has a non-commutative formula of depth at most ;
if is a variable occurring at most once in , then:
where are non-commutative polynomials that do not contain , and each can be computed by a non-commutative formula of depth at most .
Base case: . In this case there is one gate g connecting two variables or constants. Thus, (i) in the inductive statement can be obtained immediately as it is already computed by a formula of depth . As for (ii), note that in the base case, is a formula with only one gate g. Assuming that is a variable occurring only once in , it is easy to construct non-commutative formulas so that for which the conditions in (ii) hold as follows:
Case 1: if g is a plus gate connecting the variable with a variable or constant , then we can write as .
Case 2: if g is a product gate connecting with (for , and in this order), then we can write as .
Case 3: if g is a product gate connecting with (for , and in this order), then we can write as .
Induction step: (i) is established (slightly informally) as follows. Find a gate g in such that both and are small (of size at most , and where is a new variable that does not occur in ). Then, by applying induction hypothesis on , there exist formulas of small depth such that . Thus,
To prove (ii), find an appropriate gate g on the path between and the output of (an appropriate g is a gate such that and are both small (of size at most ), where is a new variable not occurring in ). Use the inductive assumptions to write:
and compose these expressions to get
where , .
It is clear that the respective depth of and are all at most when is sufficiently large.
To finish the proof of (ii), it suffices to show that