Characterizing Propositional Proofs
as NonCommutative Formulas^{1}^{1}1An extended abstract of this work entitled “Noncommutative Formulas and Frege Lower Bounds: a New Characterization of Propositional Proofs” appeared in Proceedings of the 30th Annual Computational Complexity Conference (CCC): June 1719, 2015.
Abstract
Does every Boolean tautology have a short propositionalcalculus proof? Here, a propositionalcalculus (i.e., Frege) proof is any proof starting from a set of axioms and deriving new Boolean formulas using a fixed set of sound derivation rules. Establishing any superpolynomial size lower bound on Frege proofs (in terms of the size of the formula proved) is a major open problem in proof complexity, and among a handful of fundamental hardness questions in complexity theory by and large. Noncommutative arithmetic formulas, on the other hand, constitute a quite weak computational model, for which exponentialsize lower bounds were shown already back in 1991 by Nisan [STOC 1991], using a particularly transparent argument.
In this work we show that Frege lower bounds in fact follow from corresponding size lower bounds on noncommutative formulas computing certain polynomials (and that such lower bounds on noncommutative formulas must exist, unless \NP=\coNP). More precisely, we demonstrate a natural association between tautologies to noncommutative polynomials , such that:

if has a polynomialsize Frege proof then has a polynomialsize noncommutative arithmetic formula; and conversely, when is a DNF, if has a polynomialsize noncommutative arithmetic formula over then has a Frege proof of quasipolynomial size.
The argument is a characterization of Frege proofs as noncommutative formulas: we show that the Frege system is (quasi) polynomially equivalent to a noncommutative Ideal Proof System (IPS), following the recent work of Grochow and Pitassi [FOCS 2014] that introduced a propositional proof system in which proofs are arithmetic circuits, and the work in [Tza11] that considered adding the commutator as an axiom in algebraic propositional proof systems. This also gives a characterization of propositional Frege proofs in terms of (noncommutative) arithmetic formulas that is tighter than (the formula version of IPS) in Grochow and Pitassi [FOCS 2014].
1 Introduction
1.1 Propositional proof complexity
The field of propositional proof complexity aims to understand and analyze the computational resources required to prove propositional statements. The problems the field poses are fundamental, difficult, and of central importance to computer science and complexity theory as demonstrated by the seminal work of Cook and Reckhow [CR79], who showed the immediate relevance of these problems to the \NP vs. \coNP problem (and thus to the \Ptime vs. \NP problem).
Among the major unsolved questions in propositional proof complexity, is whether the standard propositional logic calculus, either in the form of the Sequent Calculus, or equivalently, in the axiomatic form of Hilbert style proofs (i.e., Frege proofs), is polynomially bounded; that is, whether every propositional tautology—namely, a formula that is satisfied by every assignment—has a proof whose size is polynomially bounded in the size of the formula proved (alternatively and equivalently, we can think of unsatisfiable formulas and their refutations). Here, we consider the size of proofs as the number of symbols it takes to write them down, where each formula in the proof is written as a Boolean formula (in other words we count the total number of logical gates appearing in the proof).
It is known since Reckhow work [Rec76] that all Frege proofsystems^{2}^{2}2Formally, a Frege proof system is any propositional proof system with a fixed number of axiom schemes and sound derivation rules that is also implicationally complete, and in which prooflines are written as propositional formulas (see Definition 2.4). (as well as the Gentzen sequent calculus with the cut rule [Gen35]) are polynomially equivalent to each other, and hence it does not matter precisely which rules, axioms, and logicalconnectives we use in the system. Nevertheless, for concreteness, the reader can think of the Frege proof system as the following simple one (known as Schoenfield’s system), consisting of only three axiom schemes (where is an abbreviation of ; and are any propositional formulas):
and a single inference rule (known as modus ponens):
Complexitywise, Frege is considered a very strong proof system alas a poorly understood one. The qualification strong here has several meanings: first, that no superpolynomial lower bound is known for Frege proofs. Second, that there are not even good hard candidates for the Frege system (see [BBP95, Raz15, Kra11, LT13] for further discussions on hard proof complexity candidates). Third, that for most hard instances (e.g., the pigeonhole principle and Tseitin tautologies) that are known to be hard for weaker systems (e.g., resolution, cutting planes, etc.), there are known polynomial bounds on Frege proofs. Fourth, that proving superpolynomial lower bounds on Frege proofs seems to a certain extent out of reach of current techniques (and believed by some to be even harder than proving explicit circuit lower bounds [Raz15]). And finally, that by the common (mainly informal) correspondence between circuits and proofs—namely, the correspondence between a circuitclass and a proof system in which every proofline is written as a circuit^{3}^{3}3To be more precise, one has to associate a circuit class with a proof system in which a family of proofs is written such that every proofline in the family is a circuit family from . from —Frege system corresponds to the circuit class of polynomialsize depth circuits denoted \NCOne (equivalently, of polynomialsize formulas [Spi71]), considered to be a strong computational model for which no (explicit) superpolynomial lower bounds are currently known.
Accordingly, proving lower bounds on Frege proofs is considered an extremely hard task. In fact, the best lower bound known today is only quadratic, which uses a fairly simple syntactic argument [Kra95]. If we put further impeding restrictions on Frege proofs, like restricting the depth of each formula appearing in a proof to a certain fixed constant, exponential lower bounds can be obtained [Ajt88, PBI93, PBI93]. Although these constantdepth Frege exponentialsize lower bounds go back to Ajtai’s result from 1988, they are still in some sense the stateoftheart in proof complexity lower bounds (beyond the important developments on weaker proof systems, such as resolution and its comparatively weak extensions). Constantdepth Frege lower bounds use quite involved probabilistic arguments, mainly specialized switching lemmas tailored for specific tautologies (namely, counting tautologies, most notable of which are the Pigeonhole Principle tautologies). Even random CNF formulas near the satisfiability threshold are not known to be hard for constantdepth Frege (let alone hard for [unrestricted depth] Frege).
All of the above goes to emphasize the importance, basic nature and difficulty in understanding the complexity of strong propositional proof systems, while showing how little is actually known about these systems.
1.2 Prominent directions for understanding propositional proofs
As we already mentioned, there is a guiding line in proof complexity which states a correspondence between the complexity of circuits and the complexity of proofs. This correspondence is mainly informal, but there are seemingly good indications showing it might be more than a superficial analogy. One of the most compelling evidence for this correspondence is that there is a formal correspondence (cf. [CN10] for a clean formulation of this) between the firstorder logical theories of bounded arithmetic (whose axioms state the existence of sets taken from a given complexity class ) to propositional proof systems (in which prooflines are circuits from ).
Another aspect of the informal correspondence between circuit complexity and proof complexity is that circuit hardness sometimes can be used to obtain proof complexity hardness. The most notable example of this are the lower bounds on constantdepth Frege proofs mentioned above: constantdepth Frege proofs can be viewed as propositional calculus operating with \ACZ circuits, and the known lower bounds on constant depth Frege proofs (cf. [Ajt88, KPW95, PBI93]) use techniques borrowed from \ACZ circuits lower bounds. The success in moving from circuit hardness towards proofcomplexity hardness has spurred a flow of attempts to obtain lower bounds on proof systems other than constant depth Frege. For example, Pudlák [Pud99] and Atserias et al. [AGP02] studied proofs based on monotone circuits, motivated by known exponential lower bounds on monotone circuits [Raz85]. Raz and Tzameret [RT08b, RT08a, Tza08] investigated algebraic proof systems operating with multilinear formulas, motivated by lower bounds on multilinear formulas for the determinant, permanent and other explicit polynomials [Raz09, Raz06]. Atserias et al. [AKV04], Krajíček [Kra08] and Segerlind [Seg07] have considered proofs operating with ordered binary decision diagrams (OBDDs), and the second author [Tza11] initiated the study of proofs operating with noncommutative formulas (see Sec. 1.4 for a comparison with the current work).^{4}^{4}4We do not discuss here the important thread of results whose aim is to establish conditional lower bounds based on NisanWigderson generators. This direction was developed in e.g. [ABSRW04, Raz15, Kra04, Kra10].
Until quite recently it was unknown whether the correspondence between proofs and circuits is twosided, namely, whether proof complexity hardness (of concrete known proof systems) can imply any computational hardness. An initial example of such an implication from proof hardness to circuit hardness was given by Raz and Tzameret [RT08b]. They showed that a separation between algebraic proof systems operating with arithmetic circuits and multilinear arithmetic circuits, resp., for an explicit family of polynomials, implies a separation between arithmetic circuits and multilinear arithmetic circuits.
In a recent significant development about the complexity of strong proof systems, Grochow and Pitassi [GP14] demonstrated a much stronger correspondence. They introduced a natural propositional proof system, called the Ideal Proof System (IPS for short), for which any superpolynomial size lower bound on IPS implies a corresponding size lower bound on arithmetic circuits, and formally, that the permanent does not have polynomialsize arithmetic circuits. The IPS is defined as follows:
Definition 1.1 (Ideal Proof System (IPS) [Gp14]).
Let be a system of polynomials in the variables , where the polynomials , for all , are part of this system. An IPS refutation (or certificate) that the ’s polynomials have no common 01 solutions is a polynomial in the variables and , such that:

; and

The essence of IPS is that a proof (or refutation) is a single polynomial that can be written simply as an arithmetic circuit or formula. The advantage of this formulation is that now we can obtain direct connections between circuit/formula hardness (i.e., “computational hardness”) and hardness of proofs. Grochow and Pitassi showed indeed that a lower bound on IPS written as an arithmetic circuit implies that the permanent does not have polynomialsize algebraic circuits (Valiant’s conjectured separation [Val79]); And similarly, a lower bound on IPS written as an arithmetic formula implies that the permanent does not have polynomialsize algebraic formulas (, ibid).
Under certain assumptions, Grochow and Pitassi [GP14] were able to connect their result to standard propositionalcalculus proof systems, i.e., Frege and Extended Frege. Their assumption was the following: Frege has polynomialsize proofs of the statement expressing that the PIT for arithmetic formulas is decidable by polynomialsize Boolean circuits (PIT for arithmetic formulas is the problem of deciding whether an input arithmetic formula computes the [formal] zero polynomial). They showed that^{5}^{5}5We focus only on the relevant results about Frege proofs from [GP14] (and not the results about Extended Frege in [GP14]; the latter proof system operates, essentially, with Boolean circuits, in the same way that Frege operates with Boolean formulas (equivalently \NCOne circuits))., under this assumption superpolynomial lower bounds on Frege proofs imply that the permanent does not have polynomialsize arithmetic circuits. This, in turn, can be considered as a (conditional) justification for the apparent longstanding difficulty of proving lower bounds on strong proof systems.
1.3 Overview of results and proofs
1.3.1 Sketch
In this work we give a novel characterization of the propositional calculus—a fundamental and prominent object by itself—and by this contribute to the understanding of strong propositional proof systems, and to the fundamental search for lower bounds on these proofs. We formulate a very natural proof system, namely a noncommutative variant of the ideal proof system, which we show captures unconditionally (up to a quasipolynomialsize increase, and in some cases only a polynomial increase^{6}^{6}6We establish a slightly stronger characterization: the noncommutative IPS polynomially simulates Frege; and conversely, the complexity in which Frege simulates the noncommutative IPS depends on the degree of the noncommutative IPS refutation; e.g., the simulation is polynomial when refutations are of logarithmic degrees (see note after Theorem 1.7).) propositional Frege proofs. A proof in the noncommutative IPS is simply a single noncommutative polynomial written as a noncommutative formula.
Our results thus give a compelling and simple new characterization of the proof complexity of propositional Frege proofs and brings new hope for achieving lower bounds on strong proof systems, by reducing the task of lower bounding Frege proofs to the following seemingly much more manageable task: proving matrix rank lower bounds on the matrices associated with certain noncommutative polynomials (in the sense of Nisan [Nis91]; see below for details).
The new characterization also tightens the recent results of Grochow and Pitassi [GP14] in the following sense:

The noncommutative IPS is polynomialtime checkable—whereas the original IPS was checkable in probabilistic polynomialtime; and

Frege proofs unconditionally quasipolynomially simulate the noncommutative IPS—whereas Frege was shown to efficiently simulate IPS only assuming that the decidability of PIT for (commutative) arithmetic formulas by polynomialsize circuits is efficiently provable in Frege.
The tighter result shows that, at least for Frege, and in the framework of the ideal proof system, lower bounds on Frege proofs do not necessarily entail in themselves very strong computational lower bounds.
1.3.2 Some preliminaries: noncommutative polynomials and formulas
A noncommutative polynomial over a given field and with the variables is a formal sum of monomials with coefficients from such that the product of variables is noncommuting. For example, and are three distinct polynomials in . The ring of noncommutative polynomials with variables and coefficients from is denoted .
A polynomial (i.e., a commutative polynomial) over a field is defined in the same way as a noncommutative polynomial except that the product of variables is commutative; in other words, it is a sum of (commutative) monomials.
A noncommutative arithmetic formula (noncommutative formula for short) is a fanin two labeled tree, with edges directed from leaves towards the root, such that the leaves are labeled with field elements (for a given field ) or variables and internal nodes (including the root) are labeled with a plus or product gates. A product gate has an order on its two children (holding the order of noncommutative product). A noncommutative formula computes a noncommutative polynomial in the natural way (see Definition 2.5).
Exponentialsize lower bounds on noncommutative formulas (over any field) were established by Nisan [Nis91]. The idea (in retrospect) is quite simple: first transform a noncommutative formula into an algebraic branching program (ABP; Definition 4.13); and then show that the number of nodes in the th layer of an ABP computing a degree homogenous noncommutative polynomial is bounded from below by the rank of the degree partialderivative matrix of .^{7}^{7}7The degree partial derivative matrix of is the matrix whose ro‘ws are all noncommutative monomials of degree and columns are all noncommutative monomials of degree , such that the entry in row and column is the coefficient of the degree monomial in . Thus, lower bounds on noncommutative formulas follow from quite immediate rank arguments (e.g., the partial derivative matrices associated with the permanent and determinant can easily be shown to have high ranks).
1.3.3 Noncommutative ideal proof system
Recall the IPS refutation system from Definition 1.1 above. We use the idea introduced in [Tza11], which considered adding the commutator as an axiom in propositional algebraic proof systems, to define a refutation system that polynomially simulates Frege:
Definition 1.2 (Noncommutative IPS).
Let be a field. Assume that is a system of noncommutative polynomial equations from , and suppose that the following set of equations (axioms) are included in the ’s:
 Boolean axioms:

for all ;
 Commutator axioms:

, for all
Suppose that the ’s have no common  solutions.^{8}^{8}8One can check that the ’s have no common  solutions in iff they do not have a common 01 solution in every algebra. A noncommutative IPS refutation (or certificate) that the system of ’s is unsatisfiable is a noncommutative polynomial in the variables and (i.e. , such that:

; and

We always assume that the noncommutative IPS refutation is written as a noncommutative formula. Hence the size of a noncommutative IPS refutation is the minimal size of a noncommutative formula computing the noncommutative IPS refutation.
Note: (i) It is important to note that identities 1 and 2 in Definition 1.2 are formal identities between noncommutative polynomials. It is possible to show that without the commutator axioms the system becomes incomplete in the sense that there will be unsatisfiable systems of noncommutative polynomials (where the ’s include the Boolean and commutator axioms) for which there are no noncommutative IPS refutations.
(ii) In order to prove that a system of commutative polynomial equations (where each is expressed as an arithmetic formula) has no common roots in noncommutative IPS, we write each as a noncommutative formula (in some way; note that there is no unique way to do this).
The main result of this paper is that the noncommutative IPS (over either or , for any prime ) polynomially simulates Frege; and conversely, Frege quasipolynomially simulates the noncommutative IPS (over ). We explain the results in what follows.
Noncommutative IPS simulates Frege
For the purpose of the next theorem we use a standard translation of propositional formulas into noncommutative arithmetic formulas:
Definition 1.3 ().
Let , for variables ; ; ; and by induction on the size of the propositional formula: ; and finally .
For a noncommutative formula denote by the noncommutative polynomial computed by . Thus, is a propositional tautology iff for every 01 assignment to the variables of the noncommutative polynomial.
Theorem 1.4 (First main theorem).
Let be either the rational numbers or , for a prime . The noncommutative IPS refutation system, when refutations are written as noncommutative formulas over , polynomially simulates the Frege system. More precisely, for every propositional tautology T, if T has a polynomialsize Frege proof then there is a noncommutative IPS certificate (over ) of that has a polynomial noncommutative formula size.
The fact that an arithmetic formula (or circuit) in the form of the IPS can simulate a propositional Frege proof was shown in [GP14]. The noncommutative IPS, on the other hand, is much more restrictive than the original (commutative) IPS: instead of using commutative polynomials (written as arithmetic formulas) we now use noncommutative polynomials (written as noncommutative arithmetic formulas). And as mentioned above, in order to maintain the completeness of the noncommutative IPS we must add the commutator axioms to the system. Thus, the question arises: how can we still polynomially simulate Frege in this restrictive framework? The answer to this, which also constitutes one of the main observation of the simulation, is that the commutator axioms are already used implicitly in propositional Frege proofs: every classical propositional calculus system has some (possibly implicit) structural rules that enable one to commute AND’s and OR’s (e.g., is not the same formula as , from the perspective of the propositional calculus). In other words, Frege proofs operate with formulas as purely syntactic terms, and thus commutativity of AND and OR are not free for Frege proofs.
We now sketch in more detail the proof of Theorem 1.4. To simulate Frege proofs we use an intermediate proof system (standing for “formula polynomial calculus”) introduced by Grigoriev and Hirsch [GH03]. The proof system (Definition 2.8) can be thought of as a simple variant of the wellstudied polynomial calculus (PC) system in which polynomials are written as arithmetic formulas (instead of sums of monomials as in PC).
Recall that a PCrefutation, as introduced by Clegg, Edmonds and Impagliazzo [CEI96], is simply a sequence of polynomials written as sum of monomials, where each polynomial is either taken from the initial unsatisfiable set of polynomials or was derived using two algebraic rules: from a pair of previously derived polynomials and , derive (for field elements); and from a previously derived , derive , for any variable . The proof system makes the following two changes to PC (turning it into a provably much stronger system):

every polynomial in an proof is written as an arithmetic formula (instead of as a sum of monomials) and is treated as a purely syntactic object (like in Frege); and

we can derive new polynomials either by the two aforementioned PC rules, or by local rewriting rules operating on any subformula and expressing simple operations on polynomials (such as commutativity of addition and product, associativity, distributivity, etc.).
Grigoriev and Hirsch [GH03] showed that polynomially simulates Frege proofs, and that for treelike Frege proofs the polynomial simulation yields treelike proofs. Since treelike Frege is polynomially equivalent to Frege—because Frege proofs can always be balanced to a depth that is logarithmic in their size (cf. [Kra95] for a proof)—we get that treelike polynomially simulates (daglike) Frege proofs.
Therefore, to conclude Theorem 1.4 it suffices to prove that the noncommutative IPS polynomially simulates treelike proofs. To do this, loosely speaking, we construct the noncommutative formula tree according to the structure of the treelike proof, line by line.
Now, since we write refutations as noncommutative formulas we can use the polynomialtime deterministic Polynomial Identity Testing (PIT) algorithm for noncommutative formulas, devised by Raz and Shpilka [RS05], to check in deterministic polynomialtime the correctness of noncommutative IPS refutations:
Corollary 1.5.
The noncommutative IPS is a sound and complete refutation system in the sense of CookReckhow [CR79]. That is, it is a sound and complete refutation system for unsatisfiable propositional formulas in which refutations can be checked for correctness in deterministic polynomialtime.
This should be contrasted with the original (commutative) IPS of [GP14], for which verification of refutations is done in probabilistic polynomial time using the standard SchwartzZippel [Sch80, Zip79] PIT algorithm.
The major consequence of Theorem 1.4 is that to prove a superpolynomial Frege lower bound it suffices to prove a superpolynomial lower bound on noncommutative formulas computing certain polynomials. Specifically, it is enough to prove that any noncommutative IPS certificate (which is simply a noncommutative polynomial) has a superpolynomial noncommutative formula size; and yet in other words, it suffices to show that any such must have a superpolynomial total rank according to the associated partialderivatives matrices in the sense of Nisan [Nis91] as discussed before.
Frege simulates noncommutative IPS
We shall prove that Frege simulates the noncommutative IPS for CNFs (this is the case considered in [GP14]), over and with only a quasipolynomial increase in size (and for some specific cases the simulation can become polynomial).
It will be convenient to use a translation of clauses to noncommutative formulas which is slightly different than Definition 1.3:
Definition 1.6 ( and ).
Given a Boolean formula we define its noncommutative formula translation as follows. Let and , for a variable. Let ; ; and (where the sequence of products stands for a (balanced) fanin two tree of product gates with on the leaves). Further, for a CNF , denote by the noncommutative formula translation of the clause .
Note that this way, the system of equations is unsatisfiable iff is unsatisfiable.
Theorem 1.7 (Second main theorem).
Let be a CNF and let denote the corresponding noncommutative formulas for the clauses of . If there is a noncommutative IPS refutation of size of over , then there is a Frege proof of size of the tautology .
Note: The proof of Theorem 1.7 achieves in fact a slightly stronger simulation than stated. That is, our simulation shows that if the degree of the noncommutative IPS refutation is and its formula depth is , then there is a Frege proof of with size . And in particular, Frege polynomially simulates noncommutative IPS refutations of degrees (for the number of variables in the CNF). However, for simplicity we shall always assume that the depth of the noncommutative IPS formula is logarithmic in its size (Lemma 4.2 shows that we can always balance noncommutative formulas), and so explicitly we only deal with the case where and .
The proof of Theorem 1.7 consists of several separate steps of independent interest. From the logical point of view, the argument is a short Frege proof of a reflection principle for the noncommutative IPS system. A reflection principle for a given proof system is a statement that says that if there exists a proof of a formula then is also true. The argument becomes rather complicated because we need to prove properties of the PIT algorithm for noncommutative formulas devised by Raz and Shpilka [RS05] within the restrictive framework of propositional Frege proofs.
Our goal is then to prove in Frege, given a noncommutative IPS refutation of .
Step 1: balancing. We first balance the noncommutative IPS , so that its depth is logarithmic in its size. We observe that the recent construction of Hrubeš and Wigderson [HW14] for balancing noncommutative formulas with division gates (incurring with at most a polynomial increase in size) results in a divisionfree formula, when the initial noncommutative formula is divisionfree by itself. Therefore, we can assume that the noncommutative IPS certificate is already balanced (this step is independent of the Frege system).
Step 2: Booleanization. We then consider our balanced , which is a noncommutative polynomial identity over , as a Boolean tautology, by replacing plus gates with XORs and product gates with ANDs.
Step 3: reflection principle. We use a reflection principle to reduce the task of efficiently proving in Frege to the following task: show that any noncommutative formula identity over , considered as a Boolean tautology, has a short Frege proof.
Step 4: homogenization. This is the only step that is responsible for the quasipolynomial size increase in Theorem 1.7. More precisely, this increase in size depends on the fact that for the purpose of establishing short Frege proofs for all noncommutative polynomial identities over (considered as Boolean tautological formulas) it is important that the formulas are written as a sum of homogenous noncommutative formulas.
Note that it is not known whether arithmetic formulas can be turned into a (sum of) homogenous formulas with only a polynomial increase in size (in contrast to the standard efficient homogenization of arithmetic circuits by Strassen [Str73] that does allow such a conversion). Nevertheless, Strassen’s standard procedure enables us to transform any polynomialsize arithmetic formula into a sum of homogenous formulas with only a quasipolynomial increase in size: any formula of size computing a polynomial (and thus the degree of is also polynomial) can be transformed into a sum of homogenous formulas, each having size and computes the corresponding homogenous part of . (One can show that the same also holds for noncommutative formulas.)
For the purpose of establishing a quasipolynomial simulation of noncommutative IPS by Frege, it is sufficient to use the original Strassen’s homogenization procedure (as simulated inside Frege; cf. [HT12]). However, as the note after Theorem 1.7 indicates, we show a slightly stronger simulation result, using an efficient Frege simulation of a recent result due to Raz [Raz13] who showed how to transform an arithmetic formula into (a sum of) homogenous formulas in a manner which is more efficient than Strassen [Str73]. Specifically, in Lemma 4.8 we show that:

The same construction in [Raz13] also holds for noncommutative formulas;

This construction for noncommutative formulas can be carried out efficiently inside Frege. That is, if is a noncommutative formula of size and depth computing a homogenous noncommutative polynomial over of degree , then there exists a syntactic homogenous noncommutative formula computing the same polynomial and with size , such that Frege admits a proof of of size polynomial (in ).
Step 5: short proofs for homogenous noncommutative identities. Now that we have reduced our task to the task of showing that every noncommutative formula identity over (considered as a tautology) has a short Frege proof; and we have also agreed to first turn (inside Frege) our noncommutative identities into homogenous formulas (incurring in up to a quasipolynomial increase in the formulas size)—it remains only to show how to efficiently prove in Frege homogenous noncommutative identities. (Formally, we shall in fact deal with syntactic homogenous formulas.)
To this end we essentially construct an efficient Frege proof of the correctness of the Raz and Shpilka PIT algorithm for noncommutative formulas [RS05]. This PIT algorithm uses some basic linear algebraic concepts that might be beyond the efficientreasoning strength of Frege. However, since we only need to show the existence of short Frege proofs for the PIT algorithm’s correctness, we can supply witnesses to witness the desired linear algebraic objects needed in the proof (these witnesses will be a sequence of linear transformations).
A bigger obstacle is that it seems impossible to reason directly inside Frege about the algorithm of [RS05], since this algorithm first converts a noncommutative formula into an algebraic branching program (ABP); but the evaluation of ABPs (apparently) cannot be done with Boolean formulas (and accordingly Frege (apparently) cannot reason about the evaluation of ABPs). The reason for this apparent inability of Frege to reason efficiently about ABP’s evaluation is that an ABP is a slightly more “sequential” object than a formula: an evaluation of an ABP with layers can be done by an iterative matrix multiplication of matrices—known to be doable with quasipolynomial size formulas (or polynomialsize circuits with depth)—while Frege is a system that operates with formulas. To overcome this obstacle we show how to perform Raz and Shpilka’s PIT algorithm directly on noncommutative formulas, without converting the formulas first into ABPs. This technical contribution takes quite a large part of the argument (Sec. 4.7).
We are finally able to prove the following statement, which might be of independent interest:
Theorem 1.8.
If a noncommutative homogeneous formula over of size is identically zero, then the corresponding Boolean formula (where results by replacing with XOR and with AND in ) can be proved with a Frege proof of size at most .
1.4 Comparison with previous work
Our main characterization of the Frege system is based on a noncommutative version of the IPS system from Grochow and Pitassi [GP14]. As described above, the noncommutative IPS gives a tighter characterization than the (commutative) IPS in [GP14], and close to capture almost tightly the Frege system.
In the original (formula version of the) IPS, proofs are arithmetic formulas, and thus any superpolynomial lower bound on IPS refutations implies , or in other words, that the permanent does not have polynomialsize arithmetic formulas (Joshua Grochow [personal communication]). This shows that proving IPS lower bounds will be considerably difficult to obtain. For the noncommutative IPS, on the other hand, we face a seemingly much favourable situation: an exponentialsize lower bound on noncommutative IPS gives only a corresponding lower bound on noncommutative formulas, for which exponentialsize lower bounds are already known [Nis91]. In other words, exponentialsize lower bounds on Frege implies merely—at least in the context of the Ideal Proof System—corresponding lower bounds on noncommutative formulas, a result which is already known. In view of this, it seems that there is no strong concrete justification to believe that Frege lower bounds are beyond current techniques.
Let us also mention the work in [Tza11] that dealt with propositional proof systems over noncommutative formulas. In [Tza11] the choice was made to define all proof systems as polynomial calculusstyle systems in which prooflines are written as noncommutative formulas (as well as the more restricted class of orderedformulas). This meant that the characterization of a proof system in terms of a single noncommutative polynomial is lacking from that work (as well as the consequences we obtained in the current work).
2 Preliminaries
For a positive natural number we use the standard notation for .
Definition 2.1 (Boolean formulas).
Given a set of input variables a Boolean formula on the input variables is a rooted finite tree of fanin at most 2, with edges directed from leaves to the root. We consider the edges coming into nodes as ordered.^{9}^{9}9This is not important in general, but for Frege proofs it is in fact implicit that propositional formulas are ordered. Internal nodes are labeled with the Boolean gates OR, AND and NOT, denoted , respectively, where the fanin of and is two and the fanin of is one. The leaves are labeled either with input variables or with (identified with the truth values false and true, resp.). The entire formula computes the function computed by the gate at the root. Given a formula , the size of the formula is the number of Boolean gates in , denoted .
Given a pair of Boolean formulas and over the variables , we denote by the formula in which every occurrence of in is substituted by the formula .
We use the symbol to denote logical equivalence and we use the symbol to denote .
2.1 The Frege proof system
As outlined in the introduction, a Frege proof system is any standard propositional proof system for proving propositional tautologies having finitely many axiom schemes and deduction rules, and where prooflines are written as Boolean formulas. The size of a Frege proof is the number of symbols it takes to write down the proof, namely the total of all the formula sizes appearing in the proof. Let us define Frege proofs in a more formal way.
Definition 2.2 (Frege (derivation) rule).
A Frege rule is a sequence of propositional formulas , for , written as . In case , the Frege rule is called an axiom scheme. A formula is said to be derived by the rule from if are all substitution instances of , for some assignment to the variables (that is, there are formulas such that , for all ). The Frege rule is said to be sound if whenever an assignment satisfies the formulas above the line, then it also satisfies the formula below the line.
Definition 2.3 (Frege proof).
Given a set of Frege rules, a Frege proof is a sequence of Boolean formulas such that every formula is either an axiom or was derived by one of the given Frege rules from previous formulas. If the sequence terminates with the Boolean formula , then the proof is said to be a proof of . The size of a Frege proof is the sum of all formula sizes in the proof.
A proof system is said to be sound if it admits proofs of only tautologies. A proof system is said to be implicationally complete if for all set of formulas , if semantically implies , then there is a proof of using (possibly) axioms from .
Definition 2.4 (Frege proof system).
Given a set of sound Frege rules, we say that is a Frege proof system if is implicationally complete.
Note that a Frege proof is always sound since the Frege rules are assumed to be sound. Frege is also complete (that is, can prove all tautologies), by implicational completeness. We do not need to work with a specific Frege proof system, since a basic result in proof complexity by Reckhow [Rec76] states that every two Frege proof systems, even with different propositional connectives, are polynomially equivalent. For concreteness the reader can think of Schoenfield’s system from the introduction, noting it is indeed a Frege system.
The problem of demonstrating superpolynomial size lower bounds on propositional Frege proofs asks whether there is a family of propositional tautological formulas for which there is no polynomial such that the minimal Frege proof size of is at most , for all .
2.2 Preliminary algebraic models of computation and proofs
Here we define arithmetic formulas (both commutative and noncommutative) as well as the algebraic propositional proof system Polynomial Calculus over Formulas () introduced by Grigoriev and Hirsch [GH03].
Definition 2.5 (Noncommutative formula).
Let be a field and be (algebraic) variables. A noncommutative arithmetic formula (or noncommutative formula for short) is a finite (ordered) labeled tree, with edges directed from the leaves to the root, and with fanin at most two, such that there is an order on the edges coming into a node: the first edge is called the left edge and the second one the right edge. Every leaf of the tree (namely, a node of fanin zero) is labeled either with an input variable or a field element. Every other node of the tree is labeled either with or (in the first case the node is a plus gate and in the second case a noncommutative product gate). We assume that there is only one node of outdegree zero, called the root.
A noncommutative formula computes a noncommutative polynomial in in the following way. A leaf computes the input variable or field element that labels it. A plus gate computes the sum of polynomials computed by its incoming nodes. A product gate computes the noncommutative product of the polynomials computed by its incoming nodes according to the order of the edges. (Subtraction is obtained using the constant .) The output of the formula is the polynomial computed at the root. The depth of a formula is the maximal length of a path from the root to the leaf. The size of a noncommutative formula is the total number of internal nodes (i.e., all nodes except the leaves) in its underlying tree, and is denoted similarly to the Boolean case by .
The definition of (a commutative) arithmetic formula is almost identical:
Definition 2.6 ((Commutative) arithmetic formula).
An arithmetic formula is defined in a similar way to a noncommutative formula, except that we ignore the order of multiplication (that is, a product node does not have order on its children and there is no order on multiplication when defining the polynomial computed by a formula).
Substitutions of noncommutative formulas into other noncommutative formulas are defined and denoted similarly to substitutions in Boolean formulas.
Note that we consider arithmetic formulas as syntactic objects. For example, and are different formulas. Furthermore, in the proof system defined below they should be derived from each other via an explicit application of a rewrite rule.
2.2.1 Polynomial calculus over formulas
The polynomial calculus over formulas system, denoted , was introduced by Grigoriev and Hirsch [GH03]. This system operates with (commutative) arithmetic formulas (as purely syntactic terms). is a refutation system: an refutation establishes that a collection of polynomials has no 01 roots. We can also treat as a proof system for propositional tautologies: for every Boolean tautology , (Definition 1.3) is a polynomial that does not have a 01 root, and therefore, an refutation of can be considered as an proof of the tautology .
Definition 2.7 (Rewrite rule).
A rewrite rule is a pair of formulas denoted . Given a formula , an application of a rewrite rule to is the result of replacing at most one occurrence of in by (that is, substituting a subformula inside by the formula ). We write to denote the pair of rewriting rules and .
Definition 2.8 ( [Gh03]).
Fix a field . Let be a collection of formulas^{10}^{10}10Note here that we are talking about formulas (treated as syntactic terms). Also notice that all the formulas in are considered as commutative formulas computing (commutative) polynomials, though, because the formulas are merely syntactic terms we have an order on children of internal nodes, and in particular children of product gates are ordered. computing polynomials from . Let the set of axioms be the following formulas:
 Boolean axioms

A sequence of formulas computing polynomials from is said to be an proof of from , if for every we have one of the following:

, for some ;

is a Boolean axiom;

was deduced by one of the following inference rules from previous prooflines , for :
 Product

 Addition

(Where are formulas constructed as displayed; e.g., is the formula with product gate at the root having the formulas and as children.)^{11}^{11}11In [GH03] the product rule of is defined so that one can derive from , where is any formula, and not just a variable. However, it is easy to show that the definition of in [GH03] and our Definition 2.8 polynomiallysimulate each other.

was deduced from previous proofline , for , by one of the following rewriting rules expressing the polynomialring axioms (where range over all arithmetic formulas computing polynomials in ):
 Zero rule

 Unit rule

 Scalar rule

, where is a formula containing no variables (only field elements) that computes the constant .
 Commutativity rules

,
 Associativity rule

,
 Distributivity rule

(The semantics of an proofline is the polynomial equation .)
An refutation of is a proof of the formula from . The size of an proof is defined as the total size of all formulas in and is denoted by .
Definition 2.9 (Treelike ).
A system is a treelike if every derived arithmetic formula in the proof system is used only once (and if it is needed again, it must be derived once more).
For the purpose of comparing the relative complexity of different proof systems we have the concept of a simulation. Specifically, we say that a propositional proof system polynomially simulates another propositional proof system if there is a polynomialtime computable function that maps proofs to proofs of the same tautologies (if and use different representations for tautologies, we fix a translation (such as ) from one representation to the other). In case is computable in time (for the inputsize), we say that simulates . Specifically, if we say the simulation is quasipolynomial. We say that and are polynomially equivalent in case polynomially simulates and polynomially simulates . (Our simulations will always be formally simulations, though we might not always state explicitly that the map , from proofs to proofs is efficiently computable, and only show the existence of a proof whose size is proportional to the corresponding proof.)
Treelike polynomially simulates Frege.
Grigoriev and Hirsch showed the following:
Theorem 2.10 ([Gh03]).
Treelike polynomially simulates Frege. More precisely, for every propositional tautology T, if T has a polynomialsize Frege proof then there is a polynomialsize treelike proof of (over , for a prime, or ).
Let us shortly explain how Grigoriev and Hirsch [GH03] obtained a simulation of Frege by treelike (in contrast to simply (daglike) ), as this is not an entirely trivial result (and which, in turn, is important to understand our simulation). Indeed, this simulation depends crucially on a somewhat surprising result of Krajíček who showed that treelike Frege and (daglike) Frege are polynomially equivalent [Kra95]:
Theorem ([Kra95]).
Treelike Frege proofs polynomially simulate Frege proofs.
Grigoriev and Hirsch show that (Theorem 3 in [GH03]) polynomially simulates Frege. Then, by inspection of this simulation, one can observe that treelike Frege proofs are simulated by treelike proofs (which is sufficient to conclude the simulation due to the theorem above), namely:
Lemma ([Gh03]).
Treelike polynomially simulates treelike Frege.
3 Noncommutative ideal proof system polynomially simulates Frege
Here we show that the noncommutative IPS polynomially simulates Frege.
Theorem 3.1 (restatement of Theorem 1.4).
The noncommutative IPS refutation system (when refutations are written as noncommutative formulas) polynomially simulates the Frege system. More precisely, for every propositional tautology , if has a polynomialsize Frege proof then there is a noncommutative IPS refutation of (over for a prime , or ) of polynomial size.
Recall that Raz and Shpilka [RS05] gave a deterministic polynomialtime PIT algorithm for noncommutative formulas (over any field):
Theorem 3.2 (PIT for noncommutative formulas [Rs05]).
There is a deterministic polynomialtime algorithm that decides whether a given noncommutative formula over a field computes the zero polynomial .^{12}^{12}12We assume here that the elements of have an efficient representation and the field operations are efficiently computable (e.g., the field of rationals).
Now, since we write refutations as noncommutative formulas we can use the theorem above to check in deterministic polynomialtime the correctness of noncommutative IPS refutations, obtaining:
Corollary 3.3 (restatement of Corollary 1.5).
The noncommutative IPS is a sound and complete CookReckhow refutation system. That is, it is a sound and complete refutation system for unsatisfiable propositional formulas in which refutations can be checked for correctness in deterministic polynomialtime.
To prove Theorem 3.1, we will show in Section 3.1 that the noncommutative IPS polynomiallysimulates treelike (Definition 2.8), which sufficed to complete the proof, due to Theorem 2.10.
3.1 Noncommutative IPS polynomially simulates treelike
For convenience, let denote the commutator axiom , for , and let denote the vector of all the axioms. When we write where are formulas (e.g., and , resp.), we mean .
Theorem 3.4.
Noncommutative IPS polynomially simulates treelike (Definition 2.8). Specifically, if is a treelike proof of a tautology then there is a noncommutative IPS refutation of of size polynomial in .
Proof.
Let be arithmetic formulas over the variables . We denote by the vector . Since an arithmetic formula is a syntactic term in which the children of gates are ordered we can treat a (commutative) arithmetic formula as a noncommutative arithmetic formula by taking the order on the children of products gates to be the order of noncommutative multiplication.
Suppose has a size treelike refutation of the ’s (i.e., a proof of the polynomial from ), where each is an arithmetic formula. We construct a corresponding noncommutative IPS refutation of the ’s from this treelike refutation. The following lemma suffices for this purpose:
Lemma 3.5.
For every , there exists a noncommutative formula such that

;

;

, where are the indices of the prooflines involved in deriving .
For example, if is derived by and is derived by for some , then we say that are both involved in deriving . In other words, the lines involved in deriving a proofline are all the prooflines in the subtree of when we consider the underlying graph of the (treelike) proof as a tree.
Note that if the lemma holds, then is a noncommutative IPS proof because it has the property that and . And its size is bounded by
Proof.
We construct by induction on the length of the refutation . That is, for from to , we construct the noncommutative formula according to , as follows:
Base case: is an axiom for some .
Let . Obviously, and .
Induction step:
Case 1: is derived from the addition rule , for . Put where . Thus, and (where the right most inequality holds since is a treelike refutation and hence ).
Case 2: is derived from the product rule , for and . Put . Then and .
Case 3: is derived from , for , by a rewriting rule which is not the commutative rule of multiplication (). Let . The noncommutative trivially satisfies the properties claimed since all the rewriting rules (excluding the commutative rule of multiplication) express the noncommutative polynomialring axioms, and thus cannot change the polynomial computed by a noncommutative formula. And .
Case 4: is derived from , for , by a single application of the commutative rule of multiplication. Then by Lemma 3.6 below, we can construct a noncommutative formula such that satisfies the desired properties (stated in Lemma 3.5).
∎
Lemma 3.6.
Let be noncommutative formulas, such that can be derived from via the commutative rule of multiplication . Then there is a noncommutative formula in variables such that:

;

;

.
Proof.
We define the noncommutative formula inductively as follows:

If , and , then is defined to be the formula constructed in Lemma 3.7 below.

If , .
Case 1: If , then let .
Case 2: If , then let .

If , .
Case 1: If , then let .
Case 2: If , then let .
By induction, the construction satisfies the desired properties. ∎
Lemma 3.7.
For any pair of two noncommutative formulas there exists a noncommutative formula in variables such that:

;

;

.
Proof.
Let denote the smallest size of satisfying the above properties. We will show that by induction on .
Base case: .
In this case both and are constants or variables, thus .
In the following induction step, we consider the case where (which is symmetric for the case ).
Induction step: Assume that .
Case 1: The root of is addition.
Let . We have (after rearranging):
By induction hypothesis, we have .
Case 2: The root of is a product gate.
Let . By rearranging:
By induction hypothesis, we have . ∎
4 Frege quasipolynomially simulates noncommutative IPS
In this long section we prove Theorem 4.1 stating that the Frege system quasipolynomially simulates the noncommutative IPS (over ). Together with Theorem 3.1, this gives a new characterization (up to a quasipolynomial increase in size) of propositional Frege proofs as noncommutative arithmetic formulas.
We use the notation in Section 1.3.3 as follows: for a clause in a CNF , we denote by the noncommutative formula translation of the clause (Definition 1.6). Thus, translates to , translates to and translates to (considered as a tree of product gates with as leaves), and where the formulas are over (meaning that is in fact ). Recall that this way, for every 01 assignment (when we identify true with 1 and false with 0), iff is true.
Theorem 4.1 (Second main theorem; Restatement of Theorem 1.7).
For a 3CNF where are the corresponding polynomial equations for the clauses, if there is a noncommutative IPS refutation of size of over , then there is a Frege proof of size of .
As mentioned in the introduction, it will be evident that our proof in fact establishes a slightly tighter simulation of the noncommutative IPS by Frege. Specifically, if the degree of the noncommutative IPS refutation is and its formula depth is , then there is a Frege proof of with size . This will follow from our efficient simulation within Frege of Raz’ [Raz13] homogenization construction (Lemma 4.8). Nevertheless, for simplicity we shall always assume that the depth of the noncommutative IPS refutation formula is logarithmic in the size and that the degree of the refutation is at most , and thus will not take care to explicitly establish the dependence of the simulation on the parameters and .
The rest of the paper is dedicated to proving Theorem 4.1.
4.1 Balancing noncommutative formulas
First we show that a noncommutative formula of size can be balanced to an equivalent formula of depth , and thus we can assume that the noncommutative IPS certificate is already given as a balanced formula (this is needed for what follows). Both the statement of the balancing construction and its proof are similar to Proposition 4.1 in Hrubeš and Wigderson [HW14] (which in turn is similar to the case of commutative formulas with division gates in Brent [Bre74]). (Note that a formula of a logarithmic depth (in the number of variables) must have a polynomialsize (in the number of variables).)
Lemma 4.2.
Assume that a noncommutative polynomial can be computed by a formula of size . Then can be computed by a formula of depth (and hence of polynomialsize when is polynomial in the number of variables).
Proof.
The proof is almost identical to Hrubeš and Wigderson’s proof of Proposition 4.1 in [HW14], which deals with rational functions and allows formulas with division gates. Thus, we only outline the argument in [HW14] and argue that if the given formula does not have division gates, then the new formula obtained by the balancing construction will not contain any division gate as well.
Notation.
Let be a noncommutative formula and let g be a gate in . We denote by the subformula of with the root being g and by the formula obtained by replacing in by the variable . We denote by the noncommutative polynomials in computed by and , respectively.
We simultaneously prove the following two statements by induction on , concluding the lemma:
Inductive statement: If is a noncommutative formula of size , then for sufficiently large and suitable constants , the following hold:

has a noncommutative formula of depth at most ;

if is a variable occurring at most once in , then:
where are noncommutative polynomials that do not contain , and each can be computed by a noncommutative formula of depth at most .
Base case: . In this case there is one gate g connecting two variables or constants. Thus, (i) in
the inductive statement can be obtained immediately as it is already computed by a formula of depth . As for (ii), note that in the base case, is a formula with only one gate g. Assuming that is a variable occurring only once in , it is easy to construct noncommutative formulas so that for which the conditions in (ii) hold as follows:
Case 1: if g is a plus gate connecting the variable with a variable or constant , then we can write as .
Case 2: if g is a product gate connecting with (for , and in this order), then we can write as .
Case 3: if g is a product gate connecting with (for , and in this order), then we can write as .
Induction step: (i) is established (slightly informally) as follows. Find a gate g
in such that both and are small (of size at most , and where is a new variable that does not occur in ). Then, by applying induction hypothesis on , there exist formulas of small depth such that . Thus,
To prove (ii), find an appropriate gate g on the path between and the output of (an appropriate g is a gate such that and are both small (of size at most ), where is a new variable not occurring in ). Use the inductive assumptions to write:
and compose these expressions to get
where , .
It is clear that the respective depth of and are all at most when is sufficiently large.
To finish the proof of (ii), it suffices to show that