Generating Matrix Identities and Proof Complexity

Generating Matrix Identities and Proof Complexity

Fu Li Institute for Theoretical Computer Science, The Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University, Beijing. Supported in part by the National Basic Research Program of China Grant 2011CBA00300, 2011CBA00301, the National Natural Science Foundation of China Grant 61033001, 61361136003 and NSFC grant 61373002.    Iddo Tzameret Department of Computer Science, Royal Holloway, University of London. Email: iddo.tzameret@gmail.com   Supported in part by the NSFC Grant 61373002.
Abstract

Motivated by the fundamental lower bounds questions in proof complexity, we initiate the study of matrix identities as hard instances for strong proof systems. A matrix identity of  matrices over a field , is a non-commutative polynomial over  such that vanishes on every  matrix assignment to its variables.

We focus on arithmetic proofs, which are proofs of polynomial identities operating with arithmetic circuits and whose axioms are the polynomial-ring axioms (these proofs serve as an algebraic analogue of the Extended Frege propositional proof system; and over they constitute formally a sub-system of Extended Frege [9]). We introduce a decreasing in strength hierarchy of proof systems within arithmetic proofs, in which the th level is a sound and complete proof system for proving  matrix identities (over a given field). For each level in the hierarchy, we establish a proof-size lower bound in terms of the number of variables in the matrix identity proved: we show the existence of a family of matrix identities with variables, such that any proof of requires number of lines.

The lower bound argument uses fundamental results from the theory of algebras with polynomial identities together with a generalization of the arguments in [7]. Specifically, we establish an unconditional lower bound on the minimal number of generators needed to generate a matrix identity, where the generators are substitution instances of elements from any given finite basis of the matrix identities; a result that might be of independent interest.

We then set out to study matrix identities as hard instances for (full) arithmetic proofs. We present two conjectures, one about non-commutative arithmetic circuit complexity and the other about proof complexity, under which up to exponential-size lower bounds on arithmetic proofs (in terms of the arithmetic circuit size of the identities proved) hold. Finally, we discuss the applicability of our approach to strong propositional proof systems such as Extended Frege.

size=, noline, caption=Write about unity in associative algebras., , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitesize=, noline, caption=Write about unity in associative algebras., , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitetodo: size=, noline, caption=Write about unity in associative algebras., , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whiteWrite about unity in associative algebras.size=, caption=Explain the adjective strong proof system, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitesize=, caption=Explain the adjective strong proof system, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitetodo: size=, caption=Explain the adjective strong proof system, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whiteExplain the adjective strong proof system

1 Background

Proving super-polynomial size lower bounds on strong propositional proof systems, like the Extended Frege system, is a major open problem in proof complexity, and in general is among a handful of fundamental hardness questions in computational complexity theory. An Extended Frege proof is simply a textbook logical proof system for establishing Boolean tautologies, in which one starts from basic tautological axioms written as Boolean formulas, and derives, step by step, new tautological formulas from previous ones by using a finite set of logical sound derivation rules; including the so-called extension axiom enabling one to denote a possibly big formula by a single new variable (where the variable is used neither before in the proof nor in the last line of the proof). It is not hard to show (see [11]) that Extended Frege can equivalently be defined as a logical proof system operating with Boolean circuits (and without the extension axiom111An additional simple technical axiom is needed to formally define this proof system ([11]).).

Lower bounds on Extended Frege proofs can be viewed as lower bounds on certain nondeterministic algorithms for establishing the unsatisfiability of Boolean formulas (and thus as a progress towards separating NP from coNP). It is also usually considered (somewhat informally) as related to establishing (explicit) Boolean circuit size lower bounds. In fact, it has also another highly significant consequence, that places such a lower bound as a step towards separating P from NP: showing any super-polynomial lower bound on the size of Extended Frege proofs implies that, at least with respect to “polynomial-time reasoning” (namely, reasoning in the formal theory of arithmetic denoted ), it is not possible to prove that ; or in other words, it is consistent with that NP (cf. [15]).

Accordingly, proving Extended Frege lower bounds is considered a very hard problem. In fact, even conditional lower bounds on strong proof systems, including Extended Frege, are not known and are considered very interesting;222Informally, we call a proof system strong if there are no known (non-trivial) size lower bounds on proofs in the system and further such lower bounds are believed to be outside the realm of current techniques. here, we mean a condition that is different from (see [18]; the latter condition immediately implies that any propositional proof system admits a family of tautologies with no polynomial-size proofs [4]). The only size lower bound on Extended Frege proofs that is known to date is linear (where is the size of the tautological formula proved; see [14] for a proof). Establishing super-linear size lower bounds on Extended Frege proofs is thus a highly interesting open problem.

That said, although proving Extended Frege lower bounds is a fundamental open problem in complexity, it is quite unclear whether such lower bounds are indeed far from reach or beyond current techniques (in contrast to other fundamental hardness problems in complexity, such as strong explicit Boolean circuit lower bounds, for which formal so-called barriers are known).

Another feature of proof complexity is that, in contrast to circuit complexity, even the existence of non-explicit hard instances for strong propositional proof systems, including Extended Frege, are unknown. For instance, simple counting arguments cannot establish super-linear size lower bounds on Extended Frege proofs (in contrast to Shannon’s counting argument which gives non-explicit lower bounds on circuit size, but does not in itself yield complexity class separations). Thus, the existence of non-explicit hard instances in proof complexity is sufficient for the purpose of lower bounding the size of strong proof systems.

Furthermore, for strong proof systems there are almost no hard candidates, namely, tautologies that are believed to require long proofs in these systems (see Bonet, Buss and Pitassi [2]); except, perhaps for random -CNF formulas near the satisfiability threshold. But for the latter instances, even lower bounds on Frege proofs of constant-depth are unknown. It is worth noting also that Razborov [19] and especially Krajíček (see e.g., [16]) had proposed some tautologies as hard candidates for strong proof systems.

Due to the lack of progress on establishing lower bounds on strong propositional proof systems, it is interesting, and potentially helpful, to turn our eyes to an algebraic analogue of strong propositional proof systems, and try first to prove nontrivial size lower bounds in such settings. Quite recently, such algebraic analogues of Extended Frege (and Frege, which is Extended Frege without the extension axiom) were investigated by Hrubeš and the second author [8, 9]. These proof systems denoted (), called simply arithmetic proofs, operate with algebraic equations of the form , where and are algebraic circuits over a given field . An arithmetic proof of a polynomial identity is a sequence of identities between algebraic circuits derived by means of simple syntactic manipulation representing the polynomial-ring axioms (e.g., associativity, distributivity, unit element, field identities, etc.; see Definition 16). Although arithmetic proof systems are not propositional proof systems, namely they do not prove propositional tautologies, they can be regarded nevertheless as fragments of the propositional Extended Frege proof system when the field considered is . That is, every arithmetic proof over of a polynomial identity (considered as a propositional tautology) can formally be viewed also as an Extended Frege proof.333In fact, it is probably true (but was not formally verified) that arithmetic proofs are fragments of propositional proofs also over any other finite field, as well as over the ring of integers (when restricted to up to exponentially big integers). That is, it is probably true that every polynomial identity proved with an arithmetic proof over the given field or ring, can be proved with at most a polynomial increase in size in Extended Frege when we fix a certain natural translation between polynomial identities over the field or ring and propositional tautologies. The reason for this is that one could plausibly polynomially simulate arithmetic proofs over such fields or rings with propositional proofs in which numbers are encoded as bit-strings.

Apart from the hope that arithmetic proofs would shed light on propositional proof systems, the study of arithmetic proofs is motivated by the Polynomial Identity Testing (PIT) problem, namely the problem of deciding if a given algebraic circuit computes the zero polynomial. As a decision problem, polynomial identity testing can be solved by an efficient randomized algorithm [21, 22], but no efficient deterministic algorithm is known. In fact, it is not even known whether there is a polynomial time non-deterministic algorithm or, equivalently, whether PIT is in NP. An arithmetic proof system can thus be interpreted as a specific non-deterministic algorithm for PIT: in order to verify that an arithmetic circuit computes the zero polynomial, it is sufficient to guess an arithmetic proof of . Hence, if every true equality has a polynomial-size proof then PIT is in NP. Conversely, the arithmetic proof system captures the common syntactic procedures used to establish equality between algebraic expressions. Thus, showing the existence of identities that require super-polynomial arithmetic proofs would imply that those syntactic procedures are not enough to solve the PIT problem efficiently.444It is worth emphasizing again that arithmetic proofs are different than algebraic propositional proof systems like the Polynomial Calculus [3] and related systems. The latter prove propositional tautologies (a coNP language) while the former prove formal polynomial identities written as equations between algebraic circuits (a coRP language).

The emphasis in [8, 9] was mainly on demonstrating non-trivial upper bounds for arithmetic proofs (as well as lower bounds in very restricted settings). Since arithmetic proofs (at least over ), can also be considered as propositional proofs, arithmetic proofs were found very useful in establishing short propositional proofs for the determinant identities and other statements from linear algebra [9]. As for lower bounds on arithmetic proofs (operating with arithmetic circuits), the same basic linear size lower bound known for Extended Frege [14] can be shown to hold for . But any super-linear size lower bound, explicit or not, on () proof size (for any field ) is open. In [8] it was argued that proving lower bounds even on very restricted fragments of arithmetic proofs is a highly nontrivial open problem.

The state of affairs we have described up to now shows how little is known about strong propositional (and arithmetic) proof systems, and why it is highly interesting to introduce and develop novel approaches for lower bounding proofs such as arithmetic proofs, even if these approaches yield only conditional and possibly non-explicit lower bounds; and further, to propose new kinds of hard candidates for strong proof systems.

2 Overview of our results

In this work we initiate the study of matrix identities as hard instances for strong proof systems in various settings and under different assumptions. The term strong here stands for proof systems that operate with (Boolean or arithmetic) circuits, for which we do not know any non trivial lower bound (see Sec. A.2 for the definitions of arithmetic circuits and non-commutative arithmetic circuits).

The ultimate goal of our suggested approach is proving Extended Frege lower bounds; however, in this work we focus for most part on the seemingly (and relatively) easier task of proving arithmetic proofs () lower bounds, namely lower bounds on arithmetic proofs establishing polynomial identities between arithmetic circuits over a field .

We introduce a new decreasing hierarchy of proof systems establishing matrix identities of a given dimension, within arithmetic proofs (and whose first level coincides with arithmetic proofs). We obtain unconditional (polynomial) lower bounds on proof systems for matrix identities in terms of the number of variables in the identities proved. We then present two natural conjectures from arithmetic circuit complexity and proof complexity, respectively, based on which one can obtain up to exponential-size lower bounds on arithmetic proofs () in terms of the size of the identities proved.

We start by explaining what matrix identities are, as well as providing some necessary background from algebra.

2.1 Matrix identities

For a field let be a non-commutative (associative) -algebra; e.g., the algebra  of matrices over . (Formally, is an -algebra, if is a vector space over  together with a distributive multiplication operation; where multiplication in is associative (but it need not be commutative) and there exists a multiplicative unity in .)

We shall always assume, unless explicitly stated otherwise, that the field  has characteristic 0.

A non-commutative polynomial over the field  and with the variables is a formal sum of monomials where the product of variables is non-commuting. Since most polynomials in this work are non-commutative when we talk about polynomials we shall mean non-commutative polynomials, unless otherwise stated. The set of (non-commutative) polynomials with variables and over the field  is denoted .

We say that is a matrix identity of simply whenever is a non-commutative polynomial (with coefficients from ) that is equal to zero under any assignment of matrices from  to its variables. In other words, the polynomial over  is an identity of the algebra (and specifically, the matrix algebra ), if for all , .

2.2 Stratification

A matrix identity is a non-commutative polynomial vanishing over all assignments of matrices. If we consider the “matrix” algebra of matrices , its set of identities consists of all the non-commutative polynomials that vanish over field elements. Since, by definition, the field is commutative, the identities of  are all non-commutative polynomials such that when the product is considered as commutative we obtain the zero polynomial; in other words, we can consider the identities of  as the set of (standard, i.e., commutative) polynomial identities. Further, in our application we shall write all polynomials as non-commutative arithmetic circuits, and since a non-commutative arithmetic circuit is equivalent to a (commutative) arithmetic circuit (except that product gates have order on their children) we can consider the set of identities of  written as non-commutative circuits, as the set of (commutative) polynomial identities written as (commutative) arithmetic circuits.

Using matrix identities of increasing dimensions we obtain a stratification of the language of (commutative) polynomial identities. Namely, we obtain the following strictly decreasing (with respect to containment) chain of identities:

(commutative) polynomial identities
(1)

The fact that the identities of are also identities of  is easy to show. The fact that the chain above is strictly decreasing can be proved either by a elementary methods [12] or as a corollary of [1].

2.3 Corresponding proof systems and the main lower bound

We now introduce a novel hierarchy of proof systems within arithmetic proofs (). For this we need the concept of a basis of a set of identities of a given -algebra (e.g., the matrix algebra ) .

Basis.

We say that a set of non-commutative polynomials forms a basis for the identities of , in the following sense: for every identity of there exist non-commutative polynomials , for some , that are substitution instances of polynomials from , such that is in the two-sided ideal (a substitution instance of a polynomial is a polynomial , for some , ).

Recall that arithmetic proofs () (see Definition 16) are proofs that start from basic axioms like associativity, commutativity of addition and product, distributivity of product over addition, unit element axioms, etc., in which we derive new equations between arithmetic circuits using rules for adding and multiplying two previous identities. Arithmetic proofs are sound and complete proof systems for the set of (commutative) polynomial identities, written as equations between arithmetic circuits.

Notice that if one takes out the Commutativity Axiom from arithmetic proofs, we get a proof system for establishing non-commutative polynomial identities written as non-commutative arithmetic circuits (we can assume that product gates appearing in arithmetic proofs have order on their children).

The proof systems .

For any field  (of characteristic 0), any and any basis of the identities of , we define the following proof system , which is sound and complete for the identities of  (written as equations of non-commutative circuits): consider the proof systems () (Definition 16) and replace the commutativity axiom by a finite basis of the identities of  (namely, add a new axiom for each polynomial in the basis, where is a non-commutative algebraic circuit computing ).555Formally, we should fix a specific finite basis for the sake of definiteness of . However, different choices of bases can only increase the number of lines in a -proof by a constant factor. Additionally, add the axioms of distributivity of product over addition from both left and right (this is needed because we do not have anymore the commutativity axiom in our system to simulate both distributivity axioms).

Note that can be considered as , since the commutator is an axiom of () and the commutator is a basis of the identities of .

(Commutative) Polynomial IdentitiesMat-identitiesover Mat-identitiesMat-identitiesArithmetic proofs


Figure 1: Illustration of the stratification of the language of polynomial identities and the corresponding proof systems for each language.

Our main result is an unconditional lower bound on the size (in fact the number of lines666A proof-line is any equation between arithmetic circuits appearing in the proof.) of proofs, for any , in terms of the number of variables in the matrix identity proved:

Theorem 1 (Main lower bound).

Let  be any field of characteristic 0. For any natural number and every finite basis of the identities of , there exists an identity over  of degree with variables, such that any -proof of requires lines.

The proof of the main lower bound—which is the main technical contribution of our work—is explained in the following subsection, and is based on a complexity measure defined on matrix identities and their generation in a (two-sided) ideal. The complexity measure is interesting by itself, and can be applied to identities of any algebra with polynomial identities (PI-algebras; see [20, 6] for the theory of PI-algebras), and not only matrix identities.

Comments.

(i) When , our proof, showing the lower bound for every basis of the identities of , does not hold (see Sec. C.1.3 for an explanation).

(ii) The hard instance in the main lower bound theorem is non-explicit. Thus, we do not know if there are small non-commutative circuits computing the hard instances. This is the reason the lower bound holds only with respect to the number of variables in the hard-instances and not with respect to its circuit size—the latter is the more desired result in proof complexity. Section 3 sets out an approach to achieve this latter result.

(iii) The proof-systems are defined using a finite basis of the identities of . A very interesting feature of our lower bound argument is that it is in fact an open problem to find explicit finite bases for the identities of  (for ; see the next sub-Section 2.3.1 on this).

(iv) We do not know if the hierarchy of proof systems  for increasing ’s is a strictly decreasing hierarchy (since we do not know if has any speed-up over  for identities of ).

In the following subsection we give a detailed overview of the lower bound argument.

2.3.1 Proving the main lower bound: generative complexity lower bounds

Here we explain in details the complexity measure we define and how we obtain the lower bound on this measure. It is simple to show that our complexity measure is a lower bound on the minimal number of lines in a corresponding -proof (for the case this was observed in [7]).

The complexity measure.

Given an -algebra (e.g., ) and an identity of , define

as the minimal number such that there exist for which , and every is a substitution instance of some polynomial from . (Note that each substitution instance, even of the same polynomial from , adds to .) We sometimes call the generative complexity of (with respect to ).

Example: Let  be an infinite field and consider the field  itself as an -algebra, denoted . Then the identities of are all the polynomials from  that evaluate to under every assignment from  to the variables . Namely, these are the (non-commutative) polynomials that are identically zero polynomials when considered as commutative polynomials. For instance, is a non-zero polynomial from  which is an identity over .

It is not hard to show that the basis of the algebra is the commutator , denoted . In other words, every identity of is generated (in the two-sided ideal) by substitution instances of the commutator. Considering , we can now ask what is ? The answer is since we need only one substitution instance of the commutator: .

Hrubeš [7] showed the following lower bound (using a slightly different terminology):

Theorem 2 (Hrubeš [7]).

For any field and every , there exists an identity of with variables, such that .

It is also not hard to show that for any identity .

Lower bound on the complexity of generating matrix identities.

An algebra with polynomial identities, or in short a PI-algebra (PI stands for Polynomial Identities), is simply an -algebra that has a non-trivial identity, that is, there is a nonzero that is an identity of the algebra.

Let us treat (the -algebra) as the matrix algebra Mat of matrices with entries from . We shall exploit results about the structure of the identities of matrix algebras and the general theory of PI-algebras to completely generalize Hrubeš [7] lower bound above (excluding the case ), from a lower bound of for generating identities of Mat to a lower bound of for generating identities of , for any and any field  of characteristic 0:

Theorem 5 (Lower bound on generative complexity).

Let  be any field of characteristic 0. For every natural number and every finite basis of the identities of , there exists an identity over  of degree with variables, such that .

Notice that similar to [7], the lower bound in this theorem is non-explicit. We do not know of an upper bound (in terms of ) that holds on , for every identity with variables.

The main lower bound (Theorem 1) is a corollary of the following theorem (proved by simple induction on the number of lines in a -proof):

Theorem.

For every identity , where is a non-commutative circuit that computes a non-commutative polynomial which is an identity of , the number of lines of a -proof of is lower bounded up to a constant factor (depending on the choice of finite basis ) by .

Overview of the proof of Theorem 5.

The study of algebras with polynomial identities is a fairly developed subject in algebra (see the monographs by Drensky [6] and Rowen [20] on this topic). Within it, perhaps the most well studied topic is about the identities of matrix algebras. In particular, the well-known theorem of Amitsur and Levitzky from 1950 [1] is the following:

Amitsur-Levitzki Theorem ([1]).

Let be the permutation group on elements and let denote the standard identity of degree as follows:

Then, for any natural number and any field  (in fact, any commutative ring) the standard identity of degree is an identity of .

Theorem 5 is proved in several steps, but the main argument can be divided into two main parts, described as follows:

Part 1:

Here we use the Amitsur-Levitzki Theorem: we show that when there exists an with variables and degree such that . To this end, we generalize the method in [7] to “higher dimensional commutativity axioms”: using a counting argument we show the existence of special polynomials (we call s-polynomials; see Definition 10) over variables and each of degree such that (see Lemma 9). Then, we combine the s-polynomials into a single polynomial with degree by adding new variables, such that .

While [7] uses the commutator to define the s-polynomials, we consider the higher order commutativity axiom instead. It is possible to show that has sufficient properties for the lower bound as the commutator (see Lemmas 7, 8, 12).

Part 2:

Note that is not a basis of , namely there are identities of  that are not generated by substitution instances of (also notice that can be defined for any ). The second part in the proof of Theorem 5 is dedicated to showing that when , for all finite bases of the identities of the following holds for the hard identity considered in the theorem: for some constant .

For this purpose, we find a special set which serves as an “intermediate” set between and such that is generated by , and all the polynomials in that contribute to the generation of the hard instance can be generated already by . We then show (Lemma 17) that for any basis , there is a specific set of polynomials of a special form, namely, multi-homogenous commutator polynomials (Definition 11), that can generate . Based on the properties of multi-homogenous commutator polynomials, we show that, for the hard instance , only the generators of degree at most in can contribute to the generation of (Lemma 21). We then prove that when , all the generators of degree at most in can be generated by (this is where we use the assumption that (see Lemma 20)). We thus get the conclusion , when .

A very interesting feature of our proof (and theorem), is that it is in fact an open problem to describe bases of the identities of , for any . For the case the basis is known by a result of Drensky [5] (see Section E.3). However, a highly nontrivial result of Kemer [13], shows that for any natural there exists a finite basis for . Our proof shows roughly that for the hard instances in Theorem 5 no generators different from the generators can contribute to the generation of .

3 Towards strong lower bounds on (full) arithmetic proofs

Here we continue the study of matrix identities as hard proof complexity instances, and set out a program to establish lower bounds on arithmetic proofs. We present two conjectures, interesting by themselves: one about non-commutative arithmetic circuit complexity and the other about proof-complexity, based on which up to exponential-size lower bounds on arithmetic proofs (in terms of the non-commutative circuit-size of the identity proved) follow. We discuss in details these conjectures and the parameters the are needed for different kinds of lower-bounds.

Informally, the two conjectures are as follows (recall the complexity measure from Sec. 2.3.1, counting the minimal number of substitution instances of generators from a basis needed to generate an identity ):

Conjecture I.

(Informal) There exist non-commutative arithmetic circuits of small size that compute matrix identities of high generative complexity.

Conjecture II.

(Informal) Proving matrix identities by reasoning with polynomials whose variables range over matrices is as efficient as proving matrix identities using polynomials whose variables range over the entries of the matrices ?

3.1 Towards lower bounds on  in terms of arithmetic-circuit size

Recall that a non-commutative arithmetic circuit is an arithmetic circuit that has an order on the children of product gates and the product is performed according to this order (see Sec. A.2). To get a size lower bound on  proofs in terms of the circuit equations proved, we need to assume the existence of non-commutative arithmetic circuits of small size that compute matrix identities of high generative complexity:


  \@minipagerestore

Conjecture I.
For some fixed , there exists a family of identities of , with variables, such that , for some basis of the identities of , and such that has a non-commutative arithmetic circuit of size , for some constant .
  

Assuming the veracity of the above conjecture we obtain the following lower bound:

Polynomial lower bounds on -proofs (assuming Conjecture I): There exists a family of identities of  whose non-commutative arithmetic circuit-size is but every -proof of has size .

Note that we know by Theorem 5 that the lower bound in Conjecture I is true for any and for some specific family . But we do not know whether this specific has small circuits, as required in Conjecture I.


3.2 Towards polynomial-size lower bounds on full arithmetic proofs

Here we consider the possibility that the arbitrary polynomial-size lower bounds on matrix identities proofs  transfer to arithmetic proofs () lower bounds.

The natural way to formalize Conjecture II mentioned informally above is via the following translation: consider a nonzero identity of , for some . Then is a nonzero non-commutative polynomial in . If we substitute each (matrix) variable in by a  matrix of entry-variables , then corresponds to commutative zero polynomials: says that for every and for every possible assignment of field  elements to the -entry of each of the matrix variables in (when the product and addition of matrices are done in the standard way) the -entry evaluates to . Accordingly, let be a non-commutative circuit computing . Then under the above substitution of entry-variables to each variable in , we get non-commutative circuits, each computing the zero polynomial when considered as commutative polynomials (see Definition 15). We denote the set of circuits corresponding to the identity by (and we extend it naturally to equations between circuits: ).

Example:

Let and let (it is obviously not an identity of , but we use it only for the sake of example). And let be the corresponding circuit (in fact, formula) computing . Then we substitute matrices for to get:

And the -entry non-commutative circuit (in fact formula) in , is:

It is not hard to show that , for every non-commutative circuit (where is the total sizes of all circuits in and is the size of ). We denote by

the minimal size of a () proof that contains (as proof-lines) all the circuit-equations in .


  \@minipagerestore

Conjecture II.
Let be a positive natural number and let be a (finite) basis of the identities of . Assume that is an identity of , and let be a non-commutative arithmetic circuit computing . Then, the minimal number of lines in a proof of the collection of (entry-wise) equations corresponding to , is lower bounded (up to a constant factor) by . And in symbols: (2)
  

The conditional lower bound we get is now similar to that in Section 3.1, except that it holds for () and not only for fragments of ():

Polynomial lower bounds on arithmetic proofs () (assuming Conjectures I and II): There exists a family of identities of  whose non-commutative arithmetic circuit-size is but every -proof of has size .

We also present a propositional version of Conjecture II, by considering to be , adding to () the Boolean axioms and considering matrix identities for  (see Section E.2).

3.3 Towards exponential-size lower bounds on arithmetic proofs

Assuming Conjecture II above holds (i.e., Equation 2), we show under which further conditions one gets exponential-size lower bounds on arithmetic proofs (). The idea is to take the dimension of the matrix algebras as a parameter by itself. For this we need to set up the assumptions more carefully:

Assumptions:

  1. Refinement of Conjecture II: Assume that for any and any basis of the identities of  the number of lines in any proof of is at least , where is a number depending on and is a non-commutative arithmetic circuit computing (this is the same as Conjecture II except that now is not a constant).

  2. Assume that for any sufficiently large and any basis of the identities of , there exists a number such that for all sufficiently large there exists an identity with . (The existence of such identities are known from our unconditional lower bound.)

  3. Assume that for the in item 2 above: .

  4. (Variant of) Conjecture I: Assume that the non-commutative arithmetic circuit size of is at most .

Corollary (assuming Assumptions 1-4 above): There exists a polynomial size (in ) family of identities between non-commutative arithmetic circuits, for which any () proof requires exponential number of proof-lines.

Proof.

By the assumptions, every -proof of has size at least . Consider the family , where is a function of , and we take . Then, we get the following lower bound on the number of lines in any -proof of the family :

which (by Assumption 4) is exponential in the arithmetic circuit-size of the identities proved.     QED

Justification of assumptions.

We wish to justify to a certain extent the new Assumptions 3 above (which lets us obtain the exponential lower bound). We shall use the special hard polynomials that we proved exist in Theorem 5 for this purpose.

First, note that Assumption 2 holds for these ’s, by Theorem 5. In Section E.1 we show that the function for these ’s does not decrease too fast. And we use this fact to get the following (conditional exponential lower bound):

Proposition.

Suppose Assumption 1 above holds (refinement of Conjecture II) and assume that . Then, there exists a family of non-commutative circuits (computing the family of polynomials ) such that the number of lines in any -proof of is at least .

Note that this will give us an exponential-size lower bound on () proofs only if moreover the arithmetic circuit size of is small enough (e.g., if Assumption 4 above holds).

4 Concluding remarks

This work originates from the fundamental goal of establishing lower bounds on strong proof systems. Our focus was on arithmetic proofs which serve as a useful [9] analogue of propositional Extended Frege proofs. Along the way, we have discovered an interesting hierarchy within arithmetic proofs: a hierarchy of sound and complete proof systems for matrix identities of increasing dimensions. In this hierarchy we have been able to establish unconditional nontrivial size-lower bounds (in terms of the number of variables in the identities proved).

We then used these results, together with two seemingly natural conjectures about non-commutative arithmetic circuits and proof complexity, to propose matrix identities as hard candidates for strong proof systems. We showed that using these two conjectures, one can obtain up to exponential-size lower bounds (in terms of the circuit-size of the identities proved).

Proving lower bounds on strong (propositional) proof systems is a fundamental open problem in the theory of computing; nevertheless, it is in fact not clear whether such lower bounds are beyond current techniques (in contrast to other fundamental hardness problems in complexity, such as explicit Boolean circuits lower bounds). In light of this, and the fact that almost no hard candidates for strong proof systems are currently known (see [2, 16]), it seems that an important conceptual, so to speak, contribution of this paper, is to supply such new hard candidates in the form of matrix identities. Moreover, as our work partially demonstrates, such matrix identities have structure that is helpful in proving proof complexity lower bounds.

5 Relation to previous work

Relation to previous work by Hrubeš [7].

The problem of proving quadratic size lower bounds on arithmetic proofs was considered by Hrubeš in [7]. The work in [7] gave several conditions and open problems, under which, quadratic size lower bounds on arithmetic proofs would follow (and further, showed that the general framework suggested may have potential, at least in theory, to yield Extended Frege quadratic-size lower bounds). The current work can be viewed as an attempt to extend the approach suggested in Hrubeš [7], from an approach suitable for proving up to size lower bounds on proofs, (and potentially Extended Frege proofs) to an approach for proving much stronger lower bounds, namely an lower bound on proofs, for every positive and for every zero characteristic field ; and under stronger assumptions, exponential lower bounds on proofs (and similarly, potentially on Extended Frege proofs).

Relation to other previous works.

Apart from the connection to [7], we may consider the relation of the current work to the work of Hrubeš and Tzameret [9] that obtained polynomial-size (arithmetic and propositional) proofs for certain identities concerning matrices. As far as we see, there are no direct relations between these two works: in the current work we are studying matrix identities whose number of matrices (i.e., variables) grows with the number of variables (if the number of matrices in the matrix identities over  is then the number of variables in the translation of the identities to a set of identities is ). Whereas in [9] the number of matrices was fixed and only the dimension of the matrices grows.

Note also that the matrix identities studied in [9] are not even translations (via ) of matrix identities over . For instance consider the identity from [9], where and are matrices. Then we get that:

is equal to But notice that, e.g., in our translation of a matrix identity over , two variables that correspond to the same matrix cannot multiply each other, while in the example above, multiplies and multiplies , though they are entries of the same matrix.

Technical appendix

Appendix A Formal preliminaries

a.1 Algebras with polynomial identities

For a natural number , put . We use lower case letters for constants from the underlying field, for variables and for vectors of variables, or upper case letters such as for polynomials and , for vectors of polynomials (when the arity of the vector is clear from the context).

A polynomial is a formal sum of monomials, where a monomial is a product of (possibly non-commuting) variables and a constant from the underlying field. For two polynomials and we say that is a substitution instance of if for some polynomials ; and we sometimes denote by . For a polynomial , denotes the polynomial that replaces by in respectively, where are distinct numbers from and .

For a vector of polynomials where is positive integer, we also use the notation , to denote the vector of polynomials that replace the coordinate in by a polynomial , where . size=, caption=but we say that lower case letters like f denotes constants?, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitesize=, caption=but we say that lower case letters like f denotes constants?, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitetodo: size=, caption=but we say that lower case letters like f denotes constants?, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitebut we say that lower case letters like f denotes constants?

Definition 1.

Let A be a vector space over a field and be a distributive multiplication operation. If is associative, that is, for all in , then the pair is called an associative algebra over , or an -algebra, for short.777In general an -algebra can be non-associative, but since we only talk about associative algebras in this paper we use the notion of -algebra to imply that the algebra is associative.

Perhaps the most prominent example of an -algebra is the algebra of matrices, for some positive natural number , with entries from (with the usual addition and multiplication of matrices). We denote this algebra by . Note indeed that  is an associative algebra but not a commutative one (i.e., the multiplication of matrices is non-commutative because does not necessarily equal , for two matrices ).

Definition 2.

Let denote the associative algebra of all polynomials such that the variables are non-commutative with respect to multiplication. size=, caption=Should be finite? Countable?, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitesize=, caption=Should be finite? Countable?, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitetodo: size=, caption=Should be finite? Countable?, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whiteShould be finite? Countable? We call the free algebra (over ).

For example, and are three distinct polynomials in .

Note that the set forms a non-commutative ring. We sometimes call the ring of non-commutative polynomials and call the polynomials from non-commutative polynomials. Throughout this paper, unless otherwise stated, a polynomial is meant to be a non-commutative polynomial, namely a polynomial from the free algebra .

We now introduce the concept of a polynomial identity algebra, PI-algebra for short:

Definition 3.

Let be an -algebra. An identity of is a polynomial such that:

A PI-algebra is simply an algebra that has a non-trivial identity, that is, there is a nonzero that is an identity of the algebra.

For example, every commutative -algebra is also a PI-algebra: for any , it holds that , and so is a nonzero polynomial identity of , for any positive . A concrete example of a commutative algebra is the usual ring of (commutative) polynomials with coefficients from a field  and variables , denoted usually .

An example of an algebra that is not a PI-algebra is the free algebra  itself. This is because a nonzero polynomial cannot be an identity of  (since the assignment that maps each variable to itself does not nullify ).

A two-sided ideal of an -algebra is a subset of such that for any (not necessarily distinct) elements from we have , for all .

Definition 4.

A T-ideal is a two-sided ideal of  that is closed under all endomorphisms888An algebra endomorphism of is an (algebra) homomorphism ., namely, is closed under all substitutions of variables by polynomials.

In other words, a T-ideal is a two-sided ideal , such that if then , for any .

It is easy to see the following:

Fact 3.

The set of identities of an (associative) algebra is a T-ideal.

A basis of a T-ideal is a set of polynomials whose substitution instances generate as an ideal:

Definition 5.

Let be a set of polynomials and let be a T-ideal in . We say that is a basis for or that is generated as a T-ideal by , if every can be written as:

for and (for all ).

Given , we write to denote the T-ideal generated by . Thus, a T-ideal is generated by if .

Examples: is simply the set of all polynomials from . is the set of all non-commutative polynomials that are zero if considered as commutative polynomials.

Note that the concept of a T-ideal is already somewhat reminiscent of logical proof systems, where generators of the T-ideal are like axioms schemes and generators of a two-sided ideal containing are like substitution instances of the axioms.

size=, caption=explain the last example more precisely/formally., inline, linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitesize=, caption=explain the last example more precisely/formally., inline, linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitetodo: size=, caption=explain the last example more precisely/formally., inline, linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whiteexplain the last example more precisely/formally.size=, caption=Define homogenous polynomials and write this as a notation in the definition, inline, linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitesize=, caption=Define homogenous polynomials and write this as a notation in the definition, inline, linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitetodo: size=, caption=Define homogenous polynomials and write this as a notation in the definition, inline, linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whiteDefine homogenous polynomials and write this as a notation in the definitionsize=, caption=no numbers in notations ?, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitesize=, caption=no numbers in notations ?, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitetodo: size=, caption=no numbers in notations ?, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whiteno numbers in notations ?

A polynomial is homogenous if all its monomials have the same total degree. Given a polynomial , the homogenous part of degree of , denoted is the sum of all monomials with total degree . We write to denote the th-homogeneous part of the circuit and the vector denotes the vector consisting of the th-homogeneous parts of the circuits .

size=, caption=Do you really need homogenous part of circuits or is it enough to have homogenous part of a polynomial? ” denotes the th-homogeneous part of the circuit and the vector denotes the vector consisting of the th-homogeneous parts of the circuits .” , inline, linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitesize=, caption=Do you really need homogenous part of circuits or is it enough to have homogenous part of a polynomial? ” denotes the th-homogeneous part of the circuit and the vector denotes the vector consisting of the th-homogeneous parts of the circuits .” , inline, linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitetodo: size=, caption=Do you really need homogenous part of circuits or is it enough to have homogenous part of a polynomial? ” denotes the th-homogeneous part of the circuit and the vector denotes the vector consisting of the th-homogeneous parts of the circuits .” , inline, linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whiteDo you really need homogenous part of circuits or is it enough to have homogenous part of a polynomial? ” denotes the th-homogeneous part of the circuit and the vector denotes the vector consisting of the th-homogeneous parts of the circuits .”
Definition 6.

denotes the standard identity of degree as follows:

where denotes the symmetric group on elements and is the sign of the permutation . size=, caption=sym. group of degree or of “order” d ?, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitesize=, caption=sym. group of degree or of “order” d ?, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitetodo: size=, caption=sym. group of degree or of “order” d ?, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitesym. group of degree or of “order” d ?

For polynomials where , we define the generalized-commutator as follows:

A polynomial with variables is homogenous with degrees ( times) if in every monomial the power of every variable is precisely 1. In other words, every monomial is of the form , for some permutation of order and some scalar . For the sake of simplicity, we shall talk in the sequel about polynomial of degree , when referring to polynomial with degrees ( times). Thus, any polynomial with variables is homogenous of total-degree .

a.2 Arithmetic circuits

Definition 7.

Let be a field, and let be a set of input variables. An arithmetic (or algebraic) circuit is a directed acyclic graph, where the in-degree of nodes is at most . Every leaf of the graph (namely, a node of in-degree 0) is labelled with either an input variable or a field element. Every other node of the graph is labelled with either or (in the first case the node is a sum-gate and in the second case a product-gate). Every edge in the graph is labelled with an arbitrary field element. A node of out-degree is called an output-gate of the circuit.

Every node and every edge in an arithmetic circuit computes a polynomial in the commutative polynomial-ring in the following way. A leaf just computes the input variable or field element that labels it. the sum of the polynomials computed by the two edges that reach it. A product-gate computes the product of the polynomials computed by the two edges that reach it. We say that a polynomial is computed by the circuit if it is computed by one of the circuit’s output-gates.

The size of a circuit is defined to be the number of edges in , and is denoted by .

Definition 8.

Let be a field, and let be a set of input variables. A non-commutative arithmetic circuits is similarly to the arithmetic circuits defined above, with the following additional feature: given any -gate of fanin , its children are labeled by a fixed order.

Every node and every edge in a non-commutative arithmetic circuit computes a noncommutative polynomial in the free algebra in exactly the same way as the arithmetic circuit does, except that at each , the ordering among the children is taken into account in defining the polynomial computed at the gate.

The size of a noncommutative circuit is also defined to be the number of vertices in , and is denoted by .

Appendix B The complexity measure

size=, caption=might not be appropriate name, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitesize=, caption=might not be appropriate name, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitetodo: size=, caption=might not be appropriate name, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitemight not be appropriate name

Let be a PI-algebra (Definition 3) and let be the T-ideal (Definition 4) consisting of all identities of (see Fact 3). Assume that is a basis for the T-ideal , that is, . Then every is a consequence size=, noline, caption=define consequence, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitesize=, noline, caption=define consequence, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitetodo: size=, noline, caption=define consequence, , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitedefine consequence of , namely, can be written as a linear combination of substitution instance of polynomials from as follows:

(3)

for and (for all ).

A very natural question, from the complexity point of view, is the following: What is the minimal number of distinct substitution instances of generators from that must occur in (3)? Or in other words, how many distinct substitution instances of generators are needed to generate above? size=, caption=Note it’s different than the minimal such that (3) holds, because the same substitution instance can occur (necessarily) more than once in the sum (because non-commutative multiplication)., , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitesize=, caption=Note it’s different than the minimal such that (3) holds, because the same substitution instance can occur (necessarily) more than once in the sum (because non-commutative multiplication)., , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitetodo: size=, caption=Note it’s different than the minimal such that (3) holds, because the same substitution instance can occur (necessarily) more than once in the sum (because non-commutative multiplication)., , linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whiteNote it’s different than the minimal such that (3) holds, because the same substitution instance can occur (necessarily) more than once in the sum (because non-commutative multiplication).

Formally, we have the following:

Definition 9 ().

For a set of polynomials , define as the smallest (finite) such that there exist substitution instances of polynomials from with

where is the two-sided ideal generated by .

If the set is a singleton , we shall sometimes write instead of .

Accordingly, we extend Definition 9 to a sequence of polynomials and let be the smallest such that there exist some substitution instances of polynomials from with

Note that is interesting only if is not already in the generating set. Hence, we need to make sure that the generating set does not contain and the easiest way to do this (when considering asymptotic growth of measure) is by stipulating the the generating set is finite. Given an algebra, the question whether there exists a finite generating set of the T-ideal of the identities of the algebra is a highly non-trivial problem, that goes by the name The Specht Problem. Fortunately, for matrix algebras we can use the solution of the Specht problem given by Kemer [13]. Kemer showed that for every matrix algebra there exists a finite basis of the T-ideal of the identities of . The problem to actually find such a finite basis for most matrix algebras (namely for all values of , for ) is open.

We have the following simple proposition (which is analogous to a certain extent to the fact that every two Frege proof systems polynomially simulate each other; see e.g. [14]):

size=, caption=Unclear definition; also it should be a definition and not a notation., inline, linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitesize=, caption=Unclear definition; also it should be a definition and not a notation., inline, linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whitetodo: size=, caption=Unclear definition; also it should be a definition and not a notation., inline, linecolor=green!70!white, backgroundcolor=blue!10!white,bordercolor=whiteUnclear definition; also it should be a definition and not a notation.
Proposition 4.

Let be some -algebra and let and be two finite bases for the identities of . Then, there exists a constant (that depends only on ) such that for any identity of :

<