Using Elimination Theory to construct Rigid Matrices

Using Elimination Theory to construct Rigid Matrices

Abhinav Kumar,   Satyanarayana V. Lokam,
Vijay M. Patankar,   Jayalal Sarma M.N.
abhinav@math.mit.edu, Department of Mathematics, Massachusetts Institute of Technology, Cambridge, USA. This work was started when the author was with Microsoft Research, Redmond, and later supported by NSF CAREER grant DMS-0952486.satya@microsoft.com, Microsoft Research India, Bangalore, India.vijay@isichennai.res.in, Indian Statistical Institute, Chennai Centre, Chennai, India. This work was started and major part of this work was completed while the author was with Microsoft Research India, Bangalore.jayalal@cse.iitm.ac.in, Department of Computer Science & Engineering, Indian Institute of Technology Madras, Chennai, India. Work done while the author was with the Institute of Mathematical Sciences, Chennai, and Institute for Theoretical Computer Science, Tsinghua University, Beijing, China.
Abstract

The rigidity of a matrix for target rank is the minimum number of entries of that must be changed to ensure that the rank of the altered matrix is at most . Since its introduction by Valiant [Val77], rigidity and similar rank-robustness functions of matrices have found numerous applications in circuit complexity, communication complexity, and learning complexity. Almost all matrices over an infinite field have a rigidity of . It is a long-standing open question to construct infinite families of explicit matrices even with superlinear rigidity when .

In this paper, we construct an infinite family of complex matrices with the largest possible, i.e., , rigidity. The entries of an matrix in this family are distinct primitive roots of unity of orders roughly . To the best of our knowledge, this is the first family of concrete (but not entirely explicit) matrices having maximal rigidity and a succinct algebraic description.

Our construction is based on elimination theory of polynomial ideals. In particular, we use results on the existence of polynomials in elimination ideals with effective degree upper bounds (effective Nullstellensatz). Using elementary algebraic geometry, we prove that the dimension of the affine variety of matrices of rigidity at most is exactly . Finally, we use elimination theory to examine whether the rigidity function is semicontinuous.

1 Introduction

Valiant [Val77] introduced the notion of matrix rigidity. The rigidity function of a matrix for target rank is defined to be the smallest number of entries of that must be changed to ensure that the altered matrix has rank at most . It is easy to see that for every matrix (over any field), . Valiant also showed that, over an infinite field, almost all matrices have rigidity exactly . It is a long-standing open question to construct infinite families of explicit matrices with superlinear rigidity for . Here, by an explicit family, we mean that the matrix in the family is computable by a deterministic Turing machine in time polynomial in or by a Boolean circuit of size polynomial in . Lower bounds on rigidity of explicit matrices are motivated by their numerous applications in complexity theory. In particular, Valiant showed that lower bounds of the form (where and are some positive constants) imply that the linear transformation defined by cannot be computed by arithmetic circuits of linear size and logarithmic depth consisting of gates that compute linear functions of their inputs. Since then, applications of lower bounds on rigidity and similar rank-robustness functions have been found in circuit complexity, communication complexity, and learning complexity (see [FKL01, For02, Raz89, Lok01, PP04, LS09]). For comprehensive surveys on this topic, see [Cod00], [Che05], and [Lok09]. Over finite fields, the best known lower bound for explicit was first proved by Friedman [Fri93] and is for parity check matrices of good error-correcting codes. Over infinite fields, the same lower bound was proved by Shokrollahi, Spielman, and Stemann [SSS97] for Cauchy matrices, Discrete Fourier Transform matrices of prime order (see [Lok00]), and other families. Note that this type of lower bound reduces to the trivial when . In [Lok06], lower bounds of the form were proved when or when , where are the first primes. These matrices, however, are not explicit in the sense defined above.

In this paper, we construct an infinite family of complex matrices with the highest possible, i.e., , rigidity. The entries of the matrix in this family are primitive roots of unity of orders roughly . We show that the real parts of these matrices are also maximally rigid. Like the matrices in [Lok06], this family of matrices is not explicit in the sense of efficient computability described earlier. However, one of the motivations for studying rigidity comes from algebraic complexity. In the world of algebraic complexity, any element of the ground field (in our case ) is considered a primitive or atomic object. In this sense, the matrices we construct are explicitly described algebraic entities. To the best of our knowledge, this is the first construction giving an infinite family of non-generic/concrete matrices with maximum rigidity. It is still unsatisfactory, though, that the roots of unity in our matrices have orders exponential in . Earlier constructions in [Lok06] use roots of unity of orders but the bounds on rigidity proved there are weaker: for some constant .

We pursue a general approach to studying rigidity based on elementary algebraic geometry and elimination theory. To set up the formalism of this approach, we begin by reproving Valiant’s result that the set of matrices of rigidity less than is contained in111We note that this set itself may not be Zariski closed, as was mistakenly claimed in some earlier results, e.g., [Lok01], [LTV03]. The example in Section 5.1.1 shows that the set of matrices of rigidity less than is not Zariski closed. a proper Zariski closed set in , i.e., such matrices are solutions of a finite system of polynomial equations. Hence a generic matrix has rigidity at least . In fact, we prove a more general statement: the set of matrices of rigidity at most for target rank has dimension (as an affine variety) exactly . This sheds light on the geometric structure of rigid matrices. We believe that our argument in this context is clearer and cleaner than an earlier work in the projective setting by [LTV03]. To look for specific matrices of high rigidity, we consider certain elimination ideals associated to matrices with rigidity at most . A result in [DFGS91] using effective Nullstellensatz bounds (for instance, as in [Bro87, Kol88]) shows that an elimination ideal of a polynomial ideal must always contain a nonzero polynomial with an explicit degree upper bound (Theorem 9). We then use simple facts from algebraic number theory to prove that a matrix whose entries are primitive roots of unity of sufficiently high orders cannot satisfy any polynomial with such a degree upper bound. This gives us the claimed family of matrices of maximum rigidity.

Our primary objects of interest in this paper are the varieties of matrices with rigidity at most . For a fixed , we have a natural decomposition of this variety based on the patterns of changes. We prove that this natural decomposition is indeed a decomposition into irreducible components (Corollary 16). In fact, these components are defined by elimination ideals of determinantal ideals generated by all the minors of an matrix of indeterminates. Better effective upper bounds on the degree of a nonzero polynomial in the elimination ideal of determinantal ideals than those given by Theorem 9 would lead to similar improvements in the bound on the order of the primitive roots of unity we use to construct our rigid matrices. While determinantal ideals have been well-studied in mathematical literature, their elimination theory does not seem to have been as well-studied. The application to rigidity might be a natural motivation for further investigating the elimination ideals that arise in this situation.

We next consider the question: given a matrix , is there a small neighbourhood of within which the rigidity function is nondecreasing, i.e. such that every matrix in this neighbourhood has rigidity at least equal to that of ? This is related to the notion of semicontinuity of the rigidity function. We give a family of examples to show that the rigidity function is in general not semicontinuous. However, the specific matrices we produce with entries being roots of unity as above, by their very construction, have neighborhoods within which rigidity is nondecreasing.

The rest of the paper is organized as follows. In Section 2, we introduce some definitions and notations and recall a basic result from elimination theory. Much of the necessary background from basic algebraic geometry is reviewed in Appendix A. We introduce our main approach in Section 3, reprove Valiant’s theorem, and compute the dimension of the variety of matrices of rigidity at most . We present our new construction of maximally rigid matrices in Section 3.3. Connection to the elimination ideals of determinantal ideals is established in Section 4. In Section 5, we study semicontinuity of the rigidity function through examples and counterexamples.

2 Preliminaries

2.1 Definitions and Notations

Let be a field222For the most part, we will use the field of complex numbers . However, many of our definitions make sense over an arbitrary field and the theorems we use from algebraic geometry hold over any algebraically closed field.. Then, by we denote the algebra of matrices over . At times, when it is clear from the context, we will denote by . We use to denote the set of matrices over . For , by we will denote the -th entry of . Given , the support of is defined as Given a non-negative integer , we define

Thus, is the set of matrices over with at most non-zero entries.

A pattern is a subset of the positions of an matrix. Then, we define:

Note that .

Definition 1.

The rigidity function is the minimum number of entries we need to change in the matrix so that the rank becomes at most :

Sometimes, we will allow to be chosen in for an extension field of . In this case we will denote the rigidity by .

Let denote the set of matrices such that . Similarly, we define to be the set of matrices of rigidity at least and to be the set of matrices of rigidity at most . For a pattern of size , let be the set of matrices such that for some we have . Then we have

2.2 Elimination Theory and the Closure Theorem

We review much of the necessary background from algebraic geometry in Appendix A. Here we recall a basic result from Elimination Theory. As the name suggests, Elimination Theory deals with elimination of a subset of variables from a given set of polynomial equations and finding the reduced set of polynomial equations (not involving the eliminated variables). The main results of Elimination Theory, especially the Closure Theorem, describe a precise relation between the reduced ideal and the given ideal, and its corresponding geometric interpretation.

Given an ideal , the -th elimination ideal is the ideal of defined by

Theorem 2.

(Closure Theorem, page 125, Theorem 3 of [CLO07])
Let be an ideal of and be the -th elimination ideal associated to . Let and be the subvarieties of and (the affine spaces over of dimension and respectively) defined by and respectively. Let be the natural projection map from (projection map onto the -coordinates). Then,

  1. is the smallest (closed) affine variety containing . In other words, is the Zariski closure of .

  2. When , there is an affine variety strictly contained in such that .

3 Use of Elimination Theory

3.1 Determinantal Ideals and their Elimination Ideals

We would like to investigate the structure of the sets and and their Zariski closures

in the -dimensional affine space of matrices. Note that we have the “upper bound” and therefore . Let be an matrix with entries being indeterminates . For a pattern of positions, let be the matrix with indeterminates in the positions given by . Note that saying has rank at most is equivalent to saying that all its minors vanish. Let us consider the ideal generated by these minors:

(1)

It then follows from the definition of rigidity that is the projection from to of the algebraic set . Thus, if we define the elimination ideal

then by the Closure Theorem (Theorem 2), we obtain

(2)

Note that

3.2 Valiant’s Theorem

The following theorem due to Valiant [Val77, Theorem 6.4, page 172] says that a generic matrix has rigidity . That is, for , the dimension of is strictly less than .

A reader familiar with Valiant’s proof will realize that our proof is basically a rephrasing of Valiant’s proof in the language of algebraic geometry. The point of this proof is to set up the formalism and use it later; in particular, when we compute the exact dimension of the rigidity variety .

Theorem 3.

(Valiant) Let and . Let be as above. Then,

Proof.

Let be a pattern of size . For a choice of , we let denote a choice of rows and columns, and for a matrix , let be the corresponding submatrix of , whose determinant is one of the minors of . For , we let be the empty matrix, with determinant defined to be .

For , define to be the set of all matrices that satisfy the following properties: there exists some matrix such that

  1. ,

  2. , and

  3. where denotes the fixed minor as above.

Recall that is the set of matrices whose support is contained in . Let us also define

By definition, every element can be written as , with and .

We first prove the following lemma:

Lemma 4.

Proof.

Without loss of generality we can assume that is the upper left -minor. Thus we can write a as

where and is an matrix whose determinant is non-zero.

Since the matrix is nonsingular of dimension equal to , it follows that the first columns are linearly independent and span the column space of . Therefore each of the last columns is a linear combination of the first columns in exactly one way, and the linear combination is determined by the entries of . Formally, we have the equation

The set of all is an affine open set of dimension and and can each range over . Hence, the algebraic set has dimension exactly . ∎

Consider the following natural map :

(3)

taking to . The image of is exactly as defined at the beginning of this proof.

Also, note that We note that if there is a surjective morphism from an affine variety to another affine variety , then (a more formal statement appears as Lemma 25 in Appendix A). Thus for , we get

(4)

Note that

(5)

and that completes the proof of the theorem. ∎

Thus we have proved that the set of matrices of rigidity strictly smaller than is contained in a proper closed affine variety of , and thus is of dimension strictly less than . In other words, a generic matrix, i.e. a matrix that lies outside a certain proper closed affine subvariety of , is maximally rigid (even if we allow changes by elements of , rather than just ). Therefore, over an infinite field (for instance, an algebraically closed field), there always exist maximally rigid matrices.

We now refine Valiant’s argument and prove the following exact bound on the dimension of . The main point of the proof is a lower bound on .

Theorem 5.

Let and . Then

Proof.

By the above proof of Theorem 3 (see Equation (4)), we only need to prove that the is at least . By Equation (5) as above,

Thus, to prove the theorem it is sufficient to prove that for some , and some and :

We take and choose and as follows. Fix a pattern of size such that it is a subset of . This is possible because . Let be the top left minor. We now define:

(6)

As an affine algebraic variety, is isomorphic to , and thus . If we subtract the matrix

from the matrix above, we get a matrix

of rank exactly since the the first columns are linearly independent ( being invertible) and the last columns are a linear combination of the first , obtained by multiplying on the right by the matrix . Therefore, , and hence

Remark 6.

A similar argument or line of study - though in the projective setting - is also found in [LTV03]. Our formalism and proofs seem clearer and simpler. Our theorem is also very explicit.

3.3 Rigid Matrices over the field of Complex Numbers

Recall that to say that the rigidity of a matrix for target rank is at least , it suffices to prove that the matrix is not in . We use this idea to achieve the maximum possible lower bound for the rigidity of a family of matrices over the field of complex numbers . As a matter of fact, we obtain matrices with real algebraic entries with rigidity .

Theorem 7.

Let and let be distinct primes for . Let where . Let . Then, for any field containing ,

Proof.

For simplicity, we will index the by for to , and similarly . First, note that we may assume since for the statement of the theorem is a tautology, and for , it is obvious. We prove the theorem by showing that

Thus it is sufficient to prove that

for any pattern with . Let be any such pattern. To simplify notation, let us define . By Theorem 3 we have:

Equivalently (by Hilbert’s Nullstellensatz),

Proving that is equivalent to showing the existence of a such that . The key to the proof of the theorem is to produce a polynomial of sufficiently low degree.

Claim 8.

There is a polynomial of total degree less than .

To prove the claim, we use the following theorem:

Theorem 9.

([DFGS91], Proposition 1.7 and Remark 1.8) Let be an ideal in the polynomial ring over an infinite field , where . Let be the maximum total degree of a generator . Let be a subset of indeterminates of . If then there exists a non-zero polynomial such that, with and where .

Remark 10.

Note that the proof of Theorem 9 relies on a slightly different notion of the degree of a variety than the usual definition in projective algebraic geometry. This definition was used in [Hei83] to prove the Bézout inequality. For an explanation of how the first sentence of Remark 1.8 of [DFGS91] follows from this inequality, we refer the reader to Proposition 2.3 of [HS80].

Let us apply Theorem 9 to our case - in the notation of this theorem our data is as follows: , , , set of all minors of size , for , where by we denote the -th minor of , and as defined in (1). We may as well assume , since for the claim is easy to verify by explicit calculation. Then we have:

By Theorem 9 there exists a

such that

We will now apply the following Lemma 11, which we prove later, to this situation.

Lemma 11.

Let be a positive integer. Let be algebraic numbers such that for any , the field is Galois over and such that

Let such that . Then,

Let us set in Lemma 11. It is now easy to check that

and

The latter follows from the fact that the prime is totally ramified in and is unramified in ; see Theorem 4.10 in [Nar04]. Thus Lemma 11 is applicable and we get:

To complete the argument (for Theorem  7), now we prove Lemma 11.

Proof of Lemma 11: By induction on . For this is trivial. Now suppose that the statement is true when the number of variables is strictly less than . Assuming that the statement is not true for , we will arrive at a contradiction. This will prove the lemma.

Let with be such that

with , , satisfying the conditions as in the theorem. Since the statement is true for variables by the inductive hypothesis, without loss of generality, we can assume that all the variables and hence appears in . Let us denote by . Let us write

Note that and for . Since , for some the polynomial . Thus, by the inductive hypothesis,

Thus . This implies that satisfies a non-zero polynomial over of degree . Thus:

(7)

On the other hand, since and the fields are Galois over , by Theorem 12 (stated below), we conclude that

This contradicts (7) above and proves the lemma.

Theorem 12.

([Lan04], Theorem 1.12, page 266) Let be a Galois extension of , let be an arbitrary extension of , and assume that , are subfields of some other field. Then (the compositum of and ) is Galois over , and is Galois over . Let be the Galois group of over , and the Galois group of over . If then the restriction of to is in , and the map gives an isomorphism of on the Galois group of over . In particular, .

This concludes the proof of Theorem 7. ∎

Note that Theorem 7 is true for any family of matrices provided the satisfy Lemma 11. Hence, we have:

Corollary 13.

Let , where are primitive roots of unity of order such that (here denotes the complex conjugate of ). Then, has .

Proof.

We apply the remark above with , which generates the maximal real subfield of . These fields are Galois over , and since , they satisfy the linear disjointness property which forms the second part of the assumption of Lemma 10. ∎

4 Reduction to Determinantal Ideals

In this section, we show that the natural decomposition of the rigidity varieties is indeed a decomposition into irreducible affine algebraic varieties. In fact, these components turn out to be varieties defined by elimination ideals of determinantal ideals generated by all the minors.

To improve the bounds on the orders of primitive roots of unity in Theorem 7, it suffices to improve the degree bounds given by Theorem 9 for the special case when is a determinantal ideal. However, we do not know of such an improvement even for the special case when is the determinantal ideal of a generic Vandermonde matrix.

To show the decomposition, we will continue to use the notation from Section 3. Consider the matrix . Let , where is the set of variables that are indexed by and is the set of remaining variables.

Let

be the ideal of generated by the minors of . Let

Notice that since is the elimination ideal of w.r.t. eliminating variables , a matrix lies in if and only if its entries lie in the variety defined by the ideal . Therefore, equals the elimination ideal defined in Section 3.1, by definition. Also, is the ideal generated by the minors of and its elimination ideal for the polynomial ring over the rationals generated by the variables .

Proposition 14.

(the ideal generated by in ) and . In particular, considered as ideals in .

Proof.

First, notice that in the minors of , the variable , for , always occurs in combination with as . Therefore, eliminating the variables will also automatically eliminate the variables , giving the equality of the generators of the ideals and . Therefore . More formally, consider the automorphism of defined by letting for each and for all . The ideal must equal the ideal , since is an isomorphism. But is generated by determinants of matrices only involving the variables and , whereas , so that is generated by polynomials only involving the variables of . Therefore . Taking the image under , we get .

The equation follows from similar considerations, noting that the variables for always occur in the combination in the minors which generate . Therefore eliminating them eliminates as well. More formally, consider the isomorphism defined by letting for each , while for and for . Then again we have . ∎

The following is a well-known theorem; see [HE71, Theorem 1] and [BV80, Chapter 2].

Theorem 15.

Let be the set of all rank matrices of . Then

  1. and .

  2. is a prime ideal of . In particular, is an irreducible variety.

Corollary 16.

In the natural decomposition , the are irreducible varieties.

Proof.

In general if is a prime ideal of a commutative ring and if is a subring of , then is prime ideal of . Using this, it follows that the elimination ideal is a prime ideal since is a prime ideal by Theorem 15.

By Lemma 14, considered as ideals in . We need to prove that is a prime ideal in . To prove this we use the following general fact: if where is transcendental over an integral domain then, , the ideal generated by in , is a prime ideal of . To see this, note that . Now, is an integral domain (this is equivalent to being prime), therefore so is . Therefore is a prime ideal. Now let and . Let which is a prime ideal of . Then, (Lemma 14) and further more, from the general comments as above, it follows that the latter is a prime ideal in . Thus, (by (2)) is an irreducible subvariety of . ∎

Finally, we end with the observation that Proposition 14 gives us a slight improvement on Theorem 7.

Theorem 17.

Let Let for be distinct primes such that . Let where . Let . Then, for any field containing ,

Proof.

The only change is the improvement on , which follows from Theorem 9 as before, using the fact that by Proposition 14 above. Since now there are only variables in all, we easily get the bound . (As before, we have assumed .) ∎

5 Topology of Rigidity with some Examples

In this section, we make some observations about the topological behavior of the rigidity function in . The main motivation is to examine if all matrices within a small neighborhood of a matrix are at least as rigid as . For instance, the matrices from Theorem 7 have an open neighborhood around them within which the rigidity function is constant. This is a direct consequence of their very construction since they are outside the closed sets . We ask if this is a general property of the rigidity function itself. The notion of semicontinuity of a function captures this property.

5.1 Semicontinuity of Rigidity

Intuitively, if a function is (lower) semicontinuous at a given point, then within a small neighborhood of that point, the function is nondecreasing. Formally,

Definition 18.

Semicontinuity: Let be a topological space. A function is (lower) semicontinuous if, for each , the set is a closed subset of . That is, for each there is a neighbourhood of such that for .

The rank function of a matrix, for example, is a lower semicontinuous function on the space of all complex matrices. Unfortunately, the rigidity function does not in general have this nice property. We now show below that that there is an infinite family of matrices such that, for all and any , there is a matrix that is -close to but having rigidity strictly smaller than that of .

We start with a