Limits on the Universal Method for Matrix Multiplication

Limits on the Universal Method for Matrix Multiplication

Josh Alman111MIT CSAIL and EECS, jalman@mit.edu. Supported by two NSF Career Awards.
Abstract

In this work, we prove limitations on the known methods for designing matrix multiplication algorithms. Alman and Vassilevska Williams [AW18b] recently defined the Universal Method, which substantially generalizes all the known approaches including Strassen’s Laser Method [Str87] and Cohn and Umans’ Group Theoretic Method [CU03]. We prove concrete lower bounds on the algorithms one can design by applying the Universal Method to many different tensors. Our proofs use new tools for upper bounding the asymptotic slice rank of a wide range of tensors. Our main result is that the Universal method applied to any Coppersmith-Winograd tensor cannot yield a bound on , the exponent of matrix multiplication, better than . By comparison, it was previously only known that the weaker ‘Galactic Method’ applied to could not achieve an exponent of .

We also study the Laser Method (which is, in principle, a highly special case of the Universal Method) and prove that it is “complete” for matrix multiplication algorithms: when it applies to a tensor , it achieves if and only if it is possible for the Universal method applied to to achieve . Hence, the Laser Method, which was originally used as an algorithmic tool, can also be seen as a lower bounding tool. For example, in their landmark paper, Coppersmith and Winograd [CW90] achieved a bound of , by applying the Laser Method to . By our result, the fact that they did not achieve implies a lower bound on the Universal Method applied to . Indeed, if it were possible for the Universal Method applied to to achieve , then Coppersmith and Winograd’s application of the Laser Method would have achieved .

1 Introduction

One of the biggest open questions in computer science asks how quickly one can multiply two matrices. Progress on this problems is measured by giving bounds on , the exponent of matrix multiplication, defined as the smallest real number such that two matrices over a field can be multiplied using field operations. Since Strassen’s breakthrough algorithm [Str69] showing that , there has been a long line of work, resulting in the current best bound of  [Wil12, LG14], and it is popularly conjectured that .

The key to Strassen’s algorithm is an algebraic identity showing how matrix multiplication can be computed surprisingly efficiently (in particular, Strassen showed that the matrix multiplication tensor has rank at most ; see Section 3 for precise definitions). Arguing about the ranks of larger matrix multiplication tensors has proven to be quite difficult – in fact, even the rank of the matrix multiplication tensor isn’t currently known. Progress on bounding since Strassen’s algorithm has thus taken the following approach: Pick a tensor (trilinear form) , typically not a matrix multiplication tensor, such that

  • Powers of can be efficiently computed (i.e. has low asymptotic rank), and

  • is useful for performing matrix multiplication, since large matrix multiplication tensors can be ‘embedded’ within powers of .

Combined, these give an upper bound on the rank of matrix multiplication itself, and hence .

The most general type of embedding which is known to preserve the ranks of tensors as required for the above approach is a degeneration. In [AW18b], the author and Vassilevska Williams called this method of taking a tensor and finding the best possible degeneration of powers into matrix multiplication tensors the Universal Method applied to , and the best bound on which can be proved in this way is written . They also defined two weaker methods: the Galactic Method applied to , in which the ‘embedding’ must be a more restrictive monomial degeneration, resulting in the bound on , and the Solar Method applied to , in which the ‘embedding’ must be an even more restrictive zeroing out, resulting in the bound on . Since monomial degenerations and zeroing outs are successively more restrictive types of degenerations, we have that for all tensors ,

These methods are very general; there are no known methods for computing , , or for a given tensor , and these quantities are even unknown for very well-studied tensors . The two main approaches to designing matrix multiplication algorithms are the Laser Method of Strassen [Str87] and the Group-Theoretic Method of Cohn and Umans [CU03]. Both of these approaches show how to give upper bounds on for particular structured tensors (and hence upper bound itself). In other words, they both give ways to find zeroing outs of tensors into matrix multiplication tensors, but not necessarily the best zeroing outs. In fact, it is known that the Laser Method does not always give the best zeroing out for a particular tensor , since the improvements from [CW90] to later works [DS13, Wil12, LG14] can be seen as giving slight improvements to the Laser Method to find better and better zeroing outs222These works apply the Laser Method to higher powers of the tensor , a technique which is still captured by the Solar Method.. The Group-Theoretic Method, like the Solar Method, is very general, and it is not clear how to optimally apply it to a particular group or family of groups.

All of the improvements on bounding for the past 30+ years have come from studying the Coppersmith-Winograd family of tensors . The Laser Method applied to powers of gives the bound . The Group-Theoretic Method can also prove the best known bound , by simulating the Laser Method analysis of (see e.g. [AW18a] for more details). Despite a long line of work on matrix multiplication, there are no known tensors333The author and Vassilevska Williams [AW18b] study a generalization of which can tie the best known bound, but its analysis is identical to that of . Our lower bounds in this paper will apply equally well to this generalized class as to itself. which seem to come close to achieving the bounds one can obtain using . This leads to the first main question of this paper:

Question 1.1.

How much can we improve our bound on using a more clever analysis of the Coppersmith-Winograd tensor?

The author and Vassilevska Williams [AW18b] addressed this question by showing that there is a constant so that for all , . In other words, the Galactic Method (monomial degenerations) cannot be used with to prove . However, this leaves open a number of important questions: How close to can we get using monomial degenerations; could it be that ? Perhaps more importantly, what if we are allowed to use arbitrary degenerations; could it be that , or even ?

The second main question of this paper concerns the Laser Method. The Laser Method upper bounds for any tensor with certain structure (which we describe in detail in Section 6), and has led to every improvement on since its introduction by Strassen [Str87].

Question 1.2.

When the Laser Method applies to a tensor , how close does it come to optimally analyzing ?

As discussed, we know the Laser Method does not always give a tight bound on . For instance, Coppersmith-Winograd [CW90] applied the Laser Method to to prove , and then later work [DS13, Wil12, LG14] analyzed higher and higher powers of to show . Ambainis, Filmus and Le Gall [AFLG15] showed that analyzing higher and higher powers of itself with the Laser Method cannot yield an upper bound better than . What about for other tensors? Could there be a tensor such that applying the Laser Method to yields for some , but applying the Laser Method to high powers of yields ? Could applying an entirely different method to such a , using arbitrary degenerations and not just zeroing outs, show that ?

1.1 Our Results

We give strong resolutions to both Question 1.1 and Question 1.2.

Universal Method Lower Bounds

To resolve Question 1.1, we prove a new lower bound for the Coppersmith-Winograd tensor:

Theorem 1.3.

for all .

In other words, no analysis of , using any techniques within the Universal Method, can prove a bound on better than . This generalizes the main result of [AW18b] from the Galactic method to the Universal method, and gives a more concrete lower bound, increasing the bound from ‘a constant greater than ’ to . We also give stronger lower bounds for particular tensors in the family. For instance, for the specific tensor which yields the current best bound on , we show .

Our proof of Theorem 1.3 proceeds by upper bounding , the asymptotic slice rank of . The slice rank of a tensor, denoted , was first introduced by Blasiak et al. [BCC17] in the context of lower bounds against the Group-Theoretic Method. In order to study degenerations of powers of tensors, rather than just tensors themselves, we need to study an asymptotic version of slice rank, . This is important since the slice rank of a product of two tensors can be greater than the product of their slice ranks, and as we will show, is much greater than for big enough .

We will give three different tools for proving upper bounds on for many different tensors . These, combined with the known connection, that upper bounds on the slice rank of yield lower bounds on , will imply our lower bound for as well as many other tensors of interest, including: the same lower bound for any generalized Coppersmith-Winograd tensor as introduced in [AW18b], a similar lower bound for , the generalized ‘simple’ Coppersmith-Winograd tensor missing its ‘corner terms’, and a lower bound for , the structural tensor of the cyclic group , matching the lower bounds obtained by [AW18a, BCC17]. In Section 5 we give tables of our precise lower bounds for these and other tensors.

The Galactic Method lower bounds of [AW18b] were proved by introducing a suite of tools for giving upper bounds on , the asymptotic independence number (sometimes also called the ‘galactic subrank’ or the ‘monomial degeneration subrank’) for many tensors . We will show that our new tools are able to prove at least as high a lower bound on as the tools of [AW18b] can prove on . We thus show that all of those previously known Galactic Method lower bounds hold for the Universal Method as well.

We also show how our slice rank lower bounds can be used to study other properties of tensors. Coppersmith and Winograd [CW90] introduced the notion of the value of a tensor , which is useful when applying the Laser Method to a larger tensor which contains as a subtensor. We show how our slice rank lower bounding tools yield a tight upper bound on the value of , the notorious subtensor of which arises when applying the Laser Method to powers of . Although the value appears in every analysis of since [CW90], including [DS13, Wil12, LG14, LG12, GU18], the best lower bound on it has not improved since [CW90], and our new upper bound here helps explain why. See Sections 3.5 and 5.4 for more details.

We briefly note that our lower bound of in Theorem 1.3 may be significant when compared to the recent algorithm of Cohen, Lee and Song [CLS18] which solves -variable linear programs in time about .

The Laser Method is “Complete”

We also show that for a wide class of tensors , including , , , and all the other tensors we study in Section 5, our tools are tight, meaning they not only give an upper bound on , but they also give a matching lower bound. Hence, for these tensors , no better lower bound on is possible by arguing only about .

The tensors we prove this for are what we call laser-ready tensors – tensors to which the Laser Method (as used by [CW90] on ) applies; see Definition 6.1 for the precise definition. Tensors need certain structure to be laser-ready, but tensors with this structure are essentially the only ones for which successful techniques for upper bounding are known. In fact, every record-holding tensor in the history of matrix multiplication algorithm design has been laser-ready.

We show that for any laser-ready tensor , the Laser Method can be used to construct a degeneration from to an independent tensor of size , where is the upper bound on implied by one of our tools, Theorem 4.4. Combined, these imply that , showing that the lower bound from Theorem 4.4 is tight. This gives an intriguing answer to Question 1.2:

Theorem 1.4.

If is a laser-ready tensor, and the Laser Method applied to yields the bound for some , then .

To reiterate: If is any tensor to which the Laser Method applies (as in Definition 6.1), and the Laser Method does not yield when applied to , then in fact , and even the substantially more general Universal method applied to cannot yield . Hence, the Laser Method, which was originally used as an algorithmic tool, can also be seen as a lower bounding tool. Conversely, Theorem 1.4 shows that the Laser Method is “complete”, in the sense that it cannot yield a bound on worse than when applied to a tensor which is able to prove .

Theorem 1.4 explains and generalizes a number of phenomena:

  • The fact that Coppersmith-Winograd [CW90] applied the Laser method to the tensor and achieved an upper bound greater than on implies that , and no arbitrary degeneration of powers of can yield .

  • As mentioned above, it is known that applying the Laser method to higher and higher powers of a tensor can successively improve the resulting upper bound on . Theorem 1.4 shows that if the Laser method applied to the first power of any tensor did not yield , then this sequence of Laser method applications (which is a special case of the Universal method) must converge to a value greater than as well. This generalizes the result of Ambainis, Filmus and Le Gall [AFLG15], who proved this about applying the Laser Method to higher and higher powers of the specific tensor .

  • Our result also generalizes the result of Kleinberg, Speyer and Sawin [KSS18], where it was shown that (what can be seen as) the Laser method achieves a tight lower bound on , matching the upper bound of Blasiak et al. [BCC17]. Indeed, , the lower triangular part of , is a laser-ready tensor.

Our proof of Theorem 1.4 also sheds light on a notion related to the asymptotic slice rank of a tensor , called the asymptotic subrank of . is a “dual” notion of asymptotic rank, and it is important in the definition of Strassen’s asymptotic spectrum of tensors [Str87].

It is not hard to see (and follows, for instance, from Propositions 3.3 and 3.4 below) that for all tensors . However, there are no known separations between the two notions; whether there exists a tensor such that is an open question. As a Corollary of Theorem 1.4, we prove:

Corollary 1.5.

Every laser-ready tensor has .

Since, as discussed above, almost all of the most-studied tensors are laser-ready, this might help explain why we have been unable to separate the two notions.

1.2 Other Related Work

Probabilistic Tensors and Support Rank

Cohn and Umans [CU13] introduced the notion of the support rank of tensors, and showed that upper bounds on the support rank of matrix multiplication tensors can be used to design faster Boolean matrix multiplication algorithms. Recently, Karppa and Kaski [KK19] used ‘probabilistic tensors’ as another way to design Boolean matrix multiplication algorithms.

In fact, our tools for proving asymptotic slice rank upper bounds can be used to prove lower bounds on these approaches as well. For instance, our results imply that finding a ‘weighted’ matrix multiplication tensor as a degeneration of a power of (in order to prove a support rank upper bound) cannot result in a better exponent for Boolean matrix multiplication than .

This is because ‘weighted’ matrix multiplication tensors can degenerate into independent tensors just as large as their unweighted counterparts. Similarly, if a probabilistic tensor is degenerated into a (probabilistic) matrix multiplication tensor, Karppa and Kaski show that this gives a corresponding support rank expression for matrix multiplication as well, and so upper bounds on for any in the support of also result in lower bounds on this approach.

Concurrent Work

Christandl, Vrana and Zuiddam [CVZ18a] independently proved some of the same lower bounds on as us, including Theorem 1.3. Although we achieve the same upper bounds on for a number of tensors, our techniques seem different: we use simple combinatorial tools generalizing those from our prior work [AW18b], while their bounds use the seemingly more complicated machinery of Strassen’s asymptotic spectrum of tensors [Str91]. They thus phrase their results in terms of the asymptotic subrank of tensors rather than the asymptotic slice rank , and the fact that their bounds are often the same as ours is related to the fact we prove, in Corollary 1.5, that for all of the tensors we study; see the bottom of Section 3.6 for a more technical discussion of the differences between the two notions. Our other results and applications of our techniques are, as far as we know, entirely new, including our matching lower bounds for , , and , bounding the value of tensors, and all our results about the completeness of the Laser Method. By comparison, their ‘irreversibility’ approach only seems to upper bound itself.

1.3 Outline

In Section 2 we give an overview of the proofs of our main results. In Section 3 we introduce all the concepts and notation related to tensors which will be used throughout the paper. In particular, in Subsection 3.6 we introduce the relevant notions and basic properties related to slice rank. In Section 4 we present the proofs of our new lower bounding tools for asymptotic slice rank. In Section 5 we apply these tools to a number of tensors of interest including . Finally, in Section 6, we define and discuss the “completeness” of the Laser method.

2 Proof Overview

We give a brief overview of the techniques we use to prove our main results, Theorems 1.3 and 1.4. All the technical terms we refer to here will be precisely defined in Section 3.

Section 3.6: Asymptotic Slice Rank and its Connection with Matrix Multiplication

The tensors we study are 3-tensors, which can be seen as trilinear forms over three sets of formal variables. The slice rank of a tensor is a measure of the complexity of , analogous to the rank of a matrix. In this paper we study the asymptotic slice rank of tensors :

satisfies two key properties:

  1. Degenerations cannot increase the asymptotic slice rank of a tensor. In other words, if degenerates to , then .

  2. Matrix multiplication tensors have high asymptotic slice rank.

This means that if a certain tensor has a small value of , or in other words, powers can degenerate into large matrix multiplication tensors, then itself must have large asymptotic slice rank. Hence, in order to lower bound , it suffices to upper bound .

Section 4: Tools for Upper Bounding Asymptotic Slice Rank

In general, bounding for a tensor can be much more difficult than bounding . This is because can be supermultiplicative, i.e. there are tensors and such that . Indeed, we will show that for many tensors of interest, including .

We will give three new tools for upper bounding for many tensors . Each applies to tensors with different properties:

  • Theorem 4.2: If is over , then it is straightforward to see that if one of the variable sets is not too large, then must be small: . In this first tool, we show how if can be written as a sum of a few tensors, and each does not have many of one type of variable, then we can still derive an upper bound on .

  • Theorem 4.4: The second tool concerns partitions of the variable sets . It shows that if is large, then there is a probability distribution on the blocks of (subtensors corresponding to a choice of one part from each of the three partitions) so that the total probability mass assigned to each part of each partition is proportional to its size. Loosely, this means that must have many different ‘symmetries’, no matter how its variables are partitioned.

  • Theorem 4.8: Typically, for tensors and , even if and are ‘small’, it may still be the case that is large. This third tool shows that if has an additional property, then one can still bound . Roughly, the property that must satisfy is that not only is small, but a related notion called the ‘x-rank’ of must also be small.

In particular, we will remark that our three tools for bounding strictly generalize similar tools introduced by [AW18b] for bounding . Hence, we generalize their results bounding for various tensors to bounds on .

Section 5: Universal Method Lower Bounds

We apply our tools to prove upper bounds on , and hence lower bounds on , for a number of tensors of interest. To prove Theorem 1.3, we show that all three tools can be applied to . We also apply our tools to many other tensors of interest including the generalized Coppersmith-Winograd tensors , the generalized small Coppersmith-Winograd tensors , the structural tensor of the cyclic group as well as its ‘lower triangular version’ , and the subtensor of which arises in [CW90, DS13, Wil12, LG14, LG12, GU18]. Throughout Section 5 we give many tables of concrete lower bounds that we prove for the tensors in all these different families.

Section 6: “Completeness” of the Laser Method

Finally, we study the Laser Method. The Laser Method applied to a tensor shows that powers can zero out into large matrix multiplication tensors. Using the properties of that we prove in Section 3.6, we will show that the Laser Method can also be applied to a tensor to prove a lower bound on . (More precisely, it actually proves a lower bound on , the asymptotic subrank of , which in turn lower bounds ).

We prove Theorem 1.4 by combining this construction with Theorem 4.4, one of our tools for upper bounding . Intuitively, both Theorem 4.4 and the Laser Method are concerned with probability distributions on blocks of a tensor, and both involve counting the number of variables in powers that are consistent with these distributions. We use this intuition to show that the upper bound given by Theorem 4.4 is equal to the lower bound given by the Laser Method.

3 Preliminaries

We begin by introducing the relevant notions and notation related to tensors and matrix multiplication. We will use the same notation introduced in [AW18b, Section 3], and readers familiar with that paper may skip to Subsection 3.5.

3.1 Tensor Basics

For sets , , and of formal variables, a tensor over is a trilinear form

where the coefficients come from an underlying field . The terms, which we write as , are sometimes written as in the literature. We say is minimal for if, for each , there is a term involving with a nonzero coefficient in , and similarly for and (i.e. can’t be seen as a tensor over a strict subset of the variables). We say that two tensors are isomorphic, written , if they are equal up to renaming variables.

If is a tensor over , and is a tensor over , then the tensor product is a tensor over such that, for any , , and , the coefficient of in is the product of the coefficient of in , and the coefficient of in . For any tensor and positive integer , the tensor power is the tensor over resulting from taking the tensor product of copies of .

If is a tensor over , and is a tensor over , then the direct sum is a tensor over , , which results from forcing the variable sets to be disjoint (as in a normal disjoint union) and then summing the two tensors. For a nonnegative integer and tensor we write for the disjoint sum of copies of .

3.2 Tensor Rank

A tensor has rank one if there are values for each , for each , and for each , such that the coefficient of in is , or in other words,

The rank of a tensor , denoted , is the smallest number of rank one tensors whose sum (summing the coefficient of each term individually) is . It is not hard to see that for tensors and positive integers , we always have , but for some tensors of interest this inequality is not tight. We thus define the asymptotic rank of tensor as .

3.3 Matrix Multiplication Tensors

For positive integers , the matrix multiplication tensor is a tensor over , , given by

It is not hard to verify that for positive integers , we have . The exponent of matrix multiplication, denoted , is defined as

Because of the tensor product property above, we can alternatively define in a number of ways:

For instance, Strassen [Str69] showed that , which implies that .

3.4 Degenerations and the Universal Method

We now describe a very general way to transform from a tensor over to a tensor over . For a formal variable , pick maps , , and , which map pairs of variables to polynomials in , and pick an integer . Then, when you replace each with , each with , and each with , in , then the resulting tensor is a tensor over with coefficients over . When is instead viewed as a polynomial in whose coefficients are tensors over with coefficients in , it must be that is the coefficient of , and the coefficient of is for all .

If such a transformation is possible, we say is a degeneration of . There are also two more restrictive types of degenerations:

  • is a monomial degeneration of if such a transformation is possible where the polynomials in the ranges of have at most one monomial, and furthermore, for each there is at most one such that , and similarly for and .444Some definitions of monomial degenerations do not have this second condition, or equivalently, consider a monomial degeneration to be a ‘restriction’ composed with what we defined here. The distinction is not important for this paper, but we give this definition since it captures Strassen’s monomial degeneration from matrix multiplication tensors to independent tensors [Str86] (see also Proposition 3.5 below), and it is the notion that the prior work [AW18b] proved lower bounds against.

  • is a zeroing out of if, in addition to the restrictions of a monomial degeneration, the ranges of must be .

Degenerations are useful in the context of matrix multiplication algorithms because degenerations cannot increase the rank of a tensor. In other words, if is a degeneration of , then  [Bin80]. It is often hard to bound the rank of matrix multiplication tensors directly, so all known approaches proceed by bounding the rank of a different tensor and then showing that powers of degenerate into matrix multiplication tensors.

More precisely, all known approaches fall within the following method, which we call the Universal Method [AW18b] applied to a tensor of asymptotic rank : Consider all positive integers , and all ways to degenerate into a disjoint sum of matrix multiplication tensors, resulting in an upper bound on by the asymptotic sum inequality [Sch81] of . Then, , the bound on from the Universal Method applied to , is the over all such and degenerations, of the resulting upper bound on .

In [AW18b], two weaker versions of the Universal Method are also defined: the Galactic Method, in which the degeneration must be a monomial degeneration, resulting in a bound , and the Solar Method, in which the degeneration must be a zeroing out, resulting in a bound . To be clear, all three of these methods are very general, and we don’t know the values of , , or for almost any nontrivial tensors . In fact, all the known approaches to bounding proceed by giving upper bounds on for some carefully chosen tensors ; the most successful has been the Coppersmith-Winograd family of tensors , which has yielded all the best known bounds on since the 80’s [CW82, DS13, Wil12, LG14]. Indeed, the two most successful approaches, the Laser Method [Str87] and the Group-Theoretic Approach [CU03] ultimately use zeroing outs of tensors. We refer the reader to [AW18b, Sections 3.3 and 3.4] for more details on these approaches and how they relate to the notions used here.

3.5 Tensor Value

Coppersmith and Winograd [CW90] defined the value of a tensor in their analysis of the tensor. For a tensor , and any , the -value of , denoted , is defined as follows: Consider all positive integers , and all ways to degenerate into a direct sum of matrix multiplication tensors. Then, is given by

We can then equivalently define as the of , over all such that . We can see from the power mean inequality that for all , although this bound is often not tight as there can be better degenerations of depending on the value of .

3.6 Asymptotic Slice Rank

The main new notions we will need in this paper relate to the slice rank of tensors. We say a tensor over has x-rank if it is of the form

for some choices of the and coefficients over the base field. More generally, the x-rank of , denoted , is the minimum number of tensors of x-rank 1 whose sum is . We can similarly define the y-rank, , and the z-rank, . Then, the slice rank of , denoted , is the minimum such that there are tensors , and with and .

Unlike tensor rank, the slice-rank is not submultiplicative in general, i.e. there are tensors and such that . For instance, it is not hard to see that , but since it is known [Wil12, LG14] that , it follows (e.g. from Theorem 3.9 below) that . We are thus interested in the asymptotic slice rank, , of tensors , defined as

We note a few simple properties of slice rank which will be helpful in our proofs:

Lemma 3.1.

For tensors and :

  1. ,

  2. ,

  3. , and ,

  4. , and

  5. If is a tensor over , then and hence .

Proof.

(1) and (2) are straightforward. (3) follows since the sum of the slice rank (resp. x-rank) expressions for and for gives a slice rank (resp. x-rank) expression for . To prove (4), let , and note that if such that , then

and so

Finally, (5) follows since, for instance, any tensor with one only x-variable has x-rank 1. ∎

Asymptotic slice rank is interesting in the context of matrix multiplication algorithms because of the following facts.

Definition 3.2.

For a positive integer , the independent tensor of size , denoted , is the tensor with terms that do not share any variables.

Proposition 3.3 ([Ts16] Corollary 2).

If and are tensors such that has a degeneration to , then , and hence .

Proposition 3.4 ([Tao16] Lemma 1; see also [Bcc17] Lemma 4.7).

For any positive integer , we have , where is the independent tensor of size .

Proposition 3.5 ([Str86] Theorem 4; see also [AW18b] Lemma 4.2).

For any positive integers , the matrix multiplication tensor has a (monomial) degeneration to an independent tensor of size at least .

Corollary 3.6.

For any positive integers , we have .

Proof.

Assume without loss of generality that . For any positive integer , we have that has a degeneration to an independent tensor of size at least , meaning and hence , which means . Meanwhile, has different -variables, so it must have and more generally, , which means . ∎

To summarize: we know that degenerations cannot increase asymptotic slice rank, and that matrix multiplication tensors have a high asymptotic slice rank. Hence, if is a tensor such that is ‘small’, meaning a power of has a degeneration to a disjoint sum of many large matrix multiplication tensors, then itself must have ‘large’ asymptotic slice rank. This can be formalized identically to [AW18b, Theorem 4.1 and Corollary 4.3] to show:

Theorem 3.7.

For any tensor ,

Corollary 3.8.

For any tensor , if , then . Moreover, for every constant , there is a constant such that every tensor with must have .

Almost all the tensors we consider in this note are variable-symmetric tensors, and for these tensors we can get a better lower bound on from an upper bound on . We say that a tensor over is variable-symmetric if , and the coefficient of equals the coefficient of in for all .

Theorem 3.9.

For a variable-symmetric tensor we have .

Proof.

As in the proof of [AW18b, Theorem 4.1], by definition of , we know that for every , there is a positive integer such that has a degeneration to for integers such that . In fact, since is symmetric, we know also has a degeneration to and to , and so has a degeneration to . As above, it follows that . Rearranging, we see

Hence,

where the last step follows because and so subtracting the same quantity from both the numerator and denominator cannot decrease the value of the fraction. This holds for all and hence implies the desired result. ∎

Slice Rank versus Subrank

For a tensor , let denote the largest integer such that there is a degeneration from to . The asymptotic subrank of is defined as . Propositions 3.3 and 3.4 above imply that for all tensors . Similarly, it is not hard to see that Theorems 3.7 and 3.9 hold with replaced by . One could thus conceivably hope to prove stronger lower bounds than those in this paper by bounding instead of . However, we will prove in Corollary 6.5 below that for every tensor we study in this paper, so such an improvement using is impossible. More generally, there are currently no known tensors for which the best known upper bound on is smaller than the best known upper bound on (including the new bounds of [CVZ18b, CVZ18a]). Hence, novel tools for upper bounding would be required for such an approach to proving better lower bounds on .

3.7 Partition Notation

In a number of our results, we will be partitioning the terms of tensors into blocks defined by partitions of the three variable sets. Here we introduce some notation for some properties of such partitions; these definitions all depend on the particular partition of the variables being used, which will be clear from context.

Suppose is a tensor minimal over , and let , , be partitions of the three variable sets. For , let be restricted to (i.e. with , , and zeroed out), and let . is called a block of . For let , and define similarly and .

We will be particularly interested in probability distributions . Let be the set of such distributions. For such a , and for , let , and similarly and . Then, define by

and and similarly. This expression, which arises naturally in the Laser Method, will play an important role in our upper bounds and lower bounds.

3.8 Tensor Rotations and Variable-Symmetric Tensors

If is a tensor over , then the rotation of , denoted , is the tensor over such that for any , the coefficient of in is equal to the coefficient of in . Tensor is variable-symmetric if .

If is a variable-symmetric tensor minimal over , then partitions , , of the variable sets are called -symmetric if (using the notation of the previous subsection) , for all , and the block for all . For the