# Invariant polytopes
of linear operators

with applications to regularity of wavelets

and of subdivisions
^{†}^{†}thanks:
The first author is supported by INdAM GNCS (Gruppo Nazionale di Calcolo Scientifico); the second author is supported by RFBR grants nos. 13-01-00642 and 14-01-00332, and by the grant of Dynasty foundation.

###### Abstract

We generalize the recent invariant polytope algorithm for computing the joint spectral radius and extend it to a wider class of matrix sets. This, in particular, makes the algorithm applicable to sets of matrices that have finitely many spectrum maximizing products. A criterion of convergence of the algorithm is proved.

As an application we solve two challenging computational open problems. First we find the regularity of the Butterfly subdivision scheme for various parameters . In the “most regular” case , we prove that the limit function has Hölder exponent and its derivative is “almost Lipschitz” with logarithmic factor . Second we compute the Hölder exponent of Daubechies wavelets of high order.

Keywords: matrix, joint spectral radius, invariant polytope algorithm, dominant products, balancing, subdivision schemes, Butterfly scheme, Daubechies wavelets

## 1 Introduction

The joint spectral radius of a set of matrices (or linear operators) originated in early sixties with Rota and Strang [34] and found countless applications in functional analysis, dynamical systems, wavelets, combinatorics, number theory, automata, formal languages, etc. (see bibliography in [11, 16, 17, 24]). We focus on finite sets of matrices, although all the results are extended to arbitrary compact sets. If the converse is not stated, we assume a fixed basis in and identify an operator with the corresponding matrix. Everywhere below is an arbitrary family of -matrices, is the set of all products of matrices from (with repetitions permitted). A product , is called simple if it is not a power of a shorter product.

###### Definition 1

The joint spectral radius of a family is [34]

(1) |

This limit exists and does not depend on the matrix norm. In case (i.e., ), according to Gelfand’s theorem, the joint spectral radius is , i.e., coincides with the usual spectral radius , which is the maximal modulus of eigenvalues of (see, for instance [3, 15]).

The joint spectral radius has the following geometrical meaning: is the infimum of numbers for which there is a norm in such that in the induced operator norm, we have . In particular, if and only if all operators from are contractions in some (common) norm. Thus, the joint spectral radius is the indicator of simultaneous contractivity of operators .

Another interpretation is due to the discrete dynamical system:

where each matrix is chosen from independently for every and . Then is the exponent of fastest possible growth of trajectories of the system: the maximal upper limit among all the trajectories is equal to . In particular, if and only if the system is stable, i.e., as , for every trajectory.

The problem of computing or estimating the joint spectral radius is notoriously hard. This is natural in view of the negative complexity results of Blondel and Tsitsiklis [6, 7]. Several methods of approximate computation were elaborated in the literature [2, 4, 5, 9, 11, 16, 18, 28, 29, 30, 32]. They work well in low dimensions (mostly, not exceeding ). When the dimension growth, then ether the estimation becomes rough or the the running time grows dramatically. In recent work [17] we derived the invariant polytope algorithm that finds the exact value of for the vast majority of matrix families in dimensions up to . For sets of nonnegative matrices it works faster and finds the exact value in dimensions up to and higher. We conjectured in [17] that the set of matrix families for which this algorithm finds within finite time is of full Lebesgue measure. Several open problems from applications have been solved by using this method. However, it cannot handle one important case, which often emerges in practice: the case of several spectrum maximizing products. In this paper we generalize this method making it applicable for a wider class of matrix families, including that special case. To formulate the problem we need some more facts and notation.

The following double inequality for the joint spectral radius is well known:

(2) |

Moreover, both parts of this inequality converge to as . In fact, (by definition) and (see [3]). All algorithms of approximate computation of the joint spectral radius are based on this inequality. First, one finds the greatest value of (the left hand side of (2)) among all reasonably small , then one minimizes the value (the right hand side of (2)) by choosing an appropriate matrix norm . Thus maximizing the lower bound and minimizing the upper one we approximate the joint spectral radius. Sometimes those two bounds meet each other, which gives the exact value of . This happens when one finds the spectrum maximizing product and the norm for which both inequalities in (2) become equalities.

###### Definition 2

A simple product is called the spectrum maximizing product (s.m.p.) if the value is maximal among all products of matrices from of all lengths .

Let us remark that an s.m.p. maximizes the value among all products of our matrices, not just among products of length . Observe that for any s.m.p., , we have . Indeed, from (2) it follows that . If this inequality is strict, then there are such that (because of the convergence property), which contradicts to the maximality of . Thus, to find the joint spectral radius it suffices to prove that a given product is an s.m.p.. The invariant polytope algorithm [17] proves the s.m.p. property of a chosen product by recursive construction of a polytope (or more general , although here we consider for simplicity real polytopes) such that .

In the Minkowski norm generated in by this polytope, we have and is said to be an extremal norm for . Applying (2) for (left hand side inequality) and for (right hand side) we conclude that .

Note that if a product is s.m.p., then so is each of its cyclic permutations. If there are no other s.m.p., then we say that the s.m.p. is unique meaning that it is unique up to cyclic permutations.

The disadvantage of the polytope algorithm is that it is guaranteed to work only if the family has a unique s.m.p. Otherwise the algorithm may not be able to construct the desired polytope within finite time, even if this exists. The uniqueness of s.m.p. condition, although believed to be generic, is not satisfied in many practical cases. For example, it happens often that several matrices are s.m.p. (of length ). The extension of our algorithm presented in this paper works with an arbitrary (finite) number of s.m.p.’s. We prove the theoretical criterion of convergence of the algorithm and apply it to solve two long standing open problems: computing the Hölder regularity of the Butterfly subdivision scheme (Section 5) and computing the regularity of Daubechies wavelets of high order (Section 6).

## 2 Statement of the problem

We begin with a short description of the invariant polytope algorithm from [17] for computing the joint spectral radius and finding an extremal polytope norm of a given family . We make a usual assumption that the family is irreducible, i.e., its matrices do not have a common nontrivial invariant subspace. Recall that for a reducible family the computation of the joint spectral radius is obtained by solving several problems of smaller dimensions [11]. For a given set we denote by the convex hull of and by the symmetrized convex hull. The sign denotes as usual the asymptopic equivalence of two values (i.e., equivalence up to multiplication by a constant).

Initialization. First, we fix some number and find a simple product with the maximal value among all products of lengths . We call this product a candidate s.m.p. and try to prove that it is actually an s.m.p. Denote and normalize all the matrices as . Thus we obtain the family and the product such that . For the sake of simplicity we assume that the largest by modulo eigenvalue of is real, in which case it is . We assume it is , the case of is considered in the same way. The eigenvector corresponding to this eigenvalue is called leading eigenvector. The vectors , are leading eigenvectors of cyclic permutations of . The set is called root. Then we construct a sequence of finite sets and their subsets as follows:

Zero iteration. We set .

-th iteration, . We have finite set and its subset . We set and for every , check whether is an interior point of (this is an LP problem). If so, we omit this point and take the next pair , otherwise we add to and to . When all pairs are exhausted, both and are constructed. Let . We have

Termination. The algorithm halts when , i.e., (no new vertices are added in the -th iteration). In this case , and hence is an invariant polytope, is an s.m.p., and . End of the algorithm.

Actually, the algorithm works with the sets only, the polytopes are needed to illustrate the idea. Thus, in each iteration of the algorithm, we construct a polytope , store all its vertices in the set and spot the set of newly appeared (after the previous iteration) vertices. Every time we check whether . If so, then is an invariant polytope, for all , where is the Miknowski norm associated to the polytope , and is an s.m.p. Otherwise, we update the sets and and continue.

If the algorithm terminates within finite time, then it proves that the chosen candidate is indeed an s.m.p. and gives the corresponding polytope norm. Although there are simple examples of matrix families for which the algorithm does not terminate, we believe that such cases are rare in practice. In fact, in all numerical experiments made with randomly generated matrices and with matrices from applications, the algorithm did terminate in finite time providing an invariant polytope. The only special case when it does not work is when there are several different s.m.p. (up to cyclic permutations). In this case the algorithm never converges as it follows from the criterion proved in [17]. The criterion uses the notion of dominant product which is a strengthening of the s.m.p. property. A product is called dominant for the family if there is a constant such that the spectral radius of each product of matrices from the normalized family , which is neither a power of nor one of its cyclic permutations, is smaller than . A dominant product is an s.m.p., but, in general, not vice versa.

Theorem A [17]. For a given set of matrices and for a given initial product , the invariant polytope algorithm terminates within finite time if and only if is dominant and its leading eigenvalue is unique and simple.

Note that if there is another s.m.p., which is neither a power of nor of its cyclic permutation, then is not dominant. Therefore, from Theorem A we conclude

###### Corollary 1

If a family has more than one s.m.p., apart from taking powers or cyclic permutations, then, for every initial product, the invariant polytope algorithm does not terminate within finite time.

The problem occurs in the situation when a family has several s.m.p., although not generic, but possible in some relevant applications. Mostly those are s.m.p. of length , i.e., some of matrices of the family have the same spectral radius and dominate the others. This happens, for instance, for transition matrices of refinement equations and wavelets (see Sections 5, 6). In the next section we show that the algorithm can be modified and extended to families with finitely many spectrum maximizing products.

Let a family have candidate s.m.p.’s . These products are assumed to be simple and different (up to cyclic permutations). Denote by the length of . Thus, . Let be the normalized family, be a leading eigenvector of (it is assumed to be real) and be the corresponding roots, . The first idea is to start constructing the invariant polytope with all roots simultaneously, i.e., with the initial set of vertices

However, this algorithm may fail to converge as the following example demonstrates.

###### Example 1

Let and be operators in : is a contraction with factor towards the axis along the vector , is a contraction with factor towards the axis along the vector . The matrices of these operators are

Clearly, both and have a unique simple leading eigenvalue ; is the leading eigenvector of and is the leading eigenvector of .

The algorithm with two candidate s.m.p.’s and with initial vertices does not converge. Indeed, the set has an extreme point , which tends to the point as , but never reaches it.

On the other hand, the same algorithm with initial vertices terminates immediately after the first iteration with the invariant polytope (rhombus) . Indeed, one can easily check that . Therefore, and are both s.m.p. and .

This example shows that if the algorithm does not converge with the leading eigenvectors (or with the roots ), it, nevertheless, may converge if one multiplies these eigenvectors (or the roots) by some numbers . In Example 1 we have . We call a vector of positive multipliers balancing vector.

Thus, if a family has several candidate s.m.p.’s, then one can balance its leading eigenvectors (or its roots) i.e., multiply them by the entries of some balancing vector and start the invariant polytope algorithm. In the next section we prove that the balancing vector for which the algorithm converges does exist and can be efficiently found, provided all the products are dominant (meaning the natural extension of dominance from a single product to a set of products). Thus, the corresponding extension of the invariant polytope algorithm is given by Algorithm 1.

A crucial point in Algorithm 1 is Step 5, where suitable scaling factors have to be given. Then Algorithm 1 essentially repeats the invariant polytope algorithm replacing the root by the union of roots . Finding the scaling factors that provide the convergence of the algorithm is a nontrivial problem. A proper methodology to compute them in an optimal way (in other words, to balance the leading eigenvectors) is derived in the next section.

###### Remark 1

Till now we have always assumed that the leading eigenvalue is real. This is not a restriction, because the invariant polytope algorithm is generalized to the complex case as well (Algorithm C in [17, 18, 22]). For the sake of simplicity, in this paper we consider only the case of real leading eigenvalue.

## 3 Balancing leading eigenvectors. The main results

### 3.1 Definitions, notation, and auxiliary facts

Let be some products of matrices from of lengths respectively. They are assumed to be simple and different up to cyclic permutations. We also assume that all those products are candidates s.m.p.’s, in particular, . We set and denote as the supremum of norms of all products of matrices from . Since is irreducible and , this supremum is finite [3]. By we denote the family of adjoint matrices, the definition of is analogous.

To each product we associate the word of the alphabet . By we denote concatenation of words and , in particular, ( times); the length of the word is denoted by . A word is simple if it is not a power of a shorter word. In the sequel we identify words with corresponding products of matrices from .

###### Assumption 1

We now make the main assumption: each product has a real leading eigenvalue (Remark 1), which is either or in this case. For the sake of simplicity, we assume that all (the case is considered in the same way). We denote by the corresponding leading eigenvector of (one of them, if it is not unique).

Clearly, the corresponding adjoint matrix also has a leading eigenvalue , and a real leading eigenvector . If the the leading eigenvalue is unique and simple, then . In this case we normalize the adjoint leading eigenvector by the condition .

Take arbitrary and consider the product , where and (for the sake of simplicity we omit the indices ). The vectors are the leading eigenvectors of cyclic permutations of the product . The set is a root of the tree from which the polytope algorithm starts. Let be the polytope produced after the -th iteration of the algorithm started with the product , or, which is the same, with the root . This polytope is a symmetrized convex hull of the set of all alive vertices of the tree on the first levels. In particular, and . We denote by and the union of the corresponding sets for all . If Algorithm 1 terminates within finite time, then is finite and is a polytope. If is an s.m.p., i.e, if , then, by the irreducibility, the set is bounded [3].

Now assume that all have unique simple leading eigenvalues, in which case all the adjoint leading eigenvectors are normalized by the condition . For an arbitrary pair , and for arbitrary , we denote

(3) |

Thus, is the length of the orthogonal projection of the convex body onto the vector . In particular, . Sometimes we omit the superscript if its value is specified. For given and for a point , we denote

(4) |

Note that for , the sets and have the same symmetrized convex hull . Therefore, the maxima of the function over these two sets coincide with the maximum over . Hence, comparing (3) and (4) gives

For an arbitrary balancing vector , we write . Our aim is to find a balancing vector such that the polytope algorithm starting simultaneously with the roots terminates within finite time.

###### Definition 3

Let . A balancing vector is called -admissible, if

(5) |

An -admissible vector is called admissible.

Since the value is non-decreasing in , we see that the -admissibility for some implies the same holds true for all smaller . In particular, an admissible vector is -admissible for all .

We begin with two auxiliary facts needed in the proofs of our main results.

###### Lemma 1

If a matrix and a vector , are such that , then has an eigenvalue for which , where depends only on the dimension .

See [36] for the proof. The following combinatorial fact is well-known:

###### Lemma 2

Let be two simple nonempty words of a finite alphabet, be natural numbers. If the word contains a subword such that , then is a cyclic permutation of .

Now we extend the key property of dominance to a set of candidate s.m.p.’s.

###### Definition 4

Products are called dominant for the family if all numbers , are equal (denote this value by ) and there is a constant such that the spectral radius of each product of matrices from the normalized family which is neither a power of some nor that of its cyclic permutation is smaller than .

### 3.2 Criterion of convergence of Algorithm 1

If the products are dominant, then they are s.m.p., but, in general, not vice versa. The s.m.p. property means that the function ( is the length of ) defined on the set of products of the normalized family attains its maximum (equal to one) at the products and at their powers and cyclic permutations. The dominance property means, in addition, that for all other products, this function is smaller than some . This property seems to be too strong, however, the following theorem shows that it is rather general.

###### Theorem 1

Algorithm 1 with the initial products and with a balancing vector terminates within finite time if and only if these products are all dominant, their leading eigenvalues are unique and simple and is admissible.

The proof is in Appendix 1.

###### Remark 2

Theorem 1 implies that if the algorithm terminates within finite time, then the leading eigenvalues of products must be unique and simple. That is why we defined admissible balancing vectors for this case only.

If Algorithm 1 produces an invariant polytope, then our candidate s.m.p.’s are not only s.m.p.’s but also dominant products. A number of numerical experiments suggests that the situation when the algorithm terminates within finite time (and hence, there are dominant products) should be generic.

### 3.3 The existence of an admissible balancing vector

By Theorem 1, if all our candidate s.m.p.’s are dominant and have unique simple leading eigenvalues, then balancing the corresponding roots by weights we run the algorithm and construct an invariant polytope, provided the balancing vector is admissible. A natural question arises if an admissible vector always exists. The next theorem gives an affirmative answer. Before we formulate it, we need an auxiliary result.

###### Lemma 3

For given coefficients the system (5) has a solution is and only if for every nontrivial cycle on the set , we have (with )

(6) |

Proof. The necessity is simple: for an arbitrary cycle we multiply the inequalities , and obtain (6). To prove sufficiency, we slightly increase all numbers so that (6) still holds for all cycles. This is possible, because the total number of cycles is finite. We set and , where maximum is computed aver all paths with . Note that if a path contains a cycle, then removing it increases the product , since the corresponding product along the cycle is smaller than one. This means that, in the definition of , it suffices to take the maximum over all simple (without repeated vertices) paths, i.e., over a finite set.

It is easy to see that . Reducing now all back to the original values, we obtain strict inequalities.

###### Theorem 2

If the products are dominant and have unique simple leading eigenvalues, then they have an admissible balancing vector.

Proof. In view of Lemma 3, it suffices to show that for every cycle , on the set , we have . We denote this quantity by and take arbitrary . There is a product of matrices from such that . Without loss of generality we assume that this scalar product is positive. Since the product has a unique simple leading eigenvalue , it follows that for every we have as . Applying this to the vector , we conclude that , whenever is large enough. Thus, for the product , the vector is close to . Analogously, for each , we find a product such that the vector is close to . Therefore, for the product , the vector is close to . Note that , where is the supremum of norms of all products of matrices from . If , then invoking Lemma 1 we conclude that , where can be made arbitrarily small by taking . Due to the dominance assumption, it follows that is a power of some , where is the set of products and of its cyclic permutations. Due to the dominance assumption, it follows that is a power of some . Taking large enough we apply Lemma 2 to the words and conclude that is a cyclic permutation of . Similarly, is a cyclic permutation of . This is impossible, because , and the products are not cyclic permutations of each other. The contradiction proves that which completes the proof of the theorem.

###### Remark 3

In a just published paper [25], Möller and Reif present another approach for the computation of joint spectral radius. Developing ideas from [23] they come up with an elegant branch-and-bound algorithm, which, in contrast to the classical branch-and-bound method [16], can find the exact value. Although its running time is typically bigger than for our invariant polytope algorithm [17] (we compare two algorithms in Example (2) below), it has several advantages. In particular, it uses the same scheme for the cases of one and of many s.m.p. It would be interesting to analyze possible application of the balancing idea for that algorithm.

### 3.4 How to find the balancing vector

Thus, an admissible balancing vector does exist, provided our candidate s.m.p.’s are dominant products and their leading eigenvectors are unique and simple. To find , we take some , compute the values by evaluating polytopes , set and solve the following LP problem with variables :

(7) |

If , then the -admissible vector does not exist. In this case, we have to either increase or find other candidate s.m.p.’s. If , then we have a -admissible vector . This vector is optimal in a sense that the minimal ratio between and over all is the biggest possible.

###### Remark 4

To find an admissible vector one needs to solve LP problem (7) for . However, in this case the evaluation of the coefficients may, a priori, require an infinite time. Therefore, we solve this problem for some finite and then run Algorithm 1 with the obtained balancing vector . If the algorithm terminates within finite time, then is admissible indeed (Theorem 1). Otherwise, we cannot conclude that there are no admissible balancing and that our candidate s.m.p.’s are not dominant. We try to to increase and find a new vector .

Thus, Step 5 of Algorithm 1 consists in choosing a reasonably big and solving LP problem (7). If it results , then the balancing vector does not exist, and hence the algorithm will never converge and we have to find another candidate s.m.p. If , then the vector is -admissible. If the algorithm does not converge with this , we increase and solve (7) again.

###### Remark 5

Our approach works well also if the family has a unique s.m.p. , but there are other simple products for which the values , although being smaller than , are close to it. In this case the (original) invariant polytope algorithm sometimes converges slowly performing many iterations and producing many vertices. This is natural, because if, say with very small ¿0, then the dominance of over plays a role only after many iterations. Our approach suggests to collect all those “almost s.m.p. candidates” add them to , find the balancing multipliers for their roots by solving LP problem (7) and run Algorithm 1 for the initial set . In most of practical cases, this modification significantly speeds up the algorithm.

Another modification of the invariant polytope algorithm is considered in the next section.

###### Example 2

Consider the following example introduced by Deslaurier and Dubuc in [12], associated to an eight-point subdivision scheme,

The joint spectral radius of was found in [25], where it was shown that both and are s.m.p. Its computation required the construction of a binary tree with levels and considering about matrix products (i.e., vertices of the tree). Applying our Algorithm 1 with the candidates s.m.p. and and with a balancing vector for the leading eigenvectors and of and , respectively, we construct the invariant polytope with vertices in steps:

Thus, in our case it suffices to construct a binary tree with levels and consider of its vertices.

## 4 Introducing extra initial vertices

The same approach developed for the case of many s.m.p. can be used to introduce extra initial vertices. Sometimes Algorithm 1 converges slowly because the family is not well-conditioned: its matrices have a common “almost invariant subspace” of some dimension