Primitive sets of nonnegative matrices and synchronizing automata

Primitive sets of nonnegative matrices and synchronizing automata

Balázs Gerencsér Alfréd Rényi Institute of Mathematics, Hungarian Academy of Sciences
Budapest, Hungary
gerencser.balazs@renyi.mta.hu
Vladimir V. Gusev Raphaël M. Jungers ICTEAM Institute, Université catholique de Louvain
Louvain-la-Neuve, Belgium
{vladimir.gusev, raphael.jungers}@uclouvain.be
Abstract

A set of nonnegative matrices is called primitive if there exist indices such that is positive (i.e. has all its entries ). The length of the shortest such product is called the exponent of . The concept of primitive sets of matrices comes up in a number of problems within control theory, non-homogeneous Markov chains, automata theory etc. Recently, connections between synchronizing automata and primitive sets of matrices were established. In the present paper, we significantly strengthen these links by providing equivalence results, both in terms of combinatorial characterization, and computational aspects.

We study the maximal exponent among all primitive sets of matrices, which we denote by . We prove that , and moreover, we establish that this bound leads to a resolution of the Černý problem for carefully synchronizing automata. We also study the set of matrices with no zero rows and columns, denoted by , due to its intriguing connections to the Černý conjecture and the recent generalization of Perron-Frobenius theory for this class. We characterize computational complexity of different problems related to the exponent of matrix sets, and present a quadratic bound on the exponents of sets belonging to a special subclass. Namely, we show that the exponent of a set of matrices having total support is bounded by .

Nonnegative matrices, primitive sets of matrices, the exponent of a matrix set, carefully synchronizing automata, the Černý conjecture
\Copyright

Balázs Gerencsér, Vladimir V. Gusev, Raphael Jungers \subjclassF.1.1 Models of Computation, G.2.1 Combinatorics \serieslogo\volumeinfoBilly Editor and Bill Editors 2 Conference title on which this volume is based on 1 1 1\EventShortName \DOI10.4230/LIPIcs.xxx.yyy.p

1 Introduction

A nonnegative matrix of size is called primitive if is positive (i.e. has all its entries larger than zero) for a positive integer . This notion was introduced by Frobenius in 1912 during the development of so-called Perron-Frobenius theory. This theory has found numerous applications since then: in the theory of Markov chains, economics, population modelling, centrality measures in networks, see [21, Chapter 8] for an introduction to the topic. Motivated by various applications Protasov and Voynov introduced the following generalization of this notion to sets of matrices [26]: a finite set of (entrywise) nonnegative matrices is called primitive if is (entrywise) positive for some indices . The length of the shortest such product is called the exponent of . We will denote the value of the largest exponent among all sets of matrices by . For example, the matrix set in Fig 1 is primitive, since the product is entrywise positive, and its exponent is equal to . Since the actual values of positive entries of matrices in do not influence the exponent, in the rest of our paper we will implicitly assume that entries of all matrices are equal to 0 or 1. Moreover, we assume that the product of two matrices of size is also a -matrix that is defined as follows111Formally speaking, we consider the matrices over the Boolean semiring.: if , and otherwise.

Primitive sets of matrices received a lot of attention for different reasons. We refer the reader to the introduction in [4] for the account of applications of primitive matrix sets to stochastic control theory and to the consensus problem. The connections to contractive matrix families and scrambling matrices are given in detail in [26, Section 5]. Primitive sets of matrices further arise in the study of time-inhomogeneous Markov chains [14], and are of importance in mathematical ecology [19]. Furthermore, primitive sets of matrices are tightly related to boolean networks, which are widely used in biology to model gene regulatory networks. A special class of boolean networks – disjunctive networks, can be seen as a set matrices over the Boolean semiring. While researchers in theoretical biology are mostly interested in the attractors and the limit cycles for different types of update schedules, see for example [12], we are mainly interested whether it is possible and how fast one can achieve the all-one state. The subfamily of nonnegative matrices that have no zero rows and columns, denoted by , will be of major interest to us for the following reasons. A matrix is called irreducible if for every there exists a positive integer such that . A set of matrices is irreducible if the matrix is irreducible. As usual, we will denote by the th vector of the canonical basis in , the th entry of is one, all the others are zeros. We say that a matrix acts as a permutation on a partition of the vectors of the canonical basis if there exists a permutation such that for all , belongs to the subspace spanned by . A classical theorem of Perron-Frobenius theory states that an irreducible matrix is primitive if and only if there is no partition of the canonical basis vectors for such that acts as a permutation on . Protasov and Voynov generalized this theorem to sets of matrices belonging to  [26]: an irreducible set of matrices belonging to is primitive if and only if there is no partition for such that every acts as a permutation on . Thus, the class of primitive matrices belonging to can be viewed as the right class for Perron-Frobenius-type theory of matrix sets. This characterization also leads to an efficient algorithm that decides whether a set of matrices belonging to is primitive.

1.1 Synchronizing automata

A deterministic finite state automaton is a triple222The classical definition also involves an initial and a set of final states. Since they don’t play any role in our considerations, we will omit them. , where is a finite set of states, is a finite set of input symbols called the alphabet, and is a transition function . The image of a state under the action of a word is denoted by . An automaton is called synchronizing if there exist a word and a state such that for every state we have . Any such word is called a synchronizing or reset word. The length of the shortest such word is called the reset threshold of . Synchronizing automata naturally appear in different areas of research. For example, they were used to model sensorless parts orienting problems: given a part and a set of available actions that can change its spatial orientation, find a sequence of actions that would bring the part to a desired orientation independently of the initial position [22]. Clearly, if we consider an automaton with the set of spatial orientations as the set of states, and the available actions as letters, then the “orienting sequence” corresponds to a synchronizing word of . We refer the reader to [28] for the survey of main results and other applications. A recent account of applications of synchronizing automata in group theory can be found in [2]. Persisting interest of the research community to the topic is also driven by one of the most famous open problems in automata theory. Namely, the Černý conjecture states that the reset threshold of an -state automaton is at most  [6, 7]. This bound is reached by the -state Černý automaton , see [28, p. 18], but despite intensive efforts of researchers, the best upper bound was obtained more than 30 years ago in [25, 9] and independently in [18].

The notion of a synchronizing automaton can be generalized in three different ways to nondeterministic automata [16]. We will focus our attention on the most relevant for us. An automaton is a partial automaton if the transition function is partial, i.e. there might be undefined transitions for some pairs of states and letters. A partial automaton is carefully synchronizing if there exist a word and a state such that is defined and equal to for every state . Any such word is called a carefully synchronizing word. The length of the shortest such word we will denote by . We will denote by the maximum of among all -state partial automata. Essentially, carefully synchronizing automata model the problem of bringing a simple finite-state device to a known state with a single input sequence, while avoiding undefined transitions, which are undesirable or can break the device. In matrix terms, it amounts to consider a set of matrices with at most one 1-entry per row, and to ask for a product with one (entrywise) positive column.

1.2 Our contributions

Our results can be informally arranged into three different groups. The contributions of the first group significantly improve the understanding of the relationships between primitive sets of matrices and synchronizing automata. The work within this framework started in [1], where well-known examples of primitive matrices with large exponent were used to construct series of automata with relatively large reset thresholds, so-called “slowly synchronizing automata”. In [4] it was shown that a bound on the reset threshold of -state automata implies a bound on the exponent of matrix sets. We significantly improve these results. We show that the growth rate of is equal to . Thus, in a certain sense, the study of the exponents of sets of matrices is equivalent to the study of carefully synchronizing automata. We also formulate an analogous result for primitive matrix sets. Namely, we introduce a special class of automata such that the growth rate of the reset thresholds of automata in this class is equivalent to the growth rate of the exponents of matrix sets. We propose and formalize a new open question of whether a quadratic bound on leads to a breakthrough on the Černý conjecture.

The contributions of the second group are of combinatorial nature. Our main result states that , and equivalently, . From the automata theory point of view our contribution can be seen as the resolution of the Černý-like problem for the carefully synchronizing automata. From the point view of matrix theory, our result is a generalization of the classical theorem by Wielandt that the exponent of a single matrix is at most , see for example [15, Corollary 8.5.9]. It also answers the question of establishing the growth rate of posed in [4]. Another contribution in this group is a partial result for matrix sets. Recall that a matrix has total support if every non-zero element of lies on a positive diagonal, i.e. for every such that there exists a permutation with the following properties: and for every we have . We prove that the exponent of a set of matrices having total support is bounded by . In the proof we utilize the well-known theorem by Kari that the reset threshold of an Eulerian automaton is bounded by . This result suggests that the bounds for other classes of synchronizing automata might be used to obtain upper bounds on the exponent in the special classes of matrix sets.

The contributions of the last group are related to the computational complexity of finding the exponent of an matrix set. Given a set of two matrices belonging to and possibly an integer encoded in binary, we establish the exact computational complexity of the following problems:

  1. the problem of deciding whether is -complete;

  2. the problem of deciding whether is -complete;

  3. the problem of computing is -complete.

Furthermore, we show that unless , for every positive there is no polynomial-time algorithm that computes the exponent of an matrix set with the approximation ratio , even in the case of only three matrices in the set. These results are based on a single relatively simple reduction from automata with a sink state to sets of matrices belonging to .

The paper is organized as follows. Section 2 deals with the primitive sets of matrices in the general case. We show that and prove that . Section 3 is devoted to the matrix sets. In subsection 3.1 we introduce the class such that . We also present a quadratic bound on the exponent of a set of matrices having total support. In subsection 3.2 we deal with the complexity issues related to the computation of the exponent of matrix sets.

2 The general case

Recall that we denote the value of the largest exponent among all matrices by . The growth rate of is one of the most basic questions one can ask about the sets of primitive matrices. Furthermore, an upper bound on gives a bound on the running time of the straightforward algorithm that decides whether a given set of matrices is primitive: we iterate through all the possible products of length up to and check, whether they contain a positive matrix. Since the problem is -hard [4, Theorem 6], such a simple algorithm might be the best we can hope for. The best known bounds on were presented in [4, Theorem 10]: {theorem} If consists of matrices of size then . Moreover, if , then for all there exists a sequence of positive integers tending to infinity such that .

Recall that we denote the maximum of among all -state automata by . In the upcoming theorem we are going to show that grows asymptotically as . Thus, we can utilize the known bounds on to infer the bounds on . Furthermore, in the next subsection we will be able to significantly improve the known upper bound on , and equivalently, on . Before stating the theorem we require one last definition. Given a (partial or complete) automaton , an adjacency matrix of a letter is defined as follows: if , and otherwise. In Fig. 1 the matrix is an adjacency matrix of the letter .

Figure 1: The matrix set and the corresponding non-deterministic automaton .
{theorem}

Let be the maximum value of the exponent among all sets of matrices. Let be the maximum value of among all -state partial automata , then .

Proof.

The proof of the first part of the theorem is inspired by [4, Theorem 16]. While, in [4], the result was restricted to matrices, we extend it here to all primitive sets of matrices. Furthermore, we make it deterministic, which will be crucial for Theorem 3.1. Let us consider an arbitrary primitive set of matrices . We are going to show now that , which implies . We will achieve this by presenting products of matrices in with the following properties:

  1. the th column of is positive for some and the length of is at most ;

  2. the th row of is positive for some and the length of is at most ;

  3. and the length of is at most .

These properties clearly imply that is positive and the length of is at most . Thus, .

We will construct the product by utilizing a partial automaton defined as follows: a partial function is a letter of if and only if there is a matrix with the properties: for all , if is defined, , otherwise the th row has no positive entries. First, we are going to show that is carefully synchronizing, then we will use the shortest carefully synchronizing word of to obtain the matrix product .

We construct a carefully synchronizing word of with the help of an auxiliary non-deterministic automaton defined in the following manner: the set of states of is equal to ; for each matrix we add a letter such that is the adjacency matrix of , see Fig. 1. It is straightforward to verify that for every we have if and only if there is a path from to in labelled by the word . Since is primitive, there exists a positive product of matrices in . Therefore, there exists a word such that for every pair of states of there is a path between them labelled by .

It remains to show that the word can be transformed to a carefully synchronizing word of . Let us fix a state . There are paths in labelled by and for every the path goes from the state to the state . Furthermore, we can impose an additional property on these paths. Namely, if at a step paths and are in the same state, then their continuations coincide. Indeed, let and . Then we can substitute the path with the path , which still goes from to and it is labelled by . Observe now, that the paths can be easily treated as paths leading to the state in the partial automaton : by construction for each letter of and states with a property for , there exists a letter of such that for each ; due to this fact and the unique continuation property of the paths, we conclude that there exists a word over the alphabet of that labels the paths from every state to the state in . Thus, is carefully synchronizing.

Let be the shortest carefully synchronizing word of . It is easy to see that a product contains a column of ones, where is the adjacency matrix of for . Since for every there is matrix such that we obtain a product with the properties: has a column of ones and its length is bounded by .

The product is constructed in the same manner by applying the reasoning of the previous paragraphs to a matrix set . The resulting product has a column of ones and the length at most . The existence of the product easily follows from the fact that is strongly connected (otherwise the set of matrices is not primitive). Thus, for every pair of states there exists a path of length at most that bring to .

Now, given a carefully synchronizing -state automaton with the reset threshold equal to we will construct a primitive set of matrices such that . It will imply . Let be a row vector of 1’s, and be a row vector with the only non-zero entry equal to 1 at position . Let . The set of matrices is defined as a union , where is a set of the adjacency matrices of letters of the partial automaton . Since is carefully synchronizing, there is a product of matrices in such that the th column is positive for some . If we multiply by the matrix on the right, we obtain a positive matrix product. Thus, is primitive.

It remains to show that . Let be the shortest positive product of matrices in . Note, that contains at least one matrix from , since every product of matrices in contains at most one 1 in each row. Let , where the product doesn’t contain matrices from and . Observe that contains a positive column. Otherwise, will have a zero row due to the presence of a zero row in . Therefore, the length of the product is at least and we obtain the desired inequality. ∎

{corollary}

The growth rate of is and .

Proof.

The first part of the claim follows from the result of Zs. Gazdag et al. [11, Theorem 3]: . Thus, . The second part follows from the result of Martyugin [20]. He constructed a series of carefully synchronizing automata with the length of the shortest carefully synchronizing word equal to . Thus, . ∎

2.1 Improving the upper bound on the exponent

The goal of this section is to significantly improve the bound on and, equivalently, on . We will present a new upper bound on the length of the shortest carefully synchronizing word by modifying constructions from [11].

Recall that a partition of a set is a collection of pairwise disjoint non-empty sets whose union is equal to . Given a partition of , a set is called a transversal of with respect to the partition if for each there is a unique such that . A set is a partial transversal with respect to if for each there is at most one such that .

{example}

For a partition of the sets and are transversals and and are partial transversals. The set is neither transversal, nor partial transversal. Let be an -element set and be an arbitrary partition. We will denote by the number of different transversals with respect to and by the number of different partial transversals of size . Let be the largest value of among all partitions of into parts. Similarly, let be the largest value of among all partitions of into parts. If the value of is clear from the context, then we will often write and to simplify notation. We will make use of the following bounds on and :

{lemma}
  1. for .

  2. for .

  3. for .

  4. for .

Proof.
  1. It is the statement of Proposition 5 in [11].

  2. Let be a partition of into parts such that , where . If is the size of the th part of , then it is easy to see that . Observe that for any we have . Otherwise, by moving an element from the th part to the th part of the partition , we will increase the number of transversals: . Therefore, for a given range of values , every is equal to or . Let be the number of ’s equal to , then is the number of ’s equal to . Since , we derive that , and the desired bound follows.

  3. Let be a partition of into parts such that , where . If is the size of the th part of , then by the inequality of arithmetic and geometric means we have

    Let us bound the right hand side. Note that . For , we have . Therefore, the largest value of the function is achieved at . Thus, .

  4. Let be a partition of into parts such that and let be the size of the th part of .

{theorem}

Let be the maximum value of the exponent among all sets of matrices. Let be the maximum value of among all -state partial automata , then , and equivalently .

Proof.

We will show that is at most for any once for some threshold . Since is by [20], the statement will clearly follow. Due to Theorem 2 we will have the same statement for .

Let be a carefully synchronizing -state partial automaton with the set of states . We will construct a carefully synchronizing word of via the following iterative procedure:

  1. Let be a letter that is defined on every state and satisfies , where denotes the cardinality of a set. Since is carefully synchronizing, there exists at least one such letter.

  2. Choose a positive integer . Let be a word of the form

    where the words are defined iteratively for : is the shortest word such that is defined on every state and .

Note, that the word is well-defined, since at every step the set of possible words for contains carefully synchronizing words of . Our procedure further ensures that for every . Thus, is indeed a carefully synchronizing word. Our goal now is to bound the length of . The bound on presented in [11] was obtained using the presented procedure with the parameter ultimately fixed to 1. By choosing for every a sufficiently large satisfying certain conditions, we get a significant improvement. We proceed by bounding the length of intermediate words . To simplify the presentation and without loss of generality, we will further assume that is divisible by .

  1. . For these values of we uniformly put . The proof of this case is presented in [11, Proposition 7], which we repeat here for convenience. We are going to show that by induction. Note, that . Let . The word gives rise to a partition of into parts as follows: a pair of states belongs to the same part of if . Observe that if , where are letters, then the set is a transversal with respect to for every . Indeed, if it is not the case for some , then the word satisfy the conditions of our procedure and it is shorter than , which is impossible. Therefore, the length of is bounded by . By lemma 2.1 we conclude . Therefore, , which completes the induction. Observe, that for we have .

  2. . For each we will choose the value of at the end of the proof, independent of and . As before, the word gives rise to a partition of into parts as follows: a pair of states belongs to the same part of if . Let us fix and let , where are letters. By construction, for every the cardinality of the set is equal to . Furthermore, is a partial transversal with respect to . Indeed, if it is not the case, then and the length of is not minimal. Therefore, the length of is bounded by the number of partial transversals . Using this bound for all and part 4 of lemma 2.1 we obtain

    for some polynomial of degree . Note, that part 3 of lemma 2.1 can be rewritten as for . Applying this inequality and the inequality for times we derive:

    By choosing large enough, such that it satisfies , we ensure that every term, starting from the second one, is majorated by the last term. Thus,

    for another polynomial .

  3. . Since , for every we can choose such that . In the same way as before, we derive

    where is a polynomial. The last inequality holds for large enough .

3 Sets with no zero rows nor zero columns

3.1 Bounds on the exponent

A quadratic lower bound on the exponents of sets of matrices belonging to was obtained in [4, Corollary 20]. A first cubic upper bound was given in [29, Theorem 1]333At the discussion of the connections with the Černý conjecture the author refers to a wrong bound, see [13] for a discussion.. The proof relies on standard linear algebraic techniques. This bound was improved in [4, Corollary 18] to . The proof is based on the following fact: a bound for the reset thresholds of synchronizing automata implies a bound for the exponents of matrix sets [4, Theorem 17]. In this subsection we will extend this result and present a quadratic bound for a special class of matrix sets.

We denote by the maximal exponent among all primitive matrix sets belonging to . In order to state a theorem for , analogous to theorem 2, we will introduce a new class of automata defined as follows. An automaton with the set of states over an alphabet belongs to if there exists a partition of into such that for every we have:

  1. for each state there exists a state and a letter such that ;

  2. for every choice of states such that for some , there exists a letter with the property for all .

In other words, for each , every state is reachable from somewhere by a letter in , and given a list of transformations of states by letters in , we can find a letter in that performs all the transformations at once. {example} The Černý automaton equipped with an identity letter belongs to . The required partition is and . Clearly, the reset threshold is not changed with the addition of the letter . {theorem} Let be the maximum value of the exponent among all sets of matrices of size . Let be the largest reset threshold among -state automata in , then .

Proof.

In order to show that is we will reuse the reduction from primitive sets of matrices to partial automata presented in the first part of theorem 2. Let be a primitive set of matrices. It can be reduced to partial automata and such that . Since is an matrix set, we conclude that the automata and are complete. Thus, their carefully synchronizing words are ordinary synchronizing words. Furthermore, the automata and belong to . Indeed, every letter of these automata was obtained from a matrix in or . In order to obtain the desired partition, we group a pair of letters together if and only if they were derived from the same matrix. It is a straightforward check that both conditions on the partition are satisfied.

For the other direction, given an automaton we will construct a set of matrices such that . It will imply that . Let be the partition of the letters of . A set of matrices consists of matrices , where is the adjacency matrix of the letter . Clearly, each is an matrix due to the first property of the partition and the fact that is complete. The second property ensures that , since every primitive word of can be transformed into a synchronizing word of as in the proof of theorem 2. ∎

Problem \thetheorem.

Improve the bounds and on the growth rate of and, equivalently, . In particular, is there a constant such that ?

This problem can be settled in different ways. On one hand, one can show that problem 3.1 is as hard as the Černý conjecture. There are not many natural problems equivalent to it and the problem of bounding is a good candidate for this purpose. On the other hand, a quadratic bound for is clearly of interest by itself.

In the remainder of this subsection we will present an upper bound on the exponent of a set of matrices from a special class. A matrix has total support if every non-zero element of lies on a positive diagonal, i.e. for every such that there exists a permutation with the following properties: and for every we have . The class of matrices with total support received a lot of attention in the past. For example, it appears in the necessary and sufficient condition for the convergence of the classical Sinkhorn-Knopp method for matrix scaling, see [27]. Another characterization is related to a class of doubly stochastic matrices. A square matrix is called doubly stochastic if the entries are nonnegative and the sum of elements in each row and column is equal to 1. A matrix is said to have a doubly stochastic pattern if there exists a doubly stochastic matrix such that for all it holds if and only if . A famous result of Perfect and Mirsky [24] states that a matrix has total support if and only if has a doubly stochastic pattern, see [5, Theorem 9.2.1] for a modern and much more general treatment of the problem. Now we are ready to state our result.

{theorem}

If each matrix of a primitive set has total support, then the exponent of is at most , where is the size of matrices in .

Proof.

We will modify the reduction presented in the first part of theorem 2 from a primitive set of matrices to partial automata in order to prove the statement. By the aforementioned result of Perfect and Mirsky we conclude that for every matrix there exists a doubly stochastic matrix such that for every we have if and only if . It is not hard to see that we can further assume that has only rational entries. Therefore, there exists and a matrix with nonnegative integer elements such that the sum of entries in each row and column is equal to , and if and only if for all . We will define the automaton as follows. For each matrix we add a function as a letter of multiple times. Namely, we treat as different letters of the automaton . Let be an automaton obtained from a matrix set in the same manner. Similarly to the proofs of theorems 2 and 3.1 we can conclude that the automata and are complete and synchronizing, moreover, .

Now we are going to show that the automaton is Eulerian, i.e. the in-degree of each state is equal to its out-degree. Let be the set of letters generated from a matrix and be the row (and column) sum of . It is not hard to see that by the definition of the size of is . Furthermore, the number of letters from that move a state to a state is equal to . Thus, the number of incoming edges to labelled by a letter from is equal to , which is equal to the size of . Since the alphabet of is and for every the number of incoming and outgoing edges labelled by coincide, we conclude that the automaton is Eulerian. The same reasoning allow us to conclude that the automaton is also Eulerian. By the famous result of Kari about the reset thresholds of Eulerian automata [17] we have . Thus, the exponent of the matrix set is at most . ∎

3.2 Computation and approximation of the exponent

In this subsection we will focus on the problem of computing the exponent of a set of matrices. Our results rely on the following lemma: {lemma} For every synchronizing automaton with a sink state there exists a set of matrices constructible in polynomial time such that .

Proof.

If the number of states of is equal to 1, then can be an arbitrary set of matrices with the exponent equal to 2. For example, . In all the other cases we construct the matrix set as follows. Since is synchronizing, it has a unique sink state, which we denote by . Let be the set of adjacency matrices of letters of . The matrix set consists of matrices in modified in such a way that the th row of every matrix is positive, i.e. , where stands for a row vector of 1’s, and stands for a row vector with the only non-zero entry equal to 1 at position . Since the automaton is complete, we conclude that every matrix has no zero rows. Furthermore, each column of has a positive element at position . Thus, the set of matrices belongs to . Now we will demonstrate that the set is primitive. Since the automaton is synchronizing there exists a product of matrices from with a positive column. Due to the fact that for every letter of one has , we conclude that the only positive entry in the th row of is located at position . Thus, the th column of is positive. Altering each matrix of the product to a matrix with the property we obtain a product of matrices from with the positive th column. Now multiply by any matrix to get a positive product . Thus, the set is primitive and .

It remains to show that