WEIGHTED AUTOMATA & RECURRENCE EQUATIONS FOR REGULAR LANGUAGES

Weighted Automata and Recurrence Equations for Regular Languages

E. Carta-Gerardino
ecarta-gerardino@york.cuny.edu

P. Babaali
pbabaali@york.cuny.edu

Department of Mathematics and Computer Science
York College, City University of New York
94-20 Guy R. Brewer Boulevard, Jamaica, New York 11451
United States
Abstract.

Let be the semiring of languages, and consider its subset . In this paper we define the language recognized by a weighted automaton over and a one-letter alphabet. Similarly, we introduce the notion of language recognition by linear recurrence equations with coefficients in . As we will see, these two definitions coincide. We prove that the languages recognized by linear recurrence equations with coefficients in are precisely the regular languages, thus providing an alternative way to present these languages. A remarkable consequence of this kind of recognition is that it induces a partition of the language into its cross-sections, where the th cross-section contains all the words of length in the language. Finally, we show how to use linear recurrence equations to calculate the density function of a regular language, which assigns to every the number of words of length in the language. We also show how to count the number of successful paths of a weighted automaton.

Keywords: cross-section of a language, density of a language, language recognition, recurrence equations, semirings, weighted automata

1. Introduction

Weighted automata are powerful finite-state machines in which every transition carries a weight from a semiring. These automata have been studied recently in a wide range of settings, from very applied fields, like natural language and speech-processing (see [11, 10, 9]), to more theoretical ones, like logic (see [5]). In our current research, we are particularly interested in the applications of weighted automata to formal language theory.

A finite automaton ([6, 7]) can be regarded as a particular type of weighted automaton, by letting the weights come from the Boolean semiring (i.e., the weights are either 0 or 1). Thus, the class of weighted automata contains the class of finite automata. Kleene’s Theorem states that finite automata recognize the regular languages. Hence, it is no surprise that weighted automata can be used to recognize a class of languages that contains the class of the regular languages. In particular, it can be shown that weighted automata can be used to recognize context-free languages (see [4]).

In our work we are interested in weighted automata over a one-letter alphabet. We refer to these automata as counting automata, since they can be used as counting devices, with applications in combinatorics and enumeration (see [12, 13, 14]), among others. We start by recalling the definitions of a semiring and a formal power series (Section 2). These notions provide the setting we need to associate a linear recurrence equation to each state of a counting automaton. In fact, we will see that a counting automaton over a semiring generates a system of linear recurrence equations with coefficients in (Section 3).

Given our interest in the applications of weighted automata to formal language theory, we explore counting automata, and recurrence equations, over the semiring of languages, (Section 4). Specifically, we consider its subset . We define the language recognized by a counting automaton over , and introduce the idea of language recognition by linear recurrence equations with coefficients in . We will see that these two types of language recognition are equivalent. A consequence of recognizing a language this way is that we obtain a partition of the language into its cross-sections, where the th cross-section contains all the words of length in the language (see [2, 1]). It is important to notice that this is the case because the weights of the automata and the coefficients of the recurrence equations come from . We will show that the languages recognized by counting automata over , and by linear recurrence equations with coefficients in , are closed under certain operations. We then prove that a language recognized by a system of linear recurrence equations with coefficients in is regular, and that every regular language is recognized by a system of linear recurrence equations with coefficients in . This result provides a novel way to present this important class of languages.

We conclude this paper by showing how to use linear recurrence equations to count, for every , the number of words of length in a regular language (Section 5). That is, we show how to calculate the density function of the language (see [15]). We will see that the number of words of length in a language is closely related to the number of successful paths of length in the counting automaton recognizing the language. Thus, we start by counting the number of successful paths of any given length in an automaton. We do this by constructing an automaton that counts the number of successful paths of another automaton. We refer to this machine as a path-counting automaton, and we use it to construct the self-counting automaton, which we will define as a machine with the ability to count its own successful paths. Therefore, for every , we have a way to (i) generate all the words of length and to (ii) count the number of words of length in a regular language.

2. Preliminaries: Semirings and Formal Power Series

A monoid is a nonempty set on which we define an associative operation, and in which there is an identity element. Using this, we can define a semiring.

Definition 2.1.

A semiring is a set satisfying

  1. is a commutative monoid with identity element 0

  2. is a monoid with identity element 1

  3. ,

  4. ,

From the definition we can see that every ring with unity is a semiring. (For example, the ring of the real numbers is an example of a semiring.) Some nontrivial examples of semirings are the semiring of natural numbers , the Boolean semiring , and the semiring of languages , where is a finite alphabet, is the set of all words of finite length over ( denotes the empty word), and is the power set of , known as the set of languages over . It is not difficult to see that if and are two semirings, then their direct product is also a semiring.

Definition 2.2.

Let be a finite alphabet and a semiring. We can define a map , assigning to every word an element . Such a map is known as a formal power series.

We call the coefficient of , or the weight of . Of course, these coefficients or weights have different interpretations, depending on the particular semiring . The set of all formal power series is usually denoted by .

For example, if is a finite alphabet and , then notice that for , is either 0 or 1 (false or true, respectively). Hence, a formal power series rejects or accepts a word .

We have seen that, given a semiring (and a finite alphabet ), we can define the set of formal power series . In turn, the set of formal power series can be made into a semiring in the following way ([8]). Addition of two series is defined by , for all . The series defined by is the identity for the addition. Multiplication of two series is defined by , for all . (This operation is known as the Cauchy product of two formal power series.) The identity for the product is the series defined by , while for any other word . Hence, is a semiring, the semiring of formal power series.

3. Weighted Automata and Recurrence Equations

A convenient way to represent some formal power series is by means of weighted automata ([5]).

Definition 3.1.

Let be a semiring and a finite alphabet. A weighted automaton over and is a quadruple , where is a finite set of states, are functions defining the initial weight and the final weight of a state, respectively, and if is the number of states, is the transition weight function. We let be an -matrix whose -entry gives the weight of the transition . If , we denote this by .

Notice that the definition of a weighted automaton does not include the notions of initial or final states. However, by appropriately defining and , it is possible to equip a weighted automaton with initial and final states, as we will see later on.

Consider now the path in . Denote the length of the path by . Now define the weight of the path by

Notice that this path has as label the word . There might be, of course, other paths with label . We will define the weight of the word in to be the sum of the weights over all paths with label . Denote this by , and notice that the weight of a word in is a function from to . That is, is a formal power series, so .

A formal power series is said to be automata recognizable if there is a weighted automaton such that . In this case we say that is an automata representation for .

In our research we are interested in automata over a one-letter alphabet . Hence, a typical path in such an automaton is . Given that every transition reads the letter , we eliminate it from the diagram for simplicity, thus making a typical path look like . Since , an arbitrary word has the form for some . By definition, equals the sum of over all paths with label . But since every transition reads the letter , equals the sum of over all paths of length . We call the behavior of the automaton , and define it as

Note that in this kind of automaton we are not directly accepting/rejecting words over some alphabet, but rather counting paths of length , and keeping track of the weights of such paths. The idea of using automata as counting devices has been used recently with applications in combinatorics and difference equations [12, 13, 14]. Thus, we refer to weighted automata over a one-letter alphabet as counting automata. In our work, we further explore some of the properties of counting automata.

Suppose we are interested in computing the weight of all paths of length (equivalently, all paths with label ) in an automaton , starting at a specific state . Then we would look at all paths of length starting at , compute the weight of each of these paths, and add up these weights. We call this the behavior of the state and denote it by . Then

Notice that, given any state , . Since assigns to every word an element , we can identify with a function . Hence, in a counting automaton, every state generates a function , and thus we can identify each state with the function it generates. The idea of associating a function to each state of an automaton goes back to classical automata theory (see [3, 6]).

In what follows, we will assume that the initial weights of the states of an automaton are either 0 or 1. Those with a weight of 1 will be the initial states, and those with a weight of 0 will be non-initial. Denote the set of initial states by . For the moment, suppose that , so that every state is allowed to be an initial state. In the next section we will assume that . We will allow more freedom to the way we define the final weights of the states of an automaton. Those states with a non-zero weight will be the final states, and those with a weight of 0 will be non-final. We will denote the set of final states by .

Now consider an arbitrary state . Let be the set of states that can be reached from through paths of length 1, with transition weights , respectively. Suppose that the final weight of state is , and that the final weights of states are , respectively. A graphical representation of this is

Figure 1. Paths of length 1 starting at state

Let be the function generated by state , and let be the functions generated by states , respectively. It can be shown (see [12]), that

(3.1)

Notice that Eqs. 3.1 provide a recursive definition of the function generated by each state of a counting automaton. Using this, it can be shown that a counting automaton over generates a system of linear recurrence equations. And by definition, these are the only equations recognized by .

Theorem 3.1.

([12]) Suppose that is a counting automaton over a semiring .

The functions generated by satisfy the following system of linear recurrence equations.

(3.2)

Conversely, given this system of linear recurrence equations, the counting automaton recognizing it is precisely .

Example 3.1.

Higher-Degree Systems

Consider the following system of linear recurrence equations over an arbitrary semiring .

Note that the degree of is 4. Theorem 3.1 guarantees that we can build an automaton recognizing equations of degree 1. In order to use this result, we need to introduce additional functions that act as intermediate states. The functions we need can be defined as follows.

Using these auxiliary functions, we can rewrite the original system of equations as a system of equations of degree 1.

Now we can use Theorem 3.1 to build the automaton that recognizes the given system of linear recurrence equations.

Figure 2. Automaton recognizing the system of recurrence equations in Example 3.1

Theorem 3.1 above shows that a counting automaton over a semiring generates a system of linear recurrence equations with coefficients in . In the next section we will restrict our attention to the case where . Specifically, we will consider its subset . Our goal is to define the language recognized by a counting automaton over , and to define what it means for a language to be recognized by a system of linear recurrence equations with coefficients in . One of the implications of defining languages this way is that we obtain an immediate partition of the language into its cross-sections. We will see that it is also possible to define these languages through formal grammars, and we will show that these languages are closed under union, concatenation, and the Kleene star. Using this, we will prove that the languages recognized by linear recurrence equations with coefficients in are precisely the regular languages.

4. Language Recognition, Language Partition, and the Cross-Sections of a Regular Language

In this section, the semiring we use for the weights of the automata and for the coefficients of the recurrence equations is . In particular, we will only consider weights and coefficients in .

Suppose that is a counting automaton with weights in , and assume that the set of states of is . As it is customary when using automata for language recognition, we will assume that there is only one initial state. Without loss of generality, we will let the first state be the initial state. Therefore, . We now specify the final weights of the states in . Non-final states were defined as states that have a final weight of 0; in the semiring of languages, . Final states were defined as states with a non-zero final weight. Specifically, we will assume that the final states have a final weight of 1; in the semiring of languages, . Therefore, if , and if .

Let be a counting automaton with states

Figure 3. Counting automaton over

where , if , if , and for every and every , . (If , we can eliminate this transition from the diagram.) We know that , the initial state, generates a function defined by the system

(4.1)

Since , we have that for every , is a language containing words of length . Denote by .

Definition 4.1.

The language recognized by a counting automaton over is denoted by and is defined by .

Thus, a word of length belongs to if belongs to . We will say that a word of length is recognized by the counting automaton if there is a path of length starting at and ending at a final state with weight .

Given that the languages are defined recursively, we can also define the language in the following way.

Definition 4.2.

is known as the language recognized by linear recurrence equations, since the languages are defined via linear recurrence equations.

Remark.

A consequence of recognizing a language via linear recurrence equations with coefficients in is that the language is automatically partitioned into sets , where contains all the words of length in , i.e., is the th cross-section of the language.

Since the operations of the semiring of languages are union and concatenation, it is not difficult to see that we can also define by using a grammar. We associate a nonterminal symbol to every state , except that we denote by , the start symbol, since is the initial state. The set of terminal symbols is . Consider an arbitrary transition and suppose that is non-empty. Then to this transition we associate the productions . Finally, for every state (so ) we include a production . Denote this grammar by . Then we will also define by .

We now present the closure properties of these languages.

Theorem 4.1.

The languages recognized by counting automata over are closed under union, concatenation, and the Kleene star.

Proof.

Assume that and are the languages recognized by and , respectively, where and are counting automata over . We will show that we can construct counting automata over that recognize , , and .

Let be the set of states of . Assume that and that is the (nonempty) set of final states of . Then we know that is recognized by an automaton like the one in Figure 3, with weights , for . Similarly, let be the set of states of . Suppose that and that is the (nonempty) set of final states of . Then is also recognized by an automaton like the one in Figure 3, but with weights , for .

We first construct an automaton recognizing . Let and . We let if or . Otherwise, . Now we just need to specify the transitions. The new automaton will contain all the transitions in and in , plus some new ones. For every transition , , we add a new transition . Similarly, for every transition , , we add a new transition . Then recognizes . That is, .

We now construct an automaton that recognizes . Let and . We let if and . Otherwise, . As for the transitions in , we will include all the transitions in and in , as well as some other ones. Assume that . Then, for every state , , given the transitions , we add a new transition , where . Finally, if , then for every transition , , we add a transition . By construction, we conclude that .

We conclude the proof by constructing an automaton that recognizes . Let , , and . The automaton will contain all the transitions in , plus some new ones. For each state , , given the transition , we add a transition , and for every , we also add transitions (to the state ). Then . ∎

By combining Theorems 3.1 and 4.1, we obtain the following.

Corollary 4.2.

The languages recognized by systems of linear recurrence equations with coefficients in are closed under union, concatenation and the Kleene star.

We are now ready to prove one of our main results.

Theorem 4.3.

A language recognized by a system of linear recurrence equations with coefficients in is regular. Conversely, every regular language is recognized by a system of linear recurrence equations with coefficients in .

Proof.

Suppose that is a language recognized by a system of linear recurrence equations with coefficients in . Then we know that there is a counting automaton over such that . By definition, is generated by the grammar provided before Theorem 4.1. Note that this grammar is regular. Therefore, is a regular language.

Now we need to show that if a language is regular, then it is recognized by a system of linear recurrence equations with coefficients in . Recall that the set of regular languages over an alphabet is defined by (i) are regular, and (ii) if are regular, then are regular. It is not difficult to find systems of linear recurrence equations with coefficients in that recognize the languages in (i). Note that

(4.2)

recognizes ,

(4.3)

recognizes , and

(4.4)

recognizes , for each . Finally, suppose that and are two languages recognized by systems of linear recurrence equations. By Corollary 4.2, there are systems of linear recurrence equations recognizing . ∎

Theorem 4.3 shows that linear recurrence equations with coefficients in recognize, precisely, the regular languages. Hence, counting automata over recognize the regular languages as well. By Kleene’s Theorem, regular languages are recognized by finite automata. Thus, it is natural to translate concepts from finite automata theory to counting automata over . For example, we can define what it means for a counting automaton over to be deterministic.

Definition 4.3.

Let be a counting automaton over and let be its set of states. We say that is deterministic if for every state , the transition weight languages are pairwise disjoint.

Note that this is equivalent to saying that given and , belongs to at most one of the transition weight languages . Hence, our definition coincides with the classical definition of a deterministic automaton (see [6]). Notice we have not discussed how to turn a (nondeterministic) counting automaton into a deterministic one. It should be clear, however, that the techniques to accomplish this from finite automata theory can be applied to counting automata over .

Example 4.1.

Recurrence Equations and Regular Languages

Let and notice that is a regular language. Assume that the alphabet is . Then is recognized by the automaton below.

Figure 4. Counting automaton recognizing

Equivalently, is recognized by the system below.

We can write as , where is the th cross-section of . If we write the system above in matrix form, as , then . Notice that , which agrees with the fact that the language has no words of length 1. Similarly, , and note that these are precisely the words of length 4 in .

5. Path-Counting and Self-Counting Automata: Calculating the Density Function of a Regular Language

In the previous section we saw how we can use weighted automata and linear recurrence equations to recognize a regular language. In particular, we saw how this type of language recognition induces a partition of the language into its cross-sections. And thus, for each , we have a way of generating all the words of length in the language. In this section we will see that it is possible to output not only the words, but also the number of words, of length in the language. That is, we show how to calculate the density function of the language. Recall that a word of length is recognized by a successful path with the same length. With this connection in mind, we start by constructing an automaton that can count, for any given , the number of successful paths of length . In order to do this, we introduce the notion of a path-counting automaton.

Definition 5.1.

Given a counting automaton , its path-counting automaton is a counting automaton over that is able to count the number of successful paths of any given length in .

We now show how to construct a path-counting automaton . Assume that the set of states of is . Then we denote the set of states of by . We let be an initial state in if is an initial state in . Suppose that . We let if is final. Otherwise, if is non-final, we let . Finally, consider a transition in , corresponding to a transition in . We let if . Otherwise, if , we let . By Theorem 3.1, the automaton generates a system

(5.1)

From the way we defined and , a simple proof by induction shows that, if is an initial state, equals the number of successful paths of length that start at .

We now define the self-counting automaton.

Definition 5.2.

Given a counting automaton and its corresponding path-counting automaton , the self-counting automaton is a counting automaton over capable of counting its own successful paths of any given length.

Essentially, is an extension of the automaton . If the set of states of is , then the set of states of is . If is an initial state, we let be an initial state, and for every final state of , we let be a final state of . Finally, notice that the weights are also ordered pairs. It is clear that if the weight of is , then the weight of is . We can think of as an extension of , where the first coordinate keeps track of the weights of the paths traversed (thus mimicking ), while the second coordinate keeps track of the number of paths traversed.

Remark.

It is not difficult to see that path-counting and self-counting automata can be used to count the number of successful paths of a weighted automaton over any alphabet, not just a one-letter alphabet. Since a successful path does not depend on the alphabet used, simply identify all the letters in the alphabet, say , with a letter , and use counting automata.

We now return to our discussion of formal languages. Recall that a word of length is recognized by a counting automaton if there a successful path of length in with weight . Hence, given a counting automaton , we would expect the number of words of length in to be equal to the number of successful paths of length in . That is, we would expect the density function to be . However, these two quantities could fail to be equal. Notice that (i) a path could recognize more than one word, and (ii) a word could be recognized by more than one path. The next theorem shows how to correctly define the function that counts the number of words of length (the density of the language) and the conditions needed on the automaton.

Theorem 5.1.

Let be a deterministic counting automaton over . Then the density function of (the language recognized by ) can be defined via linear recurrence equations with coefficients in .

Proof.

Recall that the function in Eq. 5.1 counts the number of successful paths of length in . As we pointed out, this quantity need not be equal to the number of words of length in . First, we need to account for case (i) above, since a path could recognize more than one word. This is because a transition weight may contain more than one letter from . The recurrence equations that define the density function will be just like the ones in Eq. 5.1, except that each coefficient needs to count the number of letters in the corresponding transition weight (instead of just being 0s or 1s). Given a regular language recognized by a system

(5.2)

we define the following system of linear recurrence equations

(5.3)

where denotes the cardinality of a set .

It is clear that is greater than or equal to the number of words of length in . Note that is strictly greater if there is a word recognized by more than one successful path. We claim that, since is deterministic, no word of length is recognized by more than one successful path with the same length, and thus gives precisely the number of words of length in . (Hence, determinism will take care of case (ii) above, where a word could be recognized by more than one path.)

Suppose, on the contrary, that there is a word recognized by more than one successful path. Since has only one initial state, then there is at least one state that both paths share, with the property that the transition weights leaving are not pairwise disjoint. This contradicts the fact that is deterministic. Thus, we conclude that the density function of is . ∎

Remark.

The density function of a language is usually denoted in the literature by (see [15]). Formal languages can be classified according to their density. For example, we say that a language has a constant, polynomial, or exponential density if has constant, polynomial, or exponential order, respectively.

Example 5.1.

Path-Counting Automata, and a Language of Polynomial Density

Consider the regular language . It easy to see that is recognized by the counting automaton shown below. (Note that is deterministic.)

Figure 5. Counting automaton recognizing

It is not difficult to see that the system that defines the density function is the one below.

We could write this system in matrix form, as . Then . Alternatively, we could obtain an explicit formula for in the following way. Notice that for , and hence for . Using this, we obtain that and, for , . Thus, if , . We conclude that the language contains exactly words of length , for .

Example 5.2.

Path-Counting Automata, and a Language of Exponential Density

Consider again the language from Example 4.1, recognized by the counting automaton in Figure 4. This automaton is, clearly, deterministic. It is not difficult to see that the density function is defined by the following system.

Notice that , with and . Hence, if denotes the th Fibonacci number, we have that , for . We conclude that if , the number of words of length in is , where is the golden ratio.

Acknowledgements

This work was made possible, in part, by the PSC-CUNY Grant 60116-39 40.

References

  • [1] M. Ackerman and E. Mäkinen. Three new algorithms for regular language enumeration. In COCOON ’09: Proceedings of the 15th Annual International Conference on Computing and Combinatorics. Springer-Verlag, 2009.
  • [2] M. Ackerman and J. Shallit. Efficient enumeration of regular languages. In CIAA’07: Proceedings of the 12th International Conference on Implementation and Application of Automata. Springer-Verlag, 2007.
  • [3] J. Brzozowski. Derivatives of regular expressions. Journal of the Association for Computing Machinery, 1964.
  • [4] C. Cortes and M. Mohri. Context-free recognition with weighted automata. In Proceedings of the Sixth Meeting on Mathematics of Language, MOL6, 1999.
  • [5] M. Droste and P. Gastin. Weighted automata and weighted logics. Technical report, Laboratoire de Spécification et Vérification, ENS de Cachan and CNRS, 2005.
  • [6] S. Eilenberg. Automata, Languages, and Machines. Academic Press, 1974.
  • [7] B. Khoussainov and A. Nerode. Automata Theory and its Applications. Birkhauser, 2001.
  • [8] W. Kuich and A. Salomaa. Semirings, Automata, Languages. Springer-Verlag, 1986.
  • [9] M. Mohri. Weighted finite-state transducer algorithms an overview. Technical report, AT&T Labs—Research, 2004.
  • [10] M. Mohri, F. Pereira, and M. Riley. Weighted finite-state transducers in speech recognition. In ISCA ITRW Automatic Speech Recognition: Challenges for the Millennium, 2000.
  • [11] F. Pereira and M. Riley. Speech recognition by composition of weighted finite automata. Technical report, AT&T Labs—Research, 1996.
  • [12] J. J. M. M. Rutten. Elements of stream calculus (an extensive exercise in coinduction). Technical report, Centrum voor Wiskunde en Informatica, 2001.
  • [13] J. J. M. M. Rutten. Behavioral differential equations: a coinductive calculus of streams, automata, and power series. Technical report, Centrum voor Wiskunde en Informatica, 2002.
  • [14] J. J. M. M. Rutten. Coinductive counting with weighted automata. Technical report, Centrum voor Wiskunde en Informatica, 2002.
  • [15] S. Yu. Regular Languages. Handbook of Formal Languages, Vol. 1: Word, Language, Grammar. Springer-Verlag, 1997.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel