Complexity of Left-Ideal, Suffix-Closed and Suffix-Free Regular LanguagesThis work was supported by the Natural Sciences and Engineering Research Council of Canada grant No. OGP0000871.

# Complexity of Left-Ideal, Suffix-Closed and Suffix-Free Regular Languages††thanks: This work was supported by the Natural Sciences and Engineering Research Council of Canada grant No. OGP0000871.

Janusz Brzozowski and Corwin Sinnamon David R. Cheriton School of Computer Science, University of Waterloo,
Waterloo, ON, Canada N2L 3G1
{brzozo@uwaterloo.ca, sinncore@gmail.com
###### Abstract

A language L over an alphabet \Sigma is suffix-convex if, for any words x,y,z\in\Sigma^{*}, whenever z and xyz are in L, then so is yz. Suffix-convex languages include three special cases: left-ideal, suffix-closed, and suffix-free languages. We examine complexity properties of these three special classes of suffix-convex regular languages. In particular, we study the quotient/state complexity of boolean operations, product (concatenation), star, and reversal on these languages, as well as the size of their syntactic semigroups, and the quotient complexity of their atoms.

Keywords: different alphabets, left ideal, most complex, quotient/state complexity, regular language, suffix-closed, suffix-convex, suffix-free, syntactic semigroup, transition semigroup, unrestricted complexity

## 1 Introduction

Suffix-Convex Languages Convex languages were introduced in 1973 [30], and revisited in 2009 [1]. For w,x,y\in\Sigma^{*}, if w=xy, then y is a suffix of w. A language L is suffix-convex if, whenever z and xyz are in L, then yz is also in L, for all x,y,z\in\Sigma^{*}. Suffix-convex languages include three well-known subclasses: left-ideal, suffix-closed, and suffix-free languages. A language L is a left ideal if it is non-empty and satisfies the equation L=\Sigma^{*}L. Left ideals play a role in pattern matching: If one is searching for all words ending with words in some language L in a given text (a word over \Sigma^{*}), then one is looking for words in \Sigma^{*}L. Left ideals also constitute a basic concept in semigroup theory. A language L is suffix-closed if, whenever w is in L and x is a suffix of w, then x is also in L, for all w,x\in\Sigma^{*}. The complement of every suffix-closed language not equal to \Sigma^{*} is a left ideal. A language is suffix-free if no word in the language is a suffix of another word in the language. Suffix-free languages (with the exception of \{\varepsilon\}, where \varepsilon is the empty word) are suffix codes. They have many applications, and have been studied extensively; see [3] for example.

Quotient/State Complexity If \Sigma is an alphabet and L\subseteq\Sigma^{*} is a language such that every letter of \Sigma appears in some word of L, then the (left) quotient of L by a word w\in\Sigma^{*} is w^{-1}L=\{x\mid wx\in L\}. A language is regular if and only if it has a finite number of distinct quotients. So the number of quotients of L, the quotient complexity \kappa(L) [4] of L, is a natural measure of complexity for L. A concept equivalent to quotient complexity is the state complexity [31] of L, which is the number of states in a complete minimal deterministic finite automaton (DFA) with alphabet \Sigma recognizing L. We refer to quotient/state complexity simply as complexity.

If L_{n} is a regular language of complexity n, and \circ is a unary operation, then the complexity of \circ is the maximal value of \kappa(L_{n}^{\circ}), expressed as a function of n, as L_{n} ranges over all regular languages of complexity n. Similarly, if L^{\prime}_{m} and L_{n} are regular languages of complexities m and n respectively, \circ is a binary operation, then the complexity of \circ is the maximal value of \kappa(L^{\prime}_{m}\circ L_{n}), expressed as a function of m and n, as L^{\prime}_{m} and L_{n} range over all regular languages of complexities m and n, respectively. The complexity of an operation is a lower bound on its time and space complexities, and has been studied extensively; see [4, 5, 23, 31].

In the past the complexity of a binary operation was studied under the assumption that the arguments of the operation are restricted to be over the same alphabet, but this restriction was removed in [6]. We study both the restricted and unrestricted cases.

Witnesses To find the complexity of a unary operation we find an upper bound on this complexity, and languages that meet this bound. We require a language L_{n} for each n\geqslant k, that is, a sequence (L_{k},L_{k+1},\dots), where k is a small integer, because the bound may not hold for small values of n. Such a sequence is a stream of languages. For a binary operation we require two streams. Sometimes the same stream can be used for both operands; in general, however, this is not the case. For example, the bound for union is mn, and it cannot be met by languages from one stream if m=n because L_{n}\cup L_{n}=L_{n} and the complexity is n instead of n^{2}.

Dialects For all common binary operations on regular languages the second stream can be a “dialect” of the first, that is it can “differ only slightly” from the first, and all the bounds can still be met [5]. Let \Sigma=\{a_{1},\dots,a_{k}\} be an alphabet ordered as shown; if L\subseteq\Sigma^{*}, we denote it by L(a_{1},\dots,a_{k}) to stress its dependence on \Sigma. A dialect of L is obtained by deleting letters of \Sigma in the words of L, or replacing them by letters of another alphabet \Sigma^{\prime}. More precisely, for a partial injective map \pi\colon\Sigma\mapsto\Sigma^{\prime}, we obtain a dialect of L by replacing each letter a\in\Sigma by \pi(a) in every word of L, or deleting the word entirely if \pi(a) is undefined. We write L(\pi(a_{1}),\dots,\pi(a_{k})) to denote the dialect of L(a_{1},\dots,a_{k}) given by \pi, and we denote undefined values of \pi by “-”. For example, if L(a,b,c)=\{a,ab,ac\}, then L(b,-,d) is the language \{b,bd\}. Undefined values at the end of the alphabet are omitted. A similar definition applies to DFAs. Our definition of dialect is more general than that of [8, 13], where only the case \Sigma^{\prime}=\Sigma was allowed.

Most Complex Streams It was proved that there exists a stream (L_{3},L_{4},\dots) of regular languages which together with some dialects meets all the complexity bounds for reversal, (Kleene) star, product (concatenation), and all binary boolean operations [5, 6]. Moreover, this stream meets two additional complexity bounds: the size of the syntactic semigroup, and the complexities of atoms (discussed later). A stream of deterministic finite automata (DFAs) corresponding to a most complex language stream is a most complex DFA stream. In defining a most complex stream we try to minimize the size of the union of the alphabets of the dialects required to meet all the bounds.

Most complex streams are useful in the designs of systems dealing with regular languages and finite automata. To know the maximal sizes of automata that can be handled by the system it suffices to use the most complex stream to test all the operations.

It is known that there is a most complex stream of left ideals that meets all the bounds in both the restricted [8, 13] and unrestricted [13] cases, but a most complex suffix-free stream does not exist [15].

Our Contributions

1. 1.

We derive a new left-ideal stream from the most complex left-ideal stream and show that it meets all the complexity bounds except that for product.

2. 2.

We prove that the complement of the new left-ideal stream is a most complex suffix-closed stream.

3. 3.

We find a new suffix-free stream that meets the bounds for star, product and boolean operations; it has simpler transformations than the known stream.

4. 4.

Our witnesses for left-ideal, suffix-closed, and suffix-free streams are all derived from one most complex regular stream.

## 2 Background

Finite Automata A deterministic finite automaton (DFA) is a quintuple {\mathcal{D}}=(Q,\Sigma,\delta,q_{0},F), where Q is a finite non-empty set of states, \Sigma is a finite non-empty alphabet, \delta\colon Q\times\Sigma\to Q is the transition function, q_{0}\in Q is the initial state, and F\subseteq Q is the set of final states. We extend \delta to a function \delta\colon Q\times\Sigma^{*}\to Q as usual. A DFA {\mathcal{D}} accepts a word w\in\Sigma^{*} if {\delta}(q_{0},w)\in F. The language accepted by {\mathcal{D}} is denoted by L({\mathcal{D}}). If q is a state of {\mathcal{D}}, then the language L^{q} of q is the language accepted by the DFA (Q,\Sigma,\delta,q,F). A state is empty if its language is empty. Two states p and q of {\mathcal{D}} are equivalent if L^{p}=L^{q}. A state q is reachable if there exists w\in\Sigma^{*} such that \delta(q_{0},w)=q. A DFA is minimal if all of its states are reachable and no two states are equivalent. Usually DFAs are used to establish upper bounds on the complexity of operations and also as witnesses that meet these bounds.

A nondeterministic finite automaton (NFA) is a quintuple {\mathcal{D}}=(Q,\Sigma,\delta,I,F), where Q, \Sigma and F are defined as in a DFA, \delta\colon Q\times\Sigma\to 2^{Q} is the transition function, and I\subseteq Q is the set of initial states. An \varepsilon-NFA is an NFA in which transitions under the empty word \varepsilon are also permitted.

Transformations We use Q_{n}=\{0,\dots,n-1\} as our basic set with n elements. A transformation of Q_{n} is a mapping t\colon Q_{n}\to Q_{n}. The image of q\in Q_{n} under t is denoted by qt. If s and t are transformations of Q_{n}, their composition is denoted (qs)t when applied to q\in Q_{n}. Let {\mathcal{T}}_{Q_{n}} be the set of all n^{n} transformations of Q_{n}; then {\mathcal{T}}_{Q_{n}} is a monoid under composition.

For k\geqslant 2, a transformation (permutation) t of a set P=\{q_{0},q_{1},\ldots,q_{k-1}\}\subseteq Q is a k-cycle if q_{0}t=q_{1},q_{1}t=q_{2},\ldots,q_{k-2}t=q_{k-1},q_{k-1}t=q_{0}. This k-cycle is denoted by (q_{0},q_{1},\ldots,q_{k-1}). A 2-cycle (q_{0},q_{1}) is called a transposition. A transformation that sends all the states of P to q and acts as the identity on the remaining states is denoted by (P\to q) the transformation (Q_{n}\to p) is called constant. If P=\{p\} we write (p\to q) for (\{p\}\to q). The identity transformation is denoted by \mathbbm{1}. The notation (_{i}^{j}\;q\to q+1) denotes a transformation that sends q to q+1 for i\leqslant q\leqslant j and is the identity for the remaining states. the notation (_{i}^{j}\;q\to q-1) is defined similarly.

Semigroups The Myhill congruence {\mathbin{\approx_{L}}} [28] (also known as the syntactic congruence) of a language L\subseteq\Sigma^{*} is defined on \Sigma^{+} as follows: For x,y\in\Sigma^{+},x{\mathbin{\approx_{L}}}y if and only if wxz\in L\Leftrightarrow wyz\in L for all w,z\in\Sigma^{*}. The quotient set \Sigma^{+}/{\mathbin{\approx_{L}}} of equivalence classes of {\mathbin{\approx_{L}}} is a semigroup, the syntactic semigroup T_{L} of L.

Let {\mathcal{D}}=(Q_{n},\Sigma,\delta,0,F) be a DFA. For each word w\in\Sigma^{*}, the transition function induces a transformation \delta_{w} of Q_{n} by w: for all q\in Q_{n}, q\delta_{w}=\delta(q,w). The set T_{{\mathcal{D}}} of all such transformations by non-empty words is the transition semigroup of {\mathcal{D}} under composition [29]. Sometimes we use the word w to denote the transformation it induces; thus we write qw instead of q\delta_{w}. We extend the notation to sets: if P\subseteq Q_{n}, then Pw=\{pw\mid p\in P\}. We also find write P\lx@stackrel{{\scriptstyle w}}{{\longrightarrow}}Pw to indicate that the image of P under w is Pw.

If {\mathcal{D}} is a minimal DFA of L, then T_{{\mathcal{D}}} is isomorphic to the syntactic semigroup T_{L} of L [29], and we represent elements of T_{L} by transformations in T_{{\mathcal{D}}}. The size of this semigroup has been used as a measure of complexity [5, 20, 24, 27].

Atoms Atoms are defined by a left congruence, where two words x and y are equivalent if ux\in L if and only if uy\in L for all u\in\Sigma^{*}. Thus x and y are equivalent if x\in u^{-1}L if and only if y\in u^{-1}L. An equivalence class of this relation is an atom of L [19]. Thus an atom is a non-empty intersection of complemented and uncomplemented quotients of L. The number of atoms and their complexities were suggested as possible measures of complexity of regular languages [5], because all the quotients of a language, and also the quotients of atoms, are always unions of atoms  [18, 19, 25].

Our Key Witness The stream ({\mathcal{D}}_{n}(a,b,c)\mid n\geqslant 3) of Definition 1 and Figure 1 was introduced in [5] and studied further in [12]. It will be used as a component in all the classes of languages examined in this paper. It was shown in [5, 12] that this stream together with some dialects is most complex.

###### Definition 1

For n\geqslant 3, let {\mathcal{D}}_{n}={\mathcal{D}}_{n}(a,b,c)=(Q_{n},\Sigma,\delta_{n},0,\{n-1\}), where \Sigma=\{a,b,c\}, and \delta_{n} is defined by a\colon(0,\dots,n-1), b\colon(0,1), c\colon(n-1\rightarrow 0).

## 3 Left Ideals

The following stream was studied in [20] and also in [7, 8, 14]. This stream is most complex when the two alphabets are the same in binary operations [8]. It is also most complex for unrestricted operations [13].

###### Definition 2

For n\geqslant 4, let {\mathcal{D}}_{n}={\mathcal{D}}_{n}(a,b,c,d,e)=(Q_{n},\Sigma,\delta_{n},0,\{n-% 1\}), where \Sigma=\{a,b,c,d,e\}, and \delta_{n} is defined by transformations a\colon(1,\dots,n-1), b\colon(1,2), {c\colon(n-1\to 1)}, {d\colon(n-1\to 0)}, and e\colon(Q_{n}\to 1). See Figure 2. Let L_{n}=L_{n}(a,b,c,d,e) be the language accepted by {\mathcal{D}}_{n}.

###### Theorem 3.1 (Most Complex Left Ideals [8, 13])

For each n\geqslant 4, the DFA of Definition 2 is minimal. The stream (L_{n}(a,b,c,d,e)\mid n\geqslant 4) with some dialect streams is most complex in the class of regular left ideals.

1. 1.

The syntactic semigroup of L_{n}(a,b,c,d,e) has cardinality n^{n-1}+n-1.

2. 2.

Each quotient of L_{n}(a,-,-,d,e) has complexity n.

3. 3.

The reverse of L_{n}(a,-,c,d,e) has complexity 2^{n-1}+1, and L_{n}(a,-,c,d,e) has 2^{n-1}+1 atoms.

4. 4.

For each atom A_{S} of L_{n}(a,b,c,d,e), the complexity \kappa(A_{S}) satisfies:

 \kappa(A_{S})=\begin{cases}n,&\text{if $S=Q_{n}$;}\\ 2^{n-1},&\text{if $S=\emptyset$;}\\ 1+\sum_{x=1}^{|S|}\sum_{y=1}^{n-|S|}\binom{n-1}{x}\binom{n-x-1}{y-1},&\text{% otherwise.}\end{cases}
5. 5.

The star of L_{n}(a,-,-,-,e) has complexity n+1.

6. 6.
1. (a)

Restricted product: \kappa(L^{\prime}_{m}(a,-,-,-,e)L_{n}(a,-,-,-,e))=m+n-1.

2. (b)

Unrestricted product: \kappa(L^{\prime}_{m}(a,b,-,d,e)L_{n}(a,d,-,c,e))=mn+m+n.

7. 7.
1. (a)

Restricted complexity: \kappa(L^{\prime}_{m}(a,-,c,-,e)\circ L_{n}(a,-,e,-,c))=mn.

2. (b)

Unrestricted complexity: \kappa(L^{\prime}_{m}(a,b,-,d,e)\circ L_{n}(a,d,-,c,e)=(m+1)(n+1) if \circ\in\{\cup,\oplus\}), mn+m if \circ=\setminus, and mn if \circ=\cap.

In both cases these bounds are the same as those for regular languages.

We now define a new left-ideal witness similar to the witness in Definition 2.

###### Definition 3

For n\geqslant 4, let {\mathcal{E}}_{n}={\mathcal{E}}_{n}(a,b,c,d,e)=(Q_{n},\Sigma,\delta_{n},0,\{1,% \dots,n-1\}), where \Sigma and the transformations induced by its letters are as in {\mathcal{D}}_{n} of Definition 2. Let M_{n}=M_{n}(a,b,c,d,e) be the language accepted by {\mathcal{E}}_{n}.

###### Theorem 3.2 (Nearly Most Complex Left Ideals)

For each n\geqslant 4, the DFA of Definition 3 is minimal and its language M_{n}(a,b,c,d,e) is a left ideal of complexity n. The stream (M_{n}(a,b,c,d,e)\mid n\geqslant 4) with some dialect streams meets all the complexity bounds for left ideals, except those for product.

###### Proof

It is easily verified that {\mathcal{E}}_{n}(a,-,-,d,e) is minimal; hence M_{n}(a,b,c,d,e) has complexity n. M_{n} is a left ideal because, for each letter \ell of \Sigma, and each word w\in\Sigma^{*}, w\in M_{n} implies \ell w\in M_{n}. We prove all the claims of Theorem 3.1 except the claims in Item 6.

1. 1.

Semigroup The transition semigroup is independent of the set of final states; hence it has the size of the DFA of the most complex left ideal.

2. 2.

Quotients Obvious.

3. 3.

Reversal The upper bound of 2^{n-1}+1 was proved in [9], and it was shown in [19] that the number of atoms is the same as the complexity of the reverse. Applying the standard NFA construction for reversal, we reverse every transition in DFA {\mathcal{E}}_{n} and interchange the final and initial states, yielding the NFA in Figure 3, where the initial states (unmarked) are Q_{n}\setminus\{0\}.

We perform the subset construction. Set Q_{n}\setminus\{0\} is initial. From \{q_{1},\dots,q_{k}\}, 1\leqslant q_{1}\leqslant q_{k}, we delete q_{i}, q_{1}\leqslant q_{i}\leqslant q_{k}\leqslant n-1, by applying a^{q_{i}}da^{n-1-q_{i}}. Thus all 2^{n-1} subsets of Q_{n}\setminus\{0\} can be reached, and Q_{n} is reached from the initial state \{1\} by e. For any distinct S,T\subseteq Q_{n} with q\in S\setminus T, either q=0, in which case S is final and T is non-final, or Sa^{q-1}e=Q_{n} and Ta^{q-1}e=\emptyset. Hence all 2^{n-1}+1 states are pairwise distinguishable.

4. 4.

Atoms The upper bounds in Theorem 3.1 for left ideals were derived in [7]. The proof of [7] that these bounds are met applies also to our witness M_{n}.

5. 5.

Star The upper bound n+1 was proved in [9]. To construct an NFA recognizing (M_{n}(a,-,-,d,e))^{*} we add a new initial state 0^{\prime} which is also final and has the same transitions as the former initial state 0. We then add an \varepsilon-transition from each final state of {\mathcal{E}}(a,-,-,d,e) to the initial state 0^{\prime}. The language recognized by the new NFA {\mathcal{N}} is (M_{n}(a,-,-,d,e))^{*}. The final state \{0^{\prime}\} in the subset construction for {\mathcal{N}} is distinguishable from every other final state, since it rejects a, whereas other final states accept it.

6. 6.

Product Not applicable.

7. 7.

Boolean Operations

1. (a)

Restricted complexity: The upper bound of mn is the same as for regular languages. We show that M^{\prime}_{m}(a,b,-,d,e)\circ M_{n}(a,e,-,d,b) has complexity mn. In the standard construction for boolean operations, we consider the direct product of \mathcal{E}^{\prime}_{m}(a,b,-,d,e) and \mathcal{E}_{n}(a,e,-,d,b). The set of final states of the direct product varies depending on the operation \circ\in\{\cup,\oplus,\setminus,\cap\}.

We first check that all mn states are reachable in the direct product. State (0^{\prime},0) is initial and (p^{\prime},0), 1\leqslant p\leqslant m-1, is reached by ea^{p-1}. If 1\leqslant q\leqslant p then (p^{\prime},q) is reached from ((p-q+1)^{\prime},0) by b^{2}a^{q-1}. Similarly (0^{\prime},q) is reached by ba^{q-1}, and if 1\leqslant p\leqslant q then (p^{\prime},q) is reached from (0^{\prime},q-p+1) by e^{2}a^{p-1}. Hence all mn states are reachable.

Let R=\{(0^{\prime},q)\mid q\in Q_{n}\setminus\{0\}\}, C=\{(p^{\prime},0)\mid p^{\prime}\in Q^{\prime}_{m}\setminus\{0^{\prime}\}\}, and S=\{(p^{\prime},q)\mid p^{\prime}\in Q^{\prime}_{m}\setminus\{0\},q\in Q_{n}% \setminus\{0\}\}. States of S are pairwise distinguished with respect to the set R\cup C\cup\{(0^{\prime},0)\} by words in a^{*}bd if they differ in the first coordinate, or by words in a^{*}ed if they differ in the second coordinate. We consider each operation separately to show that the states of R\cup C\cup\{(0^{\prime},0)\} are pairwise distinguishable and distinguishable from the states of S, with respect to the final states.

Union All states are final except (0^{\prime},0). States of R are distinguished by words in a^{*}d, as are states of C. States of R are distinguishable from those of C\cup S because every state of C\cup S accepts either ba^{n-2}d or aba^{n-2}d, while states of R are sent to (0^{\prime},0) by any word in a^{*}ba^{n-2}d. States of C are similarly distinguishable from those of R\cup S.

Symmetric Difference The final states are those in R\cup C. The argument is the same as union, except (0^{\prime},0) is distinguished from states of S by e.

Difference The final states are those in C. States of C are distinguished by words in a^{*}d, and states of R are distinguished by words in a^{*}de. State (0^{\prime},0) is distinguished from states of R by e. States of R\cup\{(0^{\prime},0)\} are distinguishable from states of S because every state of S accepts a word in \{a,b,d\}^{*}, while those of C\cup\{(0^{\prime},0)\} accept only words with e.

Intersection The final states are those in S. States of R are distinguished by words in a^{*}d, as are states of C. States of R are distinguished from states of C by e. State (0^{\prime},0) is distinguished from states of R by e and from states of C by b.

Hence M^{\prime}_{m}(a,b,-,d,e)\circ M_{n}(a,e,-,d,b) has complexity mn for each \circ\in\{\cup,\oplus,\setminus,\cap\}.

2. (b)

Unrestricted complexity: To produce a DFA recognizing M^{\prime}_{m}(a,b,c,d,e)\circ M_{n}(a,e,f,d,b), where \circ is a boolean operation, we first add an empty state \emptyset^{\prime} to \mathcal{E}^{\prime}_{m}(a,b,c,d,e), and send all the transitions from any state of Q^{\prime}_{m} under f to \emptyset^{\prime}. Similarly add an empty state \emptyset to \mathcal{E}_{n}(a,e,f,d,b) and send all the transitions from any state of Q_{n} under c to \emptyset. Now the DFAs are over the combined alphabet \{a,b,c,d,e,f\} and we take the direct product as before; the direct product for union is illustrated in Figure 5.

By the restricted case all the states of Q^{\prime}_{m}\times Q_{n} are reachable and distinguishable using words in \{a,b,d,e\}^{*}. Let R_{\emptyset^{\prime}}=\{(\emptyset^{\prime},q)\mid q\in Q_{n}\} and C_{\emptyset}=\{(p^{\prime},\emptyset)\mid p^{\prime}\in Q^{\prime}_{m}\}. States of R_{\emptyset^{\prime}}\cup C_{\emptyset}\cup\{(\emptyset^{\prime},\emptyset)\} are easily seen to be reachable using c and f in addition to a, b, d, and e. We check that the states of R_{\emptyset^{\prime}}\cup C_{\emptyset}\cup\{(\emptyset^{\prime},\emptyset)\} are pairwise distinguishable and distinguishable from the states of Q^{\prime}_{m}\times Q_{n}.

Union The final states of R_{\emptyset^{\prime}} are distinguished by words in a^{*}d, and those of C_{\emptyset} are similarly distinguishable. All states except \{(\emptyset^{\prime},\emptyset)\} are non-empty since each accepts a word in \{a,b,d,e\}^{*}. States of R_{\emptyset^{\prime}}\cup\{(\emptyset^{\prime},\emptyset)\} are distinguishable from all other states since every other state accepts ce. Similarly, states of C_{\emptyset}\cup\{(\emptyset^{\prime},\emptyset)\} are distinguishable from all other states since every other state accepts fb. Hence all (m+1)(n+1) states are pairwise distinguishable.

Symmetric Difference Same as union.

Difference States of R_{\emptyset^{\prime}}\cup\{(\emptyset^{\prime},\emptyset)\} are empty and therefore equivalent. However, since the alphabet of M^{\prime}_{m}(a,b,c,d,e)\setminus M_{n}(a,e,f,d,b) is \{a,b,c,d,e\} we can omit f and delete the states of R_{\emptyset^{\prime}}\cup\{(\emptyset^{\prime},\emptyset)\}, and be left with a DFA over \{a,b,c,d,e\} that recognizes M^{\prime}_{m}(a,b,c,d,e)\setminus M_{n}(a,e,f,d,b). States of C_{\emptyset} are distinguished by words in a^{*}d, and states of Q^{\prime}_{m}\times Q_{n} are distinguished from states of C_{\emptyset} by be. Hence the mn+m remaining states are pairwise distinguishable.

Intersection States of R_{\emptyset^{\prime}}\cup C_{\emptyset}\cup\{(\emptyset^{\prime},\emptyset)\} are empty and therefore equivalent. However, since the alphabet of M^{\prime}_{m}(a,b,c,d,e)\cap M_{n}(a,e,f,d,b) is \{a,b,d,e\}, we can omit c and f and delete the states of R_{\emptyset^{\prime}}\cup C_{\emptyset}\cup\{(\emptyset^{\prime},\emptyset)\}, and be left with a DFA over \{a,b,d,e\} that recognizes M^{\prime}_{m}(a,b,c,d,e)\cap M_{n}(a,e,f,d,b). By the restricted case, all mn states are pairwise distinguishable. ∎

## 4 Suffix-Closed Languages

The complexity of suffix-closed languages was studied in [10] in the restricted case, and the syntactic semigroup of these languages, in [14, 17, 20]; however, most complex suffix-closed languages have not been examined.

###### Definition 4

For n\geqslant 4, let {\mathcal{D}}_{n}={\mathcal{D}}_{n}(a,b,c,d,e)=(Q_{n},\Sigma,\delta_{n},0,\{0\}), where \Sigma=\{a,b,c,d,e\}, and \delta_{n} is defined by transformations a\colon(1,\dots,n-1), b\colon(1,2), {c\colon(n-1\to 1)}, {d\colon(n-1\to 0)}, e\colon(Q_{n}\to 1). Let L_{n}=L_{n}(a,b,c,d,e) be the language accepted by {\mathcal{D}}_{n}; this language is the complement of the left ideal of Definition 3. The structure of {\mathcal{D}}_{n}(a,b,c,d,e) is shown in Figure 6.

###### Theorem 4.1 (Most Complex Suffix-Closed Languages)

For each n\geqslant 4, the DFA of Definition 4 is minimal and its language L_{n}(a,b,c,d,e) is suffix-closed and has complexity n. The stream (L_{m}(a,b,c,d,e)\mid m\geqslant 4) with some dialect streams is most complex in the class of suffix-closed languages.

1. 1.

The syntactic semigroup of L_{n}(a,b,c,d,e) has cardinality n^{n-1}+n-1. Moreover, fewer than five inputs do not suffice to meet this bound.

2. 2.

All quotients of L_{n}(a,-,-,d,e) have complexity n.

3. 3.

The reverse of L_{n}(a,-,-,d,e) has complexity 2^{n-1}+1, and L_{n}(a,-,-,d,e) has 2^{n-1}+1 atoms.

4. 4.

For each atom A_{S} of L_{n}(a,b,c,d,e), the complexity \kappa(A_{S}) satisfies:

 \kappa(A_{S})=\begin{cases}n,&\text{if $S=\emptyset$;}\\ 2^{n-1},&\text{if $S=Q_{n}$;}\\ 1+\sum_{x=1}^{|S|}\sum_{y=1}^{n-|S|}\binom{n-1}{y}\binom{n-y-1}{x-1},&\text{if% $\{0\}\subseteq S\subsetneq Q_{n}$.}\end{cases}
5. 5.

The star of L_{n}(a,-,-,d,e) has complexity n.

6. 6.
1. (a)

Restricted complexity: \kappa(L^{\prime}_{m}(a,b,-,d,e)L_{n}(a,e,-,d,b))=mn-n+1.

2. (b)

Unrestricted complexity: \kappa(L^{\prime}_{m}(a,b,c,d,e)L_{n}(a,e,f,d,b))=mn+m+1.

7. 7.
1. (a)

Restricted complexity: \kappa(L^{\prime}_{m}(a,b,-,d,e)\circ L_{n}(a,e,-,d,b))=mn for \circ\in\{\cup,\oplus,\cap,\setminus\}.

2. (b)

Unrestricted complexity: \kappa(L^{\prime}_{m}(a,b,c,d,e)\circ L_{n}(a,e,f,d,b))=(m+1)(n+1) if \circ\in\{\cup,\oplus\}, it is mn+m if \circ=\setminus, and mn if \circ=\cap.

###### Proof

DFA {\mathcal{D}}_{n}(a,-,-,d,e) is minimal and L_{n}(a,b,c,d,e) is suffix-closed since its complement is a left ideal.

1. 1.

Semigroup The transition semigroup is independent of the set of final states; hence its size is the same as that of the transition semigroup of the DFA {\mathcal{E}}_{n} of the left ideal M_{n}.

2. 2.

Quotients Obvious.

3. 3.

Reversal This follows from the results for M_{n}, since complementation commutes with reversal.

4. 4.

Atoms We first establish an upper bound on the complexity of the atoms, using the corresponding bounds for left ideals. Let L be a suffix-closed language with quotients K_{0},\dots,K_{n-1}; then \overline{L} is a left ideal with quotients \overline{K_{0}},\dots,\overline{K_{n-1}}. For S\subseteq Q_{n}, the atom of L corresponding to S is A_{S}=\bigcap_{i\in S}K_{i}\cap\bigcap_{i\in\overline{S}}\overline{K_{i}}. This can be rewritten as \bigcap_{i\in\overline{S}}\overline{K_{i}}\cap\bigcap_{i\in\overline{\overline% {S}}}\overline{\overline{K_{i}}}, which is the atom of \overline{L} corresponding to \overline{S}; hence the sets of atoms of L and \overline{L} are the same. The upper bounds now follow from those for left ideals as given in Theorem 3.1, which were derived in [7].

5. 5.

Star The upper bound n was proved in [10]. To construct an NFA recognizing (L_{n}(a,-,-,d,e))^{*} we add an \varepsilon-transition from the final state of {\mathcal{D}}_{n}(a,-,-,d,e) to the initial state 0; however in this case the \varepsilon-transition is a loop at 0, which does not affect the language recognized by the automaton. Thus (L_{n}(a,-,-,d,e))^{*}=L_{n}(a,-,-,d,e) and its complexity is n.

6. 6.

Product

1. (a)

Restricted complexity: The upper bound mn-n+1 was derived in [10]. The NFA for the product L^{\prime}_{m}(a,b,-,d,e)L_{n}(a,e,-,d,b) is shown in Figure 7 for m=n=4.

As b\colon(1^{\prime},2^{\prime})(Q_{n}\to 1) is the only letter which does not fix 0 and since b maps Q_{n} to 1, the reachable sets in the subset construction are of the form \{p^{\prime},q\} or \{p^{\prime},0,q\} for p^{\prime}\in Q^{\prime}_{m} and q\in Q_{n}. However we cannot reach sets \{0^{\prime},q\} where q\not=0, due to the \varepsilon-transition from 0^{\prime} to 0. Furthermore, the states \{\{p^{\prime},0,q\}\mid q\in Q_{n}\} are equivalent as any word that maps q to 0 also fixes 0. Hence we consider only sets \{p^{\prime},q\} for p^{\prime}\in Q^{\prime}_{m}\setminus\{0^{\prime}\} and q\in Q_{n}, and the initial state \{0^{\prime},0\}; note that there are mn-n+1 such sets.

For p^{\prime}\in Q^{\prime}_{m}\setminus\{0^{\prime}\} set \{p^{\prime},0\} is reached by ea^{p-1}, and \{p^{\prime},q\} is reached from \{r^{\prime},0\} by ba^{q-1} for some r^{\prime}\in Q^{\prime}_{m}\setminus\{0^{\prime}\}, since ba^{q-1} induces a permutation on Q^{\prime}_{m}. Thus all mn-n+1 states are reachable.

Among these states, only \{0^{\prime},0\} is final. Non-final states \{p^{\prime}_{1},q_{1}\} and \{p^{\prime}_{2},q_{2}\} are distinguished by a word in ea^{*}d if q_{1}\not=q_{2}, or by a word in ba^{*}d if p^{\prime}_{1}\not=p^{\prime}_{2}. Thus they are pairwise distinguishable and the product has complexity mn-n+1.

2. (b)

Unrestricted complexity: The NFA for L^{\prime}_{m}(a,b,c,d,e)L_{n}(a,e,f,d,b) is the same as Figure 7 except for the additional transformations c\colon((m-1)^{\prime}\to 1^{\prime})(Q_{n}\to\emptyset) and f\colon(n-1\to 1)(Q^{\prime}_{m}\to\emptyset). In addition to the mn-n+1 reachable and distinguishable states of the restricted case, c and f allow us to reach \{p^{\prime}\} for p^{\prime}\in Q^{\prime}_{m}\setminus\{0^{\prime}\}, \{q\} for q\in Q_{n}, and \emptyset. State \{p^{\prime}\} is reached from the initial state by eca^{p-1}, \{0\} is reached by f, and \{q\} is reached by fba^{q-1}. The empty set is reached by fc.

The original mn-n+1 states are pairwise distinguishable as before. States \{p^{\prime}\} for p^{\prime}\in Q^{\prime}_{m}\setminus\{0^{\prime}\} are pairwise distinguishable by words in a^{*}d, as are states \{q\} for q\in Q_{n}. All states other than \emptyset are non-empty, since they all accept ea^{m-2}d or ba^{n-2}d. All states \{p^{\prime}\} are distinguishable from states containing elements of Q_{n}, since \{p^{\prime}\}f=\emptyset while \{q\}f\not=\emptyset for all q\in Q_{n}. Similarly, all states \{q\} are distinguishable from states containing elements of Q^{\prime}_{m}. Thus, all mn+m+1 states are pairwise distinguishable.

7. 7.

Boolean Operations

1. (a)

Restricted complexity: Since L^{\prime}_{m}(a,b,-,d,e) (L_{n}(a,e,-,d,b)) is the complement of the left ideal M^{\prime}_{m}(a,b,-,d,e) (M_{n}(a,e,-,d,b)) of Definition 3 and they share a common alphabet \{a,b,d,e\}, by DeMorgan’s laws we have \kappa(L^{\prime}_{m}\cup L_{n})=\kappa(M^{\prime}_{m}\cap M_{n}), \kappa(L^{\prime}_{m}\oplus L_{n})=\kappa(M^{\prime}_{m}\oplus M_{n}), \kappa(L^{\prime}_{m}\setminus L_{n})=\kappa(M_{n}\setminus M^{\prime}_{m}), and \kappa(L^{\prime}_{m}\cap L_{n})=\kappa(M^{\prime}_{m}\cup M_{n}). Thus, by Theorem 3.2, all boolean operations have complexity mn.

2. (b)

Unrestricted complexity: Following the example set in Theorem 3.2, we take the direct product for L^{\prime}_{m}(a,b,c,d,e)\circ L_{n}(a,e,f,d,b), as illustrated in Figure 8 for \circ=\cup.

By the restricted case, the states of Q^{\prime}_{m}\times Q_{n} are reachable and distinguishable using words in \{a,b,d,e\}^{*}. Let R_{\emptyset^{\prime}}=\{(\emptyset^{\prime},q)\mid q\in Q_{n}\} and C_{\emptyset}=\{(p^{\prime},\emptyset)\mid p^{\prime}\in Q^{\prime}_{m}\}. States of R_{\emptyset^{\prime}}\cup C_{\emptyset}\cup\{(\emptyset^{\prime},\emptyset)\} are easily seen to be reachable using c and f in addition to a, b, d, and e. We check that the states of R_{\emptyset^{\prime}}\cup C_{\emptyset}\cup\{(\emptyset^{\prime},\emptyset)\} are pairwise distinguishable and distinguishable from the states of Q^{\prime}_{m}\times Q_{n}.

Union Non-final states of R_{\emptyset^{\prime}} are distinguished by words in a^{*}d, and those of C_{\emptyset} are similarly distinguishable. All states besides \{(\emptyset^{\prime},\emptyset)\} are non-empty since each accepts a word in \{a,b,d,e\}^{*}. States of R_{\emptyset^{\prime}}\cup\{(\emptyset^{\prime},\emptyset)\} are distinguishable from all other states since every other state accepts a word in ca^{*}d; states of C_{\emptyset}\cup\{(\emptyset^{\prime},\emptyset)\} are similarly distinguishable from all other states by words in fa^{*}d. Hence all (m+1)(n+1) states are pairwise distinguishable.

Symmetric Difference Same as union.

Difference States of R_{\emptyset^{\prime}}\cup\{(\emptyset^{\prime},\emptyset)\} are empty and therefore equivalent. However, since the alphabet of L^{\prime}_{m}(a,b,c,d,e)\setminus L_{n}(a,e,f,d,b) is \{a,b,c,d,e\} we can omit f and delete the states of R_{\emptyset^{\prime}}\cup\{(\emptyset^{\prime},\emptyset)\}, and be left with a DFA over \{a,b,c,d,e\} that recognizes L^{\prime}_{m}(a,b,c,d,e)\setminus L_{n}(a,e,f,d,b). States of C_{\emptyset} are distinguished by words in a^{*}d, and states of Q^{\prime}_{m}\times Q_{n} are distinguished from states of C_{\emptyset} by words in a^{*}da^{*}d. Hence the mn+m remaining states are pairwise distinguishable.

Intersection States of R_{\emptyset^{\prime}}\cup C_{\emptyset}\cup\{(\emptyset^{\prime},\emptyset)\} are empty and therefore equivalent. However, since the alphabet of L^{\prime}_{m}(a,b,c,d,e)\cap L_{n}(a,e,f,d,b) is \{a,b,d,e\}, we can omit c and f and delete the states of R_{\emptyset^{\prime}}\cup C_{\emptyset}\cup\{(\emptyset^{\prime},\emptyset)\}, and be left with a DFA over \{a,b,d,e\} that recognizes L^{\prime}_{m}(a,b,c,d,e)\cap L_{n}(a,e,f,d,b). By the restricted case, all mn states are pairwise distinguishable. ∎

## 5 Suffix-Free Languages

The complexity of suffix-free languages was studied in detail in [11, 15, 16, 21, 22, 26]. For completeness we present a short summary of some of those results. The main result of [15, 16] is a proof that a most complex suffix-free language does not exist. Since every suffix-free language has an empty quotient, the restricted and unrestricted cases for binary operations coincide.

For n\geqslant 6, the transition semigroup of the DFA defined below is the largest transition semigroup of a minimal DFA accepting a suffix-free language.

###### Definition 5

For n\geqslant 4, we define the DFA {\mathcal{D}}_{n}(a,b,c,d,e)=(Q_{n},\Sigma,\delta,0,F), where Q_{n}=\{0,\ldots,n-1\}, \Sigma=\{a,b,c,d,e\}, \delta is defined by the transformations a\colon(0\to n-1)(1,\ldots,n-2), b\colon(0\to n-1)(1,2), c\colon(0\to n-1)(n-2\to 1), d\colon(\{0,1\}\to n-1), e\colon(Q\setminus\{0\}\to n-1)(0\to 1), and F=\{q\in Q_{n}\setminus\{0,n-1\}\mid q\text{ is odd}\}. For n=4, a and b coincide, and we can use \Sigma=\{b,c,d,e\}. Let the transition semigroup of {\mathcal{D}}_{n} be \mathbf{T}^{\geqslant 6}(n).

The main result for this witness is the following theorem:

###### Theorem 5.1 (Semigroup, Quotients, Reversal, Atoms, Boolean Ops.)

Consider DFA {\mathcal{D}}_{n}(a,b,c,d,e) of Definition 5; its language L_{n}(a,b,c,d,e) is a suffix-free language of complexity n. Moreover, it meets the following bounds:

1. 1.

For n\geqslant 6, L_{n}(a,b,c,d,e) meets the bound (n-1)^{n-2}+n-2 for syntactic complexity, and at least five letters are required to reach this bound.

2. 2.

The quotients of L_{n}(a,-,-,-,e) have complexity n-1, except for L which has complexity n, and the empty quotient which has complexity 1.

3. 3.

For n\geqslant 4, the reverse of L_{n}(a,-,c,-,e) has complexity 2^{n-2}+1, and L_{n}(a,-,c,-,e) has 2^{n-2}+1 atoms.

4. 4.

Each atom A_{S} of L_{n}(a,b,c,d,e) has maximal complexity:

 \kappa(A_{S})=\begin{cases}2^{n-2}+1,&\text{if $S=\emptyset$;}\\ \par n,&\text{if $S=\{0\}$;}\\ 1+\sum_{x=1}^{|S|}\sum_{y=0}^{n-2-|S|}\binom{n-2}{x}\binom{n-2-x}{y},&% \emptyset\neq S\subseteq\{1,\ldots,n-2\}.\end{cases}
5. 5.

For n,m\geqslant 4, the complexity of L_{m}(a,b,-,d,e)\circ L_{n}(b,a,-,d,e) is mn-(m+n-2) if \circ\in\{\cup,\oplus\}, mn-(m+2n-4) if \circ=\setminus, and mn-2(m+n-3) if \circ=\cap.

6. 6.

A language which has a subsemigroup of \mathbf{T}^{\geqslant 6}(n) as its syntactic semigroup cannot meet the bounds for star and product.

The DFA defined below has the largest transition semigroup when n\in\{4,5\}. The transition semigroup of this DFA is \mathbf{T}^{\leqslant 5}(n), and at least n letters are required to generate it.

###### Definition 6

For n\geqslant 4, {\mathcal{D}}_{n}(a,b,c_{1},\dots,c_{n-2})=(Q_{n},\Sigma_{n},\delta,0,\{n-2\}), where Q_{n}=\{0,\ldots,n-1\}, \Sigma_{n}=\{a,b,c_{1},\dots,c_{n-2}\}, \delta is given by a\colon(0\to n-1)(1,\ldots,n-2), b\colon(0\to n-1)(1,2), and c_{p}\colon(p\to n-1)(0\to p) for 1\leqslant p\leqslant n-2.

We now define a DFA based on Definition 6, but with only three inputs.

###### Definition 7

For n\geqslant 4, define the DFA {\mathcal{D}}_{n}=(Q_{n},\Sigma,\delta,0,\{n-2\}), where Q_{n}=\{0,\ldots,n-1\}, \Sigma=\{a,b,c\}, and \delta is defined by a\colon(0\to n-1)(1,\dots,n-2), b\colon(0\to n-1)(1,2), c\colon(1,n-1)(0\to 1). See Figure 9.

###### Theorem 5.2 (Star, Product, Boolean Operations)

Let {\mathcal{D}}_{n}(a,b,c) be the DFA of Definition 7, and let the language it accepts be L_{n}(a,b,c). Then L_{m} and its permutational dialects meet the bounds for star, product, and boolean operations as follows:

1. 1.

For n\geqslant 4, (L_{n}(a,b,c))^{*} meets the bound 2^{n-2}+1.

2. 2.

For m,n\geqslant 4, L^{\prime}_{m}(a,b,c)L_{n}(c,a,b) meets the bound (m-1)2^{n-2}+1.

3. 3.

For m,n\geqslant 4, but (m,n)\not=(4,4), the complexity of L^{\prime}_{m}(a,b,c)\circ L_{n}(b,a,c) is mn-(m+n-2) if \circ\in\{\cup,\oplus\}, mn-(m+2n-4) if \circ=\setminus, and mn-2(m+n-3) if \circ=\cap.

###### Proof

The upper bounds for these operations were established in [10].

1. 1.

Star We will prove that the DFA of Figure 9 meets the bound 2^{n-2}+1. Since there are no incoming transitions to the initial state 0, to obtain an NFA accepting L_{n}^{*} it is sufficient to make state 0 final, add an \varepsilon-transition from state n-2 to state 0, and delete state n-1. We will show that in the subset construction the following states are reachable and pairwise distinguishable: \{0\}, any one of the 2^{n-3} subsets S of P=\{1,\dots,n-3\}, and 2^{n-3} subsets of the form \{0,n-2\}\cup S, where S\subseteq P.

The initial state is \{0\}, the empty set is reached by a, set \{q\}, q\in P by ca^{q-1}, and \{0,n-2\} by ca^{n-3}. Notice that the set \{1,2,\dots,k\}, where 2\leqslant k\leqslant n-3, is reached from \{1,\dots,k-1\} by a^{n-1-k}ca^{k-1}. It is well known that a and b generate all permutations of \{1,\dots,n-2\}; hence, for any S\subseteq P there is a word w\in\{a,b\}^{*} such that \{1,2,\dots,|S|\}w=S. Similarly a set \{0,q_{1},q_{2},\dots,q_{k},n-2\} with k\leqslant n-4 is reached by a permutation from \{1,2,\dots,k+1\}. Finally, \{0,1,\dots,n-2\} is reached from \{1,\dots,n-3\} by ac; hence every subset of P is reachable, as are the sets S\cup\{0,n-2\} for S\subseteq P.

Any pair of sets which differ by q\in\{1,\dots,n-2\} is distinguished by a^{n-2-q}, and the empty set is distinguished from \{0\} by ca^{n-3}. Thus, all 2^{n-2}+1 sets are reachable and pairwise distinguishable.

2. 2.

Product We construct an NFA for the product by deleting state (m-1)^{\prime}, adding an \varepsilon-transition from state (m-2)^{\prime} to state 0, deleting state n-1, and making state n-2 the only final state. We will show that the following sets are all reachable and pairwise distinguishable: \{0^{\prime}\}, \{p^{\prime}\}\cup S, 1\leqslant p<m-2, \{(m-2)^{\prime},0\}\cup S, and S, where S\subseteq Q_{n-1}\setminus\{0\}.

State \{0^{\prime}\} is initial, \{p^{\prime}\} is reached by ca^{p-1} for 1\leqslant p<m-2, and \{(m-2)^{\prime},0\} is reached by ca^{m-3}. From \{(m-2)^{\prime},0,q_{2}-q_{1},\dots,q_{k}-q_{1}\} we reach \{(m-2)^{\prime},0,q_{1},q_{2},\dots,q_{k}\} by cbc^{q_{1}-1}; hence \{(m-2)^{\prime},0\}\cup S is reachable for any S\subseteq Q_{n-1}\setminus\{0\}. Now for any p^{\prime}\in\{1,\dots,(m-3)^{\prime}\}, if p is even, then \{p^{\prime}\}\cup S is reached from \{(m-2)^{\prime},0\}\cup S by a^{p}, and if p is odd, then \{p^{\prime}\}\cup S is reached from \{(m-2)^{\prime},0\}\cup(Sa) by a^{p}. Finally, S is reached from \{1^{\prime}\}\cup S by c^{n-2}.

Any two states which differ on some q\in Q_{n-1}\setminus\{0\} are distinguished by c^{n-2-q}. Two states that differ on p^{\prime}\in Q^{\prime}_{m}\setminus\{0^{\prime}\} are distinguished by a^{m-2-p}bc^{n-3}. Finally, \{0^{\prime}\} is distinguishable from all other states because it is the only state that accepts ca^{m-3}bc^{n-3}.

3. 3.

Boolean Operations We consider the direct product of {\mathcal{D}}^{\prime}_{m}(a,b,c) and {\mathcal{D}}_{n}(b,a,c), where m,n\geqslant 4 and (m,n)\not=(4,4), illustrated in Figure 11. Let S=\{(p^{\prime},q)\mid p^{\prime}\in Q^{\prime}_{m-1}\setminus\{0^{\prime}\},q% \in Q_{n-1}\setminus\{0\}\}, R=\{((m-1)^{\prime},q)\mid q\in Q_{n-1}\setminus\{0\}\}, and C=\{(p^{\prime},n-1)\mid p\in Q^{\prime}_{m-1}\setminus\{0^{\prime}\}\}.

We first determine which states are reachable in the direct product. State (0^{\prime},0) is initial and (1^{\prime},1) is reachable by c. By [2, Theorem 1] and computation for the cases (m,n)\in\{(5,6),(6,5),(6,6)\}, all (m-2)(n-2) states of S are reachable from (1^{\prime},1) for all m,n\geqslant 4, (m,n)\not=(4,4). State ((m-1)^{\prime},n-2) is reached from (1^{\prime},n-2) by c, and ((m-1)^{\prime},q), q\in Q_{n-1}\setminus\{0\}, is reached from ((m-1)^{\prime},n-2) by b^{q}; thus states of R are reachable. Similarly, state ((m-2)^{\prime},n-1) is reached from ((m-2)^{\prime},1) by c, and (p^{\prime},n-1), p\in Q^{\prime}_{m-1}\setminus\{0^{\prime}\}, is reached from ((m-2)^{\prime},n-1) by a^{p}; thus states of C are reachable. State ((m-1)^{\prime},n-1) is reachable from ((m-1)^{\prime},1) by c. States (p^{\prime},0) for p^{\prime}\not=0^{\prime} and (0^{\prime},q) for q\not=0 are not reachable in the direct product, leaving mn-(m+n-2) reachable states.

We check distinguishability for each operation. In all cases, (0^{\prime},0) is distinguishable from every other state because it is non-empty, and it goes to the empty state by a. Again using [2, Theorem 1] and computation for the cases (m,n)\in\{(5,6),(6,5),(6,6)\}, the states of S are pairwise distinguishable for all four operations.

Union States of R are distinguished from each other by words in b^{*}, and from states of S by words in \{a,b\}^{*}. Similarly, states of C are distinguished from each other by words in a^{*}, and from states of S by words in \{a,b\}^{*}. States of R are distinguished from those of C by words in a^{*}. Hence the mn-(m+n-2) reachable states are pairwise distinguishable.

Symmetric Difference Same as union.

Difference The states of C are all empty, and hence equivalent to ((m-1)^{\prime},n-1). States of R are pairwise distinguishable by words in a^{*}, and they are distinguished from states of S by words in \{a,b\}^{*}. Hence there are mn-(m+2n-4) distinguishable states.

Intersection The states of R\cup C are all empty, and hence there are only mn-2(m+n-3) distinguishable states. ∎

The transition semigroup of the DFA of Definition 8 is a also a subsemigroup of \mathbf{T}^{\leqslant 5}(n), and its language also meets the bounds for product, star and boolean operations. The advantage of this DFA is that its witnesses use only two letters for star and only two letters (but three transformations) for boolean operations. Its disadvantages are the rather complex transformations. For more details see [16]. The DFA of Definition 7 seems to us more natural.

###### Definition 8

For n\geqslant 6, we define the DFA {\mathcal{D}}_{n}=(Q_{n},\Sigma,\delta,0,\{1\}), where Q_{n}=\{0,\ldots,n-1\}, \Sigma=\{a,b,c\}, and \delta is defined by the transformations a\colon(0\to n-1)(1,2,3)(4,\dots,n-2), b\colon(2\to n-1)(1\to 2)(0\to 1)(3,4), c\colon(0\to n-1)(1,\dots,n-2).

## 6 Conclusions

We have examined the complexity properties of left-ideal, suffix-closed, and suffix-free languages together because they are all special cases of suffix-convex languages. We have used the same most complex regular language as a basic component in all three cases.

Our results are summarized in Table 1. The largest bounds are shown in boldface type. Recall that for regular languages we have the following results: semigroup: n^{n}; reverse: 2^{n}; star: 2^{n-1}+2^{n-2}; restricted product: (m-1)2^{n}+2^{n-1}; unrestricted product: m2^{n}+2^{n-1}; restricted \cup and \oplus: mn; unrestricted \cup and \oplus: (m+1)(n+1); restricted \setminus: mn; unrestricted \setminus: mn+m; restricted \cap: mn; unrestricted \cap: mn.

## References

• [1] Ang, T., Brzozowski, J.: Languages convex with respect to binary relations, and their closure properties. Acta Cybernet. 19(2), 445–464 (2009)
• [2] Bell, J., Brzozowski, J., Moreira, N., Reis, R.: Symmetric groups and quotient complexity of boolean operations. In: Esparza, J., et al. (eds.) ICALP 2014. LNCS, vol. 8573, pp. 1–12. Springer (2014)
• [3] Berstel, J., Perrin, D., Reutenauer, C.: Codes and Automata (Encyclopedia of Mathematics and its Applications). Cambridge University Press (2010)
• [4] Brzozowski, J.: Quotient complexity of regular languages. J. Autom. Lang. Comb. 15(1/2), 71–89 (2010)
• [5] Brzozowski, J.: In search of the most complex regular languages. Int. J. Found. Comput. Sci., 24(6), 691–708 (2013)
• [6] Brzozowski, J.: Unrestricted state complexity of binary operations on regular languages. In: Câmpeanu, C., et. al (eds.) DCFS 2016. LNCS, vol. 9777, pp. 60–72. Springer (2016)
• [7] Brzozowski, J., Davies, S.: Quotient complexities of atoms in regular ideal languages. Acta Cybernet. 22(2), 293–311 (2015)
• [8] Brzozowski, J., Davies, S., Liu, B.Y.V.: Most complex regular ideal languages (October 2015), http://arxiv.org/abs/1511.00157
• [9] Brzozowski, J., Jirásková, G., Li, B.: Quotient complexity of ideal languages. Theoret. Comput. Sci. 470, 36–52 (2013)
• [10] Brzozowski, J., Jirásková, G., Zou, C.: Quotient complexity of closed languages. Theory Comput. Syst. 54, 277–292 (2014)
• [11] Brzozowski, J., Li, B., Ye, Y.: Syntactic complexity of prefix-, suffix-, bifix-, and factor-free regular languages. Theoret. Comput. Sci. 449, 37–53 (2012)
• [12] Brzozowski, J., Sinnamon, C.: Complexity of prefix-convex regular languages (2016), http://arxiv.org/abs/1605.06697
• [13] Brzozowski, J., Sinnamon, C.: Unrestricted state complexity of binary operations on regular and ideal languages (2016), http://arxiv.org/abs/1609.04439
• [14] Brzozowski, J., Szykuła, M.: Upper bounds on syntactic complexity of left and two-sided ideals. In: Shur, A.M., Volkov, M.V. (eds.) DLT 2014. LNCS, vol. 8633, pp. 13–24. Springer (2014)
• [15] Brzozowski, J., Szykuła, M.: Complexity of suffix-free regular languages. In: Kosowski, A., Walukiewicz, I. (eds.) FCT 2015. LNCS, vol. 9210, pp. 146–159. Springer (2015)
• [16] Brzozowski, J., Szykuła, M.: Complexity of suffix-free regular languages (2015), http://arxiv.org/abs/1504.05159
• [17] Brzozowski, J., Szykuła, M., Ye, Y.: Syntactic complexity of regular ideals (September 2015), http://arxiv.org/abs/1509.06032
• [18] Brzozowski, J., Tamm, H.: Quotient complexities of atoms of regular languages. Int. J. Found. Comput. Sci. 24(7), 1009–1027 (2013)
• [19] Brzozowski, J., Tamm, H.: Theory of átomata. Theoret. Comput. Sci. 539, 13–27 (2014)
• [20] Brzozowski, J., Ye, Y.: Syntactic complexity of ideal and closed languages. In: Mauri, G., Leporati, A. (eds.) DLT 2011. LNCS, vol. 6795, pp. 117–128. Springer Berlin / Heidelberg (2011)
• [21] Cmorik, R., Jirásková, G.: Basic operations on binary suffix-free languages. In: Kotásek, Z., et al. (eds.) MEMILCS. pp. 94–102 (2012)
• [22] Han, Y.S., Salomaa, K.: State complexity of basic operations on suffix-free regular languages. Theoret. Comput. Sci. 410(27-29), 2537–2548 (2009)
• [23] Holzer, M., Kutrib, M.: Descriptional and computational complexity of finite automata—a survey. Information and Computation 209(3), 456 – 470 (2011)
• [24] Holzer, M., König, B.: On deterministic finite automata and syntactic monoid size. Theoret. Comput. Sci. 327(3), 319–347 (2004)
• [25] Iván, S.: Complexity of atoms, combinatorially. Inform. Process. Lett. 116(5), 356–360 (2016)
• [26] Jirásková, G., Olejár, P.: State complexity of union and intersection of binary suffix-free languages. In: Bordihn, H., et al. (eds.) NMCA. pp. 151–166. Austrian Computer Society (2009)
• [27] Krawetz, B., Lawrence, J., Shallit, J.: State complexity and the monoid of transformations of a finite set. In: Domaratzki, M., Okhotin, A., Salomaa, K., Yu, S. (eds.) CIAA 2005. LNCS, vol. 3317, pp. 213–224. Springer Berlin / Heidelberg (2005)
• [28] Myhill, J.: Finite automata and representation of events. Wright Air Development Center Technical Report 57–624 (1957)
• [29] Pin, J.E.: Syntactic semigroups. In: Handbook of Formal Languages, vol. 1: Word, Language, Grammar, pp. 679–746. Springer, New York, NY, USA (1997)
• [30] Thierrin, G.: Convex languages. In: Nivat, M. (ed.) Automata, Languages and Programming, pp. 481–492. North-Holland (1973)
• [31] Yu, S.: State complexity of regular languages. J. Autom. Lang. Comb. 6, 221–234 (2001)
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters