Palindromic length of words and morphisms in class
We study the palindromic length of factors of infinite words fixed by morphisms of the so-called class introduced by Hof, Knill and Simon. We show that it grows at most logarithmically with the length of the factor. For the Fibonacci word and the Thue-Morse word we provide estimates on the constants of the growth. We also construct an infinite word rich in palindromes for which the palindromic length grows as .
keywords:palindromic length; class ; palindromic richness.
In this paper we show a connection between two conjectures concerning palindromes in languages of infinite words. A palindrome is a word which reads the same forward as backward, such as eye or kayak. An impulse to formulate the first conjecture comes from paper by Hof, Knill and Simon Hof et al. (1995), where they studied infinite words generated by primitive morphisms. They defined a certain class of morphisms, called class , by requiring that the image of any letter in the alphabet is in the form , where and are palindromes, being common to every . They showed that fixed points of morphisms in this class contain an infinite number of palindromes. They asked whether any palindromic fixed point of a primitive substitution arises using such a morphism. This question was eventually turned into a conjecture (called later the HKS conjecture), however, due to a certain vagueness of the original question several versions of this conjecture have been considered (for details see Introduction in Labbé and Pelantová (2016)). Validity of HKS conjecture was proved for binary words Tan (2007) and for words fixed by a morphism of certain types Labbé and Pelantová (2016); Masáková et al. (2017).
In 2013, Frid, Puzynina and Zamboni Frid et al. (2013) introduced the palindromic length of a finite word , denoted by , as the minimal number of palindromes whose concatenation is equal to . They conjectured that if there is a constant such that the palindromic length of every factor in an infinite word is bounded by , then is eventually periodic. Formally, defining for a given infinite word the function by
the conjecture is that if is not eventually periodic, then . The authors of Frid et al. (2013) proved the conjecture for infinite words which do not contain an -power for any positive integer . In particular, the conjecture is true for any aperiodic fixed point of a primitive morphism, as such fixed points have bounded powers Mossé (1992). Later, Frid Frid (2018b) showed that Sturmian words have unbounded palindromic length even if they contain unbounded powers. Palindromic length of Sturmian words is studied also in Ambrož and Pelantová (2018). It is shown that can grow arbitrarily slowly. For other infinite words besides Sturmian words and bounded-repetition words the conjecture of remains open.
We study palindromic length of factors of fixed points of primitive morphisms. Here, as we have stated above, the palindromic length in unbounded, whenever the fixed point is not eventually periodic. The main results of this contribution are formulated as Proposition 9 and Theorem 10. We prove that if the HKS conjecture is valid then for any primitive morphism there is a constant such that the palindromic length of every factor in the language of is less than or equal to . We also provide a method of estimating the constant .
For the case of the Fibonacci word our computations suggest that , where is the golden ratio. We give an upper bound on also for the Thue-Morse word. The estimates are given in Section 4. Let us mention that a lower bound on is not known even for the Fibonacci word. Frid Frid (2018a) conjectures that .
The Fibonacci word belongs among the so-called rich words (or full words) introduced in Droubay et al. (2001); Brlek et al. (2004). An infinite word is rich if each of its finite factors is rich, i.e., contains as many palindromes as possible. Intuitively, the palindromic length of of such words should grow slowly. We demonstrate that this is not necessarily the case. In Section 5, we use finite rich words introduced by Guo, Shallit and Shur Guo et al. (2016) to construct an infinite word for which for some positive constant .
Section 6 is devoted to further computer experiments on the Thue-Morse word whose language contains besides palindromes also infinitely many antipalindromes. We introduce the combined pal-antipal length and compare its growth to the growth of .
In the final section we formulate several open questions concerning factorization to palindromes and/or antipalindromes.
Let be a finite set called alphabet, its elements are called letters. A word (over ) is a finite sequence of elements in , its length (the number of its elements) is denoted by . The notation is used for the number of occurrences of the letter in . The empty word – unique word of length zero – is denoted by . The concatenation of words and is . The set of all finite words over equipped with the operation concatenation of words is a free monoid, denoted by .
For a word we define its mirror image as . A word is called palindrome if . The palindromic length of a word , denoted by , is the smallest number of palindromes such that , i.e., the minimal number of palindromes whose concatenation is equal to . For convenience, we define .
An infinite sequence of letters in is called infinite word. The set of all infinite words over is denoted . The word is said to be eventually periodic if it is of the form , where , and .
A factor of a (finite of infinite) word is a finite word such that for some words . If then is called a prefix of , if then is called a suffix of . The set of all factors of an infinite word , called the language of , is denoted by . Let be a prefix of a word , that is, there is a word such that . Then we define . Similarly, let be a suffix of a word , that is, there is a word such that . Then we define .
Let be a finite word over the alphabet . Then the Parikh vector of is the vector, denoted , whose -th element is the number of occurrences of in , i.e.,
Obviously, , where .
A morphism of the free monoid is a map such that for all . A morphism of , where , is called primitive if there is a constant such that contains for every . The action of the morphism is naturally extended to infinite words by concatenation, in particular, we have
An infinite word is called a fixed point of the morphism if . Clearly, a morphism can have several different fixed points, however, if is primitive then all its fixed points have the same language, denoted .
Let and let be a morphism of . The incidence matrix of is the matrix given by . The incidence matrix of can be used to compute the Parikh vector of the image of a word under by
3 Morphisms in class
A primitive morphism belongs to class if there is a palindrome such that for each
The Fibonacci morphism belongs to class ; Equation (2) is fulfilled for .
The Thue-Morse morphism does not belong to class , however, its square does ().
The following simple observation is due to Hof, Knill and Simon Hof et al. (1995).
Let be a primitive morphism in the form (2) and let be a fixed point of . Then
if then ,
if is a palindrome then is a palindrome.
The language of a fixed point of a morphism in class contains infinitely many palindromes.
In fact, as was noticed in Allouche et al. (2003), the same statement as Corollary 5 is valid for fixed points of morphisms that are not in class by themselves, but some of their conjugates is, see Definition 6 below. The reason for that is that languages of infinite words fixed by conjugated primitive morphisms coincide.
Morphisms are said to be conjugated, denoted by , if there is a word such that either for every or for every .
The proof of the following fact can be found for example in Allouche et al. (2003).
Let be a primitive morphism and let be conjugated with . Then .
In Allouche et al. (2003) the authors also make the observation that any morphism of class is conjugated to a morphism of the form (2) where the palindrome is either the empty word or a single letter. From now on, in view of Proposition 7, we will only consider morphisms in class of this form. The following lemma shows how palindromic length of a finite word changes under application of such a morphism.
Let be a morphism in class in the form (2).
If , then for every .
Suppose . If is even then , otherwise .
If then .
i) Let , where are palindromes. Then by Observation 4, is a concatenation of into palindromes.
ii) Let , where are palindromes. Then by Observation 4,
is a concatenation of into palindromes . If , where are palindromes then can be factorized into palindromes similarly:
iii) By i) and ii), may happen only if is odd. Then is even and thus . ∎
The above lemma states that by applying a morphism of class to a word, palindromic length can increase by at most one, and this happens only at alternating iterations of the morphism . With this knowledge, we can find an estimate on the growth of the palindromic length .
Let be a morphism in class such that for each it holds that , where and is a palindrome. Let us denote
Then for a fixed point of we have
where is the dominant eigenvalue of the incidence matrix of .
First realize that under our assumptions, the set is non-empty. Otherwise, for every letter , and as , the morphism is not primitive (and thus not in class ). Therefore the constant is well defined. Moreover, note that it is enough to consider only prefixes (in the definition of ) since are palindromes and thus if is a suffix of then is its prefix and .
Consider a fixed point of . If it is eventually periodic, then by Frid et al. (2013), the palindromic length of its factors is bounded and the statement of the proposition is trivially valid. Assume that has an aperiodic fixed point. Since is a primitive morphism, every sufficiently long factor of has a uniquely determined preimage Mossé (1996). More precisely, there exists such that for each , , there are factors of such that , where , is a proper suffix of and is a proper prefix of for some letters , cf. Figure 1.
Obviously, . By definition of , we have , since we have . Using Lemma 8, , where if is odd and , and otherwise. Together, we obtain
If we apply the same procedure to . In this way, for a given we create a sequence such that for each we have
and . From (5) we get
where we used that , and iii) of Lemma 8.
On the other hand, (6) implies that
Since is a primitive morphism, by the Perron-Frobenius theorem, the dominant eigenvalue of the matrix is positive and strictly greater that absolute values of all other eigenvalues of . Denote by the non-singular matrix such that is in the Jordan canonical form, i.e., it is block diagonal. Notably, the block corresponding to the eigenvalue is of dimension . Without loss of generality, let it be the first block on the diagonal of . Then necessarily,
Combining with (8), we have that
Putting the latter estimate together with (7),
If tends to infinity then tends to infinity as well. The validity of the proposition follows. ∎
Proposition 9 provides an upper estimate on the palindromic length for any fixed point of any morphism of class . The result is valid independently of the size of the alphabet. Reducing our consideration to binary infinite words, we recall the result of Bo Tan Tan (2007). He shows that any binary morphism producing a fixed point with infinitely many palindromes is either itself conjugated to a morphism in class , or this can be said about its second iterate . This allows us to formulate a summarizing corollary to our Proposition 9.
Let be a fixed point of a primitive morphism over a binary alphabet. Then there is a constant such that either
Let contain only finitely many palindromes. Then obviously for every factor of length , we have , where is the length of the longest palindrome in . Thus grows at least linearly.
4 Fibonacci and Thue-Morse words
Let us provide an upper bound on the constant of Theorem 10 for the Fibonacci word and for the Thue-Morse word .
The Fibonacci word
Let us apply Proposition 9 in this case. Obviously and and the dominant eigenvalue of the incidence matrix of is the golden mean . Therefore
The Fibonacci word is also the fixed point of . Consider morphism . Taking , we see that
which gives a better estimate than (9).
If we use , which also fixes the Fibonacci word, we have , , and . This improves the constant in estimate (10) to . Making similar considerations for , , we obtain that . This makes us conjecture that
Let us remark that Frid Frid (2018a) investigated the palindromic length only of prefixes of the Fibonacci word. She conjectures that the prefix (of the Fibonacci word) whose length written in the Zeckendorf numeration system is has . Should this conjecture be valid, it would imply that
The Thue-Morse word
Let us consider the Thue-Morse word , i.e., the fixed point of the morphism (cf. Example 3). Similarly to the case of the Fibonacci word we are interested in the constant where .
5 Words rich in palindromes
Intuitively, a word containing many palindromic factors should have small palindromic length. Recall that Droubay et al. Droubay et al. (2001) found out that a finite word contains at most different palindromic factors. If this bound is attained, the word is called rich (in palindromes). An infinite word is called rich if each of its factors is rich. The difference between the upper bound and actual number of palindromes in is called the (palindromic) defect of and denoted by , see Brlek et al. (2004).
Word contains 9 palindromic factors: , , , , , , , , , and thus is rich. On the other hand, the word contains only 8 palindromic factors: , , , , , , , , and thus is not rich. And so the Thue-Morse word is not rich, since is one of its factors. The defect of is . It follows from the results of Blondin-Massé et al. (2008) that .
Some of the rich words have small palindromic length. For example, the palindromic length of morphic Sturmian words such as the Fibonacci word grows at most logarithmically (cf. Theorem 10). Nevertheless, quite surprisingly, richness of a word does not imply that its palindromic length be small. We will demonstrate this fact on finite words
In all these words the sequences of runs of and of runs of are monotone, and therefore and are rich for all by a result of Guo et al. Guo et al. (2016).
For every we have and .
A palindromic factor in has one of the following forms , , , for some . Obviously, a minimal factorization will contain as many as possible blocks of the 3rd and 4th type. One can use at most such palindromes (since there are runs of zeroes and ones in , excluding the outer ones) plus at least two palindromes of the type or to cover whole . Examples of such minimal palindromic factorizations follows.
The same consideration holds for . ∎
With the use of the above proposition and a result on rich words derived in Rukavička (2017), we can summarize the following information about palindromic length of finite rich words.
There are constants , such that
for infinitely many finite rich words ,
for every rich word .
The finite words ( resp.), , allow one to define infinite rich words.
There exists an infinite word rich in palindromes for which
It suffices to define the infinite word as the word having the prefix for every .
Let be a prefix of of length , then we show that
Let be such that
since has prefix , we have , where or for some . Thus ,
has prefix and thus , where (analogously to the previous case) .
Lemma 6 from Saarela (2017) states that . Using this lemma for and and then for and we get
Analogously to the proof of Proposition 12, the maximum of the set is reached on the prefixes of . Therefore
6 Combined pal-antipal length
Let us reconsider the Thue-Morse word. Besides infinitude of palindromes, it contains also infinitely many the so-called antipalindromes Blondin-Massé et al. (2008). These words have been considered in a wider context under the name -pseudo-palindromes (or -palindromes) already in Anne et al. (2005); de Luca and Luca (2006); Halava et al. (2007). We use the name antipalindrome in accordance with Guo et al. (2015).
A finite word over a binary alphabet is an antipalindrome, if , where exchanges the letters, , . Obviously, an antipalindrome is always of even length. A finite word thus need not to be factorizable into only antipalindromes.
An extension of the question on palindromic length could be on the factorization of a given finite word into the smallest possible number of factors which are either palindromes of antipalindromes. For that purpose, we have adapted the simple quadratic algorithm for minimal palindromic factorization given in Fici et al. (2014). We have computed the palindromic and combined pal-antipal length for the prefixes of the Thue-Morse word of length up to , see graph of in Figure 2.
Our computations suggest that including both palindromes and antipalindromes into the factorization of Thue-Morse word, one can reduce the number of factors by half (and not more), see Table 1. On the other hand, we suppose that there exist prefixes of arbitrary palindromic length whose combined pal-antipal length as big as the palindromic one, see Table 2. Other research could be done in this direction.
7 Open problems
The palindromic length of finite and infinite words has been introduced in 2013 Frid et al. (2013). Since then, several groups of authors focused on the design of fast algorithms for computing the minimal palindromic factorization, see e.g. Fici et al. (2014); Rubinchik and Shur (2018); Borozdin et al. (2017). On the other hand, an analytic study of the palindromic length is still in its beginnings. Let us formulate several open questions which we consider of interest.
When studying palindromic length of the Fibonacci and Thue-Morse word, we have conveniently considered a power of the morphism that could be conjugated to a morphism in which the image of every letter is a palindrome, i.e., for every . According to our knowledge, question on determining for which morphisms such a power exists, has not been considered yet.
In the study of the growth of we provide a method of finding an upper bound on the constant , in the estimate , which is valid for any fixed point of any morphism in class . According to our knowledge, no methods for giving a lower bound on have been mentioned in the literature. So far, only Frid Frid (2018a) has focused on finding a lower bound on the palindromic length. Her study is specific for the Fibonacci word . She states a conjecture describing the prefixes of having palindromic length strictly bigger than all the shorter prefixes of .
The validity of the conjectured lower bound of Frid Frid (2018a) would imply that . Our computations (cf. Section 4) suggest that should have a bigger value. This is probably caused by the fact that Frid only considers the palindromic length of prefixes of the Fibonacci word. It may be the case that bigger palindromic length is achieved on factors that are not prefixes of . We do not have candidates for such factors. It should be mentioned that Saarela Saarela (2017) shows equivalence between the unboundedness of the palindromic length when taken over the factors and considering only the prefixes. This, however, does not mean that the growth of the function dependingly on should be equal.
In Proposition 14, we give an infinite rich word whose palindromic length grows with at least as . The infinite word is however not uniformly recurrent. All the other considered classes of palindromic uniformly recurrent words have palindromic length bounded by . Does there exist a uniformly recurrent infinite word such that ?
HKS conjecture was formulated in view of characterization of morphisms providing fixed points with infinitely many palindromes. It is not obvious which morphisms generate fixed points that besides infinitely many palindromes contain also arbitrarily long antipalindromes, as it is the case of the Thue-Morse morphism.
This work was supported by the project CZ.02.1.01/0.0/0.0/16_019/0000778 from European Regional Development Fund. We also acknowledge financial support of the Grant Agency of the Czech Technical University in Prague, grant No. SGS14/205/OHK4/3T/14.
- Allouche et al. (2003) J.-P. Allouche, M. Baake, J. Cassaigne, D. Damanik, Palindrome complexity, Theoret. Comput. Sci. 292 (1) (2003) 9–31, ISSN 0304-3975, doi:10.1016/S0304-3975(01)00212-2.
Ambrož and Pelantová (2018)
P. Ambrož, E. Pelantová,
A note on palindromic length of Sturmian sequences,
submitted to European J. Combin., 2018,
- Anne et al. (2005) V. Anne, L. Zamboni, I. Zorca, Palindromes and pseudo-palindromes in episturmian and pseudo-palindromic infinite words, in: Words 2005 – 5th International Conference on Words, vol. 36 of Publications du LaCIM, 91–100, 2005.
- Blondin-Massé et al. (2008) A. Blondin-Massé, S. Brlek, A. Garon, S. Labbé, Combinatorial properties of -palindromes in the Thue-Morse sequence, Pure Math. Appl. (PU.M.A.) 19 (2-3) (2008) 39–52, ISSN 1218-4586.
- Borozdin et al. (2017) K. Borozdin, D. Kosolobov, M. Rubinchik, A. M. Shur, Palindromic length in linear time, in: 28th Annual Symposium on Combinatorial Pattern Matching, vol. 78 of LIPIcs. Leibniz Int. Proc. Inform., Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, Art. No. 23, 2017.
- Brlek et al. (2004) S. Brlek, S. Hamel, M. Nivat, C. Reutenauer, On the palindromic complexity of infinite words, Internat. J. Found. Comput. Sci. 15 (2) (2004) 293–306, ISSN 0129-0541, doi:10.1142/S012905410400242X.
- de Luca and Luca (2006) A. de Luca, A. D. Luca, Pseudopalindrome closure operators in free monoids, Theoretical Computer Science 362 (1) (2006) 282–300, ISSN 0304-3975, doi:https://doi.org/10.1016/j.tcs.2006.07.009.
- Droubay et al. (2001) X. Droubay, J. Justin, G. Pirillo, Episturmian words and some constructions of de Luca and Rauzy, Theoret. Comput. Sci. 255 (1-2) (2001) 539–553, ISSN 0304-3975, doi:10.1016/S0304-3975(99)00320-5.
- Fici et al. (2014) G. Fici, T. Gagie, J. Kärkkäinen, D. Kempa, A subquadratic algorithm for minimum palindromic factorization, J. Discrete Algorithms 28 (2014) 41–48, ISSN 1570-8667, doi:10.1016/j.jda.2014.08.001.
A. Frid, Representations of palindromes in
the Fibonacci word, in: Numeration 2018,
- Frid (2018b) A. E. Frid, Sturmian numeration systems and decompositions to palindromes, European J. Combin. 71 (2018b) 202–212, ISSN 0195-6698, doi:10.1016/j.ejc.2018.04.003.
- Frid et al. (2013) A. E. Frid, S. Puzynina, L. Q. Zamboni, On palindromic factorization of words, Adv. in Appl. Math. 50 (5) (2013) 737–748, ISSN 0196-8858, doi:10.1016/j.aam.2013.01.002.
Guo et al. (2015)
C. Guo, J. Shallit, A. M.
Shur, On the Combinatorics of Palindromes and
- Guo et al. (2016) C. Guo, J. Shallit, A. M. Shur, Palindromic rich words and run-length encodings, Inform. Process. Lett. 116 (12) (2016) 735–738, ISSN 0020-0190, doi:10.1016/j.ipl.2016.07.001.
Halava et al. (2007)
V. Halava, T. Harju,
T. KÃ¤rki, L. Zamboni,
Relational Fine and Wilf words, Tech. Rep.
839, Turku Centre for Computer
Hof et al. (1995)
A. Hof, O. Knill,
B. Simon, Singular continuous spectrum for
palindromic Schrödinger operators, Comm. Math.
Phys. 174 (1) (1995)
149–159, ISSN 0010-3616,
S. Labbé, A counterexample to a question
of Hof, Knill and Simon, Electron. J. Combin.
21 (3) (2014)
Paper 3.11, ISSN 1077-8926,
- Labbé and Pelantová (2016) S. Labbé, E. Pelantová, Palindromic sequences generated from marked morphisms, European J. Combin. 51 (2016) 200–214, ISSN 0195-6698, doi:10.1016/j.ejc.2015.05.006.
- Masáková et al. (2017) Z. Masáková, E. Pelantová, Š. Starosta, Exchange of three intervals: substitutions and palindromicity, European J. Combin. 62 (2017) 217–231, ISSN 0195-6698, doi:10.1016/j.ejc.2017.01.003.
- Mossé (1992) B. Mossé, Puissances de mots et reconnaissabilité des points fixes d’une substitution, Theoret. Comput. Sci. 99 (2) (1992) 327–334, ISSN 0304-3975, doi:10.1016/0304-3975(92)90357-L.
- Mossé (1996) B. Mossé, Reconnaissabilité des substitutions et complexité des suites automatiques, Bull. Soc. Math. France 124 (2) (1996) 329–346, ISSN 0037-9484.
- Rubinchik and Shur (2018) M. Rubinchik, A. M. Shur, EERTREE: an efficient data structure for processing palindromes in strings, European J. Combin. 68 (2018) 249–265, ISSN 0195-6698, doi:10.1016/j.ejc.2017.07.021.
- Rukavička (2017) J. Rukavička, On the number of rich words, in: Developments in language theory, vol. 10396 of Lecture Notes in Comput. Sci., Springer, Cham, 345–352, 2017.
- Saarela (2017) A. Saarela, Palindromic length in free monoids and free groups, in: Combinatorics on words, vol. 10432 of Lecture Notes in Comput. Sci., Springer, Cham, 203–213, 2017.
- Tan (2007) B. Tan, Mirror substitutions and palindromic sequences, Theoret. Comput. Sci. 389 (1-2) (2007) 118–124, ISSN 0304-3975, doi:10.1016/j.tcs.2007.08.003.