Deciding Equivalence of Linear TreetoWord Transducers in Polynomial Time
Abstract
We show that the equivalence of deterministic linear topdown treetoword transducers is decidable in polynomial time. Linear treetoword transducers are noncopying but not necessarily orderpreserving and can be used to express XML and other document transformations. The result is based on a partial normal form that provides a basic characterization of the languages produced by linear treetoword transducers.
1 Introduction
Tree transformations are widely used in functional programming and document processing. Tree transducers are a general model for transforming structured data like a database in a structured or even unstructured way. Consider the following internal representation of a client database that should be transformed to a table in HTML.
Topdown tree transducers can be seen as functional programs that transform trees from the root to the leaves with finite memory. Transformations where the output is not produced in a structured way or where, for example, the output is a string, can be modeled by treetoword transducers.
In this paper, we study deterministic linear treetoword transducers (ltws), a subset of deterministic treetoword transducers that are noncopying, but not necessarily orderpreserving. Processing the subtrees in an arbitrary order is important to avoid reordering of the internal data for different use cases. In the example of the client database the names may be needed in different formats, e.g.
The equivalence of unrestricted treetoword transducers was a long standing open problem that was recently shown to be decidable [14]. The algorithm by [14] provides an corandomized polynomial algorithm for linear transducers. We show that the equivalence of ltws is decidable in polynomial time and provide a partial normal form.
To decide equivalence of ltws, we start in Section 3 by extending the methods used for sequential (linear and orderpreserving) treetoword transducers (stws), discussed in [15]. The equivalence for these transducers is decidable in polynomial time [15]. Moreover a normal form for sequential and linear treetoword transducers, computable in exponential time, is known [7, 1]. Two equivalent ltws do not necessarily transform their trees in the same order. However, the differences that can occur are quite specific and characterized in [1]. We show how they can be identified. We use the notion of earliest states, inspired by the existing notion of earliest sequential transducers [7]. In this earliest form, two equivalent stws can transform subtrees in different orders only if they fulfill specific properties pertaining to the periodicity of the words they create. Computing this normal form is exponential in complexity as the number of states may increase exponentially. To avoid this size increase we do not compute these earliest transducers fully, but rather locally. This means we transform two ltws with different orders to a partial normal form in polynomial time (see Section 4) where the order of their transformation of the different subtrees are the same. ltws that transform the subtrees of the input in the same order can be reduced to sequential treetoword transducers as the input trees can be reordered according to the order in the transformation.
A short version of this paper will be published in the proceedings of the 20th International Conference on Developements in Language Theory (DLT 2016).
Related Work. Different other classes of transducers, such as treetotree transducers [5], macro tree transducers [6] or nestedwordtoword transducers [15] have been studied. Many results for treetotree transducers are known, e.g. deciding equivalence [12], minimization algorithms [12] and Goldstyle learning algorithms [9]. In contrast, transformations where the output is not generated in a structured way like a tree are not that well understood. In macrotree transducers, the decidability of equivalence is a wellknown and longstanding question [2]. However, the equivalence of linear size increase macrotree transducers that are equivalent to MSO definable transducers is decidable [3, 4].
2 Preliminaries
Let be a ranked alphabet with the symbols of rank . Trees on () are defined inductively: if , and , then is a tree. Let be an alphabet. An element is a word. For two words we denote the concatenation of these two words by . The length of a word is denoted by . We call the empty word. We denote the inverse of a symbol where . The inverse of a word is .
A contextfree grammar (CFG) is defined as a tuple , where is the alphabet of , is a finite set of nonterminal symbols, is the initial nonterminal of , is a finite set of rules of form , where and . A CFG is deterministic if each nonterminal has at most one rule.
We define the language of a nonterminal recursively: if is a rule of , with words of and nonterminals of , and a word of , then is a word of . We define the contextfree language of a contextfree grammar as .
A straightline program (SLP) is a deterministic CFG that produces exactly one word. The word produced by an SLP is called .
We denote the longest common prefix of all words of a language by . Its longest common suffix is .
A word is said to be periodic of period if is the smallest word such that . A language is said to be periodic of period if is the smallest word such that .
A language is quasiperiodic on the left (resp. on the right) of handle and period if is the smallest word such that (resp. if ). A language is quasiperiodic if it is quasiperiodic on the right or left. If is a singleton or empty, it is periodic of period . Iff is periodic, it is quasiperiodic on the left and the right of handle . If is quasiperiodic on the left (resp. right) then (resp. ) is the shortest word of .
3 Linear TreetoWord Transducers
A linear treetoword transducer (ltw) is a tuple where

is a ranked alphabet,

is an alphabet of output symbols,

is a finite set of states,

the axiom ax is of the form , where and ,

is a set of rules of the form where , of rank , and is a permutation from to . There is at most one rule per pair .
The partial function of a state on an input tree is defined inductively as

if

undefined, if is not defined in .
The partial function of an ltw with axiom on an input tree is defined as .
Two ltws and are equivalent if .
A sequential treetoword transducer (stw) is an ltw where for each rule of the form , is the identity on .
We define accessibility of states as the transitive and reflexive closure of appearance in a rule. This means state is accessible from itself, and if , and is accessible from , then all states , , are accessible from .
We denote by (resp. ) the domain of an ltw (resp. a state ), i.e. all trees such that is defined (resp. ). We only consider ltws with nonempty domains and assume w.l.o.g. that no state in an ltw has an empty domain by eliminating transitions using states with empty domain.
We denote by (resp. ) the range of (resp. ), i.e. the set of all images (resp. ). The languages and for each are all contextfree languages. We call a state (quasi)periodic if is (quasi)periodic.
Note that a word in a rule of an ltw can be represented by an SLP without changing the semantics of the ltw. Therefore a set of SLPs can be added to the transducer and a word on the righthand side of a rule can be represented by an SLPs. The decidability of equivalence of stws in polynomial time still holds true with the use of SLPs. The advantage of SLPs is that they may compress the size of a word as the following example shows.
Example 1.
We define an SLP , where is a set , the initial nonterminal is , and is the set of rules , , , , and . This SLP produces the word . has nonterminals and rules. Thus, produces a word that is exponential in the size of .
The results of this paper require SLP compression to avoid exponential blowup. SLPs are used to prevent exponential blowup in [13], where morphism equivalence on contextfree languages is decided in polynomial time.
The equivalence problem for sequential treetoword transducer can be reduced to the morphism equivalence problem for contextfree languages [15]. This reduction relies on the fact that STWs transform their subtrees in the same order. As ltws do not necessarily transform their subtrees in the same order the result cannot be applied on ltws in general. However, if two ltws transform their subtrees in the same order, then the same reduction can be applied. To formalize that two ltws transform their subtrees in the same order we introduce the notion of state coreachability. Two states and of ltws , , respectively, are coreachable if there is an input tree such that the two states are assigned to the same node of the input tree in the translations of , , respectively.
Two ltws are sameordered if for each pair of coreachable states and for each symbol , neither nor have a rule for , or if and are rules of and , then .
If two ltws are sameordered the input trees can be reordered according to the order in the transformations. Therefore for each ltw a treetotree transducer is constructed that transforms the input tree according to the transformation in the ltw. Then all permutations in the ltws are replaced by the identity. Thus the ltws can be handled as stws and therefore the equivalence is decidable in polynomial time.
Theorem 1.
The equivalence of sameordered ltws is decidable in polynomial time.
3.1 Linear Earliest Normal Form
In this section we introduce the two key properties that are used to build a normal form for linear treetoword transducers, namely the earliest and eraseordered properties. The earliest property means that the output is produced as early as possible, i.e. the longest common prefix (resp. suffix) of is produced in the rule in which occurs, and as left as possible. The eraseordered property means that all states that produce no output are ordered according to the input tree and pushed to the right in the rules.
An ltw is in earliest form if

each state is earliest, i.e. ,

and for each rule , for each , .
In [1, Lemma 9] it is shown that for each ltw an equivalent earliest ltw can be constructed in exponential time. Intuitively, if (resp. ) then is constructed with (resp. ) and is replaced by (resp. ). If and is a prefix of then we push through by constructing with and replace by .
Note that the construction to build the earliest form of an ltw creates a sameordered . Furthermore, if a state of and a state of are coreachable, then is an “earliest” version of , where some word was pushed out of the production of to make it earliest, and some word was pushed through the production of to ensure that the rules have the right property: there exists such that for all , .
Theorem 2.
For each ltw an equivalent sameordered and earliest ltw can be constructed in exponential time.
The exponential time complexity is caused by a potential exponential size increase in the number of states as it is shown in the following example.
We call a state that produces only the empty word, i.e. , an erasing state. As erasing states do not change the transformation and can occur at any position in a rule we need to fix their position for a normal form.
An ltw is eraseordered if for each rule in , if is erasing then for all , is erasing, and .
We test whether in polynomial time and then reorder a rule according to the eraseordered property. If an ltw is earliest it is still earliest after the reordering.
Lemma 1 (extended from [1, Lemma 18]).
For each (earliest) ltw an equivalent (earliest) eraseordered ltw can be constructed in polynomial time.
Example 2.
Consider the rule where translates trees of the form to , translates trees of the form to , translates trees of the form to . Thus the rule is not eraseordered. We reorder the rule to the equivalent and eraseordered rule .
If two equivalent ltws are earliest and eraseordered, then they are not necessarily sameordered. For example, the rule is equivalent to the rule in the above example but the two rules are not sameordered. However, in earliest and eraseordered ltws, we can characterize the differences in the orders of equivalent rules: Just as two words , satisfy the equation if and only if there is a word such that and , the only way for equivalent earliest and eraseordered ltws to not be sameordered is to switch periodic states.
Theorem 3 ([1]).
Let and be two equivalent eraseordered and earliest ltws and , be two coreachable states in , , respectively. Let
and
be two rules for , . Then

for such that , all , , are periodic of the same period and all , ,

for such that , .
As the subtrees that are not sameordered in two equivalent earliest and eraseordered states are periodic of the same period the order of these can be changed without changing the semantics. Therefore the order of these subtrees can be fixed such that equivalent earliest and eraseordered ltws are sameordered. Then the equivalence is decidable in polynomial time, see Theorem 1. However, building the earliest form of an ltw is in exponential time.
To circumvent this difficulty, we will show that the first part of Theorem 3 still holds even on a partial normal form, where only quasiperiodic states are earliest and the longest common prefix of parts of rules with being quasiperiodic is the empty word.
Theorem 4.
Let and be two equivalent eraseordered ltws such that

all quasiperiodic states are earliest, i.e.

for each part of a rule where is quasiperiodic,
Let , be two coreachable states in , , respectively and
and
be two rules for , . Then for such that , all , , are periodic of the same period and all , .
4 Partial Normal Form
In this section we introduce a partial normal form for ltws that does not suffer from the exponential blowup of the earliest form. Inspired by Theorem 4, we wish to solve order differences by switching adjacent periodic states of the same period. Remember that the earliest form of a state is constructed by removing the longest common prefix (suffix) of to produce this prefix (suffix) earlier. It follows that all nonearliest states from which can be constructed following the earliest form are quasiperiodic.
We show that building the earliest form of a quasiperiodic state or a part of a rule with being quasiperiodic is in polynomial time. Therefore building the following partial normal form is in polynomial time.
Definition 1.
A linear treetoword transducer is in partial normal form if

all quasiperiodic states are earliest,

it is eraseordered and

for each rule if is quasiperiodic then is earliest and .
4.1 Eliminating NonEarliest QuasiPeriodic States
In this part, we show a polynomial time algorithm to build an earliest form of a quasiperiodic state. From which an equivalent ltw can be constructed in polynomial time such that any quasiperiodic state is earliest, i.e. . Additionally, we show that the presented algorithm can be adjusted to test if a state is quasiperiodic in polynomial time.
As quasiperiodicity on the left and on the right are symmetric properties we only consider quasiperiodic states of the form (quasiperiodic on the left). The proofs in the case are symmetric and therefore omitted here. In the end of this section we shortly discuss the introduced algorithms for the symmetric case .
To build the earliest form of a quasiperiodic state we use the property that each state accessible from a quasiperiodic state is as well quasiperiodic. However, the periods can be shifted as the following example shows.
Example 3.
Consider states , and with rules , , , . State accepts trees of the form , , and produces the language , i.e. is quasiperiodic of period . State accepts trees of the form , , and produces the language , i.e. is quasiperiodic of period . State accepts trees of the form , and produces the language , i.e. is (quasi)periodic of period .
We introduce two definitions to measure the shift of periods. We denote by the from righttoleft shifted word of of shift , , i.e. where is the prefix of of size . If then with .
For two quasiperiodic states of period and , respectively, we denote the shift in their period by .
The size of the periods of a quasiperiodic state and the states accessible from this state can be computed from the size of the shortest words of the languages produced by these states.
Lemma 2.
If is quasiperiodic on the left with period , and accessible from , then is quasiperiodic with period or a shift of . Moreover we can calculate the shift in polynomial time.
We now use these shifts to build, for a state in that is quasiperiodic on the left, a transducer equivalent to where each occurrence of is replaced by its equivalent earliest form, i.e. a periodic state and the corresponding prefix.
Algorithm 1.
Let be a state in that is quasiperiodic on the left. starts with the same states, axiom, and rules as .

For each state accessible from , we add a copy to .

For each rule in with accessible from , we add a rule with in .

We delete state in and replace any occurrence of in a rule or the axiom of by .
Note that is equivalent to deleting the prefix of size from the word .
Intuitively, to build the earliest form of a state that is quasiperiodic on the left we need to push all words and all longest common prefixes of states on the righthand side of a rule of to the left. Pushing a word to the left through a state needs to shift the language produced by this state. We explain the algorithm in detail on state from Example 3.
Example 4.
Remember that produces the language and , accessible from produce languages and , respectively. Therefore , and . We start with state . As there is only one rule for the longest common prefix of and the longest common prefix of this rule are the same and therefore eliminated.
As there is only one rule for the argumentation is the same and we get . For the rule we calculate the longest common prefix of the righthand side that is larger than the longest common prefix of . Therefore we need to calculate the shift as is accessible from in rule and is accessible from in rule . This leads to the following rule.
As the longest common prefix of is the same as the longest common prefix of the righthand side of rule we get . The axiom of is .
Lemma 3.
Let be an ltw and be a state in that is quasiperiodic on the left. Let be constructed by Algorithm 1 and be a state in accessible from . Then and are equivalent and is earliest.
To replace all quasiperiodic states by their equivalent earliest form we need to know which states are quasiperiodic. Algorithm 1 can be modified to test an arbitrary state for quasiperiodicity on the left in polynomial time. The only difference to Algorithm 1 is that we do not know how to compute in polynomial time and does not exist. We therefore substitute by some smallest word of and we define a mockshift as follows

for all ,

if , we say , where is a shortest word of ,

if and then .
If several definitions of exist, we use the smallest. If is accessible from a quasiperiodic , then .
Algorithm 2.
Let be an ltw and be a state in . We build an ltw as follows.

For each state accessible from , we add a copy to .

The axiom is where is a shortest word of .

For each rule in with accessible from , we add a rule
in , where is constructed as follows.

We define , where is a shortest word of .

Then we remove from its prefix of size , where is a shortest word of . We obtain a word .

Finally, we set .

As the construction of Algorithms 1 and 2 are the same if the state is quasiperiodic, and are equivalent if is quasiperiodic. Moreover, is quasiperiodic if and are equivalent.
Lemma 4.
Let be a state of an ltw and be constructed by Algorithm 2. Then and are sameordered and is quasiperiodic on the left if and only if and is periodic.
As and are sameordered we can test the equivalence in polynomial time, cf. Theorem 1. Moreover testing a CFG for periodicity is in polynomial time and therefore testing a state for quasiperiodicity is in polynomial time.
Algorithm 2 can be applied to a part of a rule to test for quasiperiodicity on the left. In this case for each rule a rule is added to and each occurrence of the part in a rule of is replaced by . We then apply the above algorithm to and test and for equivalence and for periodicity.
Example 5.
Let be a state with the rules , . Thus, transforms trees of the form , to . We use Algorithm 2 to test for quasiperiodicity on the left. As explained above we introduce a state with the rules , . We now apply Algorithm 2 on . We build as follows. The axiom is as the shortest word of is . For the rule we build as is the shortest word of . Then we obtain and . Thus we get . For the rule we build and obtain as the shortest word of is . Thus we get .
transforms trees of the form to and transforms trees of the form to . Thus, they are equivalent. Additionally is periodic with period . It follows that is quasiperiodic.
We introduced algorithms to test states for quasiperiodicity on the left and to build the earliest form for such states. These two algorithms can be adapted for states that are quasiperiodic on the right. There are two main differences. First, as the handle is on the right the shortest word of a language that is quasiperiodic on the right is . Second, instead of pushing words through a periodic language to the left we need to push words through a periodic language to the right.
Hence, we can test each state of an ltw for quasiperiodicity on the left and right. If the state is quasiperiodic we replace by its earliest form. Algorithm 1 and 2 run in polynomial time if SLPs are used. This is crucial as the shortest word of a CFG can be of exponential size, cf. Example 1. However, the operations that are needed in the algorithms, namely constructing the shortest word of a CFG and removing the prefix or suffix of a word, are in polynomial time using SLPs, cf. [11].
Theorem 5.
Let be an ltw. Then an equivalent ltw where all quasiperiodic states are earliest can be constructed in polynomial time.
4.2 Switching Periodic States
In this part we obtain the partial normal form by ordering periodic states of an eraseordered transducer where all quasiperiodic states are earliest. Ordering means that if the order of the subtrees in the translation can differ, we choose the one similar to the input, i.e. if and are equivalent, we choose the second order. We already showed how we can build a transducer where each quasiperiodic state is earliest and therefore periodic. However, we need to make parts of rules earliest such that periodic states can be switched as the following example shows.
Example 6.
Consider the rule where , have the rules , , , . States and are earliest and periodic but not of the same period as a subword is produced in between. We replace the nonearliest and quasiperiodic part by their earliest form. This leads to with , . Hence, and are earliest and periodic of the same period and can be switched in the rule.
To build the earliest form of a quasiperiodic part of a rule each occurrence of this part is replaced by a state and for each rule a rule is added. Then we apply Algorithm 1 on to replace and therefore by their earliest form. Iteratively this leads to the following theorem.
Theorem 6.
For each ltw where all quasiperiodic states are earliest we can build in polynomial time an equivalent ltw such that each part of a rule in where is quasiperiodic is earliest.
In Theorem 4 we showed that order differences in equivalent eraseordered ltws where all quasiperiodic states are earliest and all parts of rules are earliest are caused by adjacent periodic states. As these states are periodic of the same period and no words are produced in between these states can be reordered without changing the semantics of the ltws.
Lemma 5.
Let be an ltw such that

is eraseordered,

all quasiperiodic states in are earliest and

each in a rule of that is quasiperiodic is earliest.
Then we can reorder adjacent periodic states of the same period in the rules of such that in polynomial time. The reordering does not change the transformation of .
We showed before how to construct a transducer with the preconditions needed in Lemma 5 in polynomial time. Note that replacing a quasiperiodic state by its earliest form can break the eraseordered property. Thus we need to replace all quasiperiodic states by its earliest form before building the eraseordered form of a transducer. Then Lemma 5 is the last step to obtain the partial normal form for an ltw.
Theorem 7.
For each ltw we can construct an equivalent ltw that is in partial normal form in polynomial time.
4.3 Testing Equivalence in Polynomial Time
It remains to show that the equivalence problem of ltws in partial normal form is decidable in polynomial time. The key idea is that two equivalent ltws in partial normal form are sameordered.
Consider two equivalent ltws , where all quasiperiodic states and all parts of rules with is quasiperiodic are earliest. In Theorem 4 we showed if the orders , of two coreachable states , of , , respectively, for the same input differ then the states causing this order differences are periodic with the same period. The partial normal form solves this order differences such that the transducers are sameordered.
Lemma 6.
If and are equivalent and in partial normal form then they are sameordered.
As the equivalence of sameordered ltws is decidable in polynomial time (cf. Theorem 1) we conclude the following.
Corollary 1.
The equivalence problem for ltws in partial normal form is decidable in polynomial time.
To summarize, the following steps run in polynomial time and transform a ltw into its partial normal form.

Test each state for quasiperiodicity. If it is quasiperiodic replace the state by its earliest form.

Build the equivalent eraseordered transducer.

Test each part in each rule from right to left for quasiperiodicity on the left. If it is quasiperiodic on the left replace the part by its earliest form.

Order adjacent periodic states of the same period according to the input order.
This leads to our main theorem.
Theorem 8.
The equivalence of ltws is decidable in polynomial time.
5 Conclusion
The equivalence problem for linear treetoword transducers can be decided in polynomial time. To prove this we used a reduction to the equivalence problem between sequential transducers [7], or more exactly, to an extension of this result to sameordered transducers. This reduction hinges on two points. First, we showed that the only structural differences between two equivalent earliest linear transducers are caused by periodic languages which are interchangeable. The structural characteristic of periodic languages has been used in the normalization of stws [7]. Second, we showed that if building a fully earliest transducer is potentially exponential, our reduction only requires quasiperiodic states to be earliest, which can be done in polynomial time. The use of the equivalence problem for morphisms on a CFG [13] and of properties on straightline programs [10] is essential here as it was in [7, 8]. This leads to further research questions, starting with generalization of this result to all treetowords transducers. Furthermore, is it possible that these techniques can be used to decrease the complexity of some problems in other classes of transducer classes, such as topdown treetotree transducers, where the equivalence problem is known to be between ExptimeHard and NExptime?
Appendix A Proof of Theorem 4
Theorem.
Let and be two equivalent eraseordered ltws such that

all quasiperiodic states are earliest, i.e.

for each part of a rule where is quasiperiodic,
Let , be two coreachable states in , , respectively and
and
be two rules for , . Then for such that , all , , are periodic of the same period and all , .
Proof.
Let and be the equivalent earliest transducer of and , respectively, such that and as well as and are sameordered (cf. Theorem 2).
Suppose there exists coreachable (and thus equivalent) states and in and , respectively, with rules
such that .
Let be the first index such that . Following Theorem 3, we have such that and and all , are periodic with the same period.
Let and be the states in and , respectively, from which the coreachable states and were constructed with the earliest construction proposed by [7]. From the earliest construction it follows that and are coreachable. Since the construction preserves the rule structure, we have:
The earliest construction gives us that for all , for some . This means that if is periodic, then is quasi periodic in its nonearliest form. The same is true for all .
However, the first property we supposed of and implies that all those and that are quasiperiodic are not only quasi periodic, but periodic. Consider a part of the rule that is periodic in the earliest form and therefore quasiperiodic in the nonearliest form. The first condition gives us that are periodic. However, then the words are not necessarily empty. As the part is quasiperiodic we know that each part , is quasiperiodic. Then the second condition of this theorem guarantees that the parts ,