A Proof of Theorem 4

# Deciding Equivalence of Linear Tree-to-Word Transducers in Polynomial Time

## Abstract

We show that the equivalence of deterministic linear top-down tree-to-word transducers is decidable in polynomial time. Linear tree-to-word transducers are non-copying but not necessarily order-preserving and can be used to express XML and other document transformations. The result is based on a partial normal form that provides a basic characterization of the languages produced by linear tree-to-word transducers.

## 1 Introduction

Tree transformations are widely used in functional programming and document processing. Tree transducers are a general model for transforming structured data like a database in a structured or even unstructured way. Consider the following internal representation of a client database that should be transformed to a table in HTML.

Top-down tree transducers can be seen as functional programs that transform trees from the root to the leaves with finite memory. Transformations where the output is not produced in a structured way or where, for example, the output is a string, can be modeled by tree-to-word transducers.

In this paper, we study deterministic linear tree-to-word transducers (ltws), a subset of deterministic tree-to-word transducers that are non-copying, but not necessarily order-preserving. Processing the subtrees in an arbitrary order is important to avoid reordering of the internal data for different use cases. In the example of the client database the names may be needed in different formats, e.g.

<salutation> <name> <surname>
<surname>, <name>
<title> <surname>
<title> <surname>, <name>

The equivalence of unrestricted tree-to-word transducers was a long standing open problem that was recently shown to be decidable [14]. The algorithm by [14] provides an co-randomized polynomial algorithm for linear transducers. We show that the equivalence of ltws is decidable in polynomial time and provide a partial normal form.

To decide equivalence of ltws, we start in Section 3 by extending the methods used for sequential (linear and order-preserving) tree-to-word transducers (stws), discussed in [15]. The equivalence for these transducers is decidable in polynomial time [15]. Moreover a normal form for sequential and linear tree-to-word transducers, computable in exponential time, is known [7, 1]. Two equivalent ltws do not necessarily transform their trees in the same order. However, the differences that can occur are quite specific and characterized in [1]. We show how they can be identified. We use the notion of earliest states, inspired by the existing notion of earliest sequential transducers [7]. In this earliest form, two equivalent stws can transform subtrees in different orders only if they fulfill specific properties pertaining to the periodicity of the words they create. Computing this normal form is exponential in complexity as the number of states may increase exponentially. To avoid this size increase we do not compute these earliest transducers fully, but rather locally. This means we transform two ltws with different orders to a partial normal form in polynomial time (see Section 4) where the order of their transformation of the different subtrees are the same. ltws that transform the subtrees of the input in the same order can be reduced to sequential tree-to-word transducers as the input trees can be reordered according to the order in the transformation.

A short version of this paper will be published in the proceedings of the 20th International Conference on Developements in Language Theory (DLT 2016).

Related Work. Different other classes of transducers, such as tree-to-tree transducers [5], macro tree transducers [6] or nested-word-to-word transducers [15] have been studied. Many results for tree-to-tree transducers are known, e.g. deciding equivalence [12], minimization algorithms [12] and Gold-style learning algorithms [9]. In contrast, transformations where the output is not generated in a structured way like a tree are not that well understood. In macro-tree transducers, the decidability of equivalence is a well-known and long-standing question [2]. However, the equivalence of linear size increase macro-tree transducers that are equivalent to MSO definable transducers is decidable [3, 4].

## 2 Preliminaries

Let be a ranked alphabet with the symbols of rank . Trees on () are defined inductively: if , and , then is a tree. Let be an alphabet. An element is a word. For two words we denote the concatenation of these two words by . The length of a word is denoted by . We call the empty word. We denote the inverse of a symbol where . The inverse of a word is .

A context-free grammar (CFG) is defined as a tuple , where is the alphabet of , is a finite set of non-terminal symbols, is the initial non-terminal of , is a finite set of rules of form , where and . A CFG is deterministic if each non-terminal has at most one rule.

We define the language of a non-terminal recursively: if is a rule of , with words of and non-terminals of , and a word of , then is a word of . We define the context-free language of a context-free grammar as .

A straight-line program (SLP) is a deterministic CFG that produces exactly one word. The word produced by an SLP is called .

We denote the longest common prefix of all words of a language by . Its longest common suffix is .

A word is said to be periodic of period if is the smallest word such that . A language is said to be periodic of period if is the smallest word such that .

A language is quasi-periodic on the left (resp. on the right) of handle and period if is the smallest word such that (resp. if ). A language is quasi-periodic if it is quasi-periodic on the right or left. If is a singleton or empty, it is periodic of period . Iff is periodic, it is quasi-periodic on the left and the right of handle . If is quasi-periodic on the left (resp. right) then (resp. ) is the shortest word of .

## 3 Linear Tree-to-Word Transducers

A linear tree-to-word transducer (ltw) is a tuple where

• is a ranked alphabet,

• is an alphabet of output symbols,

• is a finite set of states,

• the axiom ax is of the form , where and ,

• is a set of rules of the form where , of rank , and is a permutation from to . There is at most one rule per pair .

The partial function of a state on an input tree is defined inductively as

• if

• undefined, if is not defined in .

The partial function of an ltw with axiom on an input tree is defined as .

Two ltws and are equivalent if .

A sequential tree-to-word transducer (stw) is an ltw where for each rule of the form , is the identity on .

We define accessibility of states as the transitive and reflexive closure of appearance in a rule. This means state is accessible from itself, and if , and is accessible from , then all states , , are accessible from .

We denote by (resp. ) the domain of an ltw (resp. a state ), i.e. all trees such that is defined (resp. ). We only consider ltws with non-empty domains and assume w.l.o.g. that no state in an ltw has an empty domain by eliminating transitions using states with empty domain.

We denote by (resp. ) the range of (resp. ), i.e. the set of all images (resp. ). The languages and for each are all context-free languages. We call a state (quasi-)periodic if is (quasi-)periodic.

Note that a word in a rule of an ltw can be represented by an SLP without changing the semantics of the ltw. Therefore a set of SLPs can be added to the transducer and a word on the right-hand side of a rule can be represented by an SLPs. The decidability of equivalence of stws in polynomial time still holds true with the use of SLPs. The advantage of SLPs is that they may compress the size of a word as the following example shows.

###### Example 1.

We define an SLP , where is a set , the initial non-terminal is , and is the set of rules , , , , and . This SLP produces the word . has non-terminals and rules. Thus, produces a word that is exponential in the size of .

The results of this paper require SLP compression to avoid exponential blow-up. SLPs are used to prevent exponential blow-up in [13], where morphism equivalence on context-free languages is decided in polynomial time.

The equivalence problem for sequential tree-to-word transducer can be reduced to the morphism equivalence problem for context-free languages [15]. This reduction relies on the fact that STWs transform their subtrees in the same order. As ltws do not necessarily transform their subtrees in the same order the result cannot be applied on ltws in general. However, if two ltws transform their subtrees in the same order, then the same reduction can be applied. To formalize that two ltws transform their subtrees in the same order we introduce the notion of state co-reachability. Two states and of ltws , , respectively, are co-reachable if there is an input tree such that the two states are assigned to the same node of the input tree in the translations of , , respectively.

Two ltws are same-ordered if for each pair of co-reachable states and for each symbol , neither nor have a rule for , or if and are rules of and , then .

If two ltws are same-ordered the input trees can be reordered according to the order in the transformations. Therefore for each ltw a tree-to-tree transducer is constructed that transforms the input tree according to the transformation in the ltw. Then all permutations in the ltws are replaced by the identity. Thus the ltws can be handled as stws and therefore the equivalence is decidable in polynomial time.

###### Theorem 1.

The equivalence of same-ordered ltws is decidable in polynomial time.

### 3.1 Linear Earliest Normal Form

In this section we introduce the two key properties that are used to build a normal form for linear tree-to-word transducers, namely the earliest and erase-ordered properties. The earliest property means that the output is produced as early as possible, i.e. the longest common prefix (resp. suffix) of is produced in the rule in which occurs, and as left as possible. The erase-ordered property means that all states that produce no output are ordered according to the input tree and pushed to the right in the rules.

An ltw is in earliest form if

• each state is earliest, i.e. ,

• and for each rule , for each , .

In [1, Lemma 9] it is shown that for each ltw an equivalent earliest ltw can be constructed in exponential time. Intuitively, if (resp. ) then is constructed with (resp. ) and is replaced by (resp. ). If and is a prefix of then we push through by constructing with and replace by .

Note that the construction to build the earliest form of an ltw creates a same-ordered . Furthermore, if a state of and a state of are co-reachable, then is an “earliest” version of , where some word was pushed out of the production of to make it earliest, and some word was pushed through the production of to ensure that the rules have the right property: there exists such that for all , .

###### Theorem 2.

For each ltw an equivalent same-ordered and earliest ltw can be constructed in exponential time.

The exponential time complexity is caused by a potential exponential size increase in the number of states as it is shown in the following example.

We call a state that produces only the empty word, i.e. , an erasing state. As erasing states do not change the transformation and can occur at any position in a rule we need to fix their position for a normal form.

An ltw is erase-ordered if for each rule  in , if is erasing then for all , is erasing, and .

We test whether in polynomial time and then reorder a rule according to the erase-ordered property. If an ltw is earliest it is still earliest after the reordering.

###### Lemma 1 (extended from [1, Lemma 18]).

For each (earliest) ltw an equivalent (earliest) erase-ordered ltw can be constructed in polynomial time.

###### Example 2.

Consider the rule where translates trees of the form to , translates trees of the form to , translates trees of the form to . Thus the rule is not erase-ordered. We reorder the rule to the equivalent and erase-ordered rule .

If two equivalent ltws are earliest and erase-ordered, then they are not necessarily same-ordered. For example, the rule is equivalent to the rule in the above example but the two rules are not same-ordered. However, in earliest and erase-ordered ltws, we can characterize the differences in the orders of equivalent rules: Just as two words , satisfy the equation if and only if there is a word such that and , the only way for equivalent earliest and erase-ordered ltws to not be same-ordered is to switch periodic states.

###### Theorem 3 ([1]).

Let and be two equivalent erase-ordered and earliest ltws and , be two co-reachable states in , , respectively. Let

and

be two rules for , . Then

• for such that , all , , are periodic of the same period and all , ,

• for such that , .

As the subtrees that are not same-ordered in two equivalent earliest and erase-ordered states are periodic of the same period the order of these can be changed without changing the semantics. Therefore the order of these subtrees can be fixed such that equivalent earliest and erase-ordered ltws are same-ordered. Then the equivalence is decidable in polynomial time, see Theorem 1. However, building the earliest form of an ltw is in exponential time.

To circumvent this difficulty, we will show that the first part of Theorem 3 still holds even on a partial normal form, where only quasi-periodic states are earliest and the longest common prefix of parts of rules with being quasi-periodic is the empty word.

###### Theorem 4.

Let and be two equivalent erase-ordered ltws such that

• all quasi-periodic states are earliest, i.e.

• for each part of a rule where is quasi-periodic,

Let , be two co-reachable states in , , respectively and

and

be two rules for , . Then for such that , all , , are periodic of the same period and all , .

## 4 Partial Normal Form

In this section we introduce a partial normal form for ltws that does not suffer from the exponential blow-up of the earliest form. Inspired by Theorem 4, we wish to solve order differences by switching adjacent periodic states of the same period. Remember that the earliest form of a state is constructed by removing the longest common prefix (suffix) of to produce this prefix (suffix) earlier. It follows that all non-earliest states from which can be constructed following the earliest form are quasi-periodic.

We show that building the earliest form of a quasi-periodic state or a part of a rule with being quasi-periodic is in polynomial time. Therefore building the following partial normal form is in polynomial time.

###### Definition 1.

A linear tree-to-word transducer is in partial normal form if

1. all quasi-periodic states are earliest,

2. it is erase-ordered and

3. for each rule  if is quasi-periodic then is earliest and .

### 4.1 Eliminating Non-Earliest Quasi-Periodic States

In this part, we show a polynomial time algorithm to build an earliest form of a quasi-periodic state. From which an equivalent ltw can be constructed in polynomial time such that any quasi-periodic state is earliest, i.e. . Additionally, we show that the presented algorithm can be adjusted to test if a state is quasi-periodic in polynomial time.

As quasi-periodicity on the left and on the right are symmetric properties we only consider quasi-periodic states of the form (quasi-periodic on the left). The proofs in the case are symmetric and therefore omitted here. In the end of this section we shortly discuss the introduced algorithms for the symmetric case .

To build the earliest form of a quasi-periodic state we use the property that each state accessible from a quasi-periodic state is as well quasi-periodic. However, the periods can be shifted as the following example shows.

###### Example 3.

Consider states , and with rules , , , . State accepts trees of the form , , and produces the language , i.e.  is quasi-periodic of period . State accepts trees of the form , , and produces the language , i.e.  is quasi-periodic of period . State accepts trees of the form , and produces the language , i.e.  is (quasi-)periodic of period .

We introduce two definitions to measure the shift of periods. We denote by the from right-to-left shifted word of of shift , , i.e.  where is the prefix of of size . If then with .

For two quasi-periodic states of period and , respectively, we denote the shift in their period by .

The size of the periods of a quasi-periodic state and the states accessible from this state can be computed from the size of the shortest words of the languages produced by these states.

###### Lemma 2.

If is quasi-periodic on the left with period , and accessible from , then is quasi-periodic with period or a shift of . Moreover we can calculate the shift in polynomial time.

We now use these shifts to build, for a state in that is quasi-periodic on the left, a transducer equivalent to where each occurrence of is replaced by its equivalent earliest form, i.e. a periodic state and the corresponding prefix.

###### Algorithm 1.

Let be a state in that is quasi-periodic on the left. starts with the same states, axiom, and rules as .

• For each state accessible from , we add a copy to .

• For each rule in with accessible from , we add a rule with in .

• We delete state in and replace any occurrence of in a rule or the axiom of by .

Note that is equivalent to deleting the prefix of size from the word .

Intuitively, to build the earliest form of a state that is quasi-periodic on the left we need to push all words and all longest common prefixes of states on the right-hand side of a rule of to the left. Pushing a word to the left through a state needs to shift the language produced by this state. We explain the algorithm in detail on state from Example 3.

###### Example 4.

Remember that produces the language and , accessible from produce languages and , respectively. Therefore , and . We start with state . As there is only one rule for the longest common prefix of and the longest common prefix of this rule are the same and therefore eliminated.

As there is only one rule for the argumentation is the same and we get . For the rule we calculate the longest common prefix of the right-hand side that is larger than the longest common prefix of . Therefore we need to calculate the shift as is accessible from in rule and is accessible from in rule . This leads to the following rule.

As the longest common prefix of is the same as the longest common prefix of the right-hand side of rule we get . The axiom of is .

###### Lemma 3.

Let be an ltw and be a state in that is quasi-periodic on the left. Let be constructed by Algorithm 1 and be a state in accessible from . Then and are equivalent and is earliest.

To replace all quasi-periodic states by their equivalent earliest form we need to know which states are quasi-periodic. Algorithm 1 can be modified to test an arbitrary state for quasi-periodicity on the left in polynomial time. The only difference to Algorithm 1 is that we do not know how to compute in polynomial time and does not exist. We therefore substitute by some smallest word of and we define a mock-shift as follows

• for all ,

• if , we say , where is a shortest word of ,

• if and then .

If several definitions of exist, we use the smallest. If is accessible from a quasi-periodic , then .

###### Algorithm 2.

Let be an ltw and be a state in . We build an ltw as follows.

• For each state accessible from , we add a copy to .

• The axiom is where is a shortest word of .

• For each rule in with accessible from , we add a rule

 pe,f→upqe1(xσ(1))qe2(xσ(2))…qen(xσ(n))

in , where is constructed as follows.

• We define , where is a shortest word of .

• Then we remove from its prefix of size , where is a shortest word of . We obtain a word .

• Finally, we set .

As the construction of Algorithms 1 and 2 are the same if the state is quasi-periodic, and are equivalent if is quasi-periodic. Moreover, is quasi-periodic if and are equivalent.

###### Lemma 4.

Let be a state of an ltw and be constructed by Algorithm 2. Then and are same-ordered and is quasi-periodic on the left if and only if and is periodic.

As and are same-ordered we can test the equivalence in polynomial time, cf. Theorem 1. Moreover testing a CFG for periodicity is in polynomial time and therefore testing a state for quasi-periodicity is in polynomial time.

Algorithm 2 can be applied to a part of a rule to test for quasi-periodicity on the left. In this case for each rule  a rule is added to and each occurrence of the part in a rule of is replaced by . We then apply the above algorithm to and test and for equivalence and for periodicity.

###### Example 5.

Let be a state with the rules , . Thus, transforms trees of the form , to . We use Algorithm 2 to test for quasi-periodicity on the left. As explained above we introduce a state with the rules , . We now apply Algorithm 2 on . We build as follows. The axiom is as the shortest word of is . For the rule we build as is the shortest word of . Then we obtain and . Thus we get . For the rule we build and obtain as the shortest word of is . Thus we get .

transforms trees of the form to and transforms trees of the form to . Thus, they are equivalent. Additionally is periodic with period . It follows that is quasi-periodic.

We introduced algorithms to test states for quasi-periodicity on the left and to build the earliest form for such states. These two algorithms can be adapted for states that are quasi-periodic on the right. There are two main differences. First, as the handle is on the right the shortest word of a language that is quasi-periodic on the right is . Second, instead of pushing words through a periodic language to the left we need to push words through a periodic language to the right.

Hence, we can test each state of an ltw for quasi-periodicity on the left and right. If the state is quasi-periodic we replace by its earliest form. Algorithm 1 and 2 run in polynomial time if SLPs are used. This is crucial as the shortest word of a CFG can be of exponential size, cf. Example 1. However, the operations that are needed in the algorithms, namely constructing the shortest word of a CFG and removing the prefix or suffix of a word, are in polynomial time using SLPs, cf. [11].

###### Theorem 5.

Let be an ltw. Then an equivalent ltw where all quasi-periodic states are earliest can be constructed in polynomial time.

### 4.2 Switching Periodic States

In this part we obtain the partial normal form by ordering periodic states of an erase-ordered transducer where all quasi-periodic states are earliest. Ordering means that if the order of the subtrees in the translation can differ, we choose the one similar to the input, i.e. if and are equivalent, we choose the second order. We already showed how we can build a transducer where each quasi-periodic state is earliest and therefore periodic. However, we need to make parts of rules earliest such that periodic states can be switched as the following example shows.

###### Example 6.

Consider the rule where , have the rules , , , . States and are earliest and periodic but not of the same period as a subword is produced in between. We replace the non-earliest and quasi-periodic part by their earliest form. This leads to with , . Hence, and are earliest and periodic of the same period and can be switched in the rule.

To build the earliest form of a quasi-periodic part of a rule each occurrence of this part is replaced by a state and for each rule a rule is added. Then we apply Algorithm 1 on to replace and therefore by their earliest form. Iteratively this leads to the following theorem.

###### Theorem 6.

For each ltw where all quasi-periodic states are earliest we can build in polynomial time an equivalent ltw such that each part of a rule in where is quasi-periodic is earliest.

In Theorem 4 we showed that order differences in equivalent erase-ordered ltws where all quasi-periodic states are earliest and all parts of rules are earliest are caused by adjacent periodic states. As these states are periodic of the same period and no words are produced in between these states can be reordered without changing the semantics of the ltws.

###### Lemma 5.

Let be an ltw such that

• is erase-ordered,

• all quasi-periodic states in are earliest and

• each in a rule of that is quasi-periodic is earliest.

Then we can reorder adjacent periodic states of the same period in the rules of such that in polynomial time. The reordering does not change the transformation of .

We showed before how to construct a transducer with the preconditions needed in Lemma 5 in polynomial time. Note that replacing a quasi-periodic state by its earliest form can break the erase-ordered property. Thus we need to replace all quasi-periodic states by its earliest form before building the erase-ordered form of a transducer. Then Lemma 5 is the last step to obtain the partial normal form for an ltw.

###### Theorem 7.

For each ltw we can construct an equivalent ltw that is in partial normal form in polynomial time.

### 4.3 Testing Equivalence in Polynomial Time

It remains to show that the equivalence problem of ltws in partial normal form is decidable in polynomial time. The key idea is that two equivalent ltws in partial normal form are same-ordered.

Consider two equivalent ltws , where all quasi-periodic states and all parts of rules with is quasi-periodic are earliest. In Theorem 4 we showed if the orders , of two co-reachable states , of , , respectively, for the same input differ then the states causing this order differences are periodic with the same period. The partial normal form solves this order differences such that the transducers are same-ordered.

###### Lemma 6.

If and are equivalent and in partial normal form then they are same-ordered.

As the equivalence of same-ordered ltws is decidable in polynomial time (cf. Theorem 1) we conclude the following.

###### Corollary 1.

The equivalence problem for ltws in partial normal form is decidable in polynomial time.

To summarize, the following steps run in polynomial time and transform a ltw into its partial normal form.

1. Test each state for quasi-periodicity. If it is quasi-periodic replace the state by its earliest form.

2. Build the equivalent erase-ordered transducer.

3. Test each part in each rule from right to left for quasi-periodicity on the left. If it is quasi-periodic on the left replace the part by its earliest form.

4. Order adjacent periodic states of the same period according to the input order.

This leads to our main theorem.

###### Theorem 8.

The equivalence of ltws is decidable in polynomial time.

## 5 Conclusion

The equivalence problem for linear tree-to-word transducers can be decided in polynomial time. To prove this we used a reduction to the equivalence problem between sequential transducers [7], or more exactly, to an extension of this result to same-ordered transducers. This reduction hinges on two points. First, we showed that the only structural differences between two equivalent earliest linear transducers are caused by periodic languages which are interchangeable. The structural characteristic of periodic languages has been used in the normalization of stws [7]. Second, we showed that if building a fully earliest transducer is potentially exponential, our reduction only requires quasi-periodic states to be earliest, which can be done in polynomial time. The use of the equivalence problem for morphisms on a CFG [13] and of properties on straight-line programs [10] is essential here as it was in [7, 8]. This leads to further research questions, starting with generalization of this result to all tree-to-words transducers. Furthermore, is it possible that these techniques can be used to decrease the complexity of some problems in other classes of transducer classes, such as top-down tree-to-tree transducers, where the equivalence problem is known to be between Exptime-Hard and NExptime?

## Appendix A Proof of Theorem 4

###### Theorem.

Let and be two equivalent erase-ordered ltws such that

• all quasi-periodic states are earliest, i.e.

• for each part of a rule where is quasi-periodic,

Let , be two co-reachable states in , , respectively and

and

be two rules for , . Then for such that , all , , are periodic of the same period and all , .

###### Proof.

Let and be the equivalent earliest transducer of and , respectively, such that and as well as and are same-ordered (cf. Theorem 2).

Suppose there exists co-reachable (and thus equivalent) states and in and , respectively, with rules

 qe,f→v0qe1(xσ(1))…qen(xσ(n))vn,
 q′e,f→v′0q′e1(xσ′(1))…q′en(xσ(n))v′n

such that .

Let be the first index such that . Following Theorem 3, we have such that and and all , are periodic with the same period.

Let and be the states in and , respectively, from which the co-reachable states and were constructed with the earliest construction proposed by [7]. From the earliest construction it follows that and are co-reachable. Since the construction preserves the rule structure, we have:

 q,f→u0q1(xσ(1))…qn(xσ(n))un
 q′,f→u′0q′1(xσ′(1))…q′n(xσ(n))u′n

The earliest construction gives us that for all , for some . This means that if is periodic, then is quasi periodic in its non-earliest form. The same is true for all .

However, the first property we supposed of and implies that all those and that are quasi-periodic are not only quasi periodic, but periodic. Consider a part of the rule that is periodic in the earliest form and therefore quasi-periodic in the non-earliest form. The first condition gives us that are periodic. However, then the words are not necessarily empty. As the part is quasi-periodic we know that each part , is quasi-periodic. Then the second condition of this theorem guarantees that the parts ,