Hardness of Permutation Pattern Matching††thanks: Supported by project 16-01602Y of the Czech Science Foundation.
Permutation Pattern Matching (or PPM) is a decision problem whose input is a pair of permutations and , represented as sequences of integers, and the task is to determine whether contains a subsequence order-isomorphic to . Bose, Buss and Lubiw proved that PPM is NP-complete on general inputs.
We show that PPM is NP-complete even when has no decreasing subsequence of length 3 and has no decreasing subsequence of length 4. This provides the first known example of PPM being hard when one or both of and are restricted to a proper hereditary class of permutations.
This hardness result is tight in the sense that PPM is known to be polynomial when both and avoid a decreasing subsequence of length 3, as well as when avoids a decreasing subsequence of length 2. The result is also tight in another sense: we will show that for any hereditary proper subclass of the class of permutations avoiding a decreasing sequence of length 3, there is a polynomial algorithm solving PPM instances where is from and is arbitrary.
We also obtain analogous hardness and tractability results for the class of so-called skew-merged patterns.
From these results, we deduce a complexity dichotomy for the PPM problem restricted to belonging to , where denotes the class of permutations avoiding a permutation . Specifically, we show that the problem is polynomial when is in the set , and it is NP-complete for any other .
Computer Science Institute
Charles University, Faculty of Mathematics and Physics,
Malostranské nám. 25, 118 00 Praha 1, Czech Republic;
Department of Applied Mathematics and Institute for Theoretical Computer Science,
Charles University, Faculty of Mathematics and Physics,
Malostranské nám. 25, 118 00 Praha 1, Czech Republic;
A permutation of size is a bijection from the set to itself. We represent a permutation of size by the sequence . We let denote the set of permutations of size . When writing out short permutations explicitly, we usually omit the punctuation and write, e.g., instead of . We let denote the set .
We say that a permutation contains a permutation , which we denote by , if has a subsequence whose elements have the same relative order as the elements of , that is, for any we have if and only if . We call such a subsequence an occurrence of in . If does not contain , we say that avoids or that is -avoiding.
In this paper, we study the computational complexity of determining for a given pair of permutations and whether contains . In the literature, this problem is known as Permutation Pattern Matching, or PPM.
Permutation Pattern Matching (PPM) Instance: Permutations and . Question: Does contain ?
In the context of PPM, the permutation is usually called the pattern, and is called the text. When dealing with instances of PPM, we always assume that the pattern is at most as long as the text, that is, .
Observe that PPM can be solved by a simple brute-force algorithm in time . Thus, if the pattern were fixed, rather than being part of the input, the problem would trivially be polynomial-time solvable.
Bose, Buss and Lubiw  have shown that PPM is NP-complete. This general hardness result has motivated the study of parameterized and restricted variants of PPM.
Guillemot and Marx  have shown that PPM can be solved in time ; in particular, PPM is fixed-parameter tractable with considered as parameter. The complexity of the Guillemot–Marx algorithm was later reduced to by Fox .
A different parameterization was considered by Bruner and Lackner , who proved that PPM can be solved in time , where is the number of increasing and decreasing runs in ; here an increasing run in is a maximal consecutive increasing subsequence of length at least , and decreasing runs are defined analogously. The Bruner–Lackner algorithm shows that PPM is fixed-parameter tractable with respect to the parameter . Moreover, since for any permutation , the Bruner–Lackner algorithm solves PPM in time . It is the first algorithm to improve upon the bound of achieved by the straightforward brute-force approach.
Another algorithm for PPM was described by Albert, Aldred, Atkinson and Holton , and later a similar approach was analyzed by Ahal and Rabinovich . Ahal and Rabinovich have proved that PPM can be solved in time , where is a graph that can be associated to the pattern , and denotes the treewidth of . We shall give the precise definitions and the necessary details of this approach in Section 3.
Apart from the above-mentioned algorithms, which all solve general PPM instances, various researchers have obtained efficient solutions for instances of PPM where or satisfy some additional restrictions. The most natural way to formalize such restrictions is to use the concept of permutation class, which we now introduce.
A permutation class is a set of permutations with the property that for every , all the permutations contained in belong to as well.
It is often convenient to describe a permutation class by specifying the minimal permutations not belonging to . For a set of permutations , we let denote the class of permutations that avoid all the permutations in . Note that for any permutation class there is a unique (possibly infinite) antichain of permutations such that . The set is the basis of .
A principal permutation class is a permutation class whose basis has a single element. A proper permutation class is a permutation class whose basis is nonempty, or in other words, a permutation class that does not contain all permutations. For a recent overview of the structural theory of permutation classes, we refer the interested reader to the survey by Vatter .
When dealing with specific sets , we often omit nested braces and write, e.g., or , instead of or , respectively.
In this paper, we focus on the complexity of PPM when one or both of the inputs are restricted to a particular proper permutation class. Following the terminology of Albert et al. , we consider, for a permutation class , these two restricted versions of PPM:
-Permutation Pattern Matching (-PPM) Instance: A pattern of size and a text of size . Question: Does contain ?
-Pattern Permutation Pattern Matching (-Pattern PPM) Instance: A pattern of size and a text . Question: Does contain ?
Clearly, any instance of -PPM is also an instance of -Pattern PPM, and in particular, -PPM is at most as hard as -Pattern PPM.
Bose, Buss and Lubiw  have shown that -Pattern PPM is polynomially tractable when is the class of the so-called separable permutations. Other algorithms for -Pattern PPM were given by Ibarra , by Albert et al. , and by Yugandhar and Saxena .
An even more restricted case of -Pattern PPM deals with monotone increasing patterns, that is, . In this case, -Pattern PPM reduces to finding the longest increasing subsequence in a given text. This is an old algorithmic problem , and can be solved in time [9, 18].
A natural generalization is to consider instances of PPM where patterns and texts can be partitioned into a bounded number of monotone sequences. For integers , we say that a permutation is an -permutation if can be partitioned into increasing and decreasing (possibly empty) subsequences. We let denote the class of all -permutations. In particular, , and more generally, . The -permutations are also known as skew-merged permutations, and it is not hard to see that ; see Atkinson . Kézdy, Snevily and Wang  have shown that for any , the basis of the class is finite; however, they also pointed out that the basis of has more than 100 permutations.
Guillemot and Vialette  have shown that -PPM is polynomial-time solvable. A different, faster algorithm for -PPM has been described by Albert et al. . By a similar approach, Albert et al.  have also obtained a polynomial algorithm for -PPM.
It is NP-complete to decide, for a pattern and a text , whether is contained in .
Consequently, -Pattern PPM as well as -PPM are NP-complete. These results are tight in the sense that both -Pattern PPM and -PPM are polynomial-time solvable, as mentioned above.
We obtain similar results when and are replaced with and , respectively. In fact, here we can be even more restrictive. Let be the class .
It is NP-complete to decide, for a pattern and a text , whether is contained in .
This result again implies that -Pattern PPM and -PPM are NP-complete, in contrast with the polynomiality of -Pattern PPM, -PPM and -PPM.
Let be a permutation. The problem -Pattern PPM is polynomial-time solvable for and NP-complete for any other .
We also obtain new tractability results, which show that the NP-hardness results for -Pattern PPM and -Pattern PPM are tight in an even stronger sense than suggested above.
If is a proper subclass of then -Pattern PPM can be solved in polynomial time.
If is a proper subclass of then -Pattern PPM can be solved in polynomial time.
2 Hardness of -Pattern PPM
Our goal is to show that for a pattern and a text , it is NP-complete to decide whether is contained in . Since the problem is clearly in NP, we focus on proving its NP-hardness. We take inspiration from the NP-hardness proof given by Bose, Buss and Lubiw  for general permutations and adapt it to the proper classes and . We proceed by reduction from the classical NP-complete problem 3-SAT, whose input is a 3-CNF formula , and the goal is to determine whether is satisfiable.
We first introduce several auxiliary notions. Let be a permutation of length . For a pair of elements , we say that is above (and is below ), if . Likewise, is left of (and is right of ) if . For disjoint sets we say that is above if every element of is above all the elements of , and similarly for the other directions.
If is a pair of elements of and is another element, we say that is sandwiched by from below if and is above ; see Figure 1, left. Similarly, we say that is sandwiched by from the left if and is to the right of ; see Figure 1, right. Analogously, we also define sandwiching from the right or from above. More generally, a set of elements of is sandwiched from below by if each element of is sandwiched from below by , and similarly for the other directions.
A pair of elements is an increase in if and ; in other words, an increase is an occurrence of the permutation 12 in .
A staircase of steps is a sequence of disjoint increases
with the following properties (see Figure 2):
For every , is sandwiched by from the left, and for , is sandwiched by from below.
For every , is to the right and above , and is to the right and above . In particular, the elements of , as well as the elements form an increasing sequence of length .
We call the th outer bend of , and the th inner bend. Additionally, we call the base of and the top of .
A double staircase of steps is a pair of disjoint staircases of steps, with and , satisfying the following properties (see Figure 3):
For each , the increase is above and to the right of , and also below (and necessarily to the right of) . For , is also below and to the left of .
Similarly, for , is above and to the right of , and also to the left (and necessarily above) . For , is below and to the left of .
An -fold staircase of steps is an -tuple of staircases such that for each with , the pair is a double staircase of steps. The th inner bend of is the union of the th inner bends of the staircases ; the outer bends, the base and the top of are defined analogously.
Notice that an -fold staircase avoids the pattern . Moreover, the union of its outer bends, as well as the union of its inner bends each form an increasing subsequence.
2.1 The Reduction
We now describe the reduction from 3-SAT. Let be a given 3-CNF formula. Suppose that has variables and clauses . We will assume, without loss of generality, that each clause contains exactly three literals, and no variable appears in a single clause more than once. We will construct two permutations and , such that is satisfiable if and only if contains .
The overall structure of and is depicted in Figure 4.
The pattern is the disjoint union of a -fold staircase of steps with an increasing sequence (called the anchor of ) of length , where is a sufficiently large value to be specified later. Moreover, the sequence is below , to the right of the base of , and to the left of the first inner bend of . This determines uniquely up to the value of , and we may observe that avoids . We will say that the staircase represents the variable of .
We now describe the text . As a starting point, we first build a permutation and then explain how to modify to obtain .
The permutation is the disjoint union of a -fold staircase of steps, and an increasing sequence (the anchor of ) of length . The sequence is below the -fold staircase, to the right of the base of , and to the left of the first outer bend of .
Each of the staircases in will represent one of the possible literals, with representing the literal and representing .
We now modify to obtain the actual text . The modification proceeds in two steps.
In the first step, we change the relative position of the bases of the individual staircases in . For every , let and be the respective bases of and . In , the two bases together form an occurrence of . We modify their relative horizontal position by moving to the right of , so that in the two bases will form an occurrence of . The relative position of and to the remaining elements remains unchanged; in other words, the four elements of occupy the same four rows and the same four columns as before.
Before we describe the second step of the construction of , we introduce the following notion: suppose is a staircase of steps, and let , and be three consecutive bends of , with . A bypass of in is a sequence of three increases disjoint from and satisfying these properties:
The sequence obtained from by replacing with , with and with is again a staircase with steps, and
is above and to the left of , is above and to the right of , and is sandwiched by from the right (see Figure 5).
Note that these conditions determine the relative positions of the elements of uniquely. We call the pair the fork of the bypass, and the pair the merge of the bypass.
Let be a staircase of steps, let be arbitrary, and let be obtained from by replacing the three bends with a bypass . Let be a staircase of steps such that , and moreover, each of the elements of also belongs to . Then the sequence of bends is equal either to or to .
By assumption, . There are four elements of (namely ) sandwiched from below by , and two of them must form . However, since and are the only two increases in , we have either or . In either case, there is a unique increase sandwiched by from the left, and this increase must therefore be , and by the same argument, both and are determined uniquely. The lemma follows. ∎
We continue by the second step of the construction of the text . The general approach is to modify certain parts of the staircases by adding bypasses whose structure depends on the clauses of the formula . Recall that has clauses , and that the -fold staircase has steps.
Let and denote the th outer bend and the th inner bend of , respectively. For every , we will associate to the clause the sequence of three bends of .
Let be a clause of of the form , with , , and , for some .
Let denote the staircase representing the literal , and let be the staircase representing the other literal containing . We define , , and analogously.
We will now add bypasses to (some of) the staircases into the three bends associated to , by inserting new bends into , and , and sometimes changing the relative position of existing bends. The choice of the relative positions of these bends is the key aspect of our reduction.
We first describe how to modify . We will replace by a so called fork gadget containing the th outer bends of all the staircases and and also the corresponding bends of their bypasses, if any. The gadget is a union of three disjoint increasing sequences, which we call the top level, the middle level and the bottom level; see Figure 6. As the names suggest, the top level is above the middle one, and the middle one is above the bottom one.
The top level contains, in left-to-right order, the bend of , the bends of , followed by the bend of , and finally the bends of .
The middle level contains, left to right, the bends of , the bypasses of , and finally the bend of .
The bottom level contains the bends of , the bypasses of , and the bypass of .
Note that does not contain the permutation : indeed, any occurrence of would have to intersect all three levels, since each level is an increasing sequence. However, the only elements of the top level that have any element of the bottom level to the right of them are the two elements of the bend of , and neither of these two elements belongs to an occurrence of .
We then replace the inner bend by a single increasing sequence containing the first inner bend of all the staircases and bypasses emerging from . Note that the vertical position of these staircases is already determined by .
Finally, we replace by a merge gadget , in which all the bypasses that forked in will be merged. Note that the horizontal relative position of the bends in is determined uniquely by , while their relative vertical position is fixed by . Essentially, is similar to a transpose of along the North-East diagonal, except that a fork of a bypass forms an occurrence of in , whereas the corresponding merge forms an occurrence of in . In particular, we may again see that is -free.
Performing the above-described modifications for each , we obtain the text . We let denote the part of that was obtained by replacing the bends of by the corresponding gadgets; in other words, contains all the elements of except the anchor.
We let (or ) denote the subpermutation obtained as the union of (or ) and all the bypasses added to it.
It remains to determine the value of , i.e., the length of the anchors of and . We choose to be equal to , that is, the elements of the anchor will outnumber the remainder of .
We now verify that the construction has the desired properties.
Let the formula , pattern and text be as described in the previous subsection. Then is satisfiable if and only if contains .
Suppose that is satisfiable. Fix any satisfying assignment, and represent it by a function , where if and only if the variable is true in the chosen satisfying assignment. We obtain a copy of in as follows. The elements of the anchor of will be mapped to the anchor of .
It remains to map the -fold staircase to . We will show that it is possible to map each into . To obtain such a mapping, we need to decide, for every bypass appearing in , whether to map to the bends of the bypass or to the bends of . The decision can be made for each bypass independently, but we need to make sure that for each gadget in , the bends mapped into will form an increasing sequence.
It can be verified by a routine case analysis that such a choice is always possible. To see this, suppose that is a clause whose literals contain variables , and , respectively, with . Assume for instance, that the assignment satisfies but not and . Then the th outer bends of , and must be mapped to the bends of , and in . The bends of and are unique, and to preserve monotonicity, we need to choose the bend of in the bottom level of , i.e., the bypass of . Then all the staircases may be mapped to bends in the bottom level of , while may be mapped to bends in the middle level, preserving monotonicity.
Notice that the position of the bends in is the transpose along the North-East diagonal of their position in , and in particular, the bends will form an increasing sequence in as well.
We conclude that if is a satisfying assignment of , then occurs in . For the converse, suppose that has an occurrence in . Since the anchor of is longer than , at least one element of must be mapped to an element of . In particular, all the elements to the left and above are mapped to elements to the left and above . In other words, the base of gets mapped to a subset of the base of .
Recall that the base of is an increasing sequence of blocks of size , where the th block is the union of the base of with the base of . Recall also, that each of these blocks is order isomorphic to , and in particular, the longest increasing subsequence of each block has size , and there are exactly two such subsequences, namely and .
This implies that any increasing subsequence of length of the base of contains exactly one of and for each . In particular, in an occurrence of in , the base of is mapped either to or to . Fix an occurrence of in and use it to define a truth assignment , so that the base of is mapped to .
We claim that the assignment satisfies . To see this, we first argue that for every , the elements of are mapped to elements of , and more specifically, each (inner or outer) bend on is mapped either to the corresponding bend of or to the corresponding bend of a bypass of . We have already seen that this is the case for the base of . To show that it holds for the remaining bends also, we may proceed by induction and simply note that the only elements sandwiched from below by an inner bend in are the elements of the subsequent outer bend of , or perhaps a pair of outer bends forming the fork of a bypass. An analogous property holds for outer bends as well. Using Lemma 2.1, we may then conclude that the bends of map to corresponding bends in .
To see that is satisfying, assume for contradiction that there is a clause whose literals involve the variables and , and neither of these literals is satisfied by . It follows that inside the gadget , the bends of , and must map to the bends of , and . However, no three such bends form an increasing sequence in , whereas the corresponding bends form an increasing sequence in . This contradiction completes the proof of the proposition. ∎
Proof of Theorem 1.1.
Clearly, the problem is in the class NP. It is easy to observe that in our reduction, belongs to . To see that belongs to , it suffices to note that the gadgets used in the construction of all avoid , and that the base of avoids as well. Note also that the base of is to the left and below all the gadgets . Clearly, and can be constructed from in polynomial time, and the correctness of the reduction follows from Proposition 2.2. ∎
3 Patterns from a proper subclass of
In this section we prove Theorem 1.4. We rely on a result by Ahal and Rabinovich , who showed that for patterns with bounded “treewidth”, the PPM problem can be solved in polynomial time. Our main contribution is in showing that patterns in of large “treewidth” contain a large universal pattern, containing all patterns in of a given size. To show this, we use the grid minor theorem by Robertson and Seymour [11, 19].
3.1 Permutations and treewidth
The following definition was introduced by Ahal and Rabinovich [2, Definition 2.3]. For a -permutation , a graph is defined as follows; see Figure 7. The vertices of are the numbers , interpreted as the elements of . Two vertices are connected by an edge if or . We say that an edge between and is red if , and it is blue if . Note that an edge can be both red and blue. Clearly, the edges of each color form a Hamiltonian path in . We note that in our definition, is a graph, whereas Ahal and Rabinovich  defined as a multigraph. Also, we label the vertices of by their value rather than position in .
Let be the treewidth of a graph . The main result of Ahal and Rabinovich  can be stated in the following form.
Theorem 3.1 ([2, Theorem 2.10, Proposition 3.6]).
For and , the problem whether is contained in can be solved in time .
3.2 Treewidth, grids and walls
The grid is the graph with vertex set where vertices and are joined by an edge if and only if . See Figure 8, left.
Robertson and Seymour  proved that for every , every graph of sufficiently large treewidth contains the grid as a minor. Recently, Chekuri and Chuzhoy  showed that a treewidth polynomial in is sufficient. The upper bound has been further improved by Chuzhoy .
Theorem 3.2 ().
There is a function satisfying such that every graph of treewidth at least contains the grid as a minor.
Since grids have vertices of degree , it is more convenient to consider their subgraphs of maximum degree , called walls. Let be even. An elementary wall of height  is obtained from the grid by removing two opposite corners and , edges for odd and , and edges for even and . That is, an elementary wall of height is a planar graph of maximum degree , which can be drawn as a “wall” with rows of “bricks”, where each “brick” is a face of size . See Figure 8, right. A subdivision of an elementary wall of height is called a wall of height or simply an -wall. It is well known that if is a graph of maximum degree , then a graph contains as a minor if and only if contains a subdivision of as a subgraph. Therefore, a graph containing the grid as a minor also contains an -wall as a subgraph.
3.3 Universal patterns in
Let be a finite sequence of distinct positive integers. The reduction of is an -permutation obtained from by replacing the th smallest element by , for every .
For every , we define the -track as the -permutation that is the reduction of the sequence if is even, and the reduction of if is odd. See Figure 9. The -track clearly avoids since it is a union of two increasing sequences. In Lemma 3.4 we will show that the -track is a universal pattern for all 321-avoiding -permutations.
We use the stair-decomposition of -avoiding permutations introduced by Guillemot and Vialette  and independently, in a slightly different way, by Albert et al. . Let be a -permutation. A stair-decomposition of is a partition of , regarded as the set of elements of , into possibly empty subsets , for some , such that
each forms an increasing subsequence in ,
is above for each ,
is to the right of for each , and
is above and to the right of for each .
The subsets are called the blocks of the stair-decomposition. Sometimes it will be convenient to refer to blocks or , which we define as empty sets.
The -track has a stair-decomposition into blocks , each containing exactly elements; see Figure 9. Moreover, for every , the subset forms a vertical alternation in the -track, that is, is above and the elements from alternate from left to right with the elements from in the -track. Similarly, for every , the subset forms a horizontal alternation in the -track, that is, is to the right of and the elements of alternate from bottom to top with the elements of in the -track.
Every -avoiding -permutation has a stair-decomposition with at most blocks.
Let be a -permutation. We define a stair-decomposition of by a greedy algorithm. Let be the longest interval whose elements form an increasing subsequence in . Let be the subsequence of formed by the elements in . Now let be the subset of whose elements form the maximal increasing prefix of . Let be the subsequence of obtained by removing the elements of . We continue analogously. Suppose that and have been defined. Then let be the maximal down-set of forming an increasing subsequence in , and let be the subsequence of obtained by removing the elements of . Finally, let be the subset of whose elements form the maximal increasing prefix of , and let be the subsequence of obtained by removing the elements of . We continue until or is empty and we denote by the largest index such that is nonempty. Clearly .
Now we verify that is indeed a stair-decomposition of . The facts that and are above , and that and are to the right of for every follow directly from the construction.
For every , the block is to the right of since the set of elements above and to the left of forms an increasing subsequence in ; a decreasing pair would form a forbidden pattern 321 with . Finally, for every , the block is above since the set of elements below and to the right of forms an increasing subsequence in ; a decreasing pair would induce a forbidden pattern 321 with . ∎
Albert et al. [4, Proposition 6] proved that each 321-avoiding permutation of size is contained in an -track for some . Using similar ideas, we observe the following stronger fact.
Let be a -permutation, and let be its stair-decomposition. Let . Let be the -track, and let be its stair-decomposition into blocks of size . Then has an occurrence in in which the elements of are mapped into for every .
For the claim is trivial, so we assume that .
For , let be the left-to-right linear order of the elements from in . Similarly, for