Pattern matching in
Given permutations and with , the pattern matching problem is to decide whether matches as an order-isomorphic subsequence. We give a linear-time algorithm in case both and avoid the two size- permutations and . For the special case where only avoids and , we present a time algorithm. We extend our research to bivincular patterns that avoid and and present a time algorithm. Finally we look at the related problem of the longest subsequence which avoids and .
A permutation is said to match another permutation , in symbols , if there exists a subsequence of elements of that has the same relative order as . Otherwise, is said to avoid the permutation . For example a permutation matches the pattern (resp. ) if it has an increasing (resp. decreasing) subsequence of length . As another example, matches but not . During the last decade, the study of the pattern matching on permutations has become a very active area of research  and a whole annual conference (Permutation Patterns) is now devoted to this topic.
We consider here the so-called pattern matching problem (also sometimes referred to as the pattern involvement problem): Given two permutations and , this problem is to decide whether (the problem is ascribed to Wilf in ). The permutation matching problem is known to be NP-hard . It is, however, polynomial-time solvable by brute-force enumeration if has bounded size. Improvements to this algorithm were presented in  and , the latter describing a time algorithm. Bruner and Lackner  gave a fixed-parameter algorithm solving the pattern matching problem with an exponential worst-case runtime of , where denotes the number of alternating runs of . (This is an improvement upon the runtime required by brute-force search without imposing restrictions on and .) A recent major step was taken by Marx and Guillemot . They showed that the permutation matching problem is fixed-parameter tractable (FPT) for parameter .
A few particular cases of the pattern matching problem have been attacked successfully. The case of increasing patterns is solvable in time in the RAM model , improving the previous 30-year bound of . Furthermore, the patterns , , , can all be handled in linear-time by stack sorting algorithms. Any pattern of length can be detected in time . Algorithmic issues for -avoiding patterns matching has been investigated in . The pattern matching problem is also solvable in polynomial-time for separable patterns [13, 5] (see also  for LCS-like issues of separable permutations). Separable permutations are those permutations that match neither nor , and they are enumerated by the Schröder numbers (Notice that the separable permutations include as a special case the stack-sortable permutations, which avoid the pattern .)
There exists many generalisation of patterns that are worth considering in the context of algorithmic issues in pattern matching (see  for an up-to-date survey). Vincular patterns, also called generalized patterns, resemble (classical) patterns with the additional constraint that some of the elements in a matching must be consecutive in postitions. Of particular importance in our context, Bruner and Lackner  proved that deciding whether a vincular pattern of length can be match to a longer permutation is -complete for parameter ; for an up to date survey of the class and related material, see . Bivincular patterns generalize classical patterns even further than vincular patterns by adding a constraint on values.
We focus in this paper on pattern matching issues for -avoiding permutations (i.e., those permutations that avoid both and ). The number of -permutations that avoid both and is for and for . On an individual basis, the permutations that do not match the permutation pattern are exactly the stack-sortable permutations and they are counted by the Catalan numbers . A stack-sortable permutation is a permutation whose elements may be sorted by an algorithm whose internal storage is limited to a single stack data structure. As for , it is well-known that if avoids , then its complement avoids , and the reverse of avoids . This paper is organized as follows. In Section 2 the needed definitions are presented. Section 3 is devoted to presenting an online linear-time algorithm in case both permutations are -avoiding, whereas Section 4 focuses on the case where only the pattern is -avoiding. In Section 5 we give a polynomial-time algorithm for -avoiding bivincular patterns. In Section 6 we consider the problem of finding the longest -avoiding pattern in permutations.
A permutation of length is a one-to-one function from an -element set to itself. We write permutations as words , whose elements are distinct and usually consist of the integers , and we let stands for . For the sake of convenience, we let stand for , stand for and stand for . As usual, we let denote the set of all permutations of length . It is convenient to use a geometric representation of permutation to ease the understanding of algorithms. The geometric representation corresponds to the set of point with coordinate (see figure 1).
A permutation is said to match the permutation if there exists a subsequence of (not necessarily consecutive) element of that has the same relative order as , and in this case is said to match , written . Otherwise, is said to avoid the permutation . For example, the permutation matches the pattern , as can be seen in the highlighted subsequence of (or or or ). Each subsequence , , , , in is called a matching of . Since the permutation contains no increasing subsequence of length four, avoids . Geometrically, matches if there exists a set of point in that is isomorph to the set of point of . In other word, if there exists a set of point in with the same disposition as the set of point of , without regard to the distance (see figure 1).
Suppose is a set of permutations. We let denote the set of all -permutations avoiding each permutation in . For the sake of convenience (and as it is customary ), we omit ’s braces thus having e.g. instead of . If , we also say that is -avoiding.
An ascent of a permutation is any element where the following value is bigger than the current one. That is, if , then is an ascent if . For example, the permutation has ascents , , and . Similarly, a descent is any element with , so for every , is either an ascent or is a descent of .
A left to right maxima (abbreviate LRMax) of is a element that does not have any element bigger than it on its right (see fig. 2). Formally, is a LRMax if and only if is the biggest element of . Similarly is a left to right minima (abbreviate LRMin) if and only if is the smallest element of .
A bivincular pattern of length is a permutation in written in two-line notation (that is the top row is and the bottom row is a permutation ). We have the following conditions on the top and bottom rows of , as see in  in definition 1.4.1:
If the bottom line of contains then the elements corresponding to in a matching of in must be adjacent, whereas there is no adjacency condition for non-underlined consecutive elements. Moreover if the bottom row of begins with then any matching of in a permutation must begin with the leftmost element of , and if the bottom row of begins with then any matching of in a permutation must end with the rightmost element of .
If the top line of contains then the elements corresponding to in an matching of in must be adjacent in values, whereas there is no value adjacency restriction for non-overlined elements. Moreover, if the top row of begins with then any matching of is a permutation must contain the smallest element of , and if top row of ends with then any matching of is a permutation must contain the largest element of .
For example, let . In , is a matching of but is not. The best general reference is .
Geometrically, We represent underlined and overlined elements by forbidden areas. A vertical area between two points indicates that the two matching of those points must be consecutive in positions, whereas a horizontal area between two points indicates that the two matching of those points must be consecutive in value. The forbidden areas can be understand as follow : in a matching, the forbidden areas must be empty. Thus, matches a bivincular pattern if there exists a set of point in that is isomorph to and if the forbidden areas are empty. (see figure 3).
3 Both and are -avoiding
This section is devoted to presenting a fast algorithm for deciding if in case both and are -avoiding. We begin with an easy but crucial structure lemma.
Lemma 1 (Folklore)
The first element of any -avoiding permutation must be either the minimal or the maximal element.
Proof (of Lemma 1)
Any other initial element would serve as a ‘’ in either a or with and as the ‘’ and ‘’ respectively. ∎
if and only if for , is a LRMax or a LRMin.
Let and . Then, (1) is an ascent element if and only if is a LRMin and (2) is a descent element if and only if is a LRMax
Lemma 2 gives a bijection between and the set of binary word of size . The bijected word of , is the word where each letter at position represents if is an ascent or descent element (or is a LRMax or a LRMin). We call this bijection .
A -avoiding permutation has a particular form. If we take only the descent elements, the points draw a north-east to south-west line and if we take only the ascent elements, the points draw a south-east to north-west line. This shape the permutation as a . For convenience when drawing a random -avoiding permutation we will sometime represent a sequence of ascent/descent element by lines (see figure 4 and 5).
The following lemma is central to our algorithm.
Let and be two -avoiding permutations, Then, matches if and only if there exists a subsequence of such .
Proof (of Lemma 2)
The forward direction is obvious. We prove the backward direction by induction on the size of the pattern . The base case is a pattern of size . Suppose that and thus . Let , , be a subsequence of such that , this reduces to saying that , and hence that is a matching of in . A similar argument shows that the lemma holds true for . Now, assume that the lemma is true for all patterns up to size . Let and let , be a subsequence of of length such that . As by the inductive hypothesis, it follows that is a matching of . Moreover thus and are both either the minimal or the maximal element of their respective subsequences. Therefore, is a matching of in . ∎
We are now ready to solve the pattern matching problem in case both and are -avoiding.
Let and be two -avoiding permutations. One can decide whether matches in linear time.
According to Lemma 1 the problem reduces to deciding whether occurs as a subsequence in . A straightforward greedy approach solves this issue in linear-time. ∎
Thank to Corollary 2, we can compute the bijected words in the same time that we running the greedy algorithm, this gives us a on-line algorithm.
4 only is -avoiding
This section focuses on the pattern matching problem in case only the pattern avoids and . We need to consider a specific decomposition of into factor : we split the permutation into largest sequences of ascent element and descent element, respectively called an ascent factor and a descent factor. This correspond to split the permutation between every pair of ascent-descent element and descent-ascent element (see figure 5). For the special case of -avoiding, this correspond to split the permutation into largest sequence of consecutive element. We will label the factors from right to left.
We introduce the notation : Suppose that is a subsequence of , is the index of the left most element of in . Thus for every , stand for the index in of the leftmost element of . For example, is split as . Hence with , , and . Furthermore, , , and . We represent elements matching an ascent (resp. descent) factor by a rectangle which has the left most matched point of the factor as the left bottom (resp. top) corner and the right most matched point of the factor as the right top (resp. bottom) corner. We can remove the two right most rectangles and replace it by the smallest rectangle that contains both of them and repeat this operation to represent part of a matching (see figure LABEL:fig:shape_of_the_permutation_plus_factor_plus_factor).
A factor is either an increasing or a decreasing sequence of element. Thus while matching a factor, it is enough to find an increasing or a decreasing subsequence of same size or bigger than the factor.
Given a permutation and a suffix of its decomposition , if is an ascent (respectively descent) factor then the maximal (resp. minimal) element of is the left most element of
This is a corollary of lemma 1. This states that given a permutation in if the permutation starts with ascent (respectively descent) elements then the maximal (resp. minimal) element of this permutation is the first descent (resp. ascent) element (see figure 6).
We now define the set as the set of every subsequence of that starts at and that is a matching of and .
Let be a permutation, be an ascent (respectively descent) factor, be a subsequence of such that and that minimizes (resp. maximizes) the match of the left most element of . For all subsequences and for all subsequences of , such that , if is a matching of such that the left most element of is matched to then the subsequence is a matching of such that the left most element of is matched to .
This lemma states that given any matching of , where is an ascent (resp. descent) factor, we can replace the part of the match that match by any match that minimise (resp. maximise) the left most element of . Indeed the left most element of is the maximal (resp. minimal) element of (see figure 7).
Proof (of Lemma 3)
By definition is a matching of . To prove that is an matching of we need to prove that every element of t is larger than every element of . If is a matching of then every element of is larger than every element of . Moreover the maximal element of is smaller than the maximal element of so every element of is smaller than every element of thus every element of is smaller than every element of . We use a similar argument if is a descent factor. ∎
Let be a permutation, be an ascent (respectively descent) factor and be a subsequence of such that and that minimizes (resp. maximizes) the match of the left most element of . These following statements are equivalent :
There exists a matching of in with the left most element of is matched to .
There exists a matching of in such that is a matching of in with the left most element of matched to .
This corollary takes a step further from the previous one, it states that if there is no matching using any match that maximise (resp. minimise) the left most element of then there does not exist any matching at all. This is central to the algorithm because it allows to test only the matching that maximise (resp. minimise) the left most element of (see figure 7).
Let and .
One can decide in time and space if matches .
We first introduce a set of values needed to our proof. Let (resp. ) be the longest increasing (resp. decreasing) sequence in starting at , with every element of this sequence smaller (resp. bigger) than . and can be computed in time (see ). As stated before, those values allow us to find a matching of a factor.
Now consider the following set of values (see figure 8) :
Clearly there exists a matching of in if and only if there exists a such that and with the number of factor in . We show how to compute recursively those values.
|In the base case, one is looking for a matching of the first factor.|
where and are the sets of elements matching the left most element of in a match of starting at . Suppose that is an ascent (resp descent) factor, index and matching exists if and only if contains a matching of ”compatibles” with a matching in of . It is enough to assure that every element of the matching of are smaller (resp. bigger) than the element of the matching of .
Thus we can compute and as follows:
The number of factor is bound by . Every instance of and can be computed in . There are base cases that can be computed time, thus computing every base cases takes time. There are different instance of and each one of them take time to compute, thus computing every instance of takes time. There are different instance of and each one of them take , because the size of an is bounded by , thus computing every takes time. Thus computing all the values takes . Every value takes space, thus the whole problem takes space. ∎
5 -avoiding bivincular patterns
This section is devoted to the pattern matching problem with -avoiding bivincular pattern. Recall that a bivincular pattern generalises a permutation pattern by being able to force elements to be consecutive in value or in position. Hence, bivincular pattern is stronger in constraint than permutation pattern, intuitively we can not use the previous algorithm. As in a -avoiding permutation, we can describe structural property of a -avoiding bivincular pattern.
Given a -avoiding bivincular pattern, If (this implies that ) and if is an ascent (resp. decent) element and is an ascent (resp. decent) element then :
For every , (resp. ), is a descent (resp. ascent) element.
This lemma states that if two ascent (resp. decsent) elements need to be matched to consecutive elements in value then every element between those two elements (if any) is a descent (resp. ascent) element.
Proof (of Lemma 4)
Suppose that there exists , , such that is ascent. Ascent elements are increasing so which is in contradiction with . We use a similar argument if is a descent element ∎
Let be a -avoiding bivincular of length and . One can decide in time and space if matches .
We consider the following set of boolean : Given a -avoiding bivincular pattern, and a text ,
The argument (respectively ) stands for the match of the last ascent (resp. decsent) element matched plus (resp. minus) one. We now show how to compute recursively those boolean (see fig. 9).
The base case finds an matching for the rightmost element of the pattern. If the last element does not have any restriction on positions and on values, then is true if and only if is matched . Which is true if . If then must be matched to the right most element of thus must be the . If then must be matched to the largest element which is . If then must be matched to the smallest element which is . If then the matched element of and must be consecutive in value, by recursion the value of the matched element of will be recorded in and by adding to it thus must be matched to . If then the matched element of and must be consecutive in value, by recursion the value of the matched element of will be recorded in and by removing to it thus must be matched to .
We need to consider 3 cases for the problem :
If then :
which is immediate from the definition. If then it can not be part of a matching of in with every matched element in .
If and is an ascent element then :
Remark that can be matched to because . Thus if matches with every element of the matching in then matches . The last condition correspond to know if there exists , such that is true. The first case correspond to a matching without restriction on position and on value. The second case asks for the match of and to be consecutive in value, but the match of is thus we want . The fourth case asks for the match of and to be consecutive in index, thus the match of must be . The third case is an union of the second and fourth case. Note that if is an ascent element we can not have the condition that the match of and have to consecutive in value.
If and is a descent element then :
The same remark as the last case holds.
Clearly if is true then matches . We now discuss the position and value constraints.
There are 3 types of position constraints that can be added by underlined elements.
If then the leftmost element of must be matched to the leftmost element of ( is matched to on a matching of in ). This constraint is satisfied by requiring that the matching starts at the left most element of : if is true.
If then the rightmost element must be matched the rightmost element of ( is matched to on a matching of in ). This constraint is checked in the base case.
If then the index of the matched elements of and must be consecutive. In other word, if is matched to then must be matched to . We assure this restriction by recursion by requiring that the matching of to start at index (see figure 9).
There are 3 types of position constraints that can be added by overlined elements.
If (and thus ) then the minimal value of must be matched to the minimal value of .
If is an ascent element, then remark that every problem is true if is matched to element with value (by recursion) thus it is enough to require that . Now remark that is the leftmost ascent element, indeed if not, then there exists an ascent element , and by definition which is not possible as must be the minimal element. As a consequence , , are descent elements. Moreover the recursive call from a descent element does not modified the lower bound thus for every , (see figure 9).
If is a descent element then ( is the right most element). Thus every is a base case and is true if is matched to .
If (and thus ) then the maximal value of must be matched to the maximal value of .
If is an descent element, then remark that every recursive call
is true if is matched to element with value (by recursion) thus it is enough to require that . Now remark that is the leftmost descent element, indeed if not, then there exists an descent element , and by definition which is not possible as must be the maximal element. As a consequence , , are ascent elements. Moreover the recursive call from a ascent element does not modified the upper bound thus for every , (see figure 9.
If is an ascent element then then ( is the right most element). Thus every is a base case and is true if is matched to .
If , (which implies that ) then if is matched to then must be matched to .
The case is a descent element, is an ascent element and (remark that this case is equivalent to the case is an ascent element,