Increasing subsequences of random walks
Abstract
Given a sequence of real numbers , we consider the longest weakly increasing subsequence, namely with and maximal. When the elements are i.i.d. uniform random variables, Vershik and Kerov, and Logan and Shepp proved that .
We consider the case when is a random walk on with increments of mean zero and finite (positive) variance. In this case, it is well known (e.g., using record times) that the length of the longest increasing subsequence satisfies . Our main result is an upper bound , establishing the leading asymptotic behavior. If is a simple random walk on , we improve the lower bound by showing that .
We also show that if is a simple random walk in , then there is a subsequence of of expected length at least that is increasing in each coordinate. The above onedimensional result yields an upper bound of . The problem of determining the correct exponent remains open.
1 Introduction
For a function , its restriction to a subset of its domain is denoted . We say that is increasing if for all with . Define
The main goal of this paper is to investigate when is a random walk. The simple random walk is the most natural case, but our results apply to walks with steps of mean zero and finite (positive) variance, that is, such that is an i.i.d. sequence with and . By normalising we may clearly assume that . We say that is the simple random walk if .
The famous ErdősSzekeres Theorem [6] implies that must contain either an increasing or a decreasing subsequence of length at least . This is sharp for general sequences, and it is easy to see that there are even step simple walks on for which the longest increasing subsequence has length of order . By symmetry, increasing and decreasing subsequences have the same length distribution, but this does not immediately imply that a similar bound holds in high probability.
In random settings, there have been extensive studies of the longest increasing subsequence in a uniformly random permutation initiated by Ulam [14]. This is easily equivalent also to the case of a sequence of i.i.d. (nonatomic) random variables. A rich theory rose from the study of this question, which is closely related to last passage percolation and other models. It was proved by Vershik and Kerov [15] and by Logan and Shepp [10] that and in probability as . In this case, much more is known. Baik, Deift and Johansson [2] proved that the fluctuations of scaled by converge to the TracyWidom distribution, first arising in the study of the Gaussian Unitary Ensemble. We refer the reader to Romik’s book [12] for an excellent survey of this problem.
On the other hand, it appears that this problem has not been studied so far even for a simple random walk . The expected length of the longest strictly increasing subsequence of is at most the expected size of the range , hence is . Thus we consider (weakly) increasing subsequences. Taking the set of record times, or alternatively the zero set of both yield increasing subsequences of expected length . It is not immediate how to do any better. The largest level set of still has size . See Figure 1 for the longest increasing subsequence in one random walk instance. Note that the set of record times yields a similar lower bound for a general random walk with mean zero and finite variance.
On some reflection, one finds a number of arguments that yield the weaker bound for a simple random walk . For example, first one can show that with high probability, in every interval , each value is visited at most times. Assume is such that is increasing. For each define the interval , where is the first (and is the last) time such that . By monotonicity the intervals are disjoint. The length of the subsequence is then bounded by , where the number of intervals is at most . As with high probability, the CauchySchwarz inequality gives the upper bound . However, going beyond the exponent requires more delicate arguments even in the case of a simple random walk.
Although Ulam’s problem and the problem of this paper are superficially similar, it turns out that their structure is different, and finding monotone subsequences in random walks is more closely related in nature to restriction theorems for continuous functions. In the continuous setting the upper Minkowski dimension plays the role of counting. Balka and Peres [3] showed that a Brownian motion is not monotone on any set of Hausdorff dimension greater than . However, working with the Minkowski dimension requires understanding the structure of a set at a specific scale, which is not needed for the Hausdorff dimension. The proof of [3] is based on Kaufman’s uniform dimension doubling theorem for twodimensional Brownian motion. A key fact used there is that a closed set in of Hausdorff dimension strictly greater than 1/2 intersects the zero set of a Brownian motion with positive probability. Hausdorff dimension cannot be replaced here by upper Minkowski dimension. Therefore, the methods developed in [3] are not powerful enough to prove the leading term of the upper bound even in the case of a simple random walk. However, Balka, Máthé and Peres analyzed the behaviour of some selfaffine functions motivated by [3] and questions of Kahane and Katznelson [7], which had a large impact on this paper. Later on, Máthé and the authors of this paper [1] proved restriction theorems for fractional Brownian motion using similar methods to this paper’s ones, and the case of selfaffine functions was also handled in [1]. In particular, they proved that a Brownian motion is not monotone on any set of upper Minkowski dimension greater than . Hence the interaction between this paper and [1] played an important role in both directions. For other restriction theorems in the continuous, deterministic case, and for generic functions in the sense of Baire category see Elekes [5], Kahane and Katznelson [7], and Máthé [11].
The main goal of this paper is to prove the following theorem.
Theorem 1.
Let be a random walk with i.i.d. steps satisfying and . For all large enough, for all we have
Moreover, if for some then we have
In the following corollaries let be a random walk as in Theorem 1.
Corollary 1.1.
For all with probability we have
Corollary 1.2.
For all and large enough
In the other direction, we show that in the case of a simple random walk, with high probability there are increasing subsequences somewhat longer than the trivially found ones.
Theorem 2.
Let be a simple random walk. For every for all large enough
Consequently, for all large enough we have
In Section 4 we consider higher dimensional random walks. Let and let . We say that is increasing on a set if all the coordinate functions of are nondecreasing, i.e. is increasing with respect to the coordinatewise partial order on . Generalizing , we define
Since the restriction of a random walk to a single coordinate is again a random walk, if is a dimensional random walk with mean and bounded second moment then Corollary 1.1 implies that with probability . For a large class of twodimensional random walks we are able to prove a lower bound as well. However, the problem of determining the correct exponent remains open.
Theorem 3.
Let be a twodimensional random walk with steps for which

the mean is the zero vector,

the covariance matrix is the identity matrix,

the coordinates of have finite moments for some .
Then there is a constant such that for eevery and
Consequently, for all we have
Finally, in Section 5 we state some open questions.
Acknowledgments
OA thanks the organizers of the probability, combinatorics and geometry meeting at the Bellairs Institute, as well as several participants, in particular Simon Griffiths who proposed this problem, and Louigi AddarioBerry, Guillaume Chapuy, Luc Devroye, Gábor Lugosi and Neil Olver for useful discussions. The present collaboration took place mainly during visits of OA and RB to Microsoft Research. We are indebted to András Máthé and Boris Solomyak for useful suggestions.
2 Upper bound
To simplify notations in the proof, it is convenient to assume the length of the random walk is a power of . By monotonicity in of , we can interpolate for all other . Our main result Theorem 1 follows from the following by this monotonicity and the substitution .
Theorem 2.1.
Let be a random walk with and . For all large enough for all we have
Moreover, if for some then we have
The main goal of this section is to prove Theorem 2.1. The key is a multiscale argument, the time up to is split into intervals. We consider the number of these intervals that intersect our set , as well as the sizes of intersections. Repeating this allows us to get (inductively) better and better bounds. The dependence on the randomness of the walk is done through some estimates on the local time, which we derive in the following subsection.
Throughout this section, fix a random walk with and . Various constants below depend only on the law of . We will use the following theorems in this section.
Theorem 2.2 (Petrov, [13]).
There is a constant such that for all and we have
For the following theorems see [8, Thm. A.2.5] and its corollaries.
Theorem 2.3.
For all and we have
Theorem 2.4.
Assume that for some . Then there is a constant such that for all and we have
2.1 Scaled local time estimates
Definition 2.5.
Let and . A time interval of order is of the form
A value interval of order is of the form
Note that a time interval is a subset of , while a value interval is a real interval. For all let be the set of time intervals of order contained in . Clearly .
Definition 2.6.
The scaled local time is the number of order intervals in in which takes at least one value in :
Our intermediate goal is to prove the following uniform estimate on scaled local times.
Proposition 2.7.
There is a such that for all large enough
We begin with an estimate on the expectation of a single scaled local time. Let and denote the expectation and probability for a random walk started at .
Lemma 2.8.
For some absolute constant and for all we have
Proof.
The proof uses the idea that conditioned on the event that some time interval contributes to , with probability bounded from , the random walk is still nearby at the end of the time interval.
The strong Markov property of the walk and translation invariance imply that it is enough to consider the case . Since the central limit theorem yields , there exist and such that for all we have
For each let be the event that and let be the event
that counts towards . By the above inequality, for all we have
By Theorem 2.2 there exists a constant such that for all and we have
The above two inequalities imply that
where . The proof is complete. ∎
Next we estimate the tail of a single scaled local time.
Lemma 2.9.
There is an absolute constant such that for all and we have
Proof.
Let , where is the constant of Lemma 2.8 and denotes rounding up. By Markov’s inequality we have , establishing the claim for . We proceed inductively: Assume that the claim holds for some . Observe the walk starting at time either until we reach time or until subintervals of order contribute to . The latter happens with probability at most . By the strong Markov property the conditional probability that there are additional subintervals contributing to is at most , proving the claim for . ∎
Proof of Proposition 2.7.
Let , where is the constant of Lemma 2.9. We apply Lemma 2.9 with to each of the relevant . Since , there are choices for each of and . As , there are at most options for . If then with have scaled local time . This is likely, as Theorem 2.3 yields that . These imply that for all large enough we have
Clearly we may also require . ∎
2.2 No long increasing subsequence
Next, we use Proposition 2.7 to rule out the existence of very long increasing subsequences in the random walk. We need the following definition.
Definition 2.10.
Let be a function and let be a finite set such that . The variation of restricted to is defined as
Note that if is increasing then equals the diameter of . The upper bound of Theorem 2.1 follows from the following proposition.
Proposition 2.11.
Proof.
Let be a set such that is increasing. For let
be the set of intervals of order that intersect , and its size with a convenient normalization. Clearly and . In order to prove the claim we prove inductively bounds on .
Let and index the elements of , and suppose that interval contains intervals in , so that . By assumption (1) for all we have that is visited in at most subintervals of . It follows that if then must visit at least value intervals of order . The diameter of the union of these visited intervals is at least . This leads to a variation bound
(2.1) 
Assumption (2) yields that . Thus
(2.2) 
Inequalities (2.1) and (2.2) and imply that
and therefore
(2.3) 
Using and dividing (2.3) by yields
where we have used that . As , the above inequality implies
for every . In particular we get for
Finally, we use Proposition 2.11 to derive an estimate on the likelihood of long increasing subsequences in a random walk.
Proof of Theorem 2.1.
First we prove the theorem for . Let
where denotes rounding up. Note that . We consider up to time . For large enough with probability the event of Proposition 2.7 occurs for . Moreover, Theorem 2.3 implies that
Thus with probability at least the conditions and conclusion of Proposition 2.11 hold for .
Suppose additionally that for some . Then Theorem 2.4 yields that for some constant and for all large enough
Thus with probability at least the conditions and conclusion of Proposition 2.11 hold for .
Let be such that Proposition 2.11 holds for . Since is increasing in , we obtain that for large enough
This proves Theorem 2.1 if . For the general case fix , it is enough to prove that for all we have
(2.4) 
then setting concludes the proof, where denotes rounding down. Let . If is already defined then let be the minimal integer so that . Since increases by at most when incrementing , we actually have . By the strong Markov property at , we see that are i.i.d. copies of . However, requires for all , with probability at most . This implies (2.4), and the proof is complete. ∎
3 Lower bound for a simple random walk
The goal of this section is to prove Theorem 2. For simplicity, we present our argument only for the simple random walk on . However, it seems that the argument should extend with minor changes to any random walk with bounded integer steps of 0 mean, and finite variance. The construction relies on values appearing multiple times in the walk, and fails more fundamentally if the walk is not supported on multiples of some .
Definition 3.1.
Let denote the hitting time of by the simple random walk. Let be the 2order of , that is, the number of times it is divisible by .
Lemma 3.2.
Consider a simple random walk from conditioned to hit before returning to , and stopped when it reaches . Let be the times of the first and last visits to . Then:

The number of visits to is geometric with mean .

The walk on is a walk conditioned to hit before returning to , and stopped when it reaches .

The walk on is a walk from conditioned to hit without returning to , and stopped when it reaches .

The two subwalks and the geometric variable are independent.
Proof.
In order to prove the first statement we first consider a simple random walk from up to the time when it reaches either or . This walk has probability of returning to without hitting , at which time another excursion from begins. Therefore the number of visits to on is geometric with mean . Moreover, the number of visits to is independent of whether the walk hits or , so when conditioning on hitting the distribution is still geometric with mean , which proves the first statement.
Now we return to our original walk from . Excursions from either return to , or hit , or hit . The partition into excursions around does not give any information on the trajectory within each excursion, except for its type, and the other claims follow. ∎
Lemma 3.3.
Let be a simple random walk. For all we have
and for any ,
Proof.
We construct an increasing subsequence of as follows. Informally, we take some times to be in our index set, greedily in decreasing order of the 2order of .
For each integer we construct an interval . The intervals are such that if then . Given such intervals, we have that is increasing along , where
We start by setting and , where is the last visit to before . Let and be such that and assume by induction that are already defined for all for which . Now we define . Let and , then clearly . Thus and are already defined by the inductive hypothesis. Let , where is the first hitting time of after and is the time of the last visit to before . See Figure 2 for an example.
Assuming , we show that the law of restricted to is that of a simple random walk started at conditioned to hit before returning to , and stopped when hitting . This is seen inductively using Lemma 3.2, and since the walk after the last visit to before cannot return to .
From the above, we deduce that for all with the number of visits to in is geometric with mean , and these geometric variables are all independent. Since for each there are integers with , we get that
As any geometric satisfies and our geometric random variables are all independent, we obtain that
The second claim now follows by Chebyshev’s inequality. ∎
Proof of Theorem 2.
Fix . For large enough let be an integer such that
(3.1) 
Then we have that
(3.2) 
Applying Lemma 3.3 for this we get
Moreover, [9, Thm. 2.17] and (3.1) imply that
Since and as , plugging the previous bounds in (3.2) gives for large enough
Finally, applying the above inequality for implies that
4 Random walks in higher dimensions
The main goal of this section is to prove Theorem 3. As noted, the upper bound in the onedimensional case holds trivially in every dimension. For sequences we use the notation if as . The lower bound is based on the following estimate by Denisov and Wachtel, see [4, Example 2] and see there the history of similar estimates for Brownian motion and random walks.
Theorem 4.1.
Let be a twodimensional random walk satisfying the conditions of Theorem 3. Let be the hitting time of the positive quadrant: . Then there is some so that
More generally, for a higher dimensional random walk , define the hitting time
Denisov and Wachtel [4, Theorem 1] proved that for some and , where is the exponent corresponding to Brownian motion staying outside a quadrant up to time (assuming again that the walk is normalized so that and , and that for some ). Consequently, the following lemma completes the proof of Theorem 3 (with ), and gives a similar lower bound for random walks in higher dimensions.
Lemma 4.2.
Let be a random walk in , and let be such that
Then there is a constant such that for all and
(4.1) 
Consequently, for all we have
Proof.
Fix . Define the greedy increasing subsequence with time indices given by the recursion
Setting , we see that if then . This gives a set with i.i.d. increments with the law of .
Choose such that for all
and define the truncated variables . Then
(4.2) 
5 Open Questions
There are many potential extensions of our results. Two central open problems are to reduce the gap between the lower and upper bounds in dimension one, and to determine the right order of magnitude in higher dimensions. Moreover, our lower bound in Theorem 2 is specific to the simple random walks, and our proof does not work for general random walks.
Question 5.1.
Let be a random walk with zero mean and finite (positive) variance. Is there a constant such that, with probability ,
Does this upper bound hold at least when is a simple random walk?
Question 5.2.
Let and let be a dimensional simple random walk. What is the order of magnitude of