Chernoff’s distribution and differential equations of parabolic and Airy type

# Chernoff’s distribution and differential equations of parabolic and Airy type

\fnmsPiet \snmGroeneboom\correflabel=e1]P.Groeneboom@tudelft.nl [[ Delft University
\fnmsSteven \snmLalleylabel=e2]lalley@galton.uchicago.edu [[ University of Chicago
\fnmsNico \snmTemmelabel=e3]Nico.Temme@cwi.nl [[ CWI
Delft University, CWI and University of Chicago
###### Abstract

We give a direct derivation of the distribution of the maximum and the location of the maximum of one-sided and two-sided Brownian motion with a negative parabolic drift. The argument uses a relation between integrals of special functions, in particular involving integrals with respect to functions which can be called “incomplete Scorer functions”. The relation is proved by showing that both integrals, as a function of two parameters, satisfy the same extended heat equation, and the maximum principle is used to show that these solution must therefore have the stated relation. Once this relation is established, a direct derivation of the distribution of the maximum and location of the maximum of Brownian motion minus a parabola is possible, leading to a considerable shortening of the original proofs.

[
\kwd
\startlocaldefs\endlocaldefs\runtitle

Chernoff’s distribution

{aug}

class=AMS] \kwd[Primary ]60J65

parabolic partial differential equations \kwdAiry functions \kwdScorer’s functions \kwdBrownian motion \kwdparabolic drift \kwdCameron-Martin-Girsanov \kwdFeynman-Kac

## 1 Introduction

Let be standard two-sided Brownian motion, originating from zero. The determination of the distribution of the (almost surely unique) location of the maximum of has a long history, which probably started with Chernoff’s paper [1] in a study of the limit distribution of an estimator of the mode of a distribution. In the latter paper the density of the location of the maximum of , which we will denote by

 Z=argmaxt{W(t)−t2,t∈R}, (1.1)

is characterized in the following way. Let be the solution of the heat equation

 ∂∂tu(t,x)=−12∂2∂x2u(t,x),

for , under the boundary conditions

 u(t,x)≥0,u(t,t2)def=limx↑t2u(t,x)=1,(t,x)∈R2,limx↓−∞u(t,x)=0,t∈R.

Furthermore, let the function be defined by

 u2(t)=limx↑t2∂∂xu(t,x).

Then the density of (1.1) is given by

 fZ(t)=12u2(t)u2(−t),t∈R. (1.2)

The original attempts to compute the density were based on numerically solving the heat equation above, but it soon became clear that this method did not produce a very accurate solution, mainly because of the rather awkward boundary conditions. However, around 1984 the connection with Airy functions was discovered and this connection was exploited to give analytic solutions in the papers [2], [13] and [4], which were all written in 1984, although the last paper appeared much later.

There seems to be a recent revival of interest in this area of research, see, e.g., [9], [5], [6], [7], [10] and [8]. Also, the main theorem (Theorem 2.3) in [3]) uses Theorem 3.1 of [4] in an essential way. These recent papers (except [10]) rely a lot on the results in [2] and [4], but it seems fair to say that the derivation of these results in [2] and [4] is not a simple matter. The most natural approach still seems to use the Cameron-Martin-Girsanov formula for making the transition from Brownian motion with drift to Brownian motion without drift, and next to use the Feynman-Kac formula for determining the distribution of the Radon-Nikodym derivative of the Brownian motion with parabolic drift with respect to the Brownian motion without drift from the corresponding second order differential equation. This is the approach followed in [4]. However, the completion of these arguments used a lot of machinery which one would prefer to avoid. For this reason we give an alternative approach in the present paper.

The starting point of our approach is Theorem 2.1 in [4], which is given below for convenience. Theorem 2.1 in [4] in fact deals with the process for an arbitrary positive constant , but since we can always deduce the results for general from the case , using Brownian scaling, see, e.g., [8], we take for convenience in the theorem below. Another simplification is that we consider first hitting times of for processes starting at instead of first hitting times of of processes starting at for an arbitrary , using space homogeneity. We made slight changes of notation, in particular the function , , of [4] is again denoted by , but now with a negative argument, so in our paper corresponds to in [4].

###### Theorem 1.1 (Theorem 2.1 in [4]).

Let, for and , be the probability measure on the Borel -field of , corresponding to the process , where , starting at position at time , and where is Brownian motion, starting at at time . Let the first passage time of the process be defined by

 τ0=inf{t≥s:X(t)=0},

where, as usual, we define , if . Then

1.  Q(s,x){τ0∈dt}=e−23(t3−s3)+2sxψx(t−s)E0{e−2∫t−s0B(u)du∣∣B(t−s)=−x}dt,

where is a Bes(3) process, starting at zero at time , with corresponding expectation , and where is the value at of the density of the first passage time through zero of Brownian motion, starting at at time .

2.  Q(s,x){τ0∈dt}=e−23(t3−s3)+2sxhx(t−s)dt,

where the function has Laplace transform

 ^hx(λ)=∫∞0e−λuhx(u)du=\rm Ai(ξ−41/3x)/\rm Ai(ξ),ξ=2−1/3λ>0,

and Ai denotes the Airy function Ai.

###### Remark 1.1.

Note that the function in the definition of the density of the stopping time has by part (ii) of Theorem 1.1 the representation

 hx(t)=12π∫∞v=−∞eitv\rm Ai(i2−1/3v−41/3x)\rm Ai(i2−1/3v)dv,t>0. (1.3)

This representation is obtained by inverting the Laplace transform and will be used in Section 3 and the proof of Lemma 9.1.

###### Remark 1.2.

Theorem 1.1 occurs in different forms in the literature, see, e.g., Theorem 2.1 in [12]. For convenience of the reader, we give a short self-contained proof of Theorem 1.1 in Appendix A. The interpretation in terms of Bessel process is not really necessary, but this naturally leads to an interpretation in terms of Brownian excursions, further explored in Section 4 of [4].

Theorem 1.1 should in principle be sufficient to derive the density of (1.2), since, defining

 q(s)=limx↑0∂∂xQ(s,x){Xt<0,∀t≥s}=limx↑0∂∂xQ(s,x){τ0=∞},

we find:

 fZ(s)=12q(s)q(−s),

following a line of reasoning similar to the derivation of (1.2) in [1] (in Chernoff’s argument, which is based on a random local perturbation of the starting point and the ensuing convolution equation, the factor can be interpreted as the expectation of the squared maximum of the standard Brownian bridge). Moreover, by (ii) of Theorem 1.1 we have, using inversion of the Laplace transform along the imaginary axis:

 Q(s,x){τ0=∞}=1−Q(s,x){τ0<∞}=1−∫∞t=sQ(s,x){τ0∈dt} =1−∫∞t=se−23(t3−s3)+2sxhx(t−s)dt =1−e2sx+23s32π ∫∞v=−∞\rm Ai(2−1/3iv−41/3x)\rm Ai(2−1/3iv)∫∞t=0eitv−23(s+t)3dtdv. (1.4)

So we would be done if we can deal with the properties of the integral in the last line.

However, the latter integral has some unpleasant properties. Taking the special case , the integral reduces to:

where Hi denotes Scorer’s function Hi (the transition of the coefficient of in the left-hand side to the coefficient of in the definition of Scorer’s function was made by changes of variables in and ). But to treat the behavior of this integral (and its derivative with respect to ) as , we can not take limits inside the integral sign, since we then end up with divergent integrals. For the function Hi has the asymptotic expansion:

 \rm Hi(z)∼−1πz∞∑k=0(3k)!k!(3z3)k,|ph(−z)|<23π−δ

for arbitrarily small, where denotes the phase of , and if we put equal to zero inside the integral we are stuck with a non-integrable integrand, whereas in fact:

 limx↑012∫∞v=−∞\rm Ai(iv−41/3x)\rm Ai(iv)\rm Hi(iv)dv=limx↑0Q(0,x){τ0<∞}=1.

For this reason part (ii) of Theorem 1.1 was not directly used in the derivation of density in [4], but instead the limit

 Q(s,x){τ0=∞}=limt→∞Q(s,x){τ0>t}

was computed by first determining the transition density

 Q(s,x){X∂t∈dy},t>s,x,y<0,

of the process , which is the process , killed when reaching . The details of this computation were given in the appendix of [4], giving the result:

 Q(s,x){τ0=∞}=e23s3+2sx41/3∫∞v=−∞e−isv\rm Ai(iξ)\rm Bi(iξ−41/3x)−\rm Ai(iξ−41/3x)% \rm Bi(iξ)\rm Ai(iξ)dv, (1.5)

where , see Theorem 3.1 of [4]. So by (1) we must have the analytic relation

 e23s3+2sx2π∫∞v=−∞\rm Ai(iξ−41/3x)\rm Ai(iξ)∫∞t=0eitv−23(s+t)3dtdv =1−e23s3+2sx41/3∫∞v=−∞e−isv\rm Ai(iξ)\rm Bi(iξ−41/3x)−\rm Ai(iξ−41/3x)\rm Bi(iξ)\rm Ai(iξ)dv,ξ=2−1/3v. (1.6)

Conversely, if we can prove the analytic relation (1), we have an easy road to Theorem 3.1 of [4] and the derivation of the density . We call the function

 z↦1π∫∞t=setz−13t3dt

an incomplete Scorer function, corresponding to the (complete) Scorer function

 z↦\rm Hi(z)=1π∫∞t=0etz−13t3dt.

In the present paper we first prove in Section 2 relation (1) by showing that both integrals, as a function of the parameters and , satisfy the same extended heat equation. Section 3 discusses the derivation of the distribution of the maximum and location of maximum of one-sided or two-sided Brownian motion with a negative parabolic drift from these results. The appendices contain further details on the results.

## 2 A parabolic partial differential equation and the analytic relation (1)

###### Lemma 2.1.

Let the function be defined by

 f(s,x)=12π∫∞v=−∞\rm Ai(iξ−41/3x)\rm Ai(iξ)∫∞t=0eitv−23(s+t)3dtdv,ξ=2−1/3v. (2.1)

Then satisfies the partial differential equation

 ∂∂sf(s,x)=−12∂2∂x2f(s,x)−2xf(s,x). (2.2)

Moreover and

 limx↑0f(s,x)=e−23s3,lims→∞f(s,x)=0,x<0,limx→−∞e2sxf(s,x)=0,s∈R. (2.3)
###### Proof.

The proof follows from the following observations.
First Observation: Let , and define . The process under is a diffusion process with a time-dependent generator, obtained by subtracting from the generator of standard Brownian motion. Consequently, standard arguments from Markov process theory yield:

 ∂u∂s=−12∂2u∂x2+2s∂u∂x (2.4)

in the region .

Second Observation: Let and be functions satisfying the relation

 f(s,x)=e−2sx−23s3u(s,x). (2.5)

Then satisfies the PDE (2.4) if and only if satisfies the PDE (2.2), as can be seen by routine calculus.
Third Observation: Relation (1) (which follows from Theorem 1.1 by the Laplace inversion (1.3)) shows that the function is related to the function of the lemma by the transformation (2.5) above. Since satisfies (2.4), it now follows immediately that satisfies (2.2).
The boundary conditions follow immediately from the probabilistic interpretation of the function . ∎

It turns out that the right-hand side of (1) has a more convenient representation, which generalizes relation (2.3) of Lemma 2.2 in [7] (see also Remark 2.1 in [7] on the equivalent relation (5.10) in [9]).

###### Lemma 2.2.

Let the function be defined by

 g(s,x)=141/3∫∞v=−∞e−isv\rm Ai(iξ)\rm Bi(iξ−41/3x)−\rm Ai(iξ−41/3x)\rm Bi(iξ)\rm Ai(iξ)dv,ξ=2−1/3v. (2.6)

Then has the alternative representation

 g(s,x)=e−2sx2π∫∞u=−∞∫−41/3xy=0e−21/3s(iu+y)\rm Ai(iu+y)dy\rm Ai(iu)2du. (2.7)
###### Proof.

By the definition of the function , we have:

 g(s,x)=2−1/3∫∞−∞e−21/3isu%Ai(iu)\rm Bi(iu−41/3x)−\rm Bi(iu)% \rm Ai(iu−41/3x)\rm Ai(iu)du.

For simplicity of notation, we consider instead:

 ~g(s,x) =12∫∞−∞e−isu\rm Ai(iu)\rm Bi(iu+x)−\rm Bi(iu)\rm Ai(iu+x)\rm Ai(iu)du,s∈R,x>0.

It is shown in Section 6 that the function satisfies the first order differential equation:

 ∂∂x~g(s,x)=s~g(s,x)+12π∫∞u=−∞e−isu\rm Ai(iu+x)\rm Ai% (iu)2du. (2.8)

So if , the solution is given by:

 ~g(s,x)=esx2π∫∞u=−∞∫xy=0e−s(iu+y)\rm Ai(iu+y)dy\rm Ai(iu)2du.

Transferring this result to the function and using , we get that the corresponding linear differential equation for has the solution given by (2.7). ∎

###### Lemma 2.3.

Let the function be defined as in Lemma 2.2. Then satisfies the partial differential equation

 ∂∂sg(s,x)=−12∂2∂x2g(s,x)−2xg(s,x). (2.9)

Moreover:

 limx↑0g(s,x)=0,limx→−∞e2sxg(s,x)=e−23s3,s>0. (2.10)
###### Proof.

We have:

 =2(iv−2x)\rm Ai(iξ)\rm Bi(iξ−41/3x)−\rm Ai(iξ−41/3x)\rm Bi(iξ)\rm Ai(iξ),ξ=2−1/3v.

We also have:

 ∂∂sg(s,x)=−141/3∫∞v=−∞ive−isv\rm Ai(iξ)\rm Bi(iξ−41/3x)−\rm Ai(iξ−41/3x)\rm Bi(iξ)\rm Ai(iξ)dv.

This yields (2.9). It is clear from the definition (2.6) that for all . A stronger version of the second part of (2.10) is proved in Section 7. ∎

The preceding two lemmas give the desired result (1).

###### Theorem 2.1.
1. Let the functions and be defined as in Lemmas 2.1 to 2.3. Then we have:

 f(s,x)=e−2sx−23s3−g(s,x),s∈R,x≤0,

where is defined by taking the limit of , as .

2. Let, for and , be Brownian motion with a negative parabolic drift, starting at at time , with corresponding probability measure . Then

 Q(s,x){τ0<∞}=e2sx+23s3f(s,x), (2.11)

and

 Q(s,x){τ0=∞}=e2sx+23s3g(s,x)=e23s32π∫∞u=−∞∫−41/3xy=0e−21/3s(iu+y)\rm Ai(iu+y)dy\rm Ai(iu)2du. (2.12)
###### Proof.

(i). The function

 (s,x)↦e−2sx−23s3,(s,x)∈R2,

satisfies the same partial differential equation as the functions and of Lemmas 2.1 and 2.3. We have to show:

 h(s,x)\small def=f(s,x)+g(s,x)−e−2sx−23s3=0,s∈R,x≤0, (2.13)

defining and by the limits of and as , respectively. To show that (2.13) holds, we use the maximum principle.

First of all, (2.13) holds for all if by Lemmas 2.1 and 2.3. It is shown in Section 7 that also

 limx→−∞h(s,x)=0,∀s∈R, (2.14)

and

 lims→∞h(s,x)=0,∀x<0. (2.15)

We now consider an infinite rectangle , for some . Suppose that attains a strictly positive maximum over at an interior point . Then , denoting the derivative w.r.t. the th argument by . Hence, since satisfies the same partial differential equation as and , we get:

 0=∂1h(s0,x0)=−12∂22h(s0,x0)−2x0h(s0,x0),

implying

 ∂22h(s0,x0)=−4x0h(s0,x0)>0,

since . But this contradicts the assumption that attains its maximum at . Similarly, if attains a strictly negative minimum at an interior point , we would get , again giving a contradiction. So a strictly positive maximum or strictly negative minimum over can only be attained on the line . Suppose that a strictly positive maximum is attained at the point , where . Then we must have: , implying by the partial differential equation for :

 ∂22h(c,x0)≥−4x0h(c,x0)>0,

contradicting the assumption that attains its maximum on the line at the point .

In a similar way we get a contradiction if we assume that attains a strictly negative minimum on the line . So the conclusion is that is identically zero on . Since the argument holds for all , we get that the function is identically zero on .
(ii) This follows from (1), Lemmas 2.1 to 2.3, and (i). ∎

## 3 The distribution of the maximum and location of maximum of one-sided and two-sided Brownian motion with parabolic drift.

Let denote the maximum of the process , starting at at time , with corresponding probability measure . Moreover, let, with a slight abuse of notation, denote the location of the maximum of this process. The following theorem gives the joint distribution of and under .

###### Theorem 3.1.

Let the function be defined by

 k(s,x)=∂∂xQ(s,x){τ0<∞}, (3.1)

where is the probability measure, corresponding to the process , starting at at time . Moreover, let for all . Then

1.  Q(s,x){τ0<∞}=e23s3+2sxf(s,x)=1−e23s3+2sxg(s,x) (3.2)

where the functions and are defined as in Lemma 2.1 and Lemma 2.3, respectively, and

 k(s,0)=limx↓0∂∂xQ(s,x){τ0<∞}=e23s341/3π∫∞v=−∞e−isv\rm Ai(i2−1/3v)dv. (3.3)
2. The function is the density of the maximum at under the probability measure .

3. The joint density of and is given by:

 f(τM,M)(t,a) =e−23(t3−s3)+2s(x−a)hx−a(t−s)k(t,0) =e−23s3+2s(x−a)hx−a(t−s)π∫∞v=−∞e−itv\rm Ai(iξ)dv,a>x,t>s, (3.4)

where is defined as in part (ii) of Theorem 1.1, that is:

###### Proof.

(i) (3.2) is relation (1), which follows from Theorem 2.1 in Section 2, and (3.3) follows from the representation in the right-hand side of (1) by taking the derivative w.r.t. , letting and using that the Wronskian of the two solutions Ai and Bi of the Airy differential equation equals .
(ii) By a space homogeneity argument, the density of the maximum under is given by

 −∂∂aQ(s,x){M>a}=−∂∂aQ(s,x){τa<∞}=−∂∂aQ(s,x−a){τ0<∞} =∂∂xQ(s,x−a){τ0<∞}=k(s,x−a).

(iii) Since, if the process starts at , with and , we only can have and if and , we have:

 Q(s,x){τM

where corresponds to the event that the maximum of the path, started at at time , stays below . Hence differentiation gives:

 f(s,x)τM,M(t,a)dt =Q(s,x){τa∈dt}k(t,0)=Q(s,x−a){τ0∈dt}k(t,0) =e−23(t3−s3)+2s(x−a)hx−a(t−s)k(t,0)dt,

where we use part (ii) of Theorem 1.1 in the last equality. ∎

As a corollary we get the corresponding result for two-sided Brownian motion.

###### Corollary 3.1.

Let , where is two-sided Brownian motion, originating from zero. Furthermore, let and be the maximum and the location of the maximum of the process , respectively. Then the joint density of is given by

 f(τM,M)(t,a)=h−a(|t|)g(0,−a)ϕ(|t|), (3.5)

where is defined as in part (ii) of Theorem 1.1, by (2.6) of Lemma 2.3, and by:

 ϕ(t)=141/3π∫∞v=−∞e−itv\rm Ai% (i2−1/3v)dv,t∈R. (3.6)
###### Proof.

Let and let and be the maximum and the location of the maximum for the one-side process to the right of zero. By part (iii) of Theorem 3.1, the density of is given by (3.1), which, since , boils down to

 h−a(t)41/3π∫∞v=−∞e−itv\rm Ai% (iξ)dv.

If we want to turn this into the density of the global maximum and location of maximum on , we have to multiply this density with the probability that the maximum left of zero is less than , which means, using a symmetry argument, that the density becomes:

 f(τM,M)(t,a)=h−a(t)Q(0,0){τa=∞}41/3π∫∞v=−∞e−itv\rm Ai(iξ)dv=h−a(t)Q(0,−a){τ0=∞}41/3π∫∞v=−∞e−itv\rm Ai(iξ)dv,

where . By (3.2) of Theorem 3.1 we now get:

 f(τM,M)(t,a)=h−a(t)g(0,−a)41/3π∫∞v=−∞e−itv\rm Ai(iξ)dv=h−a(t)g(0,−a)ϕ(t).

The case where the maximum is reached to the left of zero is treated in a similar way. ∎

###### Remark 3.1.

Note that the function , defined by (3.6), has the following probabilistic interpretation:

 ϕ(t)=−e−23t3∂∂xQ(t,x){τ0=∞}∣∣∣x=0. (3.7)

This interpretation can perhaps easiest be seen from the representation (2.12) in Theorem 3.1. The function defines the density of the location of the maximum, as is seen in the following Corollary 3.2.

###### Corollary 3.2.

Let , where is two-sided Brownian motion, originating from zero. Then the density of the location of the maximum is given by:

 fτM(t)=12ϕ(t)ϕ(−t), (3.8)

where is defined by (3.6).

###### Proof.

We have by Corollary 3.1 and Theorem 3.1 for :

 fτM(t) =ϕ(t)∫∞x=0h−x(t)g(0,−x)dx.

Now let be defined by

 ψ(t)=∫∞x=0h−x(t)g(0,−x)dx. (3.9)

Then we have to show:

 ψ(t)=12ϕ(−t). (3.10)

Since, using a time reversal argument, the density obviously has to be symmetric, we only have to prove (3.10) for all . The equality is derived in the proof of Lemma 9.1 in Section 8 by an (asymptotic) analytic argument. ∎

###### Remark 3.2.

Note that the integrand on the right-hand side of (3.9) is the product of the density of the first hitting time under and the probability that the drifting process stays below under . The latter probability can also be interpreted as the probability that the process , starting at zero and running to the left, stays below . So, intuitively, the product corresponds to paths of two-sided Brownian motion minus a parabola, having their first hitting time of at time (the factor ), and staying below on the interval (the factor ). The factor in (3.7) disappears, since by part (ii) of Theorem 1.1,

 h−x(t)dt=e23t3Q(0,x){τ0∈dt}.

The factor in front of the product is introduced by going from a derivative in the space variable in (3.7) to a derivative in the time variable . This seems a bit different from the way the factor entered in Chernoff’s argument as the expectation of the squared maximum of the Brownian bridge.

## 4 Concluding remarks

We gave a direct approach to Chernoff’s theorem and other results of this type, using the Feynman-Kac formula with a stopping time and the analytic relation (1). Relation (1) is proved by showing that the integrals in this relation satisfy the parabolic partial differential equation (2.2) as a function of the parameters (time) and (space) and by an application of the maximum principle. As shown in [5] and [9], these results also give the distribution of the maximum of Brownian motion minus a parabola itself, both for the one-sided and two-sided case. An asymptotic development of the tail of the distribution of this maximum is given in [7]. We hope that the direct approach of the present paper will make these results more accessible. The original proofs in [4] were rather long and technical, and lacked this property.

It is proved in [6] and again in [8] that the maximum and the location of the maximum of two-sided Brownian motion minus the parabola satisfy the relation

 Eτ2M=13EM.

This result is generalized in [10], where the relation is proved not using Airy functions, and where a completely general result of this type is given for drifting Brownian motion. More results for moments and combinatorics for Airy integrals are given in [8]. The latter paper ends with a series of conjectures and open problems in this area.

For computational purposes, the representation (2.7) in Lemma 2.2 seems the best choice, since we lose in this way the inconvenient difference of products of the Airy functions Ai and Bi on the right side of (1), which are not integrable along the imaginary axis by themselves, and we also do not have the trouble near zero that the function on the left of relation (1), further analyzed in Lemma 2.1, is exhibiting.

## 5 Appendix A

In this section we prove Theorem 1.1. We start with the Feynman-Kac part.

Let, for , be the unique non-negative solution of the boundary problem

 12u′′(x)−(λ−2x)u(x)=0,x<0,limx↑0u(x)=1,u(x)≤1,x≤0. (5.1)

The unique solution of (5.1) is given by

 uλ(x)=\rm Ai(2−1/3λ−41/3x)\rm Ai(2−1/3λ),x≤0. (5.2)

We now consider the process

 Yt=e−∫tv=s(λ−2Xv)dvuλ(Xt),

where is standard Brownian motion, starting at at time . By Itô’s formula and (5.1) we have:

 du(Xt)=u′λ(Xt)dXt+12u′′λ(Xt)dt=u′λ(Xt)dXt+(λ−2Xt)uλ(Xt)dt.

So we get:

 dYt +e−∫tv=0(λ−2Xv)dv{u′λ(Xt)dXt+(λ−2Xt)uλ(Xt)dt} =e−∫tv=0(λ−2Xv)dvu′λ(Xt)dXt,

implying that is a local martingale and that

 Yτ0−x=∫τ0t=0e−∫tv=0