Parameter dependent optimal thresholds, indifference levels and inverse optimal stopping problems

Parameter dependent optimal thresholds, indifference levels and inverse optimal stopping problems

Martin Klimmek
Mathematical Institute, University of Oxford, Oxford OX13LB
July 2, 2019

Consider the classic infinite-horizon problem of stopping a one-dimensional diffusion to optimise between running and terminal rewards and suppose we are given a parametrised family of such problems. We provide a general theory of parameter dependence in infinite-horizon stopping problems for which threshold strategies are optimal. The crux of the approach is a supermodularity condition which guarantees that the family of problems is indexable by a set valued map which we call the indifference map. This map is a natural generalisation of the allocation (Gittins) index, a classical quantity in the theory of dynamic allocation. Importantly, the notion of indexability leads to a framework for inverse optimal stopping problems.

Keywords: Inverse problem; inverse optimal stopping; threshold strategy, parameter dependence, comparative statics, generalised diffusion, Gittins index

AMS MSC 2010: 60G40; 60J60

1 Introduction

Consider the following classical optimal stopping problem. Given a discount parameter and a time-homogeneous diffusion started at a fixed point, we are asked to maximise an expected payoff which is the sum of a discounted running reward up until the stopping time and a terminal reward depending on the state of the diffusion at the stopping time. We call this problem the forward optimal stopping problem and the expected payoff under the optimal stopping rule the (forward) problem value.

The problem can be generalised to a parametrised family of reward functions to give a parametrised family of forward problems. This generalisation is often natural. For instance, in economics we may be interested in the effect of changes in a dividend or a tax rate on the value of an investment and the optimal investment decision. In dynamic resource allocation problems, a parameter may act an index for different projects. In this context, the decision of which project to engage requires an analysis of the parameter dependence of optimal stopping rules and problem values.

The approach to solving forward problems in this article is motivated by previous work for the case when there is no running reward. In the case of perpetual American puts, Ekström and Hobson [HobsonEkstroem] establish convex duality relations between value functions and the Laplace transform of first hitting times of the underlying diffusion. In related work by Lu [BingLu], the approach is developed to establish duality when the parameter space is a discrete set of strikes. More generally, Hobson and Klimmek [HobsonKlimmek:10] employ generalized convex analysis to establish duality between -transformed value functions and -transformed diffusion eigenfunctions for a general class of reward functions. The common strand in this previous work on inverse stopping problems is the conversion of a stochastic problem into a deterministic duality relation involving monotone optimizers.

This article provides a unifying view of the monotone comparative statics results for optimal stopping developed previously and an extension to non-zero running rewards. We show that a supermodularity condition on the reward functions guarantees monotonicity of optimal thresholds in the parameter value. This monotonicity of the thresholds imposes a useful and natural order on families of parametrised stopping problems through a generalisation of the so-called allocation (or Gittins) index, an important quantity in the theory of dynamic allocation problems (see for instance Whittle [Whittle] and Karatzas [Karatzas]). We utilise the notion of indexability to solve parametrised families of stopping problems.

As well as solving families of forward problems, we consider the problem of recovering diffusion processes consistent with given optimal stopping values. ‘Inverse optimal stopping problems’ find natural motivation in mathematical finance and economics. When there is no running reward, the problem has the interpretation of constructing models for an asset price process consistent with given perpetual American option prices. Now suppose instead that we are given an investor’s valuation for a dividend bearing stock which may be liquidated for taxed capital gains. Given the valuation, we would like to recover the investor’s model. Similar situations may arise in a real-options setup. A bidder for a resource extraction project may submit a range of bids for a project depending on an economic parameter. In this case, a regulator might naturally be interested in recovering the investor’s model which underlies the bids. This article provides solutions to inverse problems in the presence of a non-zero running reward (or cost). We show that the value function does not contain enough information to recover a diffusion and that solutions to the inverse problem are parametrised by a choice of indifference (allocation) index. The indifference index can be interpreted as representing an investor’s preferences with respect to remaining invested or liquidating. Given consistent preferences and valuations, it is possible to recover a diffusion model.

This article provides a direct approach to forward and inverse problems based on principles from monotone comparative statics and dynamic allocation. In spirit, the direct approach is related to recent seminal work by Dayanik and Karatzas [DayanikKaratzas] and Bank and Baumgarten [Bank]. The direct solution method in [DayanikKaratzas], based on the calculation of concave envelopes, is employed by Bank and Baumgarten [Bank] to solve parameter-dependent forward problems. However, the method used in [Bank] is restricted to problems with linear parameter dependence and requires calculation of an auxiliary function which transforms general two-sided stopping problems to one-sided threshold problems. The approach taken in this article is to focus on optimal stopping problems for which one-sided threshold strategies are optimal. This restriction (which is usual in the setting of dynamic allocation problems) leads to a tractable characterization of parameter-dependence. As an analysis of allocation indices and stopping problems, this article can be seen to extend the work of Karatzas [Karatzas]. However, the aim here is not to prove the optimality of the ‘play-the-leader’ policy for multi-armed bandits, but to generalise the approach to inverse optimal stopping problems introduced in [HobsonKlimmek:10]. The fundamental aim is to establish qualitative principles that govern the relationship between data (e.g. prices), economic behaviour (e.g. investment indifference levels) and models (e.g. generalised diffusions).

2 Forward and the inverse problems

Let be a diffusion process on an interval , let be a discount parameter. Let be a family of terminal reward functions and a family of running reward functions, both parametrised by a real parameter lying in an interval with end-points and . The classical approach in optimal stopping problems is to fix the parameter, i.e. , and calculate

for using variational techniques, see for instance Bensoussan and Lions [Lions].

In contrast, we are interested in the case when the starting value is fixed and the parameter varies. Then the forward problem is to calculate where


We will assume that the process underlying the stopping problem is a regular one-dimensional diffusion processes characterised by a speed measure and a strictly increasing and continuous scale function. Such diffusions are ‘generalised’ because the speed measure need not have a density.

Let be a finite or infinite interval with a left endpoint and right endpoint . Let be a non-negative, non-zero Borel measure on with . Let be a strictly increasing and continuous function. Let and let be a Brownian motion started at supported on a filtration with local time process . Define to be the continuous, increasing, additive functional

and define its right-continuous inverse by

If then is a one-dimensional regular diffusion started at with speed measure and scale function . Moreover, almost surely for all .

Let . Then for a fixed (see e.g. [Salminen]),


where and are respectively a strictly increasing and a strictly decreasing solution to the differential equation


In the smooth case, when has a density so that and is continuous, (2.3) is equivalent to



We will call the solutions to (2.3) the -eigenfunctions of the diffusion. For a fixed diffusion with a fixed starting point we will scale and so that . The boundary conditions of the differential equation (2.3) depend on whether the end-points of are inaccessible, absorbing or reflecting, see Borodin and Salminen [borodin] for details. We will denote by the interior of and its accessible boundary points and we will make the following assumption about the boundary behaviour of .

Assumption 2.1.

Either the boundary of is non-reflecting (absorbing or killing) or is started at a reflecting end-point and the other end-point is non-reflecting.

Now, for , let


Define by and for all and let and .

Assumption 2.2.

for all and .

Under our assumptions it is well-known (see for instance Alvarez [Alvarez]) that solves the differential equation

Example 2.3.

In some cases can be calculated directly. Let and let and . Then .

Example 2.4.

Suppose and . Then is known as the three-dimensional Bessel process and solves the SDE; . Let be defined and . Then solves with . The solution is .

In order to rule out the case of negative value functions we also make the following assumption.

Assumption 2.5.

For all , there exists such that .

2.1 Summary of the main results

Our main result for the forward problem can be summarised as follows.

Solution to the forward problem: Given a generalised diffusion , if is -supermodular then a threshold strategy is optimal on an interval and an optimal finite stopping rule does not exist for . Furthermore, if is sufficiently regular and is differentiable at then

where is a monotone increasing function such that is the optimal stopping rule.

Now suppose that we are given and , and . Then the inverse problem is to construct a diffusion such that is the value function corresponding to an optimal threshold strategy. (To keep the inverse problem tractable we focus on the case when the running cost is not parameter dependent.) Our analysis hinges on specifying the parameters for which it is optimal to stop immediately (i.e. ) for a given level of the underlying diffusion. If we consider to be the value of an investment as a function of a parameter (e.g. a level of capital gains tax), then the indifference map specifies the parameters for which an investor would be indifferent whether to invest or not as it would be optimal to sell immediately.

The indifference map is a natural extension of the allocation (Gittins) index which occurs naturally in the theory of multi-armed bandits. We provide a novel application of this classical quantity in the context of inverse investment problems and real option theory. The indifference map can be seen to represent investor preferences with respect to liquidating for capital gains or remaining invested for future returns. Depending on the valuation of an investment as a function of the parameter, we will show how to recover diffusion models for the underlying risky asset consistent with given preferences (indifference maps).

Solution to the inverse problem: Solutions to the inverse problem are parametrised by a choice of allocation index : The functions and defined

determine the speed measure and scale function of the solution through equations (2.3) and (2.6).

3 The forward problem: threshold strategies

Threshold strategies are a natural class of candidates for the optimal stopping time in the forward problem. Our first aim is to establish necessary and sufficient conditions for the optimality of a threshold strategy.

By the strong Markov property of one-dimensional diffusions the value function for the optimal stopping problem can be decomposed into the reward from running the diffusion forever and an early stopping reward.


We will let denote the optimal early stopping reward and let denote the early stopping reward function.

Lemma 3.1.

Stopping at the first hitting time of , is optimal if and only if attains its global maximum on at .


Suppose that the global maximum is achieved at . Let

We will show that . On the one hand, since the supremum over all stopping times is larger than the value of stopping upon hitting a given threshold. Moreover is a non-negative local martingale hence a super-martingale. We have that for all stopping times ,

and hence for all stopping times . Hence is optimal.

For the converse, suppose that there exists an , such that . We will show that there exists a stopping time which is better than . First, if then stopping at is a better strategy than stopping at . Now suppose . Then

so stopping at is better than stopping at . ∎

Remark 3.2.

There is a parallel result for stopping at a threshold below . A threshold below is optimal if and only if attains a global maximum below .

Example 3.3.

Recall Example 2.3 and let be a Geometric Brownian Motion started at with volatility parameter and drift parameter . Suppose , and . Then . is decreasing so we look for a stopping threshold below . for , where . Let and . If then is the optimal stopping threshold. If then it is optimal to ‘wait forever’. If then it is optimal to stop immediately.

The following Lemma shows that if a threshold strategy is optimal then the optimal threshold is either above or below the starting point. This rules out the case that both an upper threshold and a lower threshold are optimal for a fixed parameter.

Lemma 3.4.

For a fixed parameter , let . Let and . If and then .


Suppose that . It follows that

contradicting the fact that is strictly increasing. ∎

Example 3.5.

Let be Brownian Motion on killed at and at . Let and and and suppose . Then and . Now fix and define and as in Lemma 3.4. We calculate and . If lies to the left (right) of an element in then an upper (lower) threshold is optimal. If lies between the largest element in and the smallest element in then a threshold strategy is not optimal.

Figure 1: Picture for . is represented by the dashed line and is a singleton. is represented by the solid line and consists of two points. There is no optimal threshold strategy if lies in the shaded region.

In general, given a family of forward problems over an interval , we may find that threshold stopping is optimal on the whole interval , on a subset of or nowhere on . We will temporarily assume that the forward problem (2.1) is such that a threshold strategy is optimal on the whole parameter space. Later, in Section 3.2 we will see how to relax the assumption.

Assumption 3.6.

For all it is optimal to stop at a threshold above .

There is, as will always be the case, a parallel theory when the optimal thresholds are below , compare Remark 3.2.

3.1 The envelope theorem

We will now derive our main result for the parameter dependence of the value function through an envelope theorem. The aim is to derive an expression for the derivative of .

For a fixed parameter let . Then is the set of possible threshold strategies for a fixed parameter . We will let denote the collection of all threshold strategies for the parameter space. Letting , we have that . Recall the definition of the early stopping reward. We abuse the notation slightly by setting , making the dependence on the starting value implicit. Let us also set . The following Proposition follows from an envelope theorem, see Corollary 4 in Segal and Milgrom, [Milgrom].

Proposition 3.7.

If , is upper-semicontinuous in and is continuous on then is Lipschitz continuous on and the one-sided derivatives are given by

is differentiable at if and only if is a singleton. In particular we then have


for where .

Remark 3.8.

Equation (3.2) follows by combining the equations (a consequence of the envelope theorem in Milgrom [Milgrom]) and (Lemma 3.1).

Remark 3.9.

The condition is satisfied if the boundary points of are accessible.

Corollary 3.10.

If the conditions in Proposition 3.7 are satisfied then for any ,

where is a selection from .

Corollary 3.11.

Suppose is continuously differentiable and . If is differentiable at then

for .

Example 3.12.

In Example 3.3 if ,

Parameter dependence of stopping problems is a common theme in the literature on multi-armed bandits in which a special case of the general forward problem, which we will call the standard problem, is studied.

Definition 3.13.

If and then the problem forward problem (2.1) is called the standard (forward) problem.

The preceding Corollary 3.11 is the analogue in a diffusion setting of Lemma 2 in Whittle [Whittle]. As in [Whittle] our setup allows for points of non-differentiability and for the possibility of multiple optimal thresholds above the starting point. In contrast, existing results in the diffusion setting, see for instance Karatzas [Karatzas] (Lemma 4.1) make strong assumptions on the diffusion and on which ensure that is single valued and that the value function is differentiable in the parameter.

In general, the optimal stopping thresholds for a parameter are given by a set-valued map . We will now define the inverse map from the domain of the diffusion to the parameter space.

Definition 3.14.

, the indifference map at , is the set of parameters for which it is optimal to stop immediately when .

Remark 3.15.

Under additional assumptions the indifference map can be represented as a monotone function, see Corollary 4.9.

The indifference map is a natural generalisation of the allocation index common in the theory of multi-armed bandits: while we make few assumptions on the reward functions, the multi-armed bandit or dynamic allocation literature is restricted to the standard problem ( and ), see for instance Gittins and Glazebrook [GittinsGlazebrook], Whittle [Whittle] and for a diffusion setting closer to the setting of this article, Karatzas [Karatzas] and Alvarez [Alvarez].

The following example illustrates our approach to parameter dependent stopping problems and the idea of calculating critical parameter values. Although we focus on the case when the forward problem is indexed by a single parameter, the analysis of forward problems parametrised by several parameters is analogous.

Example 3.16.

A toy model for tax effects Suppose is a model for the profits of a firm: , where is a standard Brownian Motion and . In a tax-free environment a model for the value of the firm is

where is the salvage value of the firm. is decreasing in and we look for an optimal stopping threshold below . We have and . Let be the optimal threshold (investment decision) in the tax-free environment above. We calculate .

Now consider what happens to the value of the firm if profits are taxed. Suppose that profits are taxed at a rate , and that the tax-base at time is , where represents a tax-deductible depreciation expense (or some other adjustment to the tax base). The post-tax profit of the firm is . The decreasing solution to (2.4) is while . The optimal threshold for the after-tax investment problem is

In taxation theory, a tax-rate is neutral if it does not change investment decisions. It is sometimes considered desirable for taxes to be neutral, see for instance Samuelson [SamuelsonTax]. Let denote the neutral tax rate in this problem. To compute we solve for , to find

Finally we check that if and only if . Similarly, given a tax rate we could calculate the depreciation adjustment so that the investment decision is unchanged, which is the idea in Samuelson [SamuelsonTax]. See also Klimmek [RK] for analysis of the relationship between tax levels, risk preferences and decisions under uncertainty.

In Example 3.16, the optimal thresholds are monotone in one or more of the parameters. In the next section we will derive natural conditions for the monotonicity of threshold strategies . We will see that if is monotone then we can relax Assumption 3.6.

3.2 Monotonicity of the optimal stopping threshold in the parameter value

We will say that is increasing (decreasing) if and with implies .

Definition 3.17.

  1. A function is supermodular in if for all , is increasing in and if for all , is increasing in . Equivalently, is supermodular if for all .

  2. If the inequalities in i) are strict then is called strictly supermodular

  3. If is (strictly) supermodular, is called (strictly) submodular.

  4. is (strictly) -supermodular if is (strictly) supermodular.

Remark 3.18.

Note that if is twice differentiable then is supermodular in if and only if for all and .

The next Lemma follows from a straightforward application of standard techniques in monotone comparative statics to the setting of optimal stopping, see for instance Athey [Athey].

Lemma 3.19.

Suppose that on . If is -supermodular then is increasing in .


Suppose that . and are non-empty by Assumption 3.6. Define a function via , where (recall the definition of , (2.2)). Then is also supermodular. Now for any and we have

The first inequality follows by definition of the second by supermodularity and the last inequality by definition of . Hence there is equality throughout and and . It follows that is increasing in . ∎

Corollary 3.20.

If is -submodular then is decreasing in .

Remark 3.21.

It may be the case that takes both strictly positive and negative values on . In this case it is never optimal to stop at if and so we need only check supermodularity on the set .

Monotonicity of the optimal stopping threshold will play a crucial role in our analysis of inverse optimal stopping problems because it leads to the notion of indexability. We will say that a stopping problem is indexable if is monotone.

The notion of indexability is vital in dynamic allocation theory, leading to the natural heuristic of operating the project with the highest allocation index, i.e. ‘playing-the-leader’. Most recently, Glazebrook et al. [Glazebrook] have generalised the notion of indexability to a large class of resource allocation problems. We note, however, that the definition of indexability presented here does not arise out of a Lagrangian relaxation of an original problem involving only a running reward as is the case in [Glazebrook]. Instead, in the context of a optimal stopping problems, indexability is a natural feature in the monotone comparative statics of parametrised families of forward problems.

The following assumption will ensure that is increasing. There will be a parallel set of results when is decreasing.

Assumption 3.22.

and is supermodular.

Example 3.23.

Recall Example 2.4. Let be a three-dimensional Bessel process started at , , and . We have . Note that is both -supermodular and -submodular. Suppose . attains its maximum at where is the smallest solution to the equation . For , the maximum is attained at the second smallest root of the same equation and .

Figure 2: for (solid line) and (dashed line).

Hence we find that is decreasing. This does not contradict Lemma 3.25 because Assumption 3.22 is violated: The set of points where is positive when and stopping is feasible coincides with the set of points where is negative (and stopping is therefore not feasible) when . Compare Remark 3.21.

In general, if we remove Assumption 3.6, a threshold strategy may never be optimal or it may only be optimal on some subset of parameters in . In the following we will show that if is -supermodular then a threshold strategy will be optimal for all parameters in a sub-interval of .

Let be the infimum of those values in for which . If for all then we set .

Lemma 3.24.

The set of where is non-empty (threshold stopping is optimal) forms an interval with end-points and .


Let denote the right end-point of . Suppose and . We claim that .

Fix . Then and


and for if . We write the remainder of the proof as if we are in the case ; the case when involves replacing with .

Fix . We want to show


for then

and since is continuous in the supremum is attained.

Since is supermodular by assumption we have for


Subtracting (3.5) from (3.3) gives (3.4). ∎

In the standard case, determining whether is supermodular is simplified by the following result.

Lemma 3.25.

Suppose that the boundary points of are inaccessible.

  1. If then is increasing in if and only if is -supermodular.

  2. In the standard case is -supermodular if and only if is -supermodular where , .


Athey [Athey] and Jewitt [Jewitt] prove that and are -supermodular if and only if is -supermodular. The first statement now follows from the fact (e.g Alvarez [Alvarez] or Rogers [rogers], V.50) that , where is a product of two single-variate functions and hence -supermodular.

For the second statement note that . By the result of Athey and Jewitt, is -supermodular if and only if is -supermodular. ∎

4 Inverse optimal stopping problems

In this section our aim is to recover diffusions consistent with a given value function for a stopping problem. We recall that when and , the problem has the interpretation of recovering price-processes consistent with perpetual American put option prices. Consider instead a situation in which an investor is considering whether to invest in a dividend bearing stock that can be liquidated at any time for capital gains. The capital gains depend on a parameter, e.g. a tax or subsidy rate. In this context, the indifference index has the natural interpretation of the parameter level(s) at which the investor is indifferent about the stock: at the critical level, the optimal policy would be to sell the stock immediately hence there is no expected gain from investment. The question that we ask in this section is whether we can recover an investor’s model for the asset price process given his valuation and investment indifference levels for the parameter.

The problem of recovering investor preferences from given information is a natural problem in economics and finance. In economics, the question of recovering information about an agent’s preferences given their behaviour dates back to Samuelson’s work [SamuelsonPref] on revealed preferences. Work on inverse investment problems includes Black [BlackInverse], Cox and Leland [CoxLeland], He and Huang [HeHuang] and most recently Cox et al. [CoxInverse]. Rather than calculating an optimal consumption/portfolio policy for a given agent (with a given utility function), the literature in this area aims at recovering utility functions consistent with given consumption and portfolio choices. These ‘inverse Merton problems’ have three fundamental aspects. The first is the specification of a model for an agents’ wealth process. The dynamics of the model are given by a dynamic budget constraint and are fully determined given an agent’s consumption and investment policy. The second fundamental aspect is an assumption on how the agent values the wealth developing out of his investment and consumption activity. It is assumed that he maximises utility. The third aspect, which is the crux of inverse Merton problems, is to determine the agents’ utility, or gain functions given the wealth dynamics and assuming the agent is utility maximising.

As in the inverse Merton problem, three fundamental quantities emerge in the study of the inverse perpetual optimal stopping problems considered here. 1) A model for the underlying random process, 2) valuation of the investment and 3) investor preferences (indifference levels). As we will see below, if we are given only an agent’s valuation then the inverse problem is ill-posed. There will then in general be infinitely many models solving the inverse problem, each corresponding to a choice of indifference index. As in the inverse Merton problems where both a model for the wealth process and the assumption of valuation via utility maximisation are given, we require two of the three pieces of information inherent to the perpetual-horizon investment problem to construct a solution: given a value function and (admissible) investor indifference levels, a consistent model is uniquely specified on the domain of the indifference index.

4.1 Setup

As before, let be an interval with end-points and . Let us assume that we are given , , and .

Inverse Problem: Find a generalised diffusion such that is consistent with one-sided stopping above .

Remark 4.1.

In developing the theory we focus on threshold strategies above . There is, as ever, an analogous inverse problem for threshold strategies below . See for instance Example 4.11.

We will make the following regularity assumption.

Assumption 4.2.

is differentiable and is twice continuously differentiable.

As in Hobson and Klimmek [HobsonKlimmek:10], we will use generalized convex analysis of the -transformed stopping problem to solve the inverse problem.

4.2 u-convex Analysis

Definition 4.3.

Let be subsets of and let be a bivariate function. The -convex dual of a function is denoted and defined to be .

Definition 4.4.

A function is -convex if .

Definition 4.5.

The -subdifferential of at is defined by

or equivalently

If is a subset of then we define to be the union of -subdifferentials of over all points in .

Definition 4.6.

is -subdifferentiable at if . is -subdifferentiable on if it is -subdifferentiable for all , and is -subdifferentiable if it is -subdifferentiable on .

The following envelope-theorem from -convex analysis will be fundamental in establishing duality between the value function and the -transformed eigenfunctions of consistent diffusions. The idea, which goes back to Rüschendorf [ruschfrech] (Equation 73), is to match the gradients of and -convex functions , whenever . The approach was also developed in Gangbo and McCann [mccann] and for applications in Economics by Carlier [carlier]. We refer to [carlier] for a proof of the following result.

Proposition 4.7.

Suppose that is strictly supermodular and twice continuously differentiable.

If is a.e differentiable and -subdifferentiable. Then there exists a map such that if is differentiable at then and


Moreover, is such that is non-decreasing.

Conversely, suppose that is a.e differentiable and equal to the integral of its derivative. If (4.1) holds for a non-decreasing function , then is -convex and -subdifferentiable with .

Remark 4.8.

If is strictly -submodular the conclusion of Proposition 4.7 remains true, except that and are non-increasing.

The subdifferential may be an interval in which case may be taken to be any element in that interval. By Lemma 3.19, is non-decreasing. We observe that since we have and so that may be defined directly as an element of . If is strictly increasing then is just the inverse of .

4.3 Application of -convex Analysis to the Inverse Problem

We introduce the notation that we will use for our inverse stopping problem framework. The main change over the previous section is that we will highlight dependence on the (unknown) speed measure and scale function .

We wish to recover a speed measure and scale function to construct a diffusion , supported on a domain such that . Our approach to solving this problem is to recover solutions and