Portfolio choice, portfolio liquidation, and portfolio transition under drift uncertainty.This research has been conducted with the support of the Research Initiative “Modélisation des marchés actions et dérivés” financed by HSBC France under the aegis of the Europlace Institute of Finance. The authors would like to thank Rama Cont (Imperial College), Nicolas Grandchamp des Raux (HSBC France), Charles-Albert Lehalle (CFM and Imperial College), Jean-Michel Lasry (Institut Louis Bachelier), and Christopher Ulph (HSBC London) for the conversations they had on the subject.

Portfolio choice, portfolio liquidation, and portfolio transition under drift uncertainty.thanks: This research has been conducted with the support of the Research Initiative “Modélisation des marchés actions et dérivés” financed by HSBC France under the aegis of the Europlace Institute of Finance. The authors would like to thank Rama Cont (Imperial College), Nicolas Grandchamp des Raux (HSBC France), Charles-Albert Lehalle (CFM and Imperial College), Jean-Michel Lasry (Institut Louis Bachelier), and Christopher Ulph (HSBC London) for the conversations they had on the subject.

Olivier Guéant, Jiang Pu  Université Paris 1 Panthéon-Sorbonne. Centre d’Economie de la Sorbonne. 106, Bd de l’Hôpital, 75013 Paris.Laboratoire de Probabilités et Modèles Aléatoires. CNRS, UMR 7599. Université Paris-Diderot.Institut Europlace de Finance. 28 Place de la Bourse, 75002 Paris.

This paper presents several models addressing optimal portfolio choice and optimal portfolio transition issues, in which the expected returns of risky assets are unknown. Our approach is based on a coupling between Bayesian learning and dynamic programming techniques. It permits to recover the well-known results of Karatzas and Zhao in the case of conjugate (Gaussian) priors for the drift distribution, but also to go beyond the no-friction case, when martingale methods are no longer available. In particular, we address optimal portfolio choice in a framework à la Almgren-Chriss and we build therefore a model in which the agent takes into account in his/her allocation decision process both the liquidity of assets and the uncertainty with respect to their expected returns. We also address optimal portfolio liquidation and optimal portfolio transition problems.

Key words: Optimal portfolio choice, Bayesian learning, Stochastic optimal control, Hamilton-Jacobi-Bellman equations, Optimal portfolio transition.

1 Introduction

The theory of portfolio selection started in 1952 with the seminal paper [23] of Markowitz.111Markowitz was awarded the Nobel Prize in 1990 for his work. For a brief history of portfolio theory, see [24]. In this paper, Markowitz considered the problem of an agent who wishes to build a portfolio with the maximum possible level of expected return, given a limit level of variance. He then coined the concept of efficient portfolio and described how to find/compute such portfolios. Markowitz paved the way for studying theoretically the optimal portfolio choice of risk-averse agents. A few years after Markowitz’s paper, Tobin published indeed his famous research work on agents’ liquidity preferences and the separation theorem (see [33]), which is based on the ideas developed by Markowitz. A few years later, in the sixties, Treynor, Sharpe, Lintner, and Mossin introduced independently the Capital Asset Pricing Model (CAPM) which is also built on top of Markowitz’s ideas. The ubiquitous notions of and owe a lot therefore to Markowitz’s modern portfolio theory.

Although initially written within a mean-variance optimization framework, the so-called Markowitz’s problem can also be written within the Von Neumann-Morgenstern expected utility framework. This was for instance done by Samuelson and Merton (see [25, 26, 31]), who, in addition, generalized Markowitz’s problem by extending the initial one-period approach to a multi-period one. Samuelson did it in discrete time, whereas Merton did it in continuous time. It is noteworthy that they both embedded the intertemporal portfolio choice problem into a more general optimal investment/consumption problem.222This problem in continuous time is now referred to as Merton’s problem.

In [25], Merton used PDE techniques in order to characterize the optimal consumption process of an agent and its optimal portfolio choices. In particular, Merton managed to find closed-form solutions in the constant absolute risk aversion case (i.e., for exponential utility functions), and in the constant relative risk aversion case (i.e., for power and log utility functions). Merton’s problem has then been extended to incorporate several features such as transaction costs (proportional and fixed), or bankruptcy considerations. Major advances to solve the Merton’s problem in full generality have been made in the eighties by Karatzas et al. by using (dual) martingale methods. In [18], Karatzas, Lehoczky, and Shreve used a martingale method to solve Merton’s problem for almost any smooth utility function (under the no-bankruptcy constraint) and showed how to partially disentangle the consumption maximization problem and the terminal wealth maximization problem. Constrained problems and extensions to incomplete markets were then considered – see for instance the paper [9] by Cvitanić and Karatzas.

In the literature on portfolio selection or in the slightly more general literature on Merton’s problem, input parameters (for instance the expected returns of risky assets) are often considered known constants, or stochastic processes with known initial values and dynamics. In practice however, one cannot state for sure that price returns will follow a given distribution. Uncertainty on model parameters is the raison d’être of the celebrated Black-Litterman [6] model, which is built on top of Markowitz’s model and the CAPM. However, as Markowitz’s model, Black-Litterman model is a static one. Subsequently, the agent of Black-Litterman model does not use what he/she might learn on the distribution of asset returns from their realizations.

Generalizations of optimal allocation models (or models addressing Merton’s problem) involving filtering and learning techniques in a partial information framework have been proposed. In the optimal portfolio choice literature, the most important paper mixing optimization and learning techniques is certainly the paper of Karatzas and Zhao [20]. In a model where the asset returns are Gaussian with unknown mean, they used martingale methods under the filtration of observables to compute, for almost any utility function, the optimal portfolio allocation (there is no consumption in their model). They also showed that their martingale method could be used for solving a Monge-Ampère-like parabolic PDE which naturally arises in their model. The same martingale (or dual) method has then been used to solve similar optimization problems with partial information – see for instance [10, 21, 22, 30].333A similar change-of-measure type of argument is also used in [5].

More general models have also been proposed where the dynamics of the drift is related to a hidden Markov chain / regime-switching model. Rieder and Bäuerle proposed for instance in [28] a model with one risky asset where the drift is modeled by a hidden Markov chain (see [17] and [32] for other papers on a similar topic). An important point related to [28] is that the authors used HJB equations and not the martingale approach. By solving the PDEs, they obtained closed-form solutions in the case of power and log utility functions, and recovered the results of [20] in the pure Bayesian case.

Only a few models in the partial-information literature are indeed solved by using PDE techniques. An important instance is Brendle [7]. He considered the optimal portfolio choice of an agent who does not know the drift of risky assets but knows that these drifts follow Ornstein-Uhlenbeck processes with known parameters. The Hamilton-Jacobi-Bellman (HJB) equation associated with the control problem is reduced to a set of nonlinear ODEs that are solved in closed form for CRRA and CARA utility functions, but only in the case of 1 risky asset – see also Rishel [29].444While publishing this paper, we noticed another very recent paper, by Casgrain and Jaimungal, dealing with optimal execution in a partial-information framework and using PDEs to solve the optimization/learning problem (see [8]).

In this paper, we consider several problems of portfolio choice in continuous time in which the (constant) expected returns of the risky assets are unknown. We first consider a problem similar to the one tackled by Karatzas and Zhao in the specific case where the Bayesian prior for the expected returns is a conjugate prior (Gaussian in our case). Our approach is based on the fact that conjugate priors are associated with simple Markovian updates of the distribution parameters. By adding into the state space the parameter(s) of the prior distribution, we show that classical ideas from dynamic programming / stochastic optimal control can be used and that the HJB equation associated with the problem can be solved in closed form in the case of CARA (i.e., exponential) and CRRA (i.e., power and log) utility functions, in the general case of risky assets (unlike [7] which only provides closed-form solutions in the case of one risky asset). In particular, solving the HJB equation boils down to solving a simple linear differential equation of order in the CARA case, and a Riccati equation (for which we have solutions in closed form) in the CRRA case. Furthermore, unlike most of the papers in the literature on portfolio choice with partial information, we provide verification theorems. This is particularly important as uncertainty sometimes leads to explosion in the value function when the utility function is not concave enough.

The PDE approach permits to avoid the annoying computations needed to simplify the general expressions of Karatzas and Zhao in the specific case of conjugate priors, but our message is, of course, not limited to that.

The PDE approach can indeed be used in situations where the (dual) martingale approach cannot be used. For instance, we use our approach to solve the optimal allocation problem in a trading framework à la Almgren-Chriss with quadratic execution costs. The Almgren-Chriss framework was initially built for solving optimal execution problems [1, 2], but it is also very useful outside of the cash equity world. For instance, Almgren and Li [3], and Guéant and Pu [14] used it for the pricing and hedging of vanilla options when liquidity matters.555Guéant et al. also used the Almgren-Chriss framework to tackle the pricing, hedging and execution issues related to Accelerated Share Repurchase contracts – see [12, 15]. The model we propose is one of the first models using the Almgren-Chriss framework to address an asset management problem, and definitely the first paper in this area in which the Almgren-Chriss framework is used in combination with Bayesian learning techniques.666Almgren and Lorenz used Bayesian techniques in optimal execution (see [4]), but they considered myopic agents with respect to learning. We also show how our framework can be slightly modified in order to address optimal portfolio transition issues.

Conjugate priors lead to very powerful Bayesian learning techniques, and this paper aims at proving that Bayesian ideas combined with stochastic optimal control can be very efficient to address a lot of financial problems. It is important to understand that Bayesian learning is a forward process whereas dynamic programming relies classically on backward induction reasonings. By using these two classical tools at the same time, we do not only benefit from the power of Bayesian techniques to learn continuously the value of unknown parameters, but we develop a framework in which agents learn and make decisions knowing that they will go on learning in the future in the same manner as they have learnt in the past. The same ideas are for instance at play in the case of Bayesian (Bernoulli) multi-armed bandits where the unknown parameters are the parameters of Bernoulli distributions – with Beta prior distributions –, but the dimensionality of the problem often makes computations based on PDEs too computer-intensive.777Upper confidence bound methods or Thompson sampling are often preferred to dynamic programming for computing (in fact approximating) the optimal strategies in this area. Another example is in the domain of media buying: in the paper by Fernandez-Tapia et al. [11] the unknown parameters of exponential and Bernoulli distributions have respectively Gamma and Beta prior distributions.

In Section 2, we consider the allocation problem of an agent in a context with only one risky asset – in addition to the risk-free asset – and we solve it in the case of a CARA utility function and in the case of a CRRA utility function. We analyze in particular the role of learning (and the influence of the knowledge that one will go on learning in the future) on the allocation process of the agent. In Section 3, we generalize the model to the case of several risky assets. In Section 4, we introduce liquidity costs through a modelling framework à la Almgren-Chriss and we use Bayesian learning for portfolio choice, optimal portfolio liquidation, and optimal portfolio transition problems.

2 Optimal portfolio choice with one risky asset

2.1 Introduction: price dynamics and Bayesian learning

2.1.1 Price dynamics

In this section, we consider an agent facing a portfolio allocation problem in a simplified financial context with one risk-free asset and one risky asset.

Let be a filtered probability space, with satisfying the usual conditions. Let be a Wiener process adapted to .

The risk-free interest rate is denoted by . The risky asset has the following classical log-normal dynamics


but we assume that the drift is unknown.

Remark 1.

Both and are unobserved by the agent, but is observed at time .

2.1.2 Bayesian updates

At time , the agent’s belief about the value of is modeled by a Gaussian prior distribution888We assume that is independent of .

The evolution of the risky asset price reveals information to the agent about the true value of the drift . In what follows we denote by the filtration generated by .

Remark 2.

is not an -Brownian motion, because it is not -adapted.

A classical result of the literature on filtering methods states that the conditional distribution of given (for any ) is Gaussian. More precisely, we have:

Proposition 1.

Let . Given , is conditionally normally distributed, with mean and variance , where



For , is a Gaussian vector with variance matrix



The distribution of given is the distribution of given . It is Gaussian with





By a monotone class argument, we have therefore that, for , the distribution of given is Gaussian with mean

and variance

2.1.3 Introduction of a new Brownian motion

Proposition 1 defines two processes and . The latter is a deterministic process999This process is decreasing because the longer we observe the less uncertainty remains on the value of . whereas the former is a stochastic process with the following dynamics:

where the function is defined by

and the process is defined by

The following proposition states that is a Brownian motion adapted to the filtration associated with the price process.

Proposition 2.

is a Wiener process adapted to .


For proving this result, we use the Lévy’s characterization of a Brownian motion.

Let . By definition, we have

hence the -measurability of .

Let , with .

For the first term, the increment is independent of and independent of . Therefore, it is independent of and we have

Regarding the second term, we have

by definition of .

We obtain that is an -martingale.

Since has continuous paths and , we conclude that is an -Brownian motion. ∎

2.2 Optimal portfolio choice in the CARA case

2.2.1 Portfolio dynamics and HJB equation

The strategy of the agent is described by a process , modelling the amount invested in the risky asset. The resulting value of the agent’s portfolio is modeled by a process with and the following dynamics:

For defining the set of admissible processes , let us first set a time horizon . Then, let us introduce the notion of “linear growth” for a process in our context.

Definition 1.

A measurable and -adapted process is said to satisfy the linear growth condition if, for all ,

where is deterministic and depends only on .

We define the set of admissible strategies as where

For , given and , we define:

Let us now come to the optimization problem. We assume, in the CARA case considered here, that the agent maximizes

The value function associated with this problem is then defined by

The HJB equation associated with this problem writes


with terminal condition


2.2.2 Solving the HJB equation

In this paper, we do not use viscosity techniques and we rather use verification arguments. In particular, we solve in closed form the HJB equation (2) with terminal condition (3).

To solve the HJB equation, we use the following ansatz:

Proposition 3.

Suppose there exists satisfying


with terminal condition


Then defined by (4) is solution of the HJB equation (2) with terminal condition (3).

Moreover, the supremum in (2) is achieved at:


Let us consider solution of (5) with terminal condition (6). For defined by (4), we have:


The supremum is reached at


By using this expression, we obtain:

by definition of .

As far as the terminal condition is concerned, it is straightforward to verify that satisfies the terminal condition (3). ∎

We have transformed the initial three-variable nonlinear PDE into a two-variable linear one. In the following proposition, we show that solving (5) with terminal condition (6) boils down to solving a triangular system of first order linear ODEs.

Proposition 4.

Assume that satisfies the following system of ODEs:


with terminal condition


Then, the function defined by


satisfies (5) with terminal condition (6).


Let us consider a couple solution of (8) with terminal condition (9). If is defined by (10), then

by definition of and .

As far as the terminal condition is concerned, it is straightforward to verify that satisfies the terminal condition (6). ∎

The triangular system of ODEs (8) with terminal condition (9) can be solved very easily. The next proposition states the expression of the unique solution .

Proposition 5.

The couple of functions defined by

satisfies the system (8) with terminal condition (9).

From these propositions, we deduce:

Corollary 1.

By using the notations introduced in Propositions 3, 4 and 5, we define by

satisfies the HJB equation (2) with the terminal condition (3).

Moreover, the supremum in (2) is reached at


This corollary is the consequence of the previous propositions. The argument of the maximum in (2) is obtained by plugging the expression of in (7):

2.2.3 Verification theorem

We now need to prove that the function exhibited in Corollary 1 is indeed the value function associated with the stochastic optimal control of Section 2.2.1. To obtain a verification theorem, we need first a very simple lemma.

Lemma 1.

The process satisfies the linear growth condition.


We have

We will also use in what follows the classical result of Beneš (see [19]) that we recall now for the sake of completeness.

Theorem (Beneš’s theorem).

If is a -adapted process that satisfies the linear growth condition on , then the Doléans-Dade exponential is an -martingale on .

We can now turn to the verification theorem stating that and giving the optimal investment strategy.

Theorem 1.
  1. For all and , we have:

  2. Equality in (12) is obtained by taking the optimal control given by:

  1. Let . We apply Itō’s lemma to the process :


    where the linear operator and the function are respectively given by:


    We define

    We compute101010For readability’s sake, we will shorten the notations, remembering that all the functions are evaluated at . the increment of :

    Then we compute the increment of :

    By definition of , we have , and is therefore nonincreasing. In particular, we have for ,

    Now, satisfies the linear growth condition by definition of the set of admissible strategies. Moreover, by Lemma 1, and also satisfy the linear growth condition. Therefore, we can apply Beneš’s theorem and obtain that is a true martingale. In particular we have for all .

    By taking the conditional expectation given and setting , we obtain:

  2. By taking , we have, by definition of and ,