On financial applications of the two-parameter Poisson-Dirichlet distributionResearch Note

On financial applications of the two-parameter Poisson-Dirichlet distribution Research Note


Capital distribution curve is defined as log-log plot of normalized stock capitalizations ranked in descending order. The curve displays remarkable stability over periods of time.

Theory of exchangeable distributions on set partitions, developed for purposes of mathematical genetics and recently applied in non-parametric Bayesian statistics, provides probabilistic-combinatorial approach for analysis and modeling of the capital distribution curve. Framework of the two-parameter Poisson-Dirichlet distribution contains rich set of methods and tools, including infinite-dimensional diffusion process.

The purpose of this note is to introduce framework of exchangeable distributions on partitions in the financial context. In particular, it is shown that averaged samples from the Poisson-Dirichlet distribution provide approximation to the capital distribution curves in equity markets. This suggests that the two-parameter model can be employed for modelling evolution of market weights and prices fluctuating in stochastic equilibrium.


A4 \setmargins1.1cm0.7cm19.0cm24.5cm12pt1cm0pt1.25cm

1 Introduction

The capital distribution curve is defined as log-log plot of stock market weights ranked in descending order. Temporal stability of the shape of this curve is one of the cornerstones of the Stochastic Portfolio Theory (SPT), developed by Fernholz, Karatzas et al. ([fernholz2002stochastic], [karatzas2009stochastic] and [fernholz2013second]). In contrast to the MPT and the CAPM, which are based on normative assumptions, the Stochastic Portfolio Theory is a descriptive theory, since it studies empirical dynamics and characteristics of equity markets. In particular, the SPT captures tendency of stocks of retaining their ranks. The SPT model employs machinery of rank-interacting Brownian particles and semimartingales.

Framework of partition structures, imported from mathematical genetics and comprising combinatorial and probabilistic methods, provides complementary approach for modeling and analysis of the capital distribution curve and can be summarized as follows.

  • The market is considered as a large combinatorial structure - partition of the set of the invested units of money. Capitalizations of individual stocks correspond to block or cluster sizes of the partition, represented by integers, for instance, measured in cents.

  • Number of set partitions defines number of ways each partition can be realized combinatorially. In other words, the market can be represented as a giant Young diagram with vector of capitalizations determining (potentially very large) number of ways such market configuration can be realized.

Partition structures are important for several reasons.

  • First of all, partition structures provide a model of random transitions with dynamic dimensions. In other words, at any time number of diffusion components may change due to appearance of a new stock or bankruptcy of existing firm.

  • Second, partition structure, with non-trivial limiting distribution, defines asymptotic shape of the corresponding combinatorial structure. In particular, mechanism of shape formation provides an explanation of the phenomenon of stability of the capital distribution curve.

The two-parameter Poisson-Dirichlet model is a remarkable and well studied instance of partition structures. It possesses analytically tractable limiting distribution defined in the simplex of ranked weights. Poisson-Dirichlet distribution. The Dirichlet distribution with -dimensional vector of parameters defines probability for non-negative proportions in a standard simplex. Kingman [kingman1975random] considered limiting behavior of this distribution with symmetric vector of parameters such that for and called distribution of ranked components the Poisson-Dirichlet () distribution (with one parameter ). This distribution is defined in the infinite simplex of ranked weights, known as Kingman simplex

Size-biased permutation provides an efficient method of sampling from the Dirichlet and the Poisson-Dirichlet distributions. In a framework of population biology Engen [engen1978] suggested modification of the size-biased method, which produced another class of Poisson-Dirichlet distributions. It was called the two-parameter Poisson-Dirichlet distribution by Perman, Pitman and Yor, who rediscovered it in the context of studying of ranked jumps of gamma and stable subordinators (see [perman1992size],[PY]). Monograph by Pitman [pitman2002combinatorial] contains wealth of information on the two-parameter Poisson-Dirichlet model. As shown by Chatterjee and Pal [chatterjee2010phase], limiting behaviour of rank-interacting system of Brownian particles is characterized by the distribution.

Aoki pioneered applications of exchangeable distributions in economics ([aoki2001modeling],[Aoki228]), in particular using finitary characterization by Garibaldi, Costantini, et al. ([garibaldi2004finitary], see also book [garibaldi2010finitary]). Markov chain approach with transitions in space of partitions was independently developed by Garibaldi, Costantini, et al. [garibaldi2004finitary], [garibaldi2007two]. Petrov [petrov2009two], inspired by works of Kerov, Fulman [fulman2005stein], Borodin and Olshanski [borodin2009infinite] constructed a diffusion process preserving the two-parameter Poisson-Dirichlet distribution in the infinite-dimensional ranked simplex.

This research note aims at illustration of applications of partition structures and the two-parameter model for modeling of stochastic evolution of the capital distribution curve. In particular, it is shown in Section LABEL:sec-examples that the two-parameter model provides reasonable approximation of capital distribution curves in equity markets. Moreover the model also provides fit for distribution of relative total capitalizations of stock exchanges.

Main results of this paper were presented at the 8th World Congress of the Bachelier Finance Society, 2014. The author is very grateful to Prof. I. Karatzas for useful advice and suggestions.

1.1 Capital distribution curve

Log-log plot of ranked market weights displays

  • power law behavior,

  • concavity of the curve and

  • stability over periods of time

For example, figure below shows capital distribution curves of the NASDAQ market on three dates in 2014. 1 As it can be seen from the chart most of market weights had relatively small fluctuations, despite significant fluctuations of NASDAQ market capitalization during that period of time. Stability of the capital distribution curve suggests certain independence of market weights and overall market capitalization.

Figure 1: NASDAQ, capital distribution curves on May 27, Sep 24, Dec 9, 2014

More detailed chart reveals behavior of weights of top 100 stocks.

Figure 2: weights of top 100 stocks, NASDAQ

Capital distribution curves on majority of equity markets, as well as distribution of capitalizations of world stock exchanges, have shapes similar to one shown at Figure 1. Section LABEL:sec-examples contains examples of fit of these curves by the -model.

1.2 Poisson-Dirichlet distribution and market weights

Log-log plot of ranked samples from the Poisson-Dirichlet law is characterized by

  • power law behavior,

  • concavity of the curve and

  • stability around average shape

The infinite-dimensional Poisson-Dirichlet distribution generalizes symmetric finite-dimensional Dirichlet distribution. Moreover, as shown in Section 1.4, both distributions can be represented by normalization of sequences of random variables by their sum

with the property of independence of weights and the sum .

Figure below illustrates fit of NASDAQ market weights by averages of samples from the two-parameter distribution. Estimation of parameters is by least squares method.

Figure 3: NASDAQ fit by , (data as of Dec 9, 2014)

Next figure displays typical behaviour of ranked random weights

Figure 4: 20 sample paths of

1.3 Ranked capitalizations and market weights

Stock capitalization at time is calculated as product of the shares outstanding and the stock price

For capitalizations ordered as corresponding ranked market weights are determined by

where is total market capitalization at time . Stability of the capital distribution curve means

In other words, ranked weights remain approximately the same despite changes in capitalizations. This implies that for relatively short periods of time, when the stock retains its rank numéraire approach of pricing approximately holds

However, it should be noticed that the longer the time period , the less likely that stock retains its rank. More advanced approach of modelling market weights and stock capitalizations is based on application of diffusion theory and representation of the distribution in terms of jumps of subordinators. This representation is known as Proposition 21 in the celebrated paper of Pitman and Yor [PY].

1.4 Gamma-Dirichlet algebra

There is close relationship between the gamma and Dirichlet distributions, characterized by number of important properties, which in the symmetric case can be summarized as follows. Let us consider independent and identically distributed gamma variables with shape and scale . The first, convolution property states that the sum of these variables also has gamma distribution with . The second property states that normalized components are independent of the sum , moreover, as it has been shown by Lukacs [lukacs1955characterization], this characterizing property holds if and only if are gamma distributed with the same scale . Finally, normalized vector has symmetric Dirichlet distribution .

Conversely, with Dirichlet distributed vector and independent gamma distributed ’restored’ variables , correspondingly, have gamma distributions .

Obviously, these properties hold as well in the case of the ordered Dirichlet distribution. For instance, with ranked components obtained from the symmetric Dirichlet distribution and independent , restored gamma variables are also ranked in descending order.

Similar characterization of the law is provided by the Proposition 21 in Pitman and Yor [PY], which informally can be restated as follows. Let us consider tempered stable subordinator with Lévy density in random time interval , with and denote ranked jumps of the subordinator in this interval by . Sum of these jumps is equal to value of the tempered subordinator stopped at random time

As in the case with the Dirichlet distribution, the Proposition 21 in [PY] states that sum of the jumps . The second statement of the proposition is that are independent of the sum . Finally, sequence of normalized jumps has the Poisson-Dirichlet distribution with parameters . In what follows Prop. 21 provides convenient way of modeling stochastic evolution of stock prices ’restored’ from dynamics of market weights.

1.5 -market model

It is natural to employ stick-breaking and size-biased sampling methods described in Sections LABEL:sec-GDSBP and LABEL:sec-PDD for modeling diffusion with stationary Poisson-Dirichlet distribution. At first this approach was proposed by Feng and Wang [fengwang], who also proved reversibility of corresponding infinite-dimensional process. Let us recall that the Wright-Fisher diffusion process driven by the SDE

has reversible stationary beta distribution .

If denotes market weight of the -th largest stock at time , then stochastic evolution of market weights can be determined from the stick-breaking process

where processes are determined by independent SDEs

with stationary beta distributions, corresponding to the size-biased sampling definition (LABEL:TPSB)

Initial values of processes are determined by

Local evolution of overall market capitalization can be modelled by diffusion

with stationary gamma distribution , where variable is defined by condition .
Correspondingly, local behaviour of stock prices is defined by product of independent processes

where denotes number of shares outstanding.

Figure 5: Simulation of weights, overall market value and stock capitalizations
with stationary distribution

1.6 The broken-stick model

The broken-stick is a simple model illustrating how uniform partition produces inequality patterns. MacArthur [macarthur1957relative] proposed this model for explanation of relative species abundances in closed environment.

Let’s assume that stick of unit length represents some finite resource, such as territory, available food, water reservoir, etc., which must be shared between species. The resource is broken at random by throwing uniformly cutting points on this stick and breaking it into pieces. Length of each piece represents share, which is taken by some class of species. While on average length of each piece will be , ranked lengths of pieces display interesting behavior.

For instance, if stick is broken just into two pieces, then length of smaller piece is never larger than 50% and since cut point is uniformly distributed it is easy to see that smaller stick on average represents 25% of length, while larger one takes 75%. In general it can be shown that after breaking stick into pieces expected length of the -th largest piece is given by

In case of 3 pieces expected proportions ranked in descending order are 61.1%, 27.8% and 11.1%. It can be checked by straightforward simulation that dropping 4 points at uniform on unit interval produces on average following ranked lengths of 5 subintervals

Obviously, sampled proportions will fluctuate around these expected lengths. For larger values of ranked expected proportions start to decay rapidly and it is more convenient to display them on a log-log plot.

Figure 6: Expected proportions for

This example illustrates that asymmetry in ranked proportions appears with completely uniform distribution of resource.

1.7 Toy model

Let us imagine that there are only two stocks with capitalizations 3 and 2 in the market with capitalization 5. Tickers or names do not play important role and used only to distinguish the stocks. Ten ways in which 5 units of money can form a state with these capitalizations is represented by the ten Young tableaux shown on the left


smalltableaux\ytableaushort123,4*(blue!8)5  \ytableaushort124,3*(blue!8)5  \ytableaushort134,2*(blue!8)5  \ytableaushort1*(blue!8)5,234


12*(blue!8)5,34  \ytableaushort12,34*(blue!8)5  \ytableaushort13*(blue!8)5,24  \ytableaushort13,24*(blue!8)5  \ytableaushort14*(blue!8)5,23  \ytableaushort14,23*(blue!8)5

\ydiagram [*(blue!14)]3,2

Since these partitions have the same block sizes it is convenient to use Young diagram, shown on the right, to denote all partitions with the same shape. The 10 partitions above arise by adding a new box:

\ytableausetup boxsize=.45em in one way to 4 partitions with shape \ydiagram[*(blue!2)]3,1 and in two ways to 3 partitions with shape \ydiagram[*(blue!3)]2,2


  1. Data source is http://www.google.com/finance#stockscreener
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description