New Dirichlet Mean Identities

New Dirichlet Mean Identities

\fnmsLancelot F. \snmJames\thanksreft1label=e1]lancelot@ust.hk [ Hong Kong University of Science and Technology Lancelot F. James,
The Hong Kong University of Science and Technology,
Department of Information and Systems Management,
Clear Water Bay, Kowloon, Hong Kong. \printeade1.
Abstract

An important line of research is the investigation of the laws of random variables known as Dirichlet means as discussed in Cifarelli and Regazzini CifarelliRegazzini (). However there is not much information on inter-relationships between different Dirichlet means. Here we introduce two distributional operations, which consist of multiplying a mean functional by an independent beta random variable and an operation involving an exponential change of measure. These operations identify relationships between different means and their densities. This allows one to use the often considerable analytic work to obtain results for one Dirichlet mean to obtain results for an entire family of otherwise seemingly unrelated Dirichlet means. Additionally, it allows one to obtain explicit densities for the related class of random variables that have generalized gamma convolution distributions, and the finite-dimensional distribution of their associated Lévy processes. This has implications in, for instance, the explicit description of Bayesian nonparametric prior and posterior models, and more generally in a variety of applications in probability and statistics involving Lévy processes. We demonstrate how the technique applies to several interesting examples

[
\kwd
\startlocaldefs\endlocaldefs\runtitle

Dirichlet Means {aug}

\thankstext

t1Supported in part by the grant HIA05/06.BM03 of the HKSAR

class=AMS] \kwd[Primary ]62G05 \kwd[; secondary ]62F15

beta-gamma algebra, Dirichlet means and processes, exponential tilting, generalized gamma convolutions, Lévy processes.

1 Introduction

In this work we present two distributional operations which identify relationships between seemingly different classes of random variables which are representable as linear functionals of a Dirichlet process, otherwise known as Dirichlet means. Specifically the first operation consists of multiplication of a Dirichlet mean by an independent beta random variable and the second operation involves an exponential change of measure to the density of a related infinitely divisible random variable having a generalized gamma convolution distribution (GGC). This latter operation is often referred to in the statistical literature as exponential tilting or in mathematical finance as an Esscher transform. We believe our results add a significant component to the foundational work of Cifarelli and Regazzini CifarelliRegazzini79 (); CifarelliRegazzini (). In particular, our results allow one to use the often considerable analytic work to obtain results for one Dirichlet mean to obtain results for an entire family of otherwise seemingly unrelated mean functionals. It also allows one to obtain explicit densities for the related class of infinitely divisible random variables which are generalized gamma convolutions, and the finite-dimensional distribution of their associated Lévy processes,(see Bertoin Bertoin () for the formalities of general Lévy processes). The importance of this latter statement is that Lévy processes now commonly appear in variety of applications in probability and statistics. A detailed summary and outline of our results may be found in section 1.2. Some background information, and notation, on Dirichlet proceses and Dirichlet means, their connection with GGC random variables, recent references and some motivation for our work is given in the next section.

1.1 Background and motivation

Let be a non-negative random variable with cumulative distribution function . Note furthermore for a measurable set we use the notation to mean the probability that is in One may define a Dirichlet process random probability measure, see Freedman () and Ferguson73 (); Ferguson74 (), say on with total mass parameter and prior parameter via its finite dimensional distribution as follows; for any disjoint partition on , say , the distribution of the random vector is a -variate Dirichet distribution with parameters Hence for each ,

has a beta distribution with parameters Equivalently setting for

where are independent random variables with gamma distributions and has a gamma distribution. This means that one can define the Dirichlet process via the normalization of an independent increment gamma process on , say as

where and whose almost surely finite total random mass is A very important aspect of this construction is the fact that is independent of and hence any functional of This is a natural generalization of Lukacs’Lukacs () characterization of beta and gamma random variables, whose work is fundamental to what is now referred to as the beta-gamma algebra, (for more on this, see Chaumont and Yor (Chaumont (), section 4.2)). See also Emery and Yor EmeryYor () for some interesting relationships between gamma processes, Dirichlet processes and Brownian bridges.

These simple representations and other nice features of the Dirichlet process have, since the important work of Ferguson Ferguson73 (); Ferguson74 (), contributed greatly to the relevance and practical utility of the field of Bayesian non and semi-parametric statistics. Naturally, owing to the ubiquity of the gamma and beta random variables, the Dirichlet process also arises in other areas. One of the more interesting, and we believe quite important, topics related to the Dirichlet process is the study of the laws of random variables called Dirichlet mean functionals, or simply Dirichlet means, which we denote as

initiated in the works of Cifarelli and Regazzini CifarelliRegazzini79 (); CifarelliRegazzini (). In CifarelliRegazzini () the authors obtained an important identity for the Cauchy-Stieltjes transform of order This identity is often referred to as the Markov-Krein identity as can be seen in for example, Diaconis and Kemperman Diaconis (), Kerov Kerov () and Vershik, Yor and Tsilevich Vershik (), where these authors highlight its importance to, for instance, the study of the Markov moment problem, continued fraction theory and exponential representation of analytic functions. This identity is later called the Cifarelli-Regazzini identity in James2005 (). Cifarelli and Regazzini CifarelliRegazzini (), owing to their primary interest, used this identity to then obtain explicit density and cdf formulae for The density formulae may be seen as Abel type transforms and hence do not always have simple forms, although we stress that they are still useful for some analytic calculations. The general exception is the case of which has a nice form. Some examples of works that have proceeded along these lines are Cifarelli and Melilli CifarelliMelilli (), Regazzini, Guglielmi and di Nunno Regazzini2002 (), Regazzini, Lijoi and PrünsterRegazzini2003 (), Hjort and Ongaro Hjort (), Lijoi and Regazzini Lijoi (), and Epifani, Guglielmi and Melilli Epifani2004 (); Epifani2006 ()). Moreover, the recent work of Bertoin, Fujita, Roynette and Yor BFRY () and James, Lijoi and Prünster JLP () (see also JamesGamma () which is a preliminary version of this work) show that the study of mean functionals is relevant to the analysis of phenomena related to Bessel and Brownian processes. In fact the work of James, Lijoi and Prünster JLP () identifies many new explicit examples of Dirichlet means which have interesting interpretations.

Related to these last points, Lijoi and Regazzini Lijoi () have highlighted a close connection to the theory of generalized gamma convolutions(see BondBook ()). Specifically, it is known that a rich sub-class of random variables having generalized gamma convolutions (GGC) distributions may be represented as

(1.1)

We call these random variables GGC In additon we see from (1.1) that is a random variable derived from a weighted gamma process, and hence the calculus discussed in Lo Lo82 () and Lo and Weng LW () applies. In general GGC random variables are an important class of infinitely divisible random variables whose properties have been extensively studied by BondBook () and others. We note further that although we have written a GGC random variable as this representation is not unique and in fact it is quite rare to see represented in this way. We will show that one can in fact exploit this non-uniqueness to obtain explicit densities for even when it is not so easy to do so for While the representation is not unique it helps one to understand the relationship between the Laplace transform of and the Cauchy-Stieltjes transform of order of which indeed characterizes respectively the law of and Specifically, using the independence property of and leads to, for

(1.2)

where

(1.3)

is the Lévy exponent of We note that and exist if and only if for (see for instance DS () and BondBook ()). The expressions in (1.2) equates with the the identity obtained by Cifarelli and Regazzini CifarelliRegazzini (), mentioned previously.

Despite these interesting results, there is very little work on the relationship between different mean functionals. Suppose, for instance, that for each fixed value of denotes a Dirichlet mean and denotes a collection of Dirichlet mean random variables indexed by a family of distributions Then one can ask the question, for what choices of and are these mean functionals related, and in what sense? In particular, one may wish to know how their densities are related. The rationale here is that if such a relationship is established, then the effort that one puts forth to obtain results such as the explicit density of can be applied to an entire family of Dirichlet means Furthermore since Dirichlet means are associated with GGC random variables this would establish relationships between a GGC random variable and a family of GGC random variable Simple examples are of course the choices and which, due to the linearity properties of mean functionals, results easily in the identities in law

Naturally, we are going to discuss more complex relationships, but with the same goal. That is, we will identify non-trivial relationships so that the often considerable efforts that one makes in the study of one mean functional can be then used to obtain more easily results for other mean functionals, their corresponding GGC random variables and Lévy processes. In this paper we will describe two such operations which we elaborate on in the next subsection.

1.2 Outline and summary of results

Section 1.3 reviews some of the existing formulae for the density and cdf of Dirichlet means. In Section 2, we will describe the operation of multiplying a mean functional by an independent beta random variable with parameters say, where We call this operation beta scaling. Theorem 2.1 shows that the resulting random variable is again a mean functional but now of order . In addition, the GGC random variable is equivalently a GGC random variable of order Now keeping in mind that tractable densities of mean functionals of order are the easiest to obtain, Theorem 2.1 shows that by setting , the densities of the uncountable collection of random variables are all mean functionals of order Theorem 2.2 then shows that efforts used to calculate the explicit density of any one of these random variables, via the formulae of CifarelliRegazzini (), lead to the explicit calculation of the densities of all of them. Additionally, Theorem 2.2 shows that the corresponding GGC random variables may all be expressed as GGC random variables of order representable in distribution as . A key point here is that Theorem 2.2 gives a tractable density for without requiring knowledge of the density of which is usually expressed in a complicated manner. These results also will yield some non-obvious integral identities. Furthermore, noting that a GGC random variable, is infinitely divisible, we associate it with an independent increment process known as a subordinator ,(a non-decreasing non-negative Lévy process), where for each fixed

That is, marginally and In addition, for is independent of We say that the process is a GGC subordinator. Proposition 2.1 shows how Theorems 2.1 and 2.2, can be used to address the usually difficult problem of describing explicitly the densities of the finite-dimensional distribution of a subordinator (see Kingman75 ()). This has implications in, for instance, the explicit description of densities of Bayesian nonparametric prior and posterior models. But clearly is of wider interest in terms of the distribution theory of infinitely divisible random variables and associated processes.

In Section 3, we describe how the operation of exponentially titling the density of a GGC random variable leads to a relationship between the densities of the mean functional and yet another family of mean functionals. This is summarized in Theorem 3.1. Section 3.1 then discusses a combination of the two operations. Proposition 3.1 describes the density of beta scaled and tilted mean functionals of order 1. Using this, Proposition 3.2 describes a method to calculate a key quantity in the explicit description of the density and cdf of mean functionals. In section 4 we demonstrate how our results can be applied to extend and explain results related to two well known cases of Dirichlet mean functionals. However, we emphasize that Proposition 4.14.2 and 4.3 are genuinely new results to the literature. More complex applications, which may be viewed as extensions of section 4.2, may be found in an unpublished preliminary version of this work in JamesGamma (). We discuss and develop these further in James JamesLinnik (). Section 5 presents a more involved result relative to those in section 4, but which does not require a great deal of background material. Here we show how the results in section 2 are used to derive the finite dimensional distribution and related quantities of a class of subordinators recently studied in BFRY ().

1.3 Preliminaries

Suppose that is a positive random variable with distribution , and define the function

Furthermore, define

where using a Lebesque-Stieltjes integral, Cifarelli and Regazzini CifarelliRegazzini () (see also CifarelliMelilli ()), apply inversion formula to obtain the distributional formula for as follows. For all , the cdf can be expressed as

(1.4)

provided that possesses no jumps of size greater than or equal to one. If we let denote the density of it takes its simplest form for , which is

(1.5)

Density formulae for are described as

(1.6)

An expression for the density, which holds for all , was recently obtained by James, Lijoi and Prünster JLP () as follows,

(1.7)

where

For additional formula, see CifarelliRegazzini (); Regazzini2002 (); Lijoi ().

Remark 1.1.

Throughout for random variables and when we write the product we will assume unless otherwise mentioned that and are independent. This convention will also apply to the multiplication of the special random variables that are expressed as mean functionals. That is the product is understood to be a product of independent Dirichlet means.

Remark 1.2.

Throughout we will be using the fact that if is a gamma random variable, then the independent random variables satisfying imply that This is true because gamma random variables are simplifiable. For precise meaning of this term and conditions, see Chaumont and Yor (Chaumont, , sec. 1.12 and 1.13). This fact also applies to the case where is a positive stable random variable.

2 Beta Scaling

In this section we investigate the simple operation of multiplying a Dirichlet mean functional by certain beta random variables. Note first that if denotes an arbitrary positive random variable with density then by elementary arguments it follows that the random variable where is beta independent of has density expressible as

However it is only in special cases where the density can be expressed in even simpler terms. That is to say, it is not obvious how to carry out the integration. In the next results we show how remarkable simplifications can be achieved when in particular for the range and is a symmetric beta random variable. First we will need to introduce some additional notation. Let denote a Bernoulli random variable with success probability Then if is a random variable with distribution , independent of it follows that the random variable has distribution denoted as

(2.1)

and cdf

(2.2)

Hence, there exists the mean functional

where denotes a Dirichlet process with parameters In addition we have for

(2.3)

When and hence Let denote a set such that Now notice that every beta random variable, where are arbitrary positive constants, can be represented as the simple mean functional,

by choosing

We note however that there are other choices of that will also yield beta random variables as mean functionals. Throughout we will use the convention that that is the case when We now present our first result.

Theorem 2.1.

For and , let denote a beta random variable with parameters , independent of the mean functional Then

  1. Equivalently,

  2. That is, GGCGGC.

Proof.

Since statements (i) and (ii) are equivalent. We proceed by first establishing (iii) and (iv). Note that using (1.3),

Hence

which means that establishing statements (iii) and (iv). Now writing It follows that

Hence by the fact that gamma random variables are simplifiable. ∎

When , we obtain results for random variables The symmetric beta random variables arise in a variety of important contexts, and are often referred to as generalized arcsine laws with density expressible as

Now using (2.1) and (2.2), let then for

(2.4)

Note also that The next result yields another surprising property of these random variables.

Theorem 2.2.

Consider the setting in the Theorem 2.1. Then when , it follows that for each fixed the random variable has density

(2.5)

specified by (2.4). Since GGCGGC this implies that the random variable has density

(2.6)
Proof.

Since the density is of the form (1.5), for each fixed Furthermore we use the identity in (2.3). ∎

Remark 2.1.

It is worthwhile to mention that transforming to the random variable , (2.5) is equivalent to the otherwise not obvious integral identity,

This leads to interesting results when the density has a known form. On the other hand, we see that one does not need the explicit density of to obtain the density of In fact, owing to our goal of yielding simple densities for many Dirichlet means from one mean, we see that the effort to calculate the density of for each is no more than what is needed to calculate the density of

We now see how this translates into the usually difficult problem of describing explicitly the density of the finite-dimensional distribution of a subordinator. In the next result we use the notation to mean

Proposition 2.1.

Let denote a GGC subordinator on Furthermore let denote an arbitrary disjoint partition of the interval Then the finite-dimensional distribution has a joint density

(2.7)

where each and The density is given by (2.6). That is, and are independent for where has density

Proof.

First, since partitions the interval it follows that their sizes satisfy and Since is a subordinator the independence of the is a consequence of its independent increment property. In fact these are essentially equivalent statements. Hence, we can isolate each It follows that for each the Laplace transform is given by

which shows that each is GGC for Hence the result follows from Theorem 2.2. ∎

3 Exponential Tilting/Esscher Transform

In this section we describe how the operation of exponential tilting of the density of a GGC random variable leads to a non-trivial relationship between a mean functional determined by and and an entire family of mean functionals indexed by an arbitrary constant Additionally this will identify a non-obvious relationship between two classes of mean functionals. Exponential tilting is merely a catchy phrase for the operation of applying an exponential change of measure to a density or more general measure. In mathematical finance and other applications it is known as an Esscher Transform which is a key tool for option pricing. We mention that there is much known about exponential tilting of infinitely divisible random variables and in fact Bondesson (BondBook, , example 3.2.5) discusses explicitly the case of GGC random variables, albeit not in the way we shall describe it. In addition, examining the gamma representation in (1.1) one can see a relationship to Lo and Weng (LW, , Proposition 3.1) (see also Küchler and Sorensen Kuc () and James (James05, , Proposition 2.1) for results on exponential tilting of Lévy processes). However, here our focus is on the properties of related mean functionals which leads to genuinely new insights.

Before we elaborate on this, we describe generically what we mean by exponential tilting. Suppose that denotes an arbitrary positive random variable with density, say It follows that for each positive the random variable is well-defined and has density

Exponential tilting refers to the exponential change of measure resulting in a random variable, say defined by the density

Thus from the random variable one gets a family of random variables Obviously the density for each does not differ much. However something interesting happens when is a scale mixture of a gamma random variables, i.e., for some random positive random variable independent of In that case one can show, see JamesGamma (), that where is sufficiently distinct for each value of We demonstrate this for the case where

First note that obviously, for each which in itself is not a very interesting transformation. Now setting with density denoted as , the corresponding random variable resulting from exponential tilting has density

(3.1)

and Laplace transform

(3.2)

Now for each define the random variable

That is, the cdf of the random variable can be expressed as,

In the next theorem we will show that relates to the family of mean functionals by the tilting operation described above. Moreover, we will describe the relationship between their densities.

Theorem 3.1.

Suppose that has distribution and for each is a random variable with distribution For each , let denote a GGC random variable having density Let denote a random variable with density and Laplace transform described by (3.1) and (3.2) respectively. Then is a GGC random variable and hence representable as Furthermore, the following relationships exists between the densities of the mean functionals and

  1. Suppose that the density of , say, is known. Then the density of is expressible as

    for

  2. Conversely, if the density of , is known then the density of is given by

Proof.

We proceed by first examining the Lévy exponent associated with as described in (3.2). Notice that

and is of the same form with Hence isolating the logarithmic terms we can focus on the difference

This is equivalent to

showing that is GGC This fact can also be deduced from Proposition 3.1 in Lo and Weng LW (). The next step is to identify the density of in terms of the density of Using the fact that one may write the density of in terms of a gamma mixture as

Hence, rearranging terms in (3.1), it follows that the density of can be written as

Now further algebraic manipulation makes this look like a mixture of a gamma random variable, as follows,

Hence it is evident that has the same distribution as a random variable where has density

Thus statements (i) and (ii) follow. ∎

3.1 Tilting and Beta Scaling

This section describes what happens when one applies the exponentially tilting operation relative to a mean functional resulting from beta scaling. Recall that the tilting operation applied to described in the previous section sets up a relationship between and Consider the random variable Then tilting as in the previous section leads to the random variable and hence relates

to the Dirichlet mean of order

Now letting denote the distribution of one has

and hence

(3.3)

In a way this shows that the order of beta scaling and tilting can be interchanged. We now derive a result for the cases of and related by the tilting operation described above. Combining Theorem 2.2 with Theorem 3.1 leads to the following result.

Proposition 3.1.

For each the random variables and satisfy the following;

  1. The density of is expressible as

    for

  2. Conversely, the density of is given by

Proof.

For clarity statement [(i)] is obtained by first using Theorem 3.1. Which gives,

for It then remains to substitute the form of the density (2.5) given in Theorem 2.2. Statement [(ii)] proceeds in the same way using (2.6). ∎

Note that even if one can calculate for some fixed value of it may not be so obvious how to calculate it for another value. The previous results allow us to relate their calculation to that of as described next.

Proposition 3.2.

Set and define Then for

Proof.

The result can be deduced by using Proposition 3.1 in the case of First notice that Now equating the form of the density of given by (1.5) with the expression given in Proposition 3.1. It follows that