Modelling, simulation and inference for multivariate time series of counts

Modelling, simulation and inference for multivariate time series of counts

Almut E. D. Veraart
Department of Mathematics, Imperial College London
180 Queen’s Gate, London, SW7 2AZ, UK

This article presents a new continuous-time modelling framework for multivariate time series of counts which have an infinitely divisible marginal distribution. The model is based on a mixed moving average process driven by Lévy noise – called a trawl process – where the serial correlation and the cross-sectional dependence are modelled independently of each other. Such processes can exhibit short or long memory. We derive a stochastic simulation algorithm and a statistical inference method for such processes. The new methodology is then applied to high frequency financial data, where we investigate the relationship between the number of limit order submissions and deletions in a limit order book.

Keywords: Count data, continuous time modelling of multivariate time series, trawl processes, infinitely divisible, Poisson mixtures, multivariate negative binomial law, limit order book
Mathematics Subject Classification: 60G10, 60G55, 60E07, 62M10, 62P05

1 Introduction

Time series of counts can be viewed as realisations of non-negative integer-valued stochastic processes and arise in various applications in the natural, life and social sciences. As such there has been very active research in the various fields and recent textbooks treatments can be found in Cameron & Trivedi (1998); Kedem & Fokianos (2002); Winkelmann (2003); Davis et al. (2015) and we refer to Davis et al. (1999); McKenzie (2003); Ferland et al. (2006); Weiß (2008); Cui & Lund (2009); Davis & Wu (2009); Jung & Tremayne (2011) for recent surveys and some new developments of the literature.

However, most of these previous works focus on univariate time series of counts and the literature on multivariate extensions is rather sparse and almost exclusively deals with models formulated in discrete time and borrow ideas from traditional autoregressive time series models. E.g. Franke & Rao (1995) and Latour (1997) introduced the first-order integer-valued autoregression model, which is based on the generalised Steutel and van Harn (1979) thinning operator. Recently, Boudreault & Charpentier (2011) applied such models to earthquake counts. Also, the recent handbook on discrete-valued time series by Davis et al. (2015) contains the chapter by Karlis (2015) who surveys recent developments in multivariate count time series models.

One challenge in handling multivariate time series is the modelling of the cross-sectional dependence. While for continuous distributions the theory of copulas presents a powerful toolbox, it has been pointed out by Genest & Nešlehová (2007) that a problem arises in the discrete context due to the non-uniqueness of the associated copula. This can be addressed by using the continuous extension approach by Denuit & Lambert (2005). Indeed, for instance Heinen & Rengifo (2007) introduce a multivariate time series model for counts based on copulas applied to continuously extended discrete random variables and fit the model to the numbers of trades of various assets at the New York stock exchange. Also, Koopman et al. (2015) study discrete copula distributions with time-varying marginals and dependence structure in financial econometrics. Motivated by the reliability literature, Lindskog & McNeil (2003) introduced the so-called common Poisson shock model to describe the arrival of insurance claims in multiple locations or losses due to credit defaults of various types of counterparty.

While the models mentioned above are interesting in their own right, the goal of this article is more ambitious since it formulates a more general modelling framework which can handle a variety of marginal distributions as well as different types of serial dependence including, in particular, both short and long memory specifications. That said, motivated by an application in financial econometrics and recognising the success the class of Lévy processes has in such settings, we focus exclusively on models whose marginal distribution is infinitely divisible. This assumption puts a restriction on the cross-sectional dependence due to the well-known result by Feller (1968), which says that a random vector with infinitely divisible distribution on always has non-negatively correlated components. Moreover, any non-degenerate distribution on is infinitely divisible if and only if it can be expressed as a discrete compound Poisson distribution. We will see that this is nevertheless a very rich class of distributions and suitable for our application to high frequency financial data.

The new modelling framework is based on so-called multivariate integer-valued trawl processes, which are special cases of multivariate mixed moving average processes where the driving noise is given by an integer-valued Lévy basis.

In the univariate case, trawl processes – not necessarily restricted to the integer-valued case – have been introduced by Barndorff-Nielsen (2011). Also, Noven et al. (2015) used such processes in an hierarchical model in the context of extreme value theory. The univariate integer-valued case has been developed in detail in Barndorff-Nielsen et al. (2014). Shephard & Yang (2016b) studied likelihood inference for a particular subclass of an integer-valued trawl process and, more recently, Shephard & Yang (2016a) used such processes to build an econometric model for fleeting discrete price moves. While the multivariate extension was already briefly mentioned in Barndorff-Nielsen et al. (2014), this article develops the theory of multivariate integer-valued trawl (MVIT) processes in detail and presents new methodology for stochastic simulation and statistical inference for such processes and applies the new results to high frequency financial data from a limit order book. The key feature of MIVT processes, which makes them powerful for a wide range of applications is the fact that the serial dependence and the marginal distribution can be modelled independently of each other, which is for instance not the case in the famous DARMA models, see Jacobs & Lewis (1978a, b). As such we will present parsimonious ways of parameterising the serial correlation and will show that we can accommodate both short and long memory processes as well as seasonal fluctuations. Moreover, since MITV processes are formulated in continuous time, we can handle both asynchronous and not necessarily equally spaced observations, which is particularly important in a multivariate set-up.

The motivation for this study comes from high frequency financial econometrics where discrete data arise in a variety of scenarios, e.g. high frequent price moves for stocks with fixed tick size resemble step functions supported on a fixed grid. Also, the number of trades can give us an indication of market activity and is widely analysed in the industry. In this article, we will apply our new methodology to model the relationship between the number of submitted and deleted limit orders in a limit order book, which are key quantities in high frequency trading.

The outline of this article is as follows. Section 2 introduces the class of multivariate integer-valued trawl processes and presents its probabilistic properties. Section 3 gives a detailed overview of parametric model specifications focusing on a variety of different cases for modelling the serial correlation. Moreover, we present relevant examples of multivariate marginal distributions which fall into the infinitely divisible framework. In particular, as pointed out by Nikoloulopoulos & Karlis (2008), the negative binomial distribution often appears to be a suitable candidate for various applications. Hence we will derive several approaches to defining a multivariate infinitely divisible distribution which allows for univariate negative binomial marginal law. In Section 4 we will derive an algorithm to simulate from MIVT processes and develop a statistical inference methodology which we will also test in a simulation study. Section 5 applies the new methodology to limit order book data. Finally, Section 6 concludes. The proofs of the theoretical results are relegated to the Appendix, Section A, and Section B provides more details on the algorithms used in the simulation study.

2 Multivariate integer-valued trawl processes

2.1 Integer-valued Lévy bases as driving noise

Throughout the paper, we denote by the underlying filtered probability space satisfying the usual conditions. Also, we choose a set () and let the corresponding Borel -algebra be denoted by . Next we define a Radon measure on , which by definition satisfies for every compact measurable set .

In the following, we will always assume that the Assumption (A1) stated below holds.

Assumption (A1)

Let for and let be a homogeneous Poisson random measure on with intensity measure , where is a Lévy measure concentrated on and satisfying

Using the Poisson random measure, we can define an integer-valued Lévy basis as follows.

Definition 1.

Suppose that is a homogeneous Poisson random measure on satisfying Assumption (A1). An -valued, homogeneous Lévy basis on is defined as


From the definition, we can immediately see that is infinitely divisible with characteristic function given by

Here, C denotes the associated cumulant function, which is the (distinguished) logarithm of the characteristic function. It can we written as

where the random vector denotes the corresponding Lévy seed with cumulant function given by


where denotes the corresponding Lévy measure defined above.

Remark 1.

It is important to note that the Lévy seed specifies the homogeneous Lévy basis uniquely, and vice versa, with any homogeneous Lévy basis we can associate a unique Lévy seed. Hence, in modelling terms, it will later be sufficient to discuss various modelling choices for the corresponding Lévy seed, since this will fully characterise the associated Lévy basis.

Remark 2.

Based on the Lévy seed, we can define a Lévy process denoted by , when setting . Clearly, in this case, we get .

Following the construction in Sato (1999, Theorem 4.3), we model the Lévy seed by an -dimensional compound Poisson random variable given by

where is an homogeneous Poisson process of rate and the form a sequence of i.i.d. random variables independent of and which have no atom in , i.e. not all components are simultaneously equal to zero, more precisely, for all .

Remark 3.

Recall that by modelling the Lévy seed by a multivariate compound Poisson process we can only allow for positive correlations between the components.

2.2 The trawls

Following the approach presented in Barndorff-Nielsen (2011), see also Barndorff-Nielsen et al. (2014), we now define the so-called trawls.

Definition 2.

We call a Borel set such that a trawl. Further, we set


The above definition implies that the trawl at time is just the shifted trawl from time .

Remark 4.

Note that the size of the trawl does not change over time, i.e. we have for all .

Clearly, there is a wide class of sets which can be considered as trawls. Throughout the paper, we will hence narrow down our focus, and will concentrate on a particular subclass of trawls which can be written as


where is a continuous function such that . Typically we refer to as the trawl function. In such a semi-parametric setting, we can easily deduce that


Moreover, the corresponding trawl at time is given by

Definition 3.

Let denote a trawl given by (4). If and is monotonically non-decreasing, then we call a monotonic trawl.

Example 1.

Let for . Then the corresponding trawl is monotonic with .

In our multivariate framework, we will choose trawls denoted by . Then we set for . When we work with trawls of the type (4), we will denote by the corresponding trawl functions.

2.3 The multivariate integer-valued trawl process and its properties

Definition 4.

The stationary multivariate integer-valued trawl (MIVT) process is defined by

where each component is given by

where denotes the indicator function.

Since the trawls have finite Lebesgue measure, the integrals above are well-defined in the sense of Rajput & Rosinski (1989).

When we define , then we can represent the MIVT process as

which shows that we are dealing with a special case of a multivariate mixed moving average process.

The law of the MIVT process is fully characterised by its characteristic function, which we shall present next.

Proposition 1.

For any , the characteristic function of is given by , where the corresponding cumulant function is given by

Corollary 1.

In the special case when , the characteristic function simplifies to .

This is an important result, which implies that to any infinitely divisible integer-valued law , say, there exists a stationary integer-valued trawl process having as its marginal law.

2.3.1 Cross-sectional and serial dependence

Let us now focus on the cross-sectional and the serial dependence of multivariate integer-valued trawl processes.

First, the cross-sectional dependence is entirely characterised through the multivariate Lévy measure . For instance, when we focus on the pair of the th and the th component for , we define the corresponding joint Lévy measure by

Then the covariance between the th and the th Lévy seed is given by

Relevant specifications of will be discussed in Section 3.2.

Second, the serial dependence is determined through the trawls. More precisely, following Barndorff-Nielsen (2011), we introduce the so-called autocorrelator between the th and the th component, which is defined as

Let us now focus on the autocorrelators for trawls of type (4).

Proposition 2.

Suppose the trawls , are of type (4). Then for the intersection of two trawls is given by

I.e. the autocorrelator satisfies

The proof is straightforward and hence omitted.

Remark 5.

Note that the autocorrelators can be computed as soon as the corresponding trawl functions and their parameters are known. We will come back to this aspect when we discuss inference for trawl processes in Section 4.2.

Let us consider a canonical example when the trawl functions are given by exponential functions.

Example 2.

Let . For suppose that . Then for we have that and hence . Hence . Similarly, we get that , for .

For monotonic trawl functions we observe that there are two possible scenarios: Either, one trawl function is always ‘below’ the other one, which implies that

see e.g. Example 2, or the trawl functions intersect each other. In the latter case, suppose there is one intersection of and at time , say. Consider the scenario when for and for . Then

Extensions to a multi-root scenario are straightforward.

Clearly, the autocorrelators are closely related to the autocorrelation function. More precisely, we have the following result, which follows directly from the expression of the cumulant function of the multivariate trawl process.

Proposition 3.

The covariance between two (possibly shifted) components for is given by

Also, the corresponding auto- and cross-correlation function is given by

i.e. the autocorrelation function is proportional to the autocorrelators.

We will come back to the above result when we turn our attention to parametric inference for MIVT processes in Section 4.2.

3 Parametric specifications

In order to showcase the flexibility of the new modelling framework, we will discuss various parametric model specifications in this section, where we start off by considering specifications of the trawl, followed by models for the multivariate Lévy seed.

3.1 Specifying the trawl function

We have already covered the case of an exponential trawl function above and will now present alternative choices for the trawl functions and their corresponding autocorrelators, see also Barndorff-Nielsen et al. (2014) for other examples.

While an exponential trawl leads to an exponentially decaying autocorrelation function, we sometimes need model specifications which exhibit a more slowly decaying autocorrelation function. Such trawl functions can be constructed from the exponential trawl function by randomising the memory parameter as we will describe in the following example.

To simplify the notation we will in the following supress the indices for the corresponding component in the multivariate construction, i.e. we set and do not write the sub-/superscripts for the corresponding parameters.

Example 3.

Define the trawl function by

for a probability measure on . Suppose that is absolutely continuous with density , then the corresponding trawl function can be written as

which again leads to a monotonic trawl function. The corresponding autocorrelation function is given by

assuming that .

Barndorff-Nielsen et al. (2014) discuss various constructions of that type depending on different choices of the probability measure and we refer to that article for more details on the computations.

In applications, we often assume that is absolutely continuous with respect to the Lebesgue measure and we denote its density by . A very flexible parametric framework can be obtained by choosing to be a generalised inverse Gaussian (GIG) density as we shall discuss in the next example.

Example 4.

Suppose that is the density of the GIG distribution, i.e.


where and and are both nonnegative and not simultaneously equal to zero. Here we denote by the modified Bessel function of the third kind. Straightforward computation show that the corresponding trawl function is given by

and the corresponding size of the trawl set equals

Moreover, the autocorrelation function is given by

Some special cases of the GIG distribution include the inverse Gaussian and the gamma distribution, which lead to interesting parametric examples which we shall study next.

Example 5.

Suppose we choose an inverse Gaussian (IG) density function for . Then we obtain the so-called sup-IG trawl function, which can be written as

for nonnegative parameters which are assumed not to be simultaneously equal to zero. Then we have that and the corresponding autocorrelation function is given by

Next, we consider an example where the trawl function decays according to a power law.

Example 6.

A long memory specification can be obtained when the probability measure is chosen to have Gamma distribution. In that case, we obtain a trawl function given by

Then . Also,

I.e. when we have a stationary long memory model, and when when we obtain a stationary short memory model.

Finally, we consider the case of a seasonal trawl function.

Example 7.

A seasonally varying trawl function can be obtained by setting , where is a monotonic trawl function and is a periodic seasonal function. E.g. as discussed in (Barndorff-Nielsen et al., 2014, Example 9), we can consider the following functional form

Here determines how quickly the function decays, whereas denotes the period of the season. In this case, we obtain and

Note that this construction leads to a seasonal autocorrelation function, but not to seasonality in the levels of the trawl process.

3.2 Modelling the cross-sectional dependence

The trawl process is completely specified, as soon as both the trawls and the marginal distribution of the multivariate Lévy seed are specified. When it comes to infinitely divisible discrete distributions, the Poisson distribution is the natural starting point and we will review multivariate extensions in Section 3.2.1. However, since many count data exhibit overdispersion, it is crucial that we go beyond the Poisson framework. In the univariate context, there have been a variety of articles on suitable discrete distributions, see e.g. Puig & Valero (2006) and Nikoloulopoulos & Karlis (2008) amongst others. However, the literature on parametric classes of multivariate infinitely divisible discrete distributions with support on is rather sparse. We know that any such distribution necessarily is of discrete compound Poisson type, see Feller (1968); Valderrama Ospina & Gerber (1987); Sundt (2000), and always has non-negatively correlated components. In Section 3.2.2 we will discuss a possible parametrisation based on Poisson mixtures of random additive-effect-type models.

3.2.1 Multivariate Poisson marginal distribution

As before, we denote by the Lévy seed. To start off with we present a multivariate Poisson law for the Lévy seed. In order to introduce dependence between the Poisson random variables, one typically uses a so-called common factor approach, which we outline in the following, see e.g. Karlis (2002); Karlis & Meligkotsidou (2005).

Suppose that we have independent random variables for , and set .

Let denote a -matrix (for ) with 0-1 entries and having no duplicate columns. We then set , which clearly follows a multivariate Poisson distribution. The corresponding mean and variance can be easily computed and are given by and , respectively, where and . Since the components are independent, we have and . The above construction implies that , where . Also, for we have that

Let us study some relevant examples within this modelling framework.

Example 8.

An -dimensional model with one common factor between all components can be obtained by choosing , and

and independent Poisson random variables , for . Then we have

Here each component has marginal Poisson distribution, i.e.  and for we have that .

Beyond the bivariate case, the example above presents a rather restrictive model for applications since it only allows for one common factor. A less sparse choice of would allow for more flexible model specifications. Let us consider a more realistic example in the trivariate case next.

Example 9.

Consider a model of the type

for independent Poisson random variables with parameters , for
. Such a model specification corresponds to the choice of

Here we have that , and .

The above example treats a very general case which allows for all possible bivariate as well as a trivariate covariation effect. A slightly simpler specification is given in the next example, which only considers pairwise interaction terms.

Example 10.


results in a trivariate model of the form

for independent Poisson random variables with parameters , for
. Then we have that , and ; also,

3.2.2 Multivariate discrete compound Poisson marginal distribution obtained from Poisson mixtures

While the Poisson distribution is a good starting point in the context of modelling count data, for many applications it might be too restrictive. In particular, often one needs to work with distributions which allow for overdispersion, i.e. that the variance is bigger than the mean.

Since we are interested in staying within the class of discrete infinitely divisible stochastic processes, the most general class of distributions we can consider are the discrete compound Poisson distributions. To this end, we model the Lévy seed by an -dimensional compound Poisson random variable, see e.g. Sato (1999, Theorem 4.3), given by

where is an homogeneous Poisson process of rate and the form a sequence of i.i.d. random variables independent of and which have no atom in , i.e. not all components are simultaneously equal to zero, more precisely, for all .

General Poisson mixtures
Previous research has clearly documented that Poisson mixture distributions provide a flexible class of distributions which are suitable for various applications, see e.g. Karlis & Xekalaki (2005) for a review.

In this section, we are going to introduce a parsimonious parametric model class for the -dimensional Lévy seed , which uses Poisson mixtures and is based on the results in Section 5 of Barndorff-Nielsen et al. (1992). To this end, consider random variables and for and assume that conditionally on the are independent and Poisson distributed with means given by the .

We then model the joint distribution of the by a so-called additive effect model as follows:

where the random variables are independent and the are nonnegative parameters.

We can easily derive the probability generating function of the joint distribution of , cf. Barndorff-Nielsen et al. (1992, Section 5):

where we denote by the moment generating function of a random variable with parameter .

Also, we can compute the means and the covariance function of the s and find that


Next we derive the joint law of , see Barndorff-Nielsen et al. (1992) for the bivariate case.

Proposition 4.

In the additive random effect model the joint law of is given by

Next, we establish the key result of this section, which links the Poisson mixture distribution based on an additive effect model to a discrete compound Poisson distribution. Recall, see e.g. Sato (1999, p. 18), that an -dimensional compound Poisson random variable has Laplace transform given by


where is the intensity of the Poisson process and is the Laplace transform of the i.i.d. jump sizes.

Proposition 5.

The Poisson mixture model of random-additive-effect type can be represented as a discrete compound Poisson distribution with rate

where and denotes the kumulant function, i.e. the logarithm of the Laplace transform, and the jump size distribution has Laplace transform given by


where and denotes the Lévy measure of and , respectively.

The above result is very important since we need the compound Poisson representation to efficiently simulate the trawl process, as we shall discuss in Section 4.1.

Multivariate negative binomial distribution
In situations where the count data are overdispersed and call for distributions other than the Poisson one, we can in principle choose from a great variety of discrete compound Poisson distributions. Motivated by our empirical study, see Section 5, and also the results in Barndorff-Nielsen et al. (2014), we investigate the case of a negative binomial marginal law in more detail since this is one of the infinitely divisible distributions which can cope with overdispersion.

Recall that we say that a random variable has negative binomial law with parameters , i.e.  if its probability mass function is given by

Its probability generating function is given by . Also, recall that a random variable is said to be gamma distributed with parameters , i.e.  if its probability density is given by , for .

Now, we set and in the Poisson mixture model. Then the probability generating function of is given by

Next we are going to describe three examples, see Barndorff-Nielsen et al. (1992, Example 5.3), which lead to negative binomial marginals. The first example, Example 11, covers the case of independent components, in the second example, Example 12, the fully dependent case is achieved through the presence of a common factor, and the third example, Example 13, combines the previous two cases by allowing for both a common (dependent) factor and additional independent components.

Example 11 (Independence case).

We set , for and choose . Then , which implies that the are independent and satisfy .

Example 12 (Dependence through common factor).

Choose and , for . Note that such a construction extends the bivariate case considered in Arbous & Kerrich (1951). Then , which implies that and also .

Example 13 (Dependence through common factor and additional independent factors).

Suppose that and . Then one can write , for and . Then we can deduce that . Hence .

Remark 6.

The dependence concepts used here can be considered as Poisson mixtures of the first kind, see Karlis & Xekalaki (2005).

We conclude this section by deriving the compound Poisson representation of the multivariate negative binomial distribution.

Example 14.

As before, let