Modelling, simulation and inference for multivariate time series of counts
Abstract
This article presents a new continuoustime modelling framework for multivariate time series of counts which have an infinitely divisible marginal distribution. The model is based on a mixed moving average process driven by Lévy noise – called a trawl process – where the serial correlation and the crosssectional dependence are modelled independently of each other. Such processes can exhibit short or long memory. We derive a stochastic simulation algorithm and a statistical inference method for such processes. The new methodology is then applied to high frequency financial data, where we investigate the relationship between the number of limit order submissions and deletions in a limit order book.
Keywords:
Count data, continuous time modelling of multivariate time series, trawl processes, infinitely divisible, Poisson mixtures, multivariate negative binomial law, limit order book
Mathematics Subject Classification: 60G10, 60G55, 60E07, 62M10, 62P05
1 Introduction
Time series of counts can be viewed as realisations of nonnegative integervalued stochastic processes and arise in various applications in the natural, life and social sciences. As such there has been very active research in the various fields and recent textbooks treatments can be found in Cameron & Trivedi (1998); Kedem & Fokianos (2002); Winkelmann (2003); Davis et al. (2015) and we refer to Davis et al. (1999); McKenzie (2003); Ferland et al. (2006); Weiß (2008); Cui & Lund (2009); Davis & Wu (2009); Jung & Tremayne (2011) for recent surveys and some new developments of the literature.
However, most of these previous works focus on univariate time series of counts and the literature on multivariate extensions is rather sparse and almost exclusively deals with models formulated in discrete time and borrow ideas from traditional autoregressive time series models. E.g. Franke & Rao (1995) and Latour (1997) introduced the firstorder integervalued autoregression model, which is based on the generalised Steutel and van Harn (1979) thinning operator. Recently, Boudreault & Charpentier (2011) applied such models to earthquake counts. Also, the recent handbook on discretevalued time series by Davis et al. (2015) contains the chapter by Karlis (2015) who surveys recent developments in multivariate count time series models.
One challenge in handling multivariate time series is the modelling of the crosssectional dependence. While for continuous distributions the theory of copulas presents a powerful toolbox, it has been pointed out by Genest & Nešlehová (2007) that a problem arises in the discrete context due to the nonuniqueness of the associated copula. This can be addressed by using the continuous extension approach by Denuit & Lambert (2005). Indeed, for instance Heinen & Rengifo (2007) introduce a multivariate time series model for counts based on copulas applied to continuously extended discrete random variables and fit the model to the numbers of trades of various assets at the New York stock exchange. Also, Koopman et al. (2015) study discrete copula distributions with timevarying marginals and dependence structure in financial econometrics. Motivated by the reliability literature, Lindskog & McNeil (2003) introduced the socalled common Poisson shock model to describe the arrival of insurance claims in multiple locations or losses due to credit defaults of various types of counterparty.
While the models mentioned above are interesting in their own right, the goal of this article is more ambitious since it formulates a more general modelling framework which can handle a variety of marginal distributions as well as different types of serial dependence including, in particular, both short and long memory specifications. That said, motivated by an application in financial econometrics and recognising the success the class of Lévy processes has in such settings, we focus exclusively on models whose marginal distribution is infinitely divisible. This assumption puts a restriction on the crosssectional dependence due to the wellknown result by Feller (1968), which says that a random vector with infinitely divisible distribution on always has nonnegatively correlated components. Moreover, any nondegenerate distribution on is infinitely divisible if and only if it can be expressed as a discrete compound Poisson distribution. We will see that this is nevertheless a very rich class of distributions and suitable for our application to high frequency financial data.
The new modelling framework is based on socalled multivariate integervalued trawl processes, which are special cases of multivariate mixed moving average processes where the driving noise is given by an integervalued Lévy basis.
In the univariate case, trawl processes – not necessarily restricted to the integervalued case – have been introduced by BarndorffNielsen (2011). Also, Noven et al. (2015) used such processes in an hierarchical model in the context of extreme value theory. The univariate integervalued case has been developed in detail in BarndorffNielsen et al. (2014). Shephard & Yang (2016b) studied likelihood inference for a particular subclass of an integervalued trawl process and, more recently, Shephard & Yang (2016a) used such processes to build an econometric model for fleeting discrete price moves. While the multivariate extension was already briefly mentioned in BarndorffNielsen et al. (2014), this article develops the theory of multivariate integervalued trawl (MVIT) processes in detail and presents new methodology for stochastic simulation and statistical inference for such processes and applies the new results to high frequency financial data from a limit order book. The key feature of MIVT processes, which makes them powerful for a wide range of applications is the fact that the serial dependence and the marginal distribution can be modelled independently of each other, which is for instance not the case in the famous DARMA models, see Jacobs & Lewis (1978a, b). As such we will present parsimonious ways of parameterising the serial correlation and will show that we can accommodate both short and long memory processes as well as seasonal fluctuations. Moreover, since MITV processes are formulated in continuous time, we can handle both asynchronous and not necessarily equally spaced observations, which is particularly important in a multivariate setup.
The motivation for this study comes from high frequency financial econometrics where discrete data arise in a variety of scenarios, e.g. high frequent price moves for stocks with fixed tick size resemble step functions supported on a fixed grid. Also, the number of trades can give us an indication of market activity and is widely analysed in the industry. In this article, we will apply our new methodology to model the relationship between the number of submitted and deleted limit orders in a limit order book, which are key quantities in high frequency trading.
The outline of this article is as follows. Section 2 introduces the class of multivariate integervalued trawl processes and presents its probabilistic properties. Section 3 gives a detailed overview of parametric model specifications focusing on a variety of different cases for modelling the serial correlation. Moreover, we present relevant examples of multivariate marginal distributions which fall into the infinitely divisible framework. In particular, as pointed out by Nikoloulopoulos & Karlis (2008), the negative binomial distribution often appears to be a suitable candidate for various applications. Hence we will derive several approaches to defining a multivariate infinitely divisible distribution which allows for univariate negative binomial marginal law. In Section 4 we will derive an algorithm to simulate from MIVT processes and develop a statistical inference methodology which we will also test in a simulation study. Section 5 applies the new methodology to limit order book data. Finally, Section 6 concludes. The proofs of the theoretical results are relegated to the Appendix, Section A, and Section B provides more details on the algorithms used in the simulation study.
2 Multivariate integervalued trawl processes
2.1 Integervalued Lévy bases as driving noise
Throughout the paper, we denote by the underlying filtered probability space satisfying the usual conditions. Also, we choose a set () and let the corresponding Borel algebra be denoted by . Next we define a Radon measure on , which by definition satisfies for every compact measurable set .
In the following, we will always assume that the Assumption (A1) stated below holds.
 Assumption (A1)

Let for and let be a homogeneous Poisson random measure on with intensity measure , where is a Lévy measure concentrated on and satisfying
.
Using the Poisson random measure, we can define an integervalued Lévy basis as follows.
Definition 1.
Suppose that is a homogeneous Poisson random measure on satisfying Assumption (A1). An valued, homogeneous Lévy basis on is defined as
(1) 
From the definition, we can immediately see that is infinitely divisible with characteristic function given by
Here, C denotes the associated cumulant function, which is the (distinguished) logarithm of the characteristic function. It can we written as
where the random vector denotes the corresponding Lévy seed with cumulant function given by
(2) 
where denotes the corresponding Lévy measure defined above.
Remark 1.
It is important to note that the Lévy seed specifies the homogeneous Lévy basis uniquely, and vice versa, with any homogeneous Lévy basis we can associate a unique Lévy seed. Hence, in modelling terms, it will later be sufficient to discuss various modelling choices for the corresponding Lévy seed, since this will fully characterise the associated Lévy basis.
Remark 2.
Based on the Lévy seed, we can define a Lévy process denoted by , when setting . Clearly, in this case, we get .
Following the construction in Sato (1999, Theorem 4.3), we model the Lévy seed by an dimensional compound Poisson random variable given by
where is an homogeneous Poisson process of rate and the form a sequence of i.i.d. random variables independent of and which have no atom in , i.e. not all components are simultaneously equal to zero, more precisely, for all .
Remark 3.
Recall that by modelling the Lévy seed by a multivariate compound Poisson process we can only allow for positive correlations between the components.
2.2 The trawls
Following the approach presented in BarndorffNielsen (2011), see also BarndorffNielsen et al. (2014), we now define the socalled trawls.
Definition 2.
We call a Borel set such that a trawl. Further, we set
(3) 
The above definition implies that the trawl at time is just the shifted trawl from time .
Remark 4.
Note that the size of the trawl does not change over time, i.e. we have for all .
Clearly, there is a wide class of sets which can be considered as trawls. Throughout the paper, we will hence narrow down our focus, and will concentrate on a particular subclass of trawls which can be written as
(4) 
where is a continuous function such that . Typically we refer to as the trawl function. In such a semiparametric setting, we can easily deduce that
(5) 
Moreover, the corresponding trawl at time is given by
Definition 3.
Let denote a trawl given by (4). If and is monotonically nondecreasing, then we call a monotonic trawl.
Example 1.
Let for . Then the corresponding trawl is monotonic with .
In our multivariate framework, we will choose trawls denoted by . Then we set for . When we work with trawls of the type (4), we will denote by the corresponding trawl functions.
2.3 The multivariate integervalued trawl process and its properties
Definition 4.
The stationary multivariate integervalued trawl (MIVT) process is defined by
where each component is given by
where denotes the indicator function.
Since the trawls have finite Lebesgue measure, the integrals above are welldefined in the sense of Rajput & Rosinski (1989).
When we define , then we can represent the MIVT process as
which shows that we are dealing with a special case of a multivariate mixed moving average process.
The law of the MIVT process is fully characterised by its characteristic function, which we shall present next.
Proposition 1.
For any , the characteristic function of is given by , where the corresponding cumulant function is given by
Corollary 1.
In the special case when , the characteristic function simplifies to .
This is an important result, which implies that to any infinitely divisible integervalued law , say, there exists a stationary integervalued trawl process having as its marginal law.
Crosssectional and serial dependence
Let us now focus on the crosssectional and the serial dependence of multivariate integervalued trawl processes.
First, the crosssectional dependence is entirely characterised through the multivariate Lévy measure . For instance, when we focus on the pair of the th and the th component for , we define the corresponding joint Lévy measure by
Then the covariance between the th and the th Lévy seed is given by
Relevant specifications of will be discussed in Section 3.2.
Second, the serial dependence is determined through the trawls. More precisely, following BarndorffNielsen (2011), we introduce the socalled autocorrelator between the th and the th component, which is defined as
Let us now focus on the autocorrelators for trawls of type (4).
Proposition 2.
Suppose the trawls , are of type (4). Then for the intersection of two trawls is given by
I.e. the autocorrelator satisfies
The proof is straightforward and hence omitted.
Remark 5.
Note that the autocorrelators can be computed as soon as the corresponding trawl functions and their parameters are known. We will come back to this aspect when we discuss inference for trawl processes in Section 4.2.
Let us consider a canonical example when the trawl functions are given by exponential functions.
Example 2.
Let . For suppose that . Then for we have that and hence . Hence . Similarly, we get that , for .
For monotonic trawl functions we observe that there are two possible scenarios: Either, one trawl function is always ‘below’ the other one, which implies that
see e.g. Example 2, or the trawl functions intersect each other. In the latter case, suppose there is one intersection of and at time , say. Consider the scenario when for and for . Then
Extensions to a multiroot scenario are straightforward.
Clearly, the autocorrelators are closely related to the autocorrelation function. More precisely, we have the following result, which follows directly from the expression of the cumulant function of the multivariate trawl process.
Proposition 3.
The covariance between two (possibly shifted) components for is given by
Also, the corresponding auto and crosscorrelation function is given by
i.e. the autocorrelation function is proportional to the autocorrelators.
We will come back to the above result when we turn our attention to parametric inference for MIVT processes in Section 4.2.
3 Parametric specifications
In order to showcase the flexibility of the new modelling framework, we will discuss various parametric model specifications in this section, where we start off by considering specifications of the trawl, followed by models for the multivariate Lévy seed.
3.1 Specifying the trawl function
We have already covered the case of an exponential trawl function above and will now present alternative choices for the trawl functions and their corresponding autocorrelators, see also BarndorffNielsen et al. (2014) for other examples.
While an exponential trawl leads to an exponentially decaying autocorrelation function, we sometimes need model specifications which exhibit a more slowly decaying autocorrelation function. Such trawl functions can be constructed from the exponential trawl function by randomising the memory parameter as we will describe in the following example.
To simplify the notation we will in the following supress the indices for the corresponding component in the multivariate construction, i.e. we set and do not write the sub/superscripts for the corresponding parameters.
Example 3.
Define the trawl function by
for a probability measure on . Suppose that is absolutely continuous with density , then the corresponding trawl function can be written as
which again leads to a monotonic trawl function. The corresponding autocorrelation function is given by
assuming that .
BarndorffNielsen et al. (2014) discuss various constructions of that type depending on different choices of the probability measure and we refer to that article for more details on the computations.
In applications, we often assume that is absolutely continuous with respect to the Lebesgue measure and we denote its density by . A very flexible parametric framework can be obtained by choosing to be a generalised inverse Gaussian (GIG) density as we shall discuss in the next example.
Example 4.
Suppose that is the density of the GIG distribution, i.e.
(6) 
where and and are both nonnegative and not simultaneously equal to zero. Here we denote by the modified Bessel function of the third kind. Straightforward computation show that the corresponding trawl function is given by
and the corresponding size of the trawl set equals
Moreover, the autocorrelation function is given by
Some special cases of the GIG distribution include the inverse Gaussian and the gamma distribution, which lead to interesting parametric examples which we shall study next.
Example 5.
Suppose we choose an inverse Gaussian (IG) density function for . Then we obtain the socalled supIG trawl function, which can be written as
for nonnegative parameters which are assumed not to be simultaneously equal to zero. Then we have that and the corresponding autocorrelation function is given by
Next, we consider an example where the trawl function decays according to a power law.
Example 6.
A long memory specification can be obtained when the probability measure is chosen to have Gamma distribution. In that case, we obtain a trawl function given by
Then . Also,
I.e. when we have a stationary long memory model, and when when we obtain a stationary short memory model.
Finally, we consider the case of a seasonal trawl function.
Example 7.
A seasonally varying trawl function can be obtained by setting , where is a monotonic trawl function and is a periodic seasonal function. E.g. as discussed in (BarndorffNielsen et al., 2014, Example 9), we can consider the following functional form
Here determines how quickly the function decays, whereas denotes the period of the season. In this case, we obtain and
Note that this construction leads to a seasonal autocorrelation function, but not to seasonality in the levels of the trawl process.
3.2 Modelling the crosssectional dependence
The trawl process is completely specified, as soon as both the trawls and the marginal distribution of the multivariate Lévy seed are specified. When it comes to infinitely divisible discrete distributions, the Poisson distribution is the natural starting point and we will review multivariate extensions in Section 3.2.1. However, since many count data exhibit overdispersion, it is crucial that we go beyond the Poisson framework. In the univariate context, there have been a variety of articles on suitable discrete distributions, see e.g. Puig & Valero (2006) and Nikoloulopoulos & Karlis (2008) amongst others. However, the literature on parametric classes of multivariate infinitely divisible discrete distributions with support on is rather sparse. We know that any such distribution necessarily is of discrete compound Poisson type, see Feller (1968); Valderrama Ospina & Gerber (1987); Sundt (2000), and always has nonnegatively correlated components. In Section 3.2.2 we will discuss a possible parametrisation based on Poisson mixtures of random additiveeffecttype models.
Multivariate Poisson marginal distribution
As before, we denote by the Lévy seed. To start off with we present a multivariate Poisson law for the Lévy seed. In order to introduce dependence between the Poisson random variables, one typically uses a socalled common factor approach, which we outline in the following, see e.g. Karlis (2002); Karlis & Meligkotsidou (2005).
Suppose that we have independent random variables for , and set .
Let denote a matrix (for ) with 01 entries and having no duplicate columns. We then set , which clearly follows a multivariate Poisson distribution. The corresponding mean and variance can be easily computed and are given by and , respectively, where and . Since the components are independent, we have and . The above construction implies that , where . Also, for we have that
Let us study some relevant examples within this modelling framework.
Example 8.
An dimensional model with one common factor between all components can be obtained by choosing , and
and independent Poisson random variables , for . Then we have
Here each component has marginal Poisson distribution, i.e. and for we have that .
Beyond the bivariate case, the example above presents a rather restrictive model for applications since it only allows for one common factor. A less sparse choice of would allow for more flexible model specifications. Let us consider a more realistic example in the trivariate case next.
Example 9.
Consider a model of the type
for independent Poisson random variables with parameters , for
.
Such a model specification corresponds to the choice of
Here we have that , and .
The above example treats a very general case which allows for all possible bivariate as well as a trivariate covariation effect. A slightly simpler specification is given in the next example, which only considers pairwise interaction terms.
Example 10.
Choosing
results in a trivariate model of the form
for independent Poisson random variables with parameters , for
.
Then we have that ,
and
; also,
Multivariate discrete compound Poisson marginal distribution obtained from Poisson mixtures
While the Poisson distribution is a good starting point in the context of modelling count data, for many applications it might be too restrictive. In particular, often one needs to work with distributions which allow for overdispersion, i.e. that the variance is bigger than the mean.
Since we are interested in staying within the class of discrete infinitely divisible stochastic processes, the most general class of distributions we can consider are the discrete compound Poisson distributions. To this end, we model the Lévy seed by an dimensional compound Poisson random variable, see e.g. Sato (1999, Theorem 4.3), given by
where is an homogeneous Poisson process of rate and the form a sequence of i.i.d. random variables independent of and which have no atom in , i.e. not all components are simultaneously equal to zero, more precisely, for all .
General Poisson mixtures
Previous research has clearly documented that Poisson mixture distributions provide a flexible class of distributions which are suitable for various applications, see e.g. Karlis & Xekalaki (2005) for a review.
In this section, we are going to introduce a parsimonious parametric model class for the dimensional Lévy seed , which uses Poisson mixtures and is based on the results in Section 5 of BarndorffNielsen et al. (1992). To this end, consider random variables and for and assume that conditionally on the are independent and Poisson distributed with means given by the .
We then model the joint distribution of the by a socalled additive effect model as follows:
where the random variables are independent and the are nonnegative parameters.
We can easily derive the probability generating function of the joint distribution of , cf. BarndorffNielsen et al. (1992, Section 5):
where we denote by the moment generating function of a random variable with parameter .
Also, we can compute the means and the covariance function of the s and find that
and
Next we derive the joint law of , see BarndorffNielsen et al. (1992) for the bivariate case.
Proposition 4.
In the additive random effect model the joint law of is given by
Next, we establish the key result of this section, which links the Poisson mixture distribution based on an additive effect model to a discrete compound Poisson distribution. Recall, see e.g. Sato (1999, p. 18), that an dimensional compound Poisson random variable has Laplace transform given by
(7) 
where is the intensity of the Poisson process and is the Laplace transform of the i.i.d. jump sizes.
Proposition 5.
The Poisson mixture model of randomadditiveeffect type can be represented as a discrete compound Poisson distribution with rate
where and denotes the kumulant function, i.e. the logarithm of the Laplace transform, and the jump size distribution has Laplace transform given by
where
where and denotes the Lévy measure of and , respectively.
The above result is very important since we need the compound Poisson representation to efficiently simulate the trawl process, as we shall discuss in Section 4.1.
Multivariate negative binomial distribution
In situations where the count data are overdispersed and call for distributions other than the Poisson one, we can in principle choose from a great variety of discrete compound Poisson distributions.
Motivated by our empirical study, see Section 5, and also the results in BarndorffNielsen et al. (2014), we investigate the case of a negative binomial marginal law in more detail since this is one of the infinitely divisible distributions which can cope with overdispersion.
Recall that we say that a random variable has negative binomial law with parameters , i.e. if its probability mass function is given by
Its probability generating function is given by . Also, recall that a random variable is said to be gamma distributed with parameters , i.e. if its probability density is given by , for .
Now, we set and in the Poisson mixture model. Then the probability generating function of is given by
Next we are going to describe three examples, see BarndorffNielsen et al. (1992, Example 5.3), which lead to negative binomial marginals. The first example, Example 11, covers the case of independent components, in the second example, Example 12, the fully dependent case is achieved through the presence of a common factor, and the third example, Example 13, combines the previous two cases by allowing for both a common (dependent) factor and additional independent components.
Example 11 (Independence case).
We set , for and choose . Then , which implies that the are independent and satisfy .
Example 12 (Dependence through common factor).
Choose and , for . Note that such a construction extends the bivariate case considered in Arbous & Kerrich (1951). Then , which implies that and also .
Example 13 (Dependence through common factor and additional independent factors).
Suppose that and . Then one can write