Asymptotic analysis for RCBAR processes

Asymptotic results for random coefficient bifurcating autoregressive processes

Vassili Blandin Université Bordeaux 1 Université Bordeaux 1, Institut de Mathématiques de Bordeaux, UMR CNRS 5251, and INRIA Bordeaux, team ALEA, 351 cours de la libération, 33405 Talence cedex, France. vassili.blandin@math.u-bordeaux1.fr
Abstract.

The purpose of this paper is to study the asymptotic behavior of the weighted least squares estimators of the unknown parameters of random coefficient bifurcating autoregressive processes. Under suitable assumptions on the immigration and the inheritance, we establish the almost sure convergence of our estimators, as well as a quadratic strong law and central limit theorems. Our study mostly relies on limit theorems for vector-valued martingales.

Key words and phrases:
bifurcating autoregressive process; random coefficient; weighted least squares; martingale; almost sure convergence; central limit theorem
2010 Mathematics Subject Classification:
Primary 60F15; Secondary 60F05, 60G42

1. Introduction

In this paper, we will study random coefficient bifurcating autoregressive processes (RCBAR). Those processes are an adaptation of random coefficient autoregressive processes (RCAR) to binary tree structured data. We can also see those processes as the combination of RCAR processes and bifurcating autoregressive processes (BAR). RCAR processes have been first studied by Nicholls and Quinn [18, 19] while BAR processes have been first investigated by Cowan and Staudte [5]. Both inherited and environmental effects are taken into consideration in RCBAR processes in order to explain the evolution of the characteristic under study. The binary tree structure could lead us to take cell division as an example.

More precisely, the first-order RCBAR process is defined as follows. The initial cell is labelled and the offspring of the cell labelled are labelled and . Denote by the characteristic of individual . Then, the first-order RCBAR process is given, for all , by

The environmental effect is given by the driven noise sequence while the inherited effect is given by the random coefficient sequence . The cell division example leads us to consider that and are correlated since the environmental effect on two sister cells can reasonably be seen as correlated.

This study is inspired by experiments on the single celled organism Escherichia coli, see Stewart et al. [21] or Guyon et al. [10], which reproduces by dividing itself into two poles, one being called the new pole, the other being called the old pole. Experimental data seems to show that some variables among cell lines, such as the life span of the cells, does not evolve in the same way whether it is the new or the old pole. The difference in the evolution leads us to consider an asymmetric RCBAR. Considering a RCBAR process instead of a BAR process allows us to assume that the inherited effect is no more deterministic, as randomness often appears in nature. Moreover, we can consider both deterministic and random inherited effects since we also allow the random variables modeling the inherited effect to be deterministic, making this study usable for RCBAR as well as BAR.

This paper, which is an adaptation of [4] to RCBAR processes, intends to study the asymptotic behavior of the weighted least squares (WLS) estimators of first-order RCBAR processes using a martingale approach. This martingale approach has been first proposed by Bercu et al. [3] and de Saporta et al. [6] for BAR processes. The WLS estimation of parameters branching processes was previously investigated by Wei and Winnicki [24] and Winnicki [25]. We will make use several times of the strong law of large numbers [8] as well as the central limit theorem [8, 11] for martingales, in order to investigate the asymptotic behavior of the WLS estimators. Those theorems have been previously used by Basawa and Zhou [2, 26, 27].

Several approaches appeared for BAR processes, and we tried not to set aside any of them. Thus, we took into account the classical BAR studies as seen in Huggins and Basawa [13, 14] and Huggins and Staudte [15] who studied the evolution of cell diameters and lifetimes, and also the bifurcating Markov chain model introduced by Guyon [9] and used in Delmas and Marsalle [7]. Still, we did not forget to have a look to the analogy with the Galton-Watson processes as studied in Delmas and Marsalle [7] and Heyde and Seneta [12]. Several methods have also been used for parameter estimation in RCAR processes. Koul and Schick [17] used an M-estimator while Aue et al. [1] preferred a quasi-maximum likelihood approach. Schick [20] introduced a new class of estimator that Vanecek [22] used in his work. Hwang et al. [16] also tackled the critical case where the environmental effect follows a Rademacher distribution.

The paper is organized as follows. Section 2 allows us to explain more precisely the model in which we are interested in, then Section 3 formulates the WLS estimators of the unknown parameters we will study. Section 4 permits us to introduce the martingale point of view of this paper. The main results are collected in Section 5, those results concern the asymptotic behavior of our WLS estimators, to be more accurate, we will establish the almost sure convergence, the quadratic strong law and the asymptotic normality of our estimators. Finally, the other sections gathers the proofs of our main results, except the last section which illustrates our results with a small simulation study.

2. Random coefficient bifurcating autoregressive processes

Consider the first-order RCBAR process given, for all , by

(2.1)

where the initial state is the ancestor of the process and stands for the driven noise of the process. In all the sequel, we shall assume that . We also assume that both and are i.i.d., and that those two sequences are independent. One can see the RCBAR process given by (2.1) as a first-order random coefficient autoregressive process on a binary tree, where each node represents an individual, node 1 being the original ancestor. For all , denote the -th generation by . In particular, is the initial generation and is the first generation of offspring from the first ancestor. Recall that the two offspring of individual are labelled and , or conversely, the mother of individual is where stands for the largest integer less than or equal to . Finally denote by

the sub-tree of all individuals from the original individual up to the -th generation. On can observe that the cardinality of is while that of is .

Figure 1. The tree associated with the RCBAR

3. Weighted least-squares estimation

Denote by the natural filtration associated with the first-order RCBAR process, which means that is the -algebra generated by all individuals up to the -th generation, in other words . We will assume in all the sequel that, for all and for all ,

(3.1)

Consequently, we deduce from (2.1) and (3.1) that, for all and for all ,

(3.2)

where, and . Therefore, the two relations given by (3.2) can be rewritten in a classic autoregressive form

(3.3)

where

and the matrix parameter

Our goal is to estimate from the observation of all individuals up to . We propose to make use of the WLS estimator of which minimizes

where the choice of the weighting sequence is crucial. We shall choose and we will go back to this suitable choice in Section 4. Consequently, we obviously have for all

(3.4)

In order to avoid useless invertibility assumption, we shall assume, without loss of generality, that for all , is invertible. Otherwise, we only have to add the identity matrix of order 2, to . In all what follows, we shall make a slight abuse of notation by identifying as well as to

Therefore, we deduce from (3.4) that

where and stands for the standard Kronecker product. Consequently, (3.3) yields to

(3.5)

In all the sequel, we shall make use of the following moment hypotheses.

  1. For all ,

  2. For all and for all

  3. For all and for all , if , and are conditionally independent given and for all , if , and are conditionally independent given . While otherwise, it exists and such that, for all

  4. One can find , , and such that, for all and for all

    In addition, it exists and such that, for all

  5. It exists such that

One can observe that those hypotheses allows us to consider the deterministic case where it exists some constants , with such that, for all , and a.s. Moreover, under assumption (H.2), we have for all and for all

(3.6) and

Consequently, if we choose for all , we clearly have for all

It is exactly the reason why we have chosen this weighting sequence into (3.4). Similar WLS estimation approach for branching processes with immigration may be found in [24] and [25]. For all and for all , denote . We deduce from (3.6) that for all , where is defined by

It leads us to estimate the vector of variances by the WLS estimator

(3.7)

and for all ,

Finally the weighting sequence is given, for all , by . This choice is due to the fact that for all and for all

Consequently, as , we clearly have for all and for all

We have a similar WLS estimator of the vector of variances

by replacing by into (3.7). Let us remark that, for all and for all ,

(3.8)

Then, for all and for all , denote . We deduce from (3.8) that for all , where is defined by

It leads us to estimate the vector of covariances by the WLS estimator

(3.9)

This choice is due to the fact that for all and for all

Consequently, as , we clearly have for all and for all

4. A martingale approach

In order to establish all the asymptotic properties of our estimators, we shall make use of a martingale approach. For all , denote

We can clearly rewrite (3.5) as

(4.1)

As in [3], we make use of the notation since it appears that is a martingale. This fact is a crucial point of our study and it justifies the vector notation since most of all asymptotic results for martingales were established for vector-valued martingales. Let us rewrite in order to emphasize its martingale quality. Let where is the matrix of dimension given by

It represents the individuals of the -th generation which is also the collection of all where belongs to . Let be the random vector of dimension

The vector gathers the noise variables of . The special ordering separating odd and even indices has been made in [3] so that can be written as

Under (3.1), we clearly have for all , a.s. and is -measurable. In addition it is not hard to see that under (H.1) to (H.2), is a locally square integrable vector martingale with increasing process given, for all , by

(4.2)

where

(4.3)

with

One can remark that we obviously have but it is necessary to establish the convergence of , properly normalized, in order to prove the asymptotic results for our RCBAR estimators , , and .

5. Main results

We have to introduce some more notations in order to state our main results. From the original process , we shall define a new process recursively defined by , and if with , then

where is a sequence of i.i.d. random variables with Bernoulli distribution. Such a construction may be found in [9] for the asymptotic analysis of BAR processes. The process gathers the values of the original process along the random branch of the binary tree given by . Denote by the unique such that . Then, for all , we have

(5.1)

where, with the unique number such that ,

(5.2)
Lemma 5.1.

Assume that (H.1) and (H.2) are satisfied. Then, we have

where is a positive non degenerate random variable with .

Denote .

Lemma 5.2.

Assume that (H.1) and (H.2) are satisfied. Then, for all , we have

Proposition 5.3.

Assume that (H.1) to (H.3) are satisfied. Then, we have

(5.3)

where is the positive definite matrix given by

Our first result deals with the almost sure convergence of our WLS estimator .

Theorem 5.4.

Assume that (H.1) to (H.5) satisfied. Then, converges almost surely to with the rate of convergence

In addition, we also have the quadratic strong law

(5.4)

where

(5.5)

Our second result concerns the almost sure asymptotic properties of our WLS variance and covariance estimators , and . Let

Theorem 5.5.

Assume that (H.1) to (H.5) are satisfied. Then, and converge almost surely to and respectively. More precisely,

(5.6)
(5.7)

In addition, converges almost surely to with

(5.8)
Remark 5.6.

We also have the almost sure rates of convergence

Our last result is devoted to the asymptotic normality of our WLS estimators , , and .

Theorem 5.7.

Assume that (H.1) to (H.5) are satisfied. Then, we have the asymptotic normality

(5.9)

In addition, we also have

(5.10)
(5.11)

where

Finally,

(5.12)

where

The rest of the paper is dedicated to the proof of our main results.

6. Proof of Lemma 5.1

We can reformulate (5.1) and (5.2) as

We already made the assumption that both and are i.i.d. and that those two sequences are independent. Consequently, the couples and share the same distribution. Hence, for all , has the same distribution than the random variable

For the sake of simplicity, we will denote

(6.1)

On the first hand, and since

this immediately leads to

On the other hand, let be defined as

and given by

We have

In addition, and which leads to and . Consequently,

This proves that which immediately implies that

Moreover, we can easily see that (H.1) allows us to say that thanks to the Cauchy-Schwarz inequality. It only remains to prove that is not degenerate. First, we easily have, since

Then, we can calculate