# Fisher information matrix of binary time series

###### Abstract

A common approach to analyzing categorical correlated time series data is to fit a generalized linear model (GLM) with past data as covariate inputs. There remain challenges to conducting inference for short time series length. By treating the historical data as covariate inputs, standard errors of estimates of GLM parameters computed using the empirical Fisher information do not fully account the auto-correlation in the data. To overcome this serious limitation, we derive the exact conditional Fisher information matrix of a general logistic autoregressive model with endogenous covariates for any series length . Moreover, we also develop an iterative computational formula that allows for relatively easy implementation of the proposed estimator. Our simulation studies show that confidence intervals derived using the exact Fisher information matrix tend to be narrower than those utilizing the empirical Fisher information matrix while maintaining type I error rates at or below nominal levels. Further, we establish that the exact Fisher information matrix approaches, as tends to infinity, the asymptotic Fisher information matrix previously derived for binary time series data. The developed exact conditional Fisher information matrix is applied to time-series data on respiratory rate among a cohort of expectant mothers where it is found to provide narrower confidence intervals for functionals of scientific interest and lead to greater statistical power when compared to the empirical Fisher information matrix.

Keywords: Binary time series; Correlated binary data; Empirical Fisher information; Exact Fisher information matrix; Logistic autoregressive model

## 1 Introduction

Time series data are widely collected in many fields such as genetics, medicine and transportation(see Gouveia et al. (2017), Chen et al. (2018A), Chen et al. (2018B), Guo et al. (2018)). Various models for categorical time series that take into account temporal correlation are discussed in Kedem (1994), Kedem (1980), Diggle et al. (1994), Fahrmeir & Tutz (1994) and Gao et al. (2017), among others. In this paper, we consider the logistic autoregressive model for binary time series data. Under this model, we derive the exact conditional Fisher information (Ex-FI) matrix for binary time series with arbitrary length and demonstrate that a correctly specified Ex-FI leads to more efficient inference for regression parameters. In particular, confidence intervals are narrower compared to those obtained using the empirical Fisher information (Em-FI) matrix (Dodge, 2003) while maintaining type I error rates at, or below, nominal levels.

We briefly describe some of the related approaches to modeling binary time series. Keenan (1982) developed a model with an underlying unobserved process that is Gaussian first-order autoregressive. For binary time series with a Markovian structure, Billingsley (1961), Meyn & Tweedie (2012), Bonney (1987), Fahrmeir & Kaufmann (1987), Kaufmann (1987), Keenan (1982) and Muenz & Rubinstein (1985) developed an inferential procedure based on the conditional likelihood. A comprehensive modeling framework based on partial likelihood inference and generalized linear models was developed in Fokianos & Kedem (2003) and Kedem & Fokianos (2002).

In practice, standard software for fitting generalized linear models (GLMs) to binary time series use the past series values as “explanatory variables” in the conditional mean of the response for the regression model (de Vries et al., 1994). One limitation of this approach is that it does not differentiate between explanatory variables that are exogenous to the time series data versus those that are endogenous (i.e., explanatory values that are past values of the time series). Thus, it does not properly take into account the auto-correlation structure in the data, leading to potentially undesirable consequences. In particular, the standard errors of the regression parameter estimates derived using the Em-FI matrix also ignore the auto-correlation structure. We demonstrate that this may lead to incorrect inference because the asymptotic covariance matrix of the partial (conditional) maximum likelihood estimators of the logistic model parameters are also incorrect. To exemplify the practical importance of this result, de Vries et al. (1994) utilized a logistic autoregressive model (LAR/LARX) to predict the outcome of supervised exercise for intermittent claudication. The inference did not distinguish between covariates that were exogenous versus endogenous to the time series, hence yielding potentially invalid and/or inefficient statistical inference. Hence, our goal here is to derive the Ex-FI matrix (for finite time series length ) of a logistic autoregressive (LAR/LARX) model for statistical inference. This model takes into account the correlation in binary time series data.

Asymptotic inference for modeling independent and serially-correlated binary responses (or binary time series) has been well studied in the literature. In the case of independent binary responses, several papers have established and discussed the poor performance of Wald-based tests and confidence intervals for the probability of success when utilizing the empirical information matrix (see Hauck & Donner (1977); Newcombe (1998); Agresti & Min (2001)). As a result, the use of either the likelihood ratio or score based confidence intervals is generally recommended for inference in this setting. For binary time series, Fahrmeir & Kaufmann (1987) and Kaufmann (1987) used the Markovian assumption to demonstrate the asymptotic normality and efficiency of the maximum likelihood estimator under standard regularity conditions. Fokianos & Kedem (1998) extended the idea by introducing time dependent covariates (including past series values). Under the framework of partial likelihood inference, Fokianos & Kedem (1998) proved the existence of an asymptotic conditional Fisher information (AFI) matrix and established asymptotic results of the maximum partial likelihood estimator. However, our simulation studies indicate that one needs be cautious in applying asymptotic results when the length of the time series is small or moderate (say, ). Startz (2012) provided a statistical strategy for modeling the binary autoregressive moving average (BARMA) under mild assumptions. In their study, one of the main considerations was the lack of analytical forms of the autocorrelation and the unconditional mean because of the nonlinearity of the model. In this paper, we propose a rigorous approach to derive the Ex-FI matrix of a LAR/LARX model that provides more efficient asymptotic inference in terms of narrow interval estimation while maintaining nominal type I error rate.

In Fokianos & Kedem (1998), the AFI matrix was derived for the general case where the conditional distribution of a time series depends on its own historical data as well as other covariates. While impressive in its generality, the primary limitation of this result is that it does not provide a closed form of the Fisher information matrix for specific models. As we will demonstrate, the form of the Fisher information matrix is non-trivial even for the logistic first order autoregressive LAR(1) model, perhaps the simplest model for binary time series. The difficulty in deriving the analytic form of the Fisher information matrix lies in the fact that the score function, or the Hessian matrix, contains cross-covariance related to time-varying covariates. Another limitation is that the Ex-FI matrix has not been derived for finite . Instead, only an asymptotic approximation based on the partial likelihood, which turned out to be equivalent to the Em-FI matrix for the LAR model, was provided. There are major consequences of these limitations. First, the result lacks the precise form of the Fisher information matrix to conduct inference on specific LAR coefficients and functionals of these coefficients (e.g., probability of given the past values of the binary series). Second, when is not sufficiently large, the discrepancy between the Ex-FI and Em-FI matrices could lead to poor power, incorrect significance level of tests, inefficient inference, and potentially misleading results from data analysis. Third, the large sample theory derived in Kedem & Fokianos (2002) is based on the crucial assumption that where is a probability measure, is a Borel set and is the indicator function. Even when is large, such assumption may not be easily met. In this way, using Em-FI rather than Ex-FI may be misleading since no large sample theory is guaranteed.

Motivated by these limitations, this paper provides a derivation of the Ex-FI matrix of a LAR/LARX model for arbitrary finite . While the derivation is non-trivial we provide a computationally tractable expression that can be easily implemented in an iterative manner. We report findings from simulation studies suggesting that the derived Ex-FI matrix yields superior results relative to the Em-FI for small to moderate sample sizes. When compared to using the Em-FI, inference based on the Ex-FI matrix produces narrower confidence intervals for a fixed significance level; close to expected false positive rate and higher power when conducting tests of hypotheses. The simulation studies also demonstrate that the Ex-FI matrix converges to the general AFI developed in Fokianos & Kedem (1998) in the sense that the norm of the difference between the entries of the two matrices converges to 0 when the length of the binary time series increases. Finally, we apply the developed Ex-FI matrix to time-series data on respiratory rate among a cohort of expectant mothers. Results show the similar pattern observed from simulations. Namely, the Ex-FI matrix is found to provide narrower confidence intervals for functionals of scientific interest (such as the probability or log odds) and produce more statistical power when compared to the Em-FI matrix.

The remainder of this paper is organized as follows. In Section 2, we first derive the Ex-FI matrix of LAR/LARX model in general. We also propose a computation framework through functional iteration to obtain the Ex-FI matrix explicitly. At the end, we consider a special case when the order of LAR mode is 1 and calculate the analytic form of the Ex-FI matrix. In Section 3, we present some simulation results to compare the Ex-FI with Em-FI. Results show the benefit of using Ex-FI in terms of shorter confidence interval length and reasonable Type I error rate. Moreover, asymptotic behavior is also studied. In Section 4, we applied the Ex-FI matrix to time-series data on respiratory rate among expectant mothers. By comparing with the Em-FI, we conclude that using Ex-FI can produce greater power and shorter confidence intervals when conducting statistical inference.

## 2 Derivation of the Exact Conditional Fisher Information Matrix

### 2.1 Logistic autoregressive model of order (LAR(p))

###### Theorem 1.

Consider a binary-valued correlated time series data where the conditional distribution of depends on the previous values via the conditional probability

(1) |

where is endogenous to the series and The exact Fisher Information (Ex-FI) matrix takes the form

(2) |

where the conditional joint probability of is derived to be

(3) | |||||

Proof The proof directly follows by the fact that the conditional log-likelihood function of and the vector of conditional score functions are, respectively,

And

(4) |

∎

###### Remark 1.

Note that the results given in (2) and (3) depend on the true values of . In practice, one needs to plug the maximum likelihood estimates into those expressions to obtain the exact values of Ex-FI.

### 2.2 Logistic autoregressive model of order with exogenous covariates (LARX(p))

Here we consider the case of additional exogenous covariate adjustment in the LAR(p) time series model.

###### Corollary 1.

Consider a binary-valued correlated time series , where the conditional distribution of depends on its previous values and exogenous covariates that relates to current time through the conditional probability

where , and all the other parameters follow the notation of the previous section. The Ex-FI matrix takes the form

(5) |

where is defined in Equation (3).

Proof The results follow directly from Theorem 1 and the facts that the conditional log-likelihood function of and the vector of conditional score functions are respectively

(6) | |||||

And the Hessian matrix is

(7) |

∎

### 2.3 Computation through functional iteration

Since take values from , computation of the Ex-FI matrix through direct calculation can be expensive. In this section, we propose two alternative approaches to achieve Ex-FI matrix.

###### Theorem 2.

where is the indicator function that takes value when the realization is and otherwise.

Alternatively, the Ex-FI matrix can be also obtained through where

and

Proof The derivation of directly follows from the definition. The results of derive from the fact that for any particular , by function iteration,

∎

###### Remark 2.

###### Remark 3.

The results from Theorem 2 can be extended to other link functions. In the case of complementary log-log link, given by the analytic form of the latent function, score function and Hessian matrix can be easily obtained. Equations (4) and (7) can be directly adapted. And by following Theorem 2, the Ex-FI can be obtained. In the case of probit link, due to the inexplicit form of the link function, a numerical approximation can be used to obtain Ex-FI in practice.

### 2.4 Special case: logistic autoregressive model of order (Lar(1))

In general, there is no explicit form of Ex-FI. In this section, we consider the only special case that enjoys explicit analytic form.

###### Theorem 3.

Consider a binary-valued time series data , where the conditional distribution of depends on its own immediate past value via the conditional probability

Then if we denote the Ex-FI matrix to be , its elements are derived, respectively, as

where and .

Proof It is straightforward that the corresponding Hessian matrix is derived to be

(8) |

Due to the Markovian assumption, the conditional expectation can be obtained through iterated expectations. For any particular , we have

(9) | |||||

(10) |

which completes the proof by some algebra calculation. ∎

## 3 Simulations

### 3.1 Evaluating small sample performance

In this section, we compare the behavior of the newly derived Ex-FI and Em-FI in the context of inference for regression parameters under various models. The Em-FI (AFI) is calculated from Equation (4) for LAR and Equation (7) for LARX respectively. Time series lengths are chosen as and respectively, and simulations were generated under different scenarios. In Scenario 1, the signals were generated by LAR(1) with parameters (“low ratio”), and (“high ratio”). In this case, denotes the log odds ratio and denotes the log odds when the previous realization is . is a monotonic function of the log odds ratio of . Particularly, large value (greater than 1) of implies the log odds of when is much higher compared to the log odds when . In Scenario 2, the time series were simulated through LAR(2) with parameters (“low ratio”), and (“high ratio”). In Scenario 3, we considered generating signals by LARX(1) with parameters (“low ratio”), and (“high ratio”). The exogenous covariate was obtained from standard normal distribution. For each scenario, we calculate the empirical type I error-rate for testing at level .05, the average standard error obtained from maximum likelihood estimates and true values as well as their corresponding Monte Carlo standard error, the observed error deviation of the estimates across simulations. Critical values are determined from normal distribution when calculating type I error-rates.

Table 1 provides a summary of the conducted simulation study for various time series lengths. With respect to type I error, it can be seen that use of Ex-FI and Em-FI both result in conservative inference (lower than nominal type I error) for smaller values of and for high ratios. For the low ratio scenario, nominal type I error rates are achieved as time series lengths of . For time series lengths of both variance estimators yield the desired type I error rates. The benefit of using Ex-FI over Em-FI is observed when comparing the average standard error to the observed standard deviation of estimates of across simulations. Specifically, Em-FI tends to behave erratically for small sample sizes, yielding extremely large estimated standard error for some simulated datasets. This can be seen most notably in the high ratio scenario by observing that the average standard error computed using Em-FI is 362.3 compared to the actual observed standard deviation of the estimator across 10,000 simulations being only 3.015. In contrast, the average standard error computed using Ex-FI is only 7.868.

Table 2 summarizes the simulation results of Scenario 2. Similar to the case of LAR(1), both of the Ex-FI and Em-FI result in lower type I error rates when the sample size is small and reach nominal values when We can still observe the huge advantages of the average standard error obtained from Ex-FI in contrast to the ones from Em-FI. From the Monte Caro errors presented by the second values in the parenthesis, such advantages are statistical significant in most of the cases especially when is relatively small. Moreover, by comparing between the first value in the parenthesis to the observed standard error, one can easily find that the proposed Ex-FI (by inserting the true values of parameters) is close to the Monte Carlo standard error of maximum likelihood estimates. Table 3 presents the results of Scenario 3. Similar findings can be easily found as well.

As is shown in Table 1, in the cases of , although the average Ex-FI obtained from maximum likelihood estimates is not very close to the observed standard error of the estimates, the ones obtained from the true values of are much closer. In the worst case of (low ratio), the difference is for Ex-FI in comparison with for Em-FI. Such advantage is much more obvious in the scenario of high ratio. As increases, the Ex-FI obtained from true values is converging to the observed standard error. From Tables 2 and 3, we can clearly find the same pattern for all the parameters of endogenous and exogenous covariates. When the sample size is relatively small, the Ex-FI obtained from the true value of parameters is close to the observed standard error. The discrepancy is approaching to 0 as goes towards 200.

Low Ratio () | High Ratio () | ||||||

Length/Method | Type I | Standard | Observed Standard | Type I | Standard | Observed Standard | |

Error | Error* | Error** | Error | Error* | Error** | ||

Ex-FI | 0.031 | 2.290 | 0.008 | 3.015 | |||

Em-FI | 0.030 | 0.011 | |||||

Ex-FI | 0.048 | 0.632 | 0.039 | 1.074 | |||

Em-FI | 0.044 | 0.039 | |||||

Ex-FI | 0.052 | 0.299 | 0.051 | 0.325 | |||

Em-FI | 0.052 | 0.053 | |||||

*Standard error represents the average standard error of the point estimator for and, in parentheses, the same value obtained from true | |||||||

followed by the Monte Carlo standard error. Note that since Ex-FI does not rely on the true realizations , it remains the same within | |||||||

the same scenario while for Em-FI, the reported value is average over repetitions. | |||||||

**Observed standard error represents the Monte Carlo standard error of the maximum likelihood estimates of across simulations. |

Low Ratio () | High Ratio () | ||||||

Length/Method | Type I | Standard | Observed Standard | Type I | Standard | Observed Standard | |

Error | Error* | Error** | Error | Error* | Error** | ||

Ex-FI | 0.030 | 4.312 | 0.027 | 7.944 | |||

Em-FI | 0.031 | 0.028 | |||||

Ex-FI | 0.042 | 3.901 | 0.028 | 6.931 | |||

Em-FI | 0.041 | 0.031 | |||||

Ex-FI | 0.048 | 1.247 | 0.030 | 3.284 | |||

Em-FI | 0.047 | 0.031 | |||||

Ex-FI | 0.042 | 0.949 | 0.043 | 3.993 | |||

Em-FI | 0.041 | 0.042 | |||||

Ex-FI | 0.052 | 0.612 | 0.047 | 2.522 | |||

Em-FI | 0.051 | 0.045 | |||||

Ex-FI | 0.048 | 0.701 | 0.048 | 1.503 | |||

Em-FI | 0.050 | 0.050 | |||||

*Standard error represents the average standard error of the point estimates and, in parentheses, the same measure obtained from true values | |||||||

followed by the Monte Carlo standard error. Note that since Ex-FI does not rely on the true realizations , it remains the same within | |||||||

the same scenario while for Em-FI, the reported value is average over repetitions. | |||||||

**Observed standard error represents the Monte Carlo standard error of the maximum likelihood estimates across simulations. |

Low Ratio () | High Ratio () | ||||||

Length/Method | Type I | Standard | Observed Standard | Type I | Standard | Observed Standard | |

Error | Error* | Error** | Error | Error* | Error** | ||

Ex-FI | 0.027 | 6.363 | 0.031 | 17.522 | |||

Em-FI | 0.029 | 0.032 | |||||

Ex-FI | 0.032 | 6.018 | 0.038 | 11.460 | |||

Em-FI | 0.035 | 0.036 | |||||

Ex-FI | 0.048 | 0.332 | 0.033 | 0.531 | |||

Em-FI | 0.047 | 0.033 | |||||

Ex-FI | 0.042 | 0.691 | 0.038 | 0.769 | |||

Em-FI | 0.041 | 0.039 | |||||

Ex-FI | 0.051 | 0.185 | 0.048 | 0.198 | |||

Em-FI | 0.050 | 0.046 | |||||

Ex-FI | 0.049 | 0.314 | 0.050 | 0.343 | |||

Em-FI | 0.051 | 0.051 | |||||

*Standard error represents the average standard error of the point estimates and, in parentheses, the same measure obtained from true values | |||||||

followed by the Monte Carlo standard error. Note that since Ex-FI does not rely on the true realizations , it remains the same within | |||||||

the same scenario while for Em-FI, the reported value is average over repetitions. | |||||||

**Observed standard error represents the Monte Carlo standard error of the maximum likelihood estimates across simulations. |

### 3.2 Evaluation of confidence interval length

Here we consider the average length of derived 95% confidence intervals for . Following the result that asymptotically, for large values of (Fokianos & Kedem, 1998), an approximate confidence interval can be obtained using both Ex-FI and Em-FI. For each scenario of described above, 1000 binary time series of lengths were generated. For each time series data, an approximate confidence interval for was computed using both Ex-FI and Em-FI. We compared the two approaches by calculating the relative difference of the lengths of the two confidence intervals. As expected from the average standard error values in Table 1, Fig. 1 indicates that the confidence interval derived from Ex-FI behaves more efficiently on average than the confidence interval computed using Em-FI. It is noted that such substantial difference exists when and tends to be roughly the same as goes beyond 200. Once again, it implies that one should be careful with the Em-FI when .

In Fig. 2, was fixed to and and was allowed to vary while keeping . Results clearly establish the advantage of Ex-FI over Em-FI especially as the true value of increases, i.e., the ratio increases.

### 3.3 Evaluating the discrepancy between the exact and empirical Fisher information

In this section, we discuss the results of simulations conducted to investigate the discrepancy between Ex-FI and Em-FI under the following scenarios: (i.) time series lengths ranging from ; (ii.) the ratio . Based on simulated time series under each scenario, the average Frobenius norm of the difference between the asymptotic covariance matrices (i.e. the inverse of Ex-FI and Em-FI), displayed in Fig. 3, shows that when any discrepancy between the two covariance matrices effectively vanishes. However, for , discrepancies do exist, primarily due to the instability of Em-FI for particular datasets. The result reiterates that caution needs to be taken when utilizing the Em-FI variance estimator for shorter time series, since this erratic behaviour could lead to significant errors in the estimated variances of regression parameters.

### 3.4 Evaluating the convergence

We considered the asymptotic behavior of Ex-FI and compared it to the AFI proposed by Fokianos & Kedem (1998) by computing the average Frobenius norm between the two matrices over simulated time series data. In Fig. 4, it is clear that the discrepancy between these two matrices decays dramatically, which empirically indicates that the limiting behavior between the two estimators coincides. It should be emphasized that when , the difference is significant while as grows larger than 200, the discrepancy shrinks to small values around 0. Hence, utilizing the Em-FI when may be problematic.

## 4 Analysis of Binary Respiratory Time Series

### 4.1 Explanatory analysis

In this section we consider time-series data on respiratory rate among a cohort of 113 expectant mothers. Briefly, the participants consist of a sub-sample of women from a larger cohort of women attending prenatal care at a university-based clinic in Pittsburgh, PA and participating in a prospective, longitudinal study from early gestation through birth (Entringer et al., 2015). Participants were asked to wear a heart and respiratory rate monitor for up to four consecutive days. In addition, each night prior to sleeping the participants were asked to fill out an electronic diary recording how stressful their day was on a scale from 1 to 10 (), with 10 corresponding to the highest self-reported stress level. The study was approved by the local Institutional Review Board (IRB).

Of scientific interest is the potential association between self-reported stress and respiratory, or breath, rate measured as the number of breaths per 60 second period. For the purposes of illustration, we consider a participants breath rate averaged over one-hour intervals starting from midnight and running to midnight over the maximum of a 24 hour period. Empirical data suggests that a respiratory rate of over 20 breaths per minute is considered high for a healthy adult (Barrett, 2012). As such, the time series in this study are discretized into a binary response using a threshold of greater than 20 breaths/min. Accordingly, if we denote as the average breath rate for subject at hour , we define if the observed average respiratory rate is greater than 20 breaths/min, and 0 otherwise. To illustrate, Fig. 5 presents the observed time series for a randomly sampled participant. Table 4 depicts the empirical transition table of respiratory rate across all subjects. It illustrates a strong association between the current realization of and lagged values of and In this study, one scientific question of interest is whether or not a potential interaction exists between the lagged realizations and a participant’s observed stress level . Specifically, it is hypothesized that the association between lagged responses and current breath rate is lower among individuals reporting high stress due to the erratic breathing patterns that high stress situations can evoke. As such, we consider a LARX model including the lagged realization, an indicator for high stress (), and their interaction. In this study, similar to the discussion in Holmes & Rahe (1967), a subject is considered to be in high stress if the scale exceeds .

Lagged respiratory rate | Current respiratory rate | |
---|---|---|

0.865 | 0.046 | |

0.038 | 0.051 | |

0.855 | 0.047 | |

0.038 | 0.060 | |

### 4.2 Fitting the LARX model to the respiratory binary time series data

We consider LARX(1) and LARX(2) models fitted across the 113 subjects with the same parameter. Stress level and the interactions between stress level and past values of the binarized respiratory rate , were considered to be covariates. With the independence assumption across subjects, we fit a log likelihood function that is the sum of the log likelihood function (6) for each subject. Table 5 provides 95% confidence intervals for the functionals and after fitting the LARX(1) model. It can be seen that the confidence intervals derived from Ex-FI are consistently shorter than Em-FI. Specifically, when the confidence interval for derived from Ex-FI excludes 0.5 (odds excludes 1), while the confidence interval resulting from the use of Em-FI includes 0.5 (odds includes 1). Under the LARX(2) model, the pattern is more obvious. From Table 6, it can be seen that comparing the confidence interval from Ex-FI to Em-FI, the average length of all the functionals are relatively smaller. In the most extreme case the Ex-FI derived confidence interval for the odds of high respiratory rate among high stress individuals is approximately 30% shorter (and excluding 1), when compared to the confidence interval derived using Em-FI. Using the Ex-FI approach, the lagged realizations are determined to be significantly associated with respiratory rate: expectant mothers with low stress level tend to have low rate if their previous realizations are low. In contrast, the wider Em-FI intervals do not rule out a odds of 1 associated with high prior respiratory state among high stress mothers.

Low Stress () | High Stress () | ||||
---|---|---|---|---|---|

Previous State/Method | Prob | Odds | Prob | Odds | |

Ex-FI | (0.042, 0.061) | (0.044, 0.065) | (0.027, 0.085) | (0.028, 0.093) | |

Em-FI | (0.042, 0.061) | (0.044, 0.065) | (0.027, 0.085) | (0.028, 0.093) | |

Ex-FI | (0.373, 0.731) | (0.594, 2.724) | |||

Em-FI | (0.366, 0.737) | (0.577, 2.802) |

Low Stress () | High Stress () | ||||
---|---|---|---|---|---|

Previous State/Method | Prob | Odds | Prob | Odds | |

Ex-FI | (0.044, 0.064) | (0.046, 0.068) | (0.023, 0.081) | (0.023, 0.088) | |

Em-FI | (0.044, 0.064) | (0.046, 0.068) | (0.023, 0.080) | (0.023, 0.087) | |

Ex-FI | (0.394, 0.553) | (0.651, 1.241) | (0.349, 0.851) | (0.537, 5.701) | |

Em-FI | (0.385, 0.563) | (0.626, 1.290) | (0.349, 0.851) | (0.537, 5.707) | |

Ex-FI | (0.100, 0.201) | (0.117, 0.251) | (0.017, 0.230) | (0.017, 0.299) | |

Em-FI | (0.100, 0.210) | (0.111, 0.265) | (0.015, 0.250) | (0.016, 0.332) | |

Ex-FI | (0.670, 0.787) | (2.033, 3.696) | |||

Em-FI | (0.653, 0.800) | (1.878, 4.001) |

## 5 Conclusion

We have demonstrated that applying the Em-FI matrix to serially-correlated data may lead to undesirable consequences in inference. Such consequences include wider confidence intervals (on the average) and thus potentially misleading inferential results. To overcome these limitations, we derived the exact form and an iterative computation formula of the conditional Fisher information matrix for the general logistic autoregressive model with (without) exogenous covariates (LAR()/LARX()). Although a normality assumption is necessary when the sample size is not large, simulation studies based on the LAR()/LARX() model demonstrate the advantages of Ex-FI over Em-FI in terms of small sample stability, leading to narrower confidence intervals, on the average, while maintaining false positive rates at or below nominal levels. Numerically, we established the convergence of the exact conditional Fisher information and studied the asymptotic behavior as grows large. Consequently, analysis of the respiratory binary time series data suggests that using Ex-FI may result in greater statistical power when making inference. In summary, the Ex-FI matrix is recommended over the Em-FI as it provides greater stability for small time series and equivalent large sample inference. While the derivation of the Ex-FI is non-trivial, it is computationally tractable because it can be obtained iteratively. The result is a stable estimator that is easily implementable and more stable, particularly for sample sizes less than 200.

While the proposed approach is promising, there are still potential directions that can be pursued. For instance, the current framework is based on the normality assumption even though the sample size is not too large. As future work, theoretic results on finite sample distribution of maximum likelihood estimates could be established. Moreover, selection of the order needs to be taken into serious consideration. Motivated by the works of Kedem & Fokianos (2002) and Katz (1981), we may select the optimal lag order using either the Akaike Information Criterion (AIC) or the Bayesian Information Criterion (BIC) which are defined to be and respectively, where is the maximum likelihood estimator of

## References

- Gouveia et al. (2017) Gouveia, S., Scotto, M.G., Weiß, C.H., Ferreira, P.J.S: Binary auto-regressive geometric modelling in a DNA context. Journal of the Royal Statistical Society: Series C (Applied Statistics) 66(2), no.2, 253–271 (2017)
- Chen et al. (2018A) Chen, P., Jiao, J., Xu, M., Gao, X., Bischak, C.: Promoting active student travel: A longitudinal study. Journal of Transport Geography, 70, 265–274 (2018)
- Chen et al. (2018B) Chen, P., Sun, F., Wang, Z., Gao, X., Jiao, J., Tao, Z.: Built environment effects on bike crash frequency and risk in Beijing. Journal of safety research. 64, 135–143 (2018)
- Guo et al. (2018) Guo, Y., Wang, Y., Marin, T., Kirk, E., Patel, R., Josephson, C.: Statistical methods for characterizing transfusion-related changes in regional oxygenation using Near-infrared spectroscopy (NIRS) in preterm infants. arXiv preprint arXiv:1801.08153 (2018)
- Kedem (1994) Kedem, B.: Time Series Analysis by Higher Order Crossings. IEEE Press (1994)
- Kedem (1980) Kedem, B.: Binary Time Series. Marcel Dekker (1980)
- Diggle et al. (1994) Diggle, J.., Liang, K., Zeger, L: Analysis of Longitudinal Data. Oxford Univ. Press (1994)
- Fahrmeir & Tutz (1994) Fahrmeir, L., Tutz, G.: Multivariate Statistical Modelling Based on Generalized Linear Models. Springer-Verlag (1994)
- Gao et al. (2017) Gao, X., Shahbaba, B., Ombao, H.: Modeling Binary Time Series Using Gaussian Processes with Application to Predicting Sleep States. arXiv preprint arXiv:1711.05466 (2017)
- Dodge (2003) Dodge, Y.: The Oxford Dictionary of Statistical Terms. OUP (2003)
- Keenan (1982) Keenan, D.: A time series analysis of binary data. J. Am. Statist. Assoc. 77(380), 816–821 (1982)
- Billingsley (1961) Billingsley, P.: Statistical Inference for Markov Process. University of Chicago Press (1961)
- Meyn & Tweedie (2012) Meyn, S., Tweedie, R.: Markov Chains and Stochastic Stability. Springer London (2012)
- Bonney (1987) Bonney, E.: Logistic regression for dependent binary observations. Biometrics 43, 951–973 (2004)
- Fahrmeir & Kaufmann (1987) Fahrmeir, L., Kaufmann, H: Regression models for nonstationary categorical time series. Time Series Anal. 8, 147–160 (1987)
- Kaufmann (1987) Kaufmann, H.: Regression models for nonstationary time series: asymptotic estimation theory. An. Statist. 15, 79–98 (1987)
- Muenz & Rubinstein (1985) Muenz, L., Rubinstein, L.: Markov models for covariate dependence of binary sequences. Biometrics 41, 91–101 (1985)
- Fokianos & Kedem (2003) Fokianos, K., Kedem, B.: Regression theory for categorical time series. Statistical Science 18(3), 357–376 (2003)
- Kedem & Fokianos (2002) Kedem, B., Fokianos, K. Regression Models for Time Series Analysis. John Wiley & Sons, Inc. (2002)
- de Vries et al. (1994) de Vries, o. S., Fidler, V., Kuipers, W., Hunink, M: Fitting multistate transitino models with autoregressive logistic regression: Supervised exercise in intermittent claudication. Medical Decision Making 18(1), 52–60 (1998)
- Hauck & Donner (1977) Hauck, Jr., Donner, A.: Waldâs test as applied to hypotheses in logit analysis. Journal of the American Statistical Association 72, 851–853 (1977)
- Newcombe (1998) Newcombe, R.G.: Interval estimation for the difference between independent proportions: comparison of eleven methods. Statistics in Medicine 17, 873–890 (1998)
- Agresti & Min (2001) Agresti, A., Min, Y.: On Small-Sample Confidence Intervals for Parameters in Discrete Distributions. Biometrics 57(3), 963–971 (2001)
- Fokianos & Kedem (1998) Fokianos, K., Kedem, B.: Prediction and classification of non-stationary categorical time series. J. Multi. Ana.. 67, 277–296 (1998)
- Startz (2012) Startz, R.: Binomial Autoregressive Moving Average Models With an Application to U.S. Recessions. Journal of Business & Economic Statistics 26(1), 1–8 (2012)
- Davis et al. (2000) Davis, R., Dunsmuir, W., Wang, Y.: On autocorrelation in a Poisson regression model. Biometrika 87(3), 491–505 (2000)
- Entringer et al. (2015) Entringer, S., Epel, E., Lin, J., Blackburn E., Bussa C., Shahbaba S., Gillen D., Venkataramanan R., Simhan H., Wadhwa P.: Maternal Folate Concentration in Early Pregnancy and Newborn Telomere Length. Annals of Nutrition and Metabolism 66, 202–208 (2015)
- Barrett (2012) Barrett, K.: Ganong’s Review of Medical Physiology. LANGE Basic Science (2012)
- Holmes & Rahe (1967) Holmes, T., Rahe, R.: The social readjustment rating scale. Journal of psychosomatic research 11(2), 213–218 (1967)
- Katz (1981) Katz, R.: On some criteria for estimating the order of a Markov chain. Technometrics 23(3), 243–249 (1981)