Modeling Daily Seasonality of Mexico City Ozone using Nonseparable Covariance Models on Circles Cross Time

Modeling Daily Seasonality of Mexico City Ozone using Nonseparable Covariance Models on Circles Cross Time

Philip A. White Corresponding Author: Department of Statistical Science, Duke University, Durham, NC, USA Emilio Porcu School of Mathematics, Statistics and Physics; Chair of Spatial Analytics Methods (SAM), Newcastle University, UK

Mexico City tracks ground-level ozone levels to assess compliance with national ambient air quality standards and to prevent environmental health emergencies. Ozone levels show distinct daily patterns, within the city, and over the course of the year. To model these data, we use covariance models over space, circular time, and linear time. We review existing models and develop new classes of nonseparable covariance models of this type, models appropriate for quasi-periodic data collected at many locations. With these covariance models, we use nearest-neighbor Gaussian processes to predict hourly ozone levels at unobserved locations in April and May, the peak ozone season, to infer compliance to Mexican air quality standards and to estimate respiratory health risk associated with ozone. Predicted compliance with air quality standards and estimated respiratory health risk vary greatly over space and time. In some regions, we predict exceedance of national standards for more than a third of the hours in April and May. On many days, we predict that nearly all of Mexico City exceeds nationally legislated ozone thresholds at least once. In peak regions, we estimate respiratory risk for ozone to be 55% higher on average than the annual average risk and as much at 170% higher on some days.

Keywords: Bayesian inference, circle, environmental health, nonseparable covariance function, Mexico City, pollution monitoring


1.7 \newfloatcommandcapbtabboxtable[][\FBwidth]

1 Introduction

Ground-level ozone is linked with short- and long-term health risks in general (see, e.g., Lippmann, 1989; Salam et al., 2005; Bell et al., 2006; Weschler, 2006) and in Mexico City specifically (see Mage et al., 1996; Romieu et al., 1996; Hernández-Garduño et al., 1997; Loomis et al., 1999; Bravo-Alvarez and Torres-Jardón, 2002; Barraza-Villarreal et al., 2008; Riojas-Rodríguez et al., 2014). In statistics, many have studied daily ozone levels (see, e.g., Sahu et al., 2007; Berrocal et al., 2010; Huang et al., 2018). However, because short-term spikes in ozone levels are associated with increased respiratory health risk, some argue that using finer time scales, normally hourly, and treating time as continuous, is essential to quantify short-term health risks (Chiogna and Pauli, 2011; Arisido, 2016).

The Mexico City ozone monitoring data presented here consist of hourly measurements from 24 stations from April and May, Mexico City’s peak ozone season. Of primary interest in this analysis are predicted ozone levels at unmonitored locations and associated compliance to national ambient air quality standards and respiratory health risk. For compliance, we discuss the probability of exceeding nationally legislated limits, 95 parts per billion (ppb) for hourly ozone and 70 ppb for eight-hour average ozone (Diario Oficial de la Federación, 2014). For health outcomes, we use methods in Chiogna and Pauli (2011) and compare respiratory health risks during the peak ozone season to the average ozone risk over 2017. Our model must enable these goals. First, our model must allow space-time predictions, accounting for daily seasonality or periodicity. With these predictions, we make probabilistic assessments of compliance and respiratory health risk.

Here, we provide three primary contributions. To account for the daily pattern of ozone in the Mexico City data, we envision time as a quantity that lies on a circle and the real line (i.e. a 24-hour clock and linear time) (see Figure 1 for illustration). To develop appropriate models for this, we extend the current literature on covariance functions for circles cross time and demonstrate that examples from these new classes have better predictive performance for the Mexico City data than current alternatives. Then, we discuss modeling details for Vecchia models that rely on neighbor selections for model fitting and prediction (Vecchia, 1988). This problem has not been addressed in the literature for periodic data and is not as simple as covariance models that decay monotonically over space and time. Lastly, we use this modeling framework to assess compliance with air quality standards and respiratory health risks.

Figure 1: This represents our two ideas of temporal differences, using both raw (linear) differences and circular differences (angular difference) in time.

Periodic or seasonal data, as we define it, rise and fall with some fixed period or alternatively has covariance that follows a periodic pattern. With this loose definition of seasonality, we include covariance functions that dampen as a function of time while exhibiting seasonality, for which we borrow the term quasi-periodic from Solin and Särkkä (2014). The Mexico City ozone data exhibits this type of pattern, as we discuss in Section 2.

Treating seasonal data in discrete time is common (for reference, see West and Harrison, 1997; Prado and West, 2010; Shumway and Stoffer, 2017). These types of models have even been applied to Mexico City ozone levels (Huerta et al., 2004; White et al., 2018) but were used with different goals in mind. Because prediction at unobserved locations and potentially unobserved times is essential for our modeling goals, we argue for continuous space and continuous time models.

Although the discussion on positive-definite functions on spheres dates to Schoenberg (1942), we take the work of Gneiting (2013) as our starting point. Extensions of this work to spheres cross time include Porcu et al. (2016) and White and Porcu (2018), while Shirota et al. (2017) developed similar covariance models for circles cross time to account for time-of-day effects for modeling crime event data. We further this discussion by proposing new classes of nonseparable covariance functions specific for circles cross time and by extending current covariance classes to this setting. For discussion on tests for correlation functions on circles, see Gneiting (1998).

For scalable Gaussian models, we use methods that induce sparsity in the inverse covariance matrix. First proposed by Vecchia (1988) and then extended by Stein et al. (2004), Stein (2005), Bevilacqua et al. (2012), Gramacy and Apley (2015), Datta et al. (2016a), and Katzfuss and Guinness (2017), these methods reduce computational expense by assuming conditional independence given a neighborhood (conditioning) set. We refer to these approaches as Vecchia models and focus specifically on the nearest-neighbor Gaussian process (Datta et al., 2016a).

Discussion about periodic data for neighbor-based approaches is lacking in the literature. Stein (2005) considers daily data where periodicity is not present and conditions on most recent times. Datta et al. (2016b) work on the same time scale and argue using dynamic neighborhood sets that select the most correlated neighbors; however, their approaches do not easily generalize to our setting. In this article, we open this discussion for quasi-periodic covariance models.

We continue by discussing the Mexico City pollution monitoring data in Section 2. Then, we discuss appropriate covariance classes for these data and present new covariance classes for circles cross time in Section 3. Using these and other covariance classes, we discuss modeling details using Bayesian nearest-neighbor Gaussian processes models in Section 4. We motivate this discussion in the Online Supplement using a single time-series. We present the results of our analysis in Section 5, addressing compliance to Mexican ambient air quality standards and respiratory health risks associated with ground-level ozone. Lastly, we give concluding remarks and discuss future extensions in Section 6.

2 Mexico City Ozone Monitoring Data

In this dataset, we have hourly ozone measurements for April and May of 2017 at monitoring stations across Mexico City, Mexico. At each station, we have measurements at times, giving observations in total111Although ozone levels are given hourly, these hourly quantities are derived is an average of the 60 minute-by-minute measurements. Missing measurements were imputed prior to receiving the data using the measurements at the nearest station within the same region. If no simultaneous measurements in the same region were available, then the missing measurement was filled in using the nearest station in a different region.. Relative humidity (RH) and temperature (TMP) are also measured hourly at the same 24 stations, and these are used as explanatory variables.

Ozone is formed by volatile nitrous oxides exposed to heat and sunlight (Sillman, 1999); thus, ozone levels peak during the warmest hours of the day. Ozone levels are held in check by rain that clears out pollutants. To use heat and rain as explanatory variables, we include hourly temperature and relative humidity as covariates for models in Sections 5 and the Online Supplement.

First, we look at variability in ozone levels across the city. To do this, we plot station locations with their mean ozone levels in Figure 2 using the ggmap package in R (Kahle and Wickham, 2013). In general, ozone decreases as we move northward; however, this trend is not uniform, as central Mexico City has the lowest ozone values. Peak ozone levels in the south are largely explained by a wind corridor that flows northeast to southwest, moving ozone and ozone precursors produced along this wind path to southern parts of Mexico City. Additionally, ozone is often trapped in the south by mountains on Mexico City’s southwest boundary.

Figure 2: Station locations with mean ozone coded by the color of the point with low to high ozone indicated by cyan to red.

We explore how the data vary over time, Figure 3 displays the mean and maximum ozone concentration for each day, showing that ozone concentrations peak in May. Figure 3 plots ozone averages as a function of hour of the day and demonstrates a clear peak in ozone levels around 2:00 or 3:00 p.m.

To examine whether the daily pattern in ozone can be explained by temperature and humidity, we fit a linear model to ozone using relative humidity and temperature as covariates. Then, we examine the autocorrelation of the model residuals for each of the 24 locations (See Figure 3). For each site, the temporal autocorrelation pattern peaks every 24 hours but decays overall as a function of time. Thus, a purely periodic function or purely decaying covariance model would be insufficient for these data (we demonstrate this empirically in both our temporal analysis in the Online Supplement and Section 5). These plots motivate our theoretical discussion in Section 3.

\thesubsubfigure Mean and max over days
\thesubsubfigure Mean ozone over hours
\thesubsubfigure ACF for ozone
Figure 3: (Left) Mean and maximum ozone levels for all stations each day (Center) Mean ozone over the hour of the day, averaging over days and stations (Right) Autocorrelation function (ACF) of the model residuals for each station using relative humidity and temperature as explanatory variables. Each color represents the ACF of each station.

3 Covariance Models

3.1 Covariance Modeling Approach

We start with the notation and background needed for a neat exposition of our results. We denote as the unit circle and use the mapping as the angle (great circle distance) between any two points on . We use the space to capture the circular nature of time exhibited in daily patterns in ozone levels. We use as the spatial domain, a projection of a small portion of the sphere onto a plane. Throughout, we denote as a weakly stationary space-time Gaussian random field.

The assumption of Gaussianity implies that finite dimensional subsets of are uniquely determined by its mean and the covariance function , defined as where and are the spatial and the temporal lag, respectively. Our novel contribution is to use the temporal lag to account for differences in raw time ( itself) and time-of-day, which we represent as the angle on a circle. Thus, we assume


for some mapping . Constructing covariance functions of the type (1) is challenging, and we propose covariance functions that are partially nonseparable. As an abuse of notation, we write for and for and consider the following nonseparabilities:

(A) nonseparability.

Here, ;

(B) nonseparability,

obtained when ;

(C) nonseparability,

where .

Apparently, , are covariance functions on their respective subspaces. For the Mexico City data, we also consider complete separability, , to motivate the need of nonseparable models. We detail some approaches for constructions (A), (B) and (C).

For ease of illustration, we first define Gneiting functions, , according to Gneiting (2002), as follows:


Following Porcu et al. (2016), we define the modified Gneiting class, , as


For both classes, the function is completely monotonic; that is, is infinitely differentiable on , satisfying , . The function is strictly positive and has a completely monotonic derivative. Here, denotes the restriction of to the interval . We provide selections for and in Tables 4 and 5 of the Online Supplement. The class cannot not capture seasonality or periodicity but is appropriate for strategy (B), i.e. (see Section B.2 for examples). We limit our selections of this type to , although many other examples exist in the literature. For strategy (A), arguments in Theorem 2 in White and Porcu (2018) ensure to provide a valid covariance function, and Theorem 1 in Porcu et al. (2016) shows that is a valid covariance function. These classes can also be used for in approach . Examples for and are given in Appendix B.1.

The marginal covariance is taken from the Matérn class , defined as


where is the MacDonald function (Gradshteyn and Ryzhik, 2007). Well known arguments show that is a valid choice (Stein, 1999) for any . The exponential model on , , induces conditional independence between random effects given the time immediately previous, regardless of spacing (see the Online Supplement for some discussion). A severe restriction on the parameter is required for to be a valid covariance (see Gneiting, 2013): . The Matérn function can also be used to generate valid covariance on any -dimensional Euclidean space (Stein, 1999). Choices for the functions for construction (C) are the core of our theoretical contributions.

3.2 Covariance Functions for Circles Cross Time

Literature on periodic covariance functions is sparse in general, with some exceptions. For fixed period , MacKay (1998) embeds the temporal mapping, , in the squared exponential correlation function to obtain a periodic covariance model. Rasmussen and Williams (2006) propose quasi-periodic covariance functions by taking the product of periodic and decaying covariance functions. Solin and Särkkä (2014) show an explicit link between some periodic and separable quasi-periodic Gaussian process specifications and state-space models. Most discussion about periodic covariance models is, however, limited to separable models like those discussed. To provide more general nonseparable covariance models for periodic data, we turn to models on circles cross time.

For hourly data, we map hour-of-day onto a 24-hour clock or circle. As discussed, there are rich covariance classes on spheres and spheres cross time (e.g., Gneiting, 2013; Porcu et al., 2016; White and Porcu, 2018). Shirota et al. (2017) discuss nonseparable covariance classes on (circles cross space) for crime event data using log-Gaussian Cox processes. However, for our data, purely circular classes from Gneiting (2013) and Shirota et al. (2017) are too limited because they do not dampen over time.

The result we present is motivated by autocorrelation patterns in Figure 3 in Section 2, exhibiting both periodicity and decay. Specifically, we propose two classes on nonseparable covariance functions on . For these classes, we define the variogram of an intrinsically stationary process, , as , . We also recall that, for any covariance function on , the ratio is a correlation function. Lastly, we define as the hyperbolic sine function.

Theorem 1.

Let be a variogram and be a correlation function.


Let be defined as


Then, is a correlation function.


Let be defined as


Then, is a correlation function.

Proof of Theorem 1 is deferred to Appendix A. Examples of variograms are reported in Tables 3 and 5 in the Online Supplement. We give Equations (21), (22), and (23) in Appendix B.1 as examples from Theorem 1. These models do not decay to zero at gets large. If eventual decay to zero is preferred, then we could construct a model , where decays to zero as a function of time, but this construction did not improve predictive performance for these data.

4 Methods and Models

4.1 Nearest-Neighbor Gaussian Process

We envision a model for the Mexico City data as


where is square-root ozone measured at the location-time pair , are relative humidity and temperature measurements, are regression coefficients, are space-time random effects, and is Gaussian noise. We found that using square-root ozone reduced correlation between the variance of model residuals and predicted mean. Additionally, modeling on the square-root scale led to better predictive performance than modeling on the original scale (see also Sahu et al., 2007; Berrocal et al., 2010).

Space-time random effects for point-referenced data are often specified through a functional prior using a Gaussian process (GP) (see, e.g., Rasmussen and Williams, 2006; Banerjee et al., 2014). Likelihood computations for hierarchical Gaussian process models require inverting a square matrix that is the same size as the data, making Gaussian process models intractable with even moderate amounts of data (for example, ).

Many have addressed this computational bottleneck using either low-rank or sparse matrix methods (see Heaton et al., 2018, for a review and comparison of these methods). Low-rank methods project the original process onto representative points or knots (see, e.g., Higdon, 2002; Banerjee et al., 2008; Cressie and Johannesson, 2008; Stein, 2008); however, these approaches often perform poorly for prediction as they often over-smooth (see Stein, 2014; Heaton et al., 2018).

Alternatively, sparse methods either induce zeros in the covariance matrix using compactly supported covariance functions (see, e.g., Furrer et al., 2006; Kaufman et al., 2008) or in the precision matrix by assuming conditional independence (Vecchia, 1988; Stein et al., 2004). We ultimately favor conditional independence approaches because predictive performance is generally better, and the class of valid covariance models is more expansive.

Sparse precision methods date to Vecchia (1988) and were furthered by Stein et al. (2004) and Bevilacqua et al. (2012) to approximate the likelihood of the full GP model using conditional likelihoods. Gramacy and Apley (2015) and Datta et al. (2016a) extend this work to process modeling. The nearest-neighbor Gaussian process is itself a Gaussian process (Datta et al., 2016a) and has good predictive performance relative to other fast FP methods (See Heaton et al., 2018).

If we take a parent Gaussian process over , the nearest-neighbor Gaussian process induces sparsity in the precision matrix of the parent process by assuming conditional independence given neighborhood sets (Datta et al., 2016a). Let of distinct location-time pairs denote the reference set, where time gives natural ordering and spatial ordering induced for observations at identical times. In our case, we order space by latitude, south to north, and take observed data as the reference set. For model validation, we include held-out location-time pairs in the reference set, making prediction part of model fitting.

We define neighborhood sets on the reference set, where consists of nearest-neighbors of chosen from . Evidently, if , . Together, and define a Gaussian directed acyclic graph (DAG) with joint distribution


where , , and is the subset of corresponding to neighbors (Datta et al., 2016a). Like other GP models, the NNGP can be used for hierarchically spatial random effects.

4.2 Neighbor Selection

Optimal neighbor selection is not simple, and “best” subsets vary depending on covariance function and associated parameters (see, e.g., Vecchia, 1988; Datta et al., 2016b). Defining nearest-neighbors only in terms of distance can lead to models that are uniformly sub-optimal compared to other neighbor selections (Stein et al., 2004), but nearest-neighbor approaches are often used for simplicity. Gramacy and Apley (2015) provide a thorough discussion on how to improve upon nearest-neighbor selections.

For daily monitoring data, Stein (2005) argues using most-recent neighbors. In the same setting, Datta et al. (2016b) discuss two approaches for selecting neighbors: simple neighbor selection and dynamic neighbor selection. For neighbors, simple selection involves choosing nearest spatial neighbors over the most recent times, including current times. For dynamic neighbor selection, neighbors change within model fitting to select the most correlated neighbors. In some nonseparable space-time settings, possible neighbors can be enumerated simply, limiting the computation burden of dynamic neighbor selection (Datta et al., 2016b). Enumeration of possible neighbors for periodic time and quasi-periodic time covariance models is intractable because the covariance function has many local maxima over time.

Even for fixed periods, the pattern of covariance seasonality varies greatly depending on the model and its parameters. Thus, neighborhood selection may be specific to the model and data. Our argument is simple. Including periodic peaks, as well as locations near periodic peaks improves predictive performance. We provide an empirical discussion to this point in the Online Supplement.

4.3 Priors, Prediction, and Model Comparison

We begin here by discussing the prior distributions for our model parameters. We use inverse-gamma prior distributions for and with 2.1 and 10 for the shape and rate parameters. For square-root ozone concentrations, these selections are not overly informative. For compactly supported covariance parameters, we use uniform prior distributions. For parameters that are strictly positive, we use gamma prior distributions with shape and rate of 0.01. We assume that regression coefficients a priori follow independent normal distributions with mean zero and variance . For this model formulation, model fitting details can follow either Datta et al. (2016a) or the collapsed sampler in Finley et al. (2018).

Predictions at unobserved times and locations can depend on any subset of the reference set . Because this is a retrospective analysis, we use past and future lags for prediction, where we again consider neighbor selection carefully to account for seasonality. In our case, we use the five nearest spatial neighbors with 1, 2, 23, 24, 25, and 168-hour lags back and forward, as well as simultaneous spatial neighbors. Random effects at unobserved location-time pairs have a conditional normal distribution given by


Then, the posterior prediction done using posterior samples of , , and via composition sample (see, e.g., Gelman et al., 2014). To obtain estimates of at unmonitored locations, we weight the 24 simultaneously observed covariates proportional to inverse squared distance. Predictions at held-out location-time pairs are used to compare competing models.

We fit models on the square-root scale but compare models using predictive criteria on the original scale. To validate our space-time model, we hold out all observations (i.e. all stations) at 20% of the observed hours and compare models using root mean squared prediction error (RMSPE), mean absolute prediction error (MAPE), % prediction interval coverage (CVG), and continuous ranked probability scores (CRPS) (Gneiting and Raftery, 2007), where


We use posterior predictive samples for a Monte Carlo approximation (see, e.g., Krüger et al., 2016),

We then average over held-out data. Because CRPS considers how well the entire predictive distribution matches the observed data rather than only the predictive mean (MAPE and RMSPE) or quantiles (CVG). Furthermore, CRPS is a proper scoring rule (Gneiting and Raftery, 2007); thus, we prefer it as a selection criterion.

To assess predictions for all observations at a held-out hour jointly, we consider the energy score (ES), a multivariate generalization of CRPS. For a set of multivariate predictions Y, ES is


where y is an observation, , and is a probability measure (Gneiting and Raftery, 2007). It is common to fix (see, e.g., Gneiting et al., 2008; Jordan et al., 2017). For a set of MCMC predictions for a held-out observation y, the empirical ES reduces to

as was done in Gneiting et al. (2008). Like CRPS, ES is a proper scoring rule (Gneiting and Raftery, 2007). To obtain a single value, average ES over held-out data.

4.4 Ozone Exposure Respiratory Risk Assessment

We use the daily respiratory risk model presented in Chiogna and Pauli (2011) to assess risk attributable to ground-level ozone. In their paper, Chiogna and Pauli (2011) model the short-term effects of summer ozone levels on all hospital admissions and respiratory hospital admissions. For this goal, they compare 115 models using cross-validation. Although the risk model of Chiogna and Pauli (2011) does not include space, it can be applied location-wise.

Chiogna and Pauli (2011) found that the best model included three measures associated with an ozone threshold of ppb: number of the hours exceeding on day (we call this ), the difference between the maximum daily ozone level and (max - threshold) , and the lagged average nighttime ozone from the last three days (nighttime is defined to be 9 PM and 8 AM). We used fewer days to calculate if data from the last three nights were not available (e.g. April 2 only has one night to estimate risk). Then, the relative risk of respiratory hospital admissions on day at location s is


where and are regression coefficients for and , respectively. The scaling factor makes the average risk at the 24 ozone monitoring stations over the year equal to one. The scale of the coefficients differs from those presented in Chiogna and Pauli (2011) because we use ozone levels in ppb instead of .

By adopting this model, we make many assumptions that we outline here. First, we assume that the effects of ozone in Milano, Italy are like those in Mexico City, Mexico. While the marginal effect may be the same, it is likely that the effects of ozone are influenced by other factors like other pollutants and weather. Second, we adopt their choice of threshold where ozone begins to be harmful (they consider three thresholds), although several question using thresholds altogether (See, e.g., Kim et al., 2004). Additionally, we consider the difference between the maximum daily level compared to a threshold instead of integrated ozone as was used in Arisido (2016). Noting these limitations, this model enables us to estimate respiratory risk on a daily level using predicted hourly ozone levels.

5 Results and Discussions

To select a covariance model, we hold out all observations at 20% of hours in April and May, as discussed in Section 4. This hold-out approach allows us to score models using the energy score, as well as univariate criteria like CRPS, RMSPE, MAPE, and 90% prediction interval coverage. While RMSPE and prediction interval coverage are the most common criteria, we rely upon CRPS and ES most because they are proper scoring rules (Gneiting and Raftery, 2007) and compare the whole predictive distribution to held-out values. Out-of-sample predictions are made using 25,000 posterior samples after a burn-in of 5,000 iterations.

We present results for a variety of covariance models on or a subset of this space. These models are given in Table 1. We first consider a completely separable covariance model on , Model 1, which is used to motivate the use of nonseparable models. We then consider two examples from Gneiting (2002). The first, given as Model 2, ignores periodicity. We also consider this same covariance model multiplied by an exponential covariance function that takes as an argument (Model 3), an example of construction (B) in Section 3. Then, we consider two examples based on the class of nonseparable covariance models presented by Shirota et al. (2017) (Models 4 and 5), where Model 4 ignores temporal decay and Model 5 is an example of construction (A). Then, we give two examples of nonseparable circular cross linear time covariance models, one from White and Porcu (2018) and the other from Theorem 1, where both models are multiplied by an exponential spatial covariance function (Models 6 and 7). Both Models 6 and 7 are examples of construction (C). Other models from these classes were considered; however, these represent the models with the best predictive performance.

All models except Model 4 use the five nearest spatial neighbors (including the location itself) with 1, 2, 23, 24, 25, and 168-hour lags. We use neighbors at the same time as the observation but exclude the observation itself. Thus, we condition on at most 34 neighbors. For Model 4, we exclude lags 24 and 168 at the location itself to maintain positive-definite covariance matrices; instead, we include lags 3, 167, and 169 to compensate. The results of the model comparison are given in Table 1.

The results highlight the importance of modeling the entire space () and utilizing nonseparable models. Models that excluded one of these subspaces (Models 2 and 4) had the worst predictive performance. The completely separable model (Model 1) was next worse in prediction, motivating the need for nonseparable models. Each of the constructive approaches, denoted (A), (B), and (C), proposed in Section 3 performed well in prediction. Ultimately, Model 7, derived from Theorem 1 and using construction (C), performed best in terms of ES, CRPS, MAPE, and RMSPE for these data. Thus, we use this model for prediction at unmonitored locations.

Explicitly, the final covariance model is


where , and . The parameter governs the temporal range, is the spatial range parameter, and is a smoothness parameter. The parameter is most accurately understood as the spatial range for simultaneous observations, and is the temporal range at a fixed location. One benefit of this model is that it has few parameters compared with many of its competitors.

Posterior summaries for regression coefficients and covariance parameters are given in Table 2. The regression parameters agree with our hypotheses that relative humidity is negatively related to ozone levels and the temperature is positively associated with ozone. The mean of the spatial range parameter is 22.32 km. The posterior mean of the temporal range parameter is 86.90 hours, meaning that correlation for this model persists over several days. However, these range parameters have limited interpretability, as discussed above.

We limit our predictions to the convex hull of the monitoring network to not spatially extrapolate. Using these posterior predictions in April and May, we can assess probabilities of exceeding Mexican ambient air quality standards and examine respiratory risk rates during Mexico City’s peak ground-level ozone season.

To summarize exceedance results temporally, we plot posterior means and 95% credible intervals for the estimated proportion of the city that exceeds national ozone standards at least once on that day in Figure 4. In April, there are several peaks in exceedance probability; however, from May 6 to May 28, the proportion of the city with at least one daily ozone exceedance is near one.

For each predictive location, we plot the estimated proportion of hours when Mexican ground-level ozone standards (either one-hour or eight-hour ozone) are exceeded in Figure 5. This follows the pattern in Figure 2, but the degree of variation between locations is striking. We estimate that locations in southern Mexico City exceed national ozone regulations nearly three times as often as north, west, and central Mexico City.

Figure 4: Estimated proportion of the city that exceeds either one-hour and eight-hour average ozone limits at least once that day.

In Figures 5, we give the mean relative risk averaged over all days in April and May, and we plot the estimated maximum relative risk over the same time period in Figure 5. Although the spatial patterns of the means and maxima are similar, the scale of these plots differs significantly. In terms of mean risk, the most extreme regions have respiratory risks 50% higher than the least extreme regions. In contrast, regions of the highest maximum risk in the peak season are nearly twice that of regions with the lowest maximum risk. These plots demonstrate the degree of increased health risk determined solely by where one lives or works.

\thesubsubfigure Exceedance probability
\thesubsubfigure Mean risk
\thesubsubfigure Maximum risk
Figure 5: Spatial summaries of (Left) exceedance probability and (Center and Right) respiratory risk for ozone. The exceedance probability considers both one-hour and eight-hour average ozone limits.

The estimated mean and maximum respiratory health risk over Mexico City (with 95% credible intervals) are given as a function of the day in Figure 6. Ozone risk peaks May 17 to May 23 in terms of both the mean and maximum, but the changes in the maximum risk are much more drastic than those in mean risk. These maxima correspond to extreme ozone levels in south Mexico City where the estimated risk is 2.7 times the annual average and nearly two times the mean risk on the same day.

These results highlight how extreme the space-time variability in both exceedance probability and respiratory health risk is within Mexico City during its peak ozone season. According to the risk model from Chiogna and Pauli (2011), those living in regions with extreme ozone levels are about 50% more likely to be admitted to the hospital due to ozone exposure compared to those in less polluted areas. Although this is not a component of the risk model, increased respiratory risk impacts at-risk populations (e.g. the elderly) at a much higher rate than healthier sub-populations (see, e.g., Bell et al., 2004).

Figure 6: Estimated mean and maximum respiratory risk over the city.

6 Discussion and Conclusions

We have discussed the monitoring network for ozone and analyzed these data in April and May of 2017. For these data, we develop new classes of covariance models that account for daily seasonality. We apply these covariance models to the Mexico City ozone data using nearest-neighbor Gaussian processes and select a model using predictive criteria on a hold-out dataset. This model is then used for prediction to assess compliance with Mexican ambient air quality standards and respiratory risk due to ozone exposure. In this analysis, we identify regions and times where exceedance probabilities and ozone risk factors are particularly high.

An interesting follow-up study would account for people’s movement over the day, giving individualized risk assessments. In addition to this, we could combine our modeling approaches with hospital admission data in Mexico City, allowing us to establish more accurate respiratory risk measures for Mexico City, rather of relying on other methods.

Further theoretical work could address how the constraints on the functions and in Shirota et al. (2017) can be relaxed while preserving positive-definiteness. Here, we have considered models that are partially nonseparable over but not over all three spaces. Thus, further theoretical work to establish such classes is justified.

Appendix A Proof of Theorem 1

We start by noting that arguments in Berg and Porcu (2017) show that the functions defined at points (I) and (II) are positive definite if and only if they can both be written as


where uniquely determined sequence of positive definite functions has . To show (I), we resort to in Gradshteyn and Ryzhik (2007):

The function in Equation (5) admits an expansion of the type (14) with coefficients for . It can be namely verified that is finite. Thus, the proof is completed. As for Assertion (II), Equation (6) comes straight by considering the expansion in [1.449.2] in Gradshteyn and Ryzhik (2007):

which is absolutely convergent provided . We now replace with the correlation function to obtain (6). The proof is completed.

Appendix B Covariance Examples

b.1 Temporal Modeling

We begin with two purely circular covariance examples from Gneiting (2013) in (15) and (16). For and , we define


In addition to the Matérn model presented in Section 3, we consider a pair of covariance models on from what we call the inverted Gneiting class in (Porcu et al., 2016). Let , and . Also, let and be positive parameters. Then


where , and where and belong to the interval . The second we consider uses the generalized Cauchy covariance function (Gneiting and Schlather, 2004, see Tables 4 and 5 in the Online Supplement) for the temporal margin,


Examples from White and Porcu (2018) include


with , and . For the second, we propose again a generalized Cauchy covariance function for the spatial margin, obtaining


where , , and .

From Theorems 1, we give three examples. Let and . Then, we propose


b.2 Space-Time Models

To create effective space-time models, we suggest combining models from Appendix B.1 with valid spatial models. Additionally, we provide examples of covariance models on and which can be combined with valid models on and given in Appendix B.1.

First, we look at nonseparable covariance models on from the so-called Gneiting class from Gneiting (2002) given in (2). From this class, we give two examples. For the first, we propose again a generalized Cauchy covariance function for the spatial margin, obtaining


for , and . To incorporate periodicity, giving a model on , one take the product of one of these models with, e.g., (4) or (15).

Shirota et al. (2017) extends these models to , replacing the temporal lag with the angular difference on a circle. We give one example in Equation (25). For the second, we propose again a generalized Cauchy covariance function for the spatial margin, obtaining


for , and . To incorporate a decaying covariance over linear time, one simply takes the product of one of these models with a covariance model on .


  • Arisido (2016) Arisido, M. W. (2016). Functional Measure of Ozone Exposure to Model Short-term Health Effects. Environmetrics, 27(5):306–317.
  • Banerjee et al. (2014) Banerjee, S., Carlin, B. P., and Gelfand, A. E. (2014). Hierarchical Modeling and Analysis for Spatial Data. CRC Press.
  • Banerjee et al. (2008) Banerjee, S., Gelfand, A. E., Finley, A. O., and Sang, H. (2008). Gaussian Predictive Process Models for Large Spatial Data Sets. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(4):825–848.
  • Barraza-Villarreal et al. (2008) Barraza-Villarreal, A., Sunyer, J., Hernandez-Cadena, L., Escamilla-Nuñez, M. C., Sienra-Monge, J. J., Ramírez-Aguilar, M., Cortez-Lugo, M., Holguin, F., Diaz-Sánchez, D., and Olin, A. C. (2008). Air Pollution, Airway Inflammation, and Lung Function in a Cohort Study of Mexico City Schoolchildren. Environmental Health Perspectives, 116(6):832.
  • Bell et al. (2004) Bell, M. L., McDermott, A., Zeger, S. L., Samet, J. M., and Dominici, F. (2004). Ozone and Short-term Mortality in 95 US Urban Communities, 1987-2000. Journal of the American Medical Association, 292(19):2372–2378.
  • Bell et al. (2006) Bell, M. L., Peng, R. D., and Dominici, F. (2006). The Exposure–Response Curve for Ozone and Risk of Mortality and the Adequacy of Current Ozone Regulations. Environmental Health Perspectives, 114(4):532.
  • Berg and Porcu (2017) Berg, C. and Porcu, E. (2017). From Schoenberg Coefficients to Schoenberg Functions. Constructive Approximation, 45(2):217–241.
  • Berrocal et al. (2010) Berrocal, V. J., Gelfand, A. E., and Holland, D. M. (2010). A Spatio-Temporal Downscaler for Output from Numerical Models. Journal of Agricultural, Biological, and Environmental Statistics, 15(2):176–197.
  • Bevilacqua et al. (2012) Bevilacqua, M., Gaetan, C., Mateu, J., and Porcu, E. (2012). Estimating Space and Space-Time Covariance Functions for Large Data Sets: A Weighted Composite Likelihood Approach. Journal of the American Statistical Association, 107(497):268–280.
  • Bravo-Alvarez and Torres-Jardón (2002) Bravo-Alvarez, H. and Torres-Jardón, R. (2002). Air Pollution Levels and Trends in the Mexico City Metropolitan Area. In Urban Air Pollution and Forests, pages 121–159. Springer.
  • Chiogna and Pauli (2011) Chiogna, M. and Pauli, F. (2011). Modelling Short-term Effects of Ozone on Morbidity: An Application to the Cty of Milano, Italy, 1995–2003. Environmental and Ecological Statistics, 18(1):169–184.
  • Cressie and Johannesson (2008) Cressie, N. and Johannesson, G. (2008). Fixed Rank Kriging for Very Large Spatial Data Sets. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(1):209–226.
  • Datta et al. (2016a) Datta, A., Banerjee, S., Finley, A. O., and Gelfand, A. E. (2016a). Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets. Journal of the American Statistical Association, 111(514):800–812.
  • Datta et al. (2016b) Datta, A., Banerjee, S., Finley, A. O., Hamm, N. A., and Schaap, M. (2016b). Nonseparable Dynamic Nearest Neighbor Gaussian Process Models for Large Spatio-Temporal Data with an Application to Particulate Matter Analysis. The Annals of Applied Statistics, 10(3):1286–1316.
  • Diario Oficial de la Federación (2014) Diario Oficial de la Federación (2014). Norma Oficial Mexicana NOM-025-SSA1-2014.
  • Finley et al. (2018) Finley, A. O., Datta, A., Cook, B. C., Morton, D. C., Andersen, H. E., and Banerjee, S. (2018). Efficient Algorithms for Bayesian Nearest Neighbor Gaussian Processes. arXiv preprint arXiv:1702.00434.
  • Furrer et al. (2006) Furrer, R., Genton, M. G., and Nychka, D. (2006). Covariance Tapering for Interpolation of Large Spatial Datasets. Journal of Computational and Graphical Statistics, 15(3):502–523.
  • Gelman et al. (2014) Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2014). Bayesian Data Analysis, volume 2. CRC press.
  • Gneiting (1998) Gneiting, T. (1998). Simple Tests for the Validity of Correlation Function Models on the Circle. Statistics & Probability Letters, 39(2):119–122.
  • Gneiting (2002) Gneiting, T. (2002). Nonseparable, Stationary Covariance Functions for Space–Time Data. Journal of the American Statistical Association, 97(458):590–600.
  • Gneiting (2013) Gneiting, T. (2013). Strictly and Non-Strictly Positive Definite Functions on Spheres. Bernoulli, 19(4):1327–1349.
  • Gneiting and Raftery (2007) Gneiting, T. and Raftery, A. E. (2007). Strictly Proper Scoring Rules, Prediction, and Estimation. Journal of the American Statistical Association, 102(477):359–378.
  • Gneiting and Schlather (2004) Gneiting, T. and Schlather, M. (2004). Stochastic Models That Separate Fractal Dimension and the Hurst Effect. SIAM Rev., 46:269–282.
  • Gneiting et al. (2008) Gneiting, T., Stanberry, L. I., Grimit, E. P., Held, L., and Johnson, N. A. (2008). Assessing Probabilistic Forecasts of Multivariate Quantities, with an Application to Ensemble Predictions of Surface Winds. Test, 17(2):211.
  • Gradshteyn and Ryzhik (2007) Gradshteyn, I. S. and Ryzhik, I. M. (2007). Tables of Integrals, Series, and Products. Academic Press, Amsterdam, seventh edition.
  • Gramacy and Apley (2015) Gramacy, R. B. and Apley, D. W. (2015). Local Gaussian Process Approximation for Large Computer Experiments. Journal of Computational and Graphical Statistics, 24(2):561–578.
  • Heaton et al. (2018) Heaton, M. J., Datta, A., Finley, A., Furrer, R., Guhaniyogi, R., Gerber, F., Gramacy, R. B., Hammerling, D., Katzfuss, M., and Lindgren, F. (2018). Methods for Analyzing Large Spatial Data: A Review and Comparison. arXiv preprint arXiv:1710.05013.
  • Hernández-Garduño et al. (1997) Hernández-Garduño, E., Pérez-Neria, J., Paccagnella, A. M., Piña-García, M. A., Munguía-Castro, M., Catalán-Vázquez, M., and Rojas-Ramos, M. (1997). Air Pollution and Respiratory Health in Mexico City. Journal of Occupational and Environmental Medicine, 39(4):299–307.
  • Higdon (2002) Higdon, D. (2002). Space and Space-Time Modeling Using Process Convolutions. In Quantitative Methods for Current Environmental Issues, pages 37–56. Springer.
  • Huang et al. (2018) Huang, G., Lee, D., and Scott, E. M. (2018). Multivariate Space-Time Modelling of Multiple Air Pollutants and Their Health Effects Accounting for Exposure Uncertainty. Statistics in Medicine, 37(7):1134–1148.
  • Huerta et al. (2004) Huerta, G., Sansó, B., and Stroud, J. R. (2004). A spatiotemporal model for mexico city ozone levels. Journal of the Royal Statistical Society: Series C (Applied Statistics), 53(2):231–248.
  • Jordan et al. (2017) Jordan, A., Krüger, F., and Lerch, S. (2017). Evaluating Probabilistic Forecasts with the R Package scoringRules. arXiv preprint arXiv:1709.04743.
  • Kahle and Wickham (2013) Kahle, D. and Wickham, H. (2013). ggmap: Spatial Viualization with ggplot2. The R Journal, 5(1):144–161.
  • Katzfuss and Guinness (2017) Katzfuss, M. and Guinness, J. (2017). A General Framework for Vecchia Approximations of Gaussian Processes. arXiv preprint arXiv:1708.06302.
  • Kaufman et al. (2008) Kaufman, C. G., Schervish, M. J., and Nychka, D. W. (2008). Covariance Tapering for Likelihood-Based Estimation in Large Spatial Data Sets. Journal of the American Statistical Association, 103(484):1545–1555.
  • Kim et al. (2004) Kim, S.-Y., Lee, J.-T., Hong, Y.-C., Ahn, K.-J., and Kim, H. (2004). Determining the Threshold Effect of Ozone on Daily Mortality: An Analysis of Ozone and Mortality in Seoul, Korea, 1995–1999. Environmental Research, 94(2):113–119.
  • Krüger et al. (2016) Krüger, F., Lerch, S., Thorarinsdottir, T. L., and Gneiting, T. (2016). Probabilistic Forecasting and Comparative Model Assessment Based on Markov Chain Monte Carlo Output. arXiv preprint arXiv:1608.06802.
  • Lippmann (1989) Lippmann, M. (1989). Health Effects of Ozone a Critical Review. Japca, 39(5):672–695.
  • Loomis et al. (1999) Loomis, D., Castillejos, M., Gold, D. R., McDonnell, W., and Borja-Aburto, V. H. (1999). Air Pollution and Infant Mortality in Mexico City. Epidemiology, pages 118–123.
  • MacKay (1998) MacKay, D. J. (1998). Introduction to Gaussian Processes. NATO ASI Series F Computer and Systems Sciences, 168:133–166.
  • Mage et al. (1996) Mage, D., Ozolins, G., Peterson, P., Webster, A., Orthofer, R., Vandeweerd, V., and Gwynne, M. (1996). Urban Air Pollution in Megacities of the World. Atmospheric Environment, 30(5):681–686.
  • Porcu et al. (2016) Porcu, E., Bevilacqua, M., and Genton, M. G. (2016). Spatio-Temporal Covariance and Cross-Covariance Functions of the Great Circle Distance on a Sphere. Journal of the American Statistical Association, 111(514):888–898.
  • Prado and West (2010) Prado, R. and West, M. (2010). Time Series: Modeling, Computation, and Inference. CRC Press.
  • Rasmussen and Williams (2006) Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. The MIT Press, Cambridge, MA.
  • Riojas-Rodríguez et al. (2014) Riojas-Rodríguez, H., Álamo-Hernández, U., Texcalac-Sangrador, J. L., and Romieu, I. (2014). Health Impact Assessment of Decreases in PM10 and Ozone Concentrations in the Mexico City Metropolitan Area: A Basis for a New Air Quality Management Program. Salud Pública de México, 56:579–591.
  • Romieu et al. (1996) Romieu, I., Meneses, F., Ruiz, S., Sienra, J. J., Huerta, J., White, M. C., and Etzel, R. A. (1996). Effects of Air Pollution on the Respiratory Health of Asthmatic Children lLving in Mexico City. American Journal of Respiratory and Critical Care Medicine, 154(2):300–307.
  • Sahu et al. (2007) Sahu, S. K., Gelfand, A. E., and Holland, D. M. (2007). High-Resolution Space–Time Ozone Modeling for Assessing Trends. Journal of the American Statistical Association, 102(480):1221–1234.
  • Salam et al. (2005) Salam, M. T., Millstein, J., Li, Y.-F., Lurmann, F. W., Margolis, H. G., and Gilliland, F. D. (2005). Birth Outcomes and Prenatal Exposure to Ozone, Carbon Monoxide, and Particulate Matter: Results from the Children’s Health Study. Environmental Health Perspectives, 113(11):1638.
  • Schoenberg (1942) Schoenberg, I. J. (1942). Positive Definite Functions on Spheres. Duke Math. Journal, 9:96–108.
  • Shirota et al. (2017) Shirota, S., Gelfand, A. E., et al. (2017). Space and Circular Time Log Gaussian Cox Processes with Application to Crime Event Data. The Annals of Applied Statistics, 11(2):481–503.
  • Shumway and Stoffer (2017) Shumway, R. H. and Stoffer, D. S. (2017). Time Series Analysis and Its Applications. Springer.
  • Sillman (1999) Sillman, S. (1999). The Relation Between Ozone, NOx and Hydrocarbons in Urban and Polluted Rural Environments. Atmospheric Environment, 33(12):1821–1845.
  • Solin and Särkkä (2014) Solin, A. and Särkkä, S. (2014). Explicit Link between Periodic Covariance Functions and State Space Models. In Artificial Intelligence and Statistics, pages 904–912.
  • Stein (1999) Stein, M. L. (1999). Interpolation of Spatial Data: Some Theory for Kriging. Springer Science & Business Media.
  • Stein (2005) Stein, M. L. (2005). Statistical Methods for Regular Monitoring Data. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(5):667–687.
  • Stein (2008) Stein, M. L. (2008). A Modeling Approach for Large Spatial Datasets. Journal of the Korean Statistical Society, 37(1):3–10.
  • Stein (2014) Stein, M. L. (2014). Limitations on Low Rank Approximations for Covariance Matrices of Spatial Data. Spatial Statistics, 8:1–19.
  • Stein et al. (2004) Stein, M. L., Chi, Z., and Welty, L. J. (2004). Approximating Likelihoods for Large Spatial Data Sets. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 66(2):275–296.
  • Vecchia (1988) Vecchia, A. V. (1988). Estimation and Model Identification for Continuous Spatial Processes. Journal of the Royal Statistical Society. Series B (Methodological), pages 297–312.
  • Weschler (2006) Weschler, C. J. (2006). Ozone’s Impact on Public Health: Contributions from Indoor Exposures to Ozone and Products of Ozone-Initiated Chemistry. Environmental Health Perspectives, 114(10):1489.
  • West and Harrison (1997) West, M. and Harrison, J. (1997). Bayesian Forecasting and Dynamic Models. Springer-Verlag New York, Inc., New York, NY, USA.
  • White et al. (2018) White, P. A., Gelfand, A. E., Rodrigues, E. R., and Tzintzun, G. (2018). Pollution State Modeling for Mexico City. arXiv preprint arXiv:1807.03935.
  • White and Porcu (2018) White, P. A. and Porcu, E. (2018). Towards a Complete Picture of Covariance Functions on Spheres Cross Time. arXiv preprint arXiv:1807.04272.
Model Equation Space ES CRPS MAPE RMSPE 90% CVG
1 21.518 3.530 4.752 6.611 0.914
2 (24) 24.889 4.139 5.645 7.554 0.932
3 (24) 19.941 3.238 4.531 6.085 0.903
4 (25) 24.726 4.295 5.851 7.890 0.914
5 (25) 19.773 3.249 4.309 6.050 0.908
6 (20) 21.045 3.452 4.635 6.514 0.911
7 (23) 19.652 3.214 4.273 6.025 0.919
Table 1: Model Comparison results for the Mexico City ozone data. Parentheses in the “Space” column indicate that those quantities are nonseparable in the covariance model. The lowest ES and CRPS are highlighted.
Mean Standard Deviation 2.5% 97.5%
7.3943 0.5356 6.0396 8.1924
-0.0207 0.0012 -0.0230 -0.0180
0.1129 0.0068 0.0994 0.1263
86.9006 4.9505 77.6969 97.4642
22.3190 0.7020 20.8030 23.6254
0.6740 0.0095 0.6558 0.6940
0.0947 0.0019 0.0909 0.0985
2.0981 0.0502 2.0015 2.1986
0.9568 0.0013 0.9542 0.9592
Table 2: Posterior summaries for model parameters.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description