Distributionsoriented wind forecast verification by a hidden Markov model for multivariate circularlinear data
Abstract
Winds from the NorthWest quadrant and lack of precipitation are known to lead to an increase of PM10 concentrations over a residential neighborhood in the city of Taranto (Italy). In 2012 the local government prescribed a reduction of industrial emissions by 10% every time such meteorological conditions are forecasted 72 hours in advance. Wind forecasting is addressed using the Weather Research and Forecasting (WRF) atmospheric simulation system by the Regional Environmental Protection Agency. In the context of distributionsoriented forecast verification, we propose a comprehensive modelbased inferential approach to investigate the ability of the WRF system to forecast the local wind speed and direction allowing different performances for unknown weather regimes. Groundobserved and WRFforecasted wind speed and direction at a relevant location are jointly modeled as a 4dimensional time series with an unknown finite number of states characterized by homogeneous distributional behavior. The proposed model relies on a mixture of joint projected and skew normal distributions with timedependent states, where the temporal evolution of the state membership follows a first order Markov process. Parameter estimates, including the number of states, are obtained by a Bayesian MCMCbased method. Results provide useful insights on the performance of WRF forecasts in relation to different combinations of wind speed and direction.
1 Introduction
This work is concerned with a heavy industrial district located very close to a residential area in the city of Taranto (Puglia region, Italy) including the largest steel factory in Europe, an oil refinery and a kiln cement factory. Emissions are mainly composed by suspended particle matter, polycyclic aromatic hydrocarbon compounds and benzene (Fisher, 2003) and are associated to known adverse health effects (Brunekreef and Holgate, 2012). In the last years, several PM10 limit value exceedances (according to the 2008 European Air Quality Directive 2008/50/EC) were recorded and these pollution events showed a close connection with intense winds from North and NorthWest and lack of precipitation, encouraging transportation from the industrial site to the adjacent urban area (Amodio et al., 2013; Fedele et al., 2014). In 2012, the Puglia Local Government adopted a Regional Air Quality Plan prescribing a reduction of industrial emissions by 10% (with respect to the daily mean values) every time such meteorological conditions are forecasted 72 hours in advance. Here we focus on wind forecasting, that the Puglia Environmental Protection Agency (ARPA Puglia) addresses by the Weather Research and Forecasting (WRF) mesoscale numerical weather prediction system (Skamarock et al., 2008; De Tomasi et al., 2011; Fedele et al., 2015). We are mainly concerned with the investigation of the ability of the WRF system to properly forecast winds blowing over the Gulf of Taranto. For this purpose we consider hourly WRFforecasted wind speed and direction data for year 2014 and the corresponding ground data collected at a specific relevant point within the Tamburi neighborhood.
Verification is one aspect of measuring the quality of weather forecasts by comparison to relevant observations. The traditional measuresoriented approach to forecast verification involves obtaining a (generally small) number of performance measures based on the posterior evaluation of a sample of past forecasts and observations. The alternative distributionsoriented approach acknowledges the intrinsic inferential nature of forecast verification and addresses the measure of the quality of weather forecasts as an estimation problem, allowing for the explicit consideration of the main sources of uncertainty involved in the process (Jolliffe and Stephenson, 2012). The development of verification schemes based on the joint probability distribution of forecasts and observations has been urged by many authors in the past (Murphy and Winkler, 1987; Brooks and Doswell, 1996), as they allow proper investigation of the stochastic nature of the relationship between forecasts and observations providing insights into strengths and weaknesses of the forecasting systems and showing areas where improvements can be obtained. Parametric probability distributions are only occasionally assumed for this joint distribution in continuous settings (see Wilks, 2011, and references therein). This is even more true in the investigation of the performance of wind field forecasts, where observed and forecasted wind speed and direction form a continuous 4dimensional mixed circularlinear variable. Atmospheric simulation systems such as WRF show different performances for different weather conditions (Lefèvre et al., 2010; RostkierEdelstein et al., 2014; Raktham et al., 2015). Then, for the purpose of investigating WRF forecast performance, we consider the 4dimensional time series of observed and forecasted wind speed and direction as characterized by an unknown number of homogeneous states. Such states reproduce associated observed and forecasted wind conditions, accounting for the relation between wind speed and direction.
There is a growing interest in circular data analysis, with examples arising in areas such as Oceanography (Mastrantonio et al., 2016, 2015b; Lagona et al., 2015), Biology (Maruotti et al., 2015; Hokimoto and Kiyofuji, 2014; Langrock et al., 2012) and Social Sciences. (Gill and Hangartner, 2010). To our knowledge, the first joint circularlinear probability distribution model for more than two random variables was recently introduced by Mastrantonio (2015), namely the joint projected and skew normal (JPSN). Among the features that make the JPSN attractive is the great flexibility and the possibility to introduce dependence between and within circular and linear variables. In order to properly represent homogeneous combinations of observed and forecasted wind conditions, we jointly model the time evolution of the 4dimensional JPSN mixed variable by a hidden Markov model (HMM), i.e. a mixture model with timedependent states, where the state membership evolves according to a first order Markov process. We adopt a nonparametric Bayesian estimation framework, allowing to estimate the unknown number of latent states considering Dirichlet process priors for transition probabilities. HMMs have already been used to analyze groundobserved wind speed and direction without forecast verification purposes. Most of the time independence between speed and direction is assumed, as in Holzmann et al. (2006), Zucchini and MacDonald (2009), Bulla et al. (2012) and Bulla et al. (2015), but exceptions exist. For example Lagona et al. (2015) use the recently introduced AbeLey cylindrical density (Abe and Ley, 2015) and Mastrantonio et al. (2015a) adopt the general circularlinear projected normal. In all the previous proposals the unknown number of latent states is not estimated, but it is rather assumed fixed apriori.
In this proposal we define a comprehensive modelbased approach that allows addressing distributionsoriented wind forecast verification properly accounting for the circular nature of wind direction data. While the joint behavior of observed and forecasted wind speed and direction is described by the JPSN, the HMM provides the representation of the time evolution with changing homogeneous states. Coupling the JPSN with the HMM we obtain a flexible model and a rich parametrization that reproduce the relevant observed and forecasted processes. Overall, computational efficiency characterizes the Bayesian MCMC estimation of the proposed model: HMM and JPSN parameters are both obtained by Gibbs sampling steps.
The remaining part of the paper is structured as follows. The environmental problem is stated in Section 2 where we report a brief description of the relevant meteorology of the Taranto area and provide information on several data features. In Section 3 the JPSN probability distribution for multivariate mixed circularlinear random vectors is introduced; some distributional features and implementation issues are also outlined. Section 4 is dedicated to the definition of the HMM for circularlinear multivariate time series and to a brief discussion on the implementation of the relative Bayesian inferences. Finally, in Section 5 we report the results for the Taranto casestudy, highlighting the major advantages of the proposed methodology. Additional information and supporting material is available online at the journal’s website.
2 Observed and simulated wind data
As already remarked in the introduction, we are concerned with the Tamburi neighborhood within the city of Taranto, here represented by the position of the San Vito air quality monitoring station. The San Vito ground station belongs to the ARPA Puglia air quality monitoring network, it is provided with six meteorological sensors conform to the World Meteorological Organization standards and collects hourly wind speed and direction data since February 2002. The location of the San Vito monitoring station is characterized by an extreme proximity to the Ionian Sea. This makes wind measurements strongly affected by landsea breezes (Stull, 1988) due to differences in the heat capacity and molecular conductivity between land surface and sea. Though a direction is still associated to winds observed with low speed ( m/s), it should be noted that these measures are affected by high variability. However, due to the proximity to the sea, the landsea breeze effect causes the observed series to have no null wind speed recordings and a very low percentage of small ones (%). Missing values (2.1% for wind direction and 2.0% for wind speed) are generally due to baffling winds or data transmission errors. A preliminary characterization of Taranto winds based on San Vito data is obtained by the annual wind roses for the period 20112014 and by the seasonal wind roses for the year 2014 reported in Figure 1.
In the annual roses (1, top) winds blowing from NorthWest are generally associated to high speed ( 6m/s), while in the other quadrants weaker winds are found. The average annual wind speed values range from 2.45 m/s in 2014 to 2.92 m/s in 2012. The seasonal wind roses (1, bottom) show that NorthWest winds are prevailing in autumn and spring while in winter and summer winds from the SouthEast quadrant are more frequent. Winds blowing from the NorthWest quadrant are stronger in autumn and spring, while the speed of winds from SouthEast is higher in the colder than in the warmer seasons. As a matter of fact, the intensity and evolution of winds in the Mediterranean area mainly depend on the position and extension of the Azores and Siberian Anticyclones which in turn have more or less fixed seasonal configurations (Fantauzzo, 1987). Therefore a highly differential behavior of wind regimes is expected between the four seasons. Given the purpose of the present study, observed and forecasted annual time series were partitioned into their seasonal components.
Hourly wind speed and direction forecasts are obtained 72 hours in advance for the whole year 2014 by the WRF atmospheric simulation system. WRF is developed by a collaboration of research centers, universities and government agencies coordinated by the US National Center for Atmospheric Research (NCAR). Among the advantages of WRF is the high flexibility that allows tuning physical parameterizations according to specific interests. Predictive simulations are obtained by the ARW (Advanced Research WRF) core of the WRF system, solving the fully compressible nonhydrostatic Euler equations using terrainfollowing hydrostaticpressure vertical coordinates (Skamarock et al., 2005) and the RungeKutta integration method (Butcher, 1987). Specific WRF settings used in the implementation of this work include the following:

Among the different parametrization schemes offered by WRF to simulate physical processes, those used for radiation processes, surface processes, planetary boundary layer physics, cumulus processes and microphysics processes are listed in Table 1.

A oneway nesting configuration was chosen including two domains with different spatial resolution: a parent domain with 16 km grid spacing covering the central Mediterranean and a nested domain with 4 km grid spacing covering the Puglia region.

The WRF software architecture allowed the use of parallel computing and part of the simulation process was run as a distributed memory job, with big advantages in reducing computation times. WRF was run on a highperformance computational infrastructure, made available within the ReCaS project, funded by the Italian Ministry of Education, University and Research.
Physics process  WRF scheme name 

Radiation Processes  rrtm scheme (Mlawer and Clough, 1997) and 
Dudhia scheme (Dudhia, 1989)  
Surface Processes  Noah Land Surface Model (Chen and Dudhia, 2001) 
Planetary Boundary Layer Physics  Yonsei University scheme (Hong et al., 2006) 
Cumulus Processes  KainFritsch scheme (Kain, 2004) 
Microphysics Processes  Thompson scheme (Thompson et al., 2004, 2008) 
Simulations cover the period 3 January  20 December 2014, with output temporal resolution fixed to one hour and 72 hours forecast time. For comparability reasons, WRFforecasted wind fields were obtained at the point location of the San Vito ground monitoring station. Missing values of the WRFsimulated series (4.5%) are related to entire days for which the initial and boundary conditions are not available.
3 The joint projected and skew normal distribution
As multivariate circularlinear distribution model for groundobserved and WRFsimulated wind speed and direction we consider the recently introduced JPSN (Mastrantonio, 2015). In this Section we define the JPSN, discuss some identifiability issues, show how to perform Bayesian inference in the i.i.d. case and provide statistics describing the main distributional features. The extension of the MCMC algorithm from the i.i.d. to the time series framework is given in Section 4 together with the definition of the HMM used to represent the time evolution characterized by homogeneous latent states.
Let and be two realvalued random vectors of length and , respectively. As a first step in the constructive definition of the JPSN we assume that the dimensional random vector is distributed as a multivariate skew normal distribution (hereafter SN, Sahu et al., 2003) with parameters , and : , where , is a nonnegative definite matrix, is a vector of zeros of length , and indicates a diagonal matrix with nonnull elements given by its argument. The SN distribution introduces skewness in the multivariate normal through the so called skew matrix, in our case given by . Since we assume a diagonal skew matrix, the SN distribution is closed under marginalization (Sahu et al., 2003), then and . The SN distribution has the following alternative stochastic representation (Li, 2005), that will prove to be useful later in this section. In our notation, implies:
(1)  
(2) 
where and , where is the dimensional identity matrix and indicates the half normal distribution, i.e. a truncated normal defined over .
As a second step in the definition of the JPSN distribution, we build a vector of circular variables partitioning into couples and transforming each using the following relation:
(3) 
where and is a modified arctangent function (Jammalamadaka and SenGupta, 2001). Equation (3) transforms the normally distributed 2dimensional random variable into a circular variable with projected normal distribution (Wang and Gelfand, 2013). Notice that also the following relations hold:
(4)  
(5) 
where .
Following Mastrantonio (2015), the vector of circular and linear variables , obtained transforming by (3) is said to have variate joint projected and skew normal distribution with parameters , and : . The JPSN distribution does not have a closed form expression, but the joint density of the “augmented” vector , with , is easily obtained using (1), (2), (4) and (5):
(6)  
(7) 
where is the multivariate normal probability density function. Vector is said to be distributed as an augmented joint projected and skew normal: . As marginalization of (7) over and gives the JPSN density, the JPSN parameter estimation algorithm is based on the AugJPSN, treating the elements of as latent variables.
3.1 Identifiability of the JPSN
The projected normal, in its general form, is known to have an identifiable issue (Wang and Gelfand, 2013). Let , with , notice that the two random vectors and produce the same since ’s cancel out in equation (3). Then and are the same distribution and the model is not identifiable. The JPSN suffers from the same problem since the marginal distribution of its circular component is the projected normal. Let , then and are the same distribution. As a consequence, the model is not identifiable unless some constraints are adopted.
Setting the variance of to 1 () addresses the identification problem (Wang and Gelfand, 2013), but the constraints hamper the estimation of , due to the unavailability of algorithms for constrained covariance matrix estimation. Alternatively, let be the variance of and and notice that and produce the same JPSN density but, by construction, complies with the identifiability constraints. The algorithm proposed by Mastrantonio (2015) obtains posterior samples from the nonidentifiable model and rescales each sample of to the identifiable version .
3.2 JPSN estimation: MCMC implementation details
Posterior samples of JPSN parameters are easily obtained by the augmented representation of the JPSN in (7), under suitable prior choices. More precisely, for a set of i.i.d. observations , the joint full conditional of JPSN parameters , and is proportional to
(8) 
This latter expression is also obtained as the joint full conditional for a multivariate normal likelihood, where the mean depends on , and , with a given prior over , and . Notice that , and respectively play the roles of intercept, design matrix and regression coefficients in a Bayesian regression framework. Then standard priors used in this context can be used to implement Gibbsbased MCMC steps. As suggested by Mastrantonio (2015), assuming a normal inverseWishart (NIW) prior for and an independent normal prior for , the MCMC algorithm can conveniently be based on Gibbs steps. In fact in this case the full conditionals of and are still respectively NIW and normal, the one of is truncated normal and can be simulated using the slicesampling strategy proposed by HernandezStumpfhauser et al. (2016).
3.3 Statistics for JPSN distributional features
JPSN parameters , and have a straightforward interpretation, since (1) and (2) imply that
(9) 
and controls the skewness of the distribution of the linear component . Matrixvalued parameters and control the circularlinear and circularcircular dependence since implies and implies , where indicates independence. It is not very clear how all parameters jointly influence the density of , however Monte Carlo (MC) approximations of the main features of the JPSN distribution are obtained in the Bayesian estimation framework, bypassing the afore mentioned difficulties. MC approximations of the circular mean and concentration of are respectively obtained sampling the following functions of :
(10) 
and
(11) 
A measure of the correlation between circular variables (Fisher, 1996) taking values in is given by
(12) 
where the bivariate random variable is distributed as . A circularlinear dependence measure taking values in (Mardia, 1976) is given by:
(13)  
(14) 
4 A hidden Markov model for observed and forecasted wind fields
In this section we introduce the joint model for observed and forecasted wind speed and direction that was estimated for each of the four seasons of year 2014. We indicate the groundobserved and WRFsimulated wind direction and log speed at time with and , respectively, assuming . Notice that in this case .
To catch the most relevant interactions between observed and forecasted wind speed and direction, the 4dimensional circularlinear time series is assumed to be characterized by homogeneous states and is modeled by a mixture of JPSN distributions with timedependent states, where the temporal evolution of the state membership follows a first order Markov process, namely a HMM. The HMM is estimated within a nonparametric Bayesian framework, relying on Dirichlet process priors for transition probabilities, thus leading to a multivariate circularlinear version of the sticky hierarchical Dirichlet process HMM (sHDPHMM) proposed by Fox et al. (2011). This specification allows us to estimate the unknown number of latent states, along with all other model parameters.
As was mentioned in Section 3, the latent variables need to be introduced to estimate JPSN parameters, leading to the augmented JPSN. Overall, manifest and latent observables are: , , and . At times let be a discrete random variable that represents the state of the HMM. Let be the set of JPSN parameters for state and be the th row of the transition matrix, i.e. a probability vector. Then, letting , , and , the sHDPHMM is given by:
(15)  
(16)  
(17)  
(18)  
(19)  
(20) 
where is the Dirichlet process, is the GEM distribution (Pitman, 2006), , and are hyperparameters, is a valid prior distribution over the domain of , is the indicator function and is assumed (Cappé et al., 2005). Clearly, marginalization over gives the sHDPHMM for circularlinear data with for the th mixture component.
Equations (15), (16) and (17) define the standard HMM (Zucchini and MacDonald, 2009), where is the latent discrete Markov process with the AugJPSN as emission distribution. Notice that although the number of components is potentially infinite in the specification of the sHDPHMM, it is actually bounded by the length of the the observational period. To give an intuitive interpretation of equations (18), let the latent state be nonempty if at least one is equal to . Without loss of generality, let , with , where the first states are nonempty. Then, following the standard definition of the DP (see Sethuraman, 1994), equations (18) can equivalently be written as
(21)  
(22) 
where is the Dirichlet distribution. Then, as and , it follows that and rule the mean and variance of , with being an additional weight added to the selftransition probability to avoid the tendency to create redundant mixture components (see Teh and Jordan, 2010; Fox et al., 2011). The distribution of the s is ruled by and concentrates its probability mass on fewer s as decreases. Variable implicitly depends on parameters , and that have the following standard weak informative priors: , and , where indicates the gamma distribution in terms of shape and scale and is the beta distribution. These priors allow to update the latent discrete time series and all parameters of the sHDPHMM using only Gibbs steps (see Fox et al., 2011; Beal et al., 2002).
To appreciate how the estimation algorithm proposed in Section 3 for the i.i.d. case can be used in this context, notice that observations in state are i.i.d. conditionally on the process . Then priors suggested for the i.i.d. case in Section 3 allow to update parameters and latent variables using only Gibbs steps. We use the following standard weak informative prior settings: , .
Notice that, despite the high complexity of the model, the MCMC algorithm needed to estimate the model unknowns is only based on Gibbs steps.
5 Results
The estimation algorithms specified in Sections 3 and 4 were applied to the four seasonal time series of hourly observed and forecasted wind speed and direction covering 77, 92, 92, 91 days, respectively. Prior to modeling, in order to comply with the domain of the JPSN, circular variables were transformed from degrees to radians and the log of speed was taken for both groundobserved and WRFsimulated wind data. Missing values were predicted along with all unknown parameters during model fitting. To improve the interpretation of graphical and tabular displays the results were backtransformed to their original units. All models were estimated considering 400,000 iterations, with 300,000 for the burnin phase and thinning by 20, i.e. taking 5,000 samples for inferential purposes. For each season, our R/C++ implementation of the estimation procedure took about 3 hours with with a 2.5 GHz Intel Core i5 processor. The number of states of the sHDPHMM was estimated as 5 for all of the four seasons, with . The five components were ordered, based on increasing ground speed sHDPHMM posterior means. We want to stress that the latent sHDPHMM states cannot be interpreted as wind regimes, since they jointly represent correlated groundobserved and WRFsimulated winds that do not necessarily agree in terms of speed and direction. An example and some details are given in the supplementary material.
The overall performance of the sHDPHMM is described in Table 2 and Figure 2. As measures of model fit we report the average prediction error (APE) and the mean squared error (MSE) between observed and sHDPHMM posterior predicted values, respectively for circular and linear variables (Table 2). The APE (Jona Lasinio et al., 2012) is defined as the average circular distance between observed and predicted values, where the circular distance between angles and is given by . While APE takes values in [0,2), it is well known that MSE does not have a finite range. Then, for comparison, the same indices were calculated with posterior predicted values from a minimal JPSN model for timeindependent data, i.e. an i.i.d. case (in brackets in Table 2). Inspection of Table 2 shows that the sHDPHMM time structure clearly improves the fit, measured in terms of APE and MSE, for all variables in the four seasons. It also shows that the sHDPHMM fits systematically better to groundobserved rather than to WRFsimulated data. This is probably due to the known tendency of WRF to overemphasize wind peaks that are smoothed by the sHDPHMM. As concerns seasonal differences, it is quite clear that smoother regimes corresponding to calmer summer wind conditions favor a better fit of the sHDPHMM. In Figure 2 empirical and sHDPHMMestimated marginal distributions (solid and dashed lines) substantially agree, thus the sHDPHMM provides an overall reliable representation of both groundobserved and WRFsimulated wind data. Concerning the quality of WRF forecasts, Figure 2 shows that the bimodality of wind direction with peaks around the NW and SE quadrants is well reproduced for the four seasons, together with the strong asimmetry of wind speed. WRFsimulated wind speed clearly overestimates ground recordings on average and shows higher variability.
APE  APE  MSE  MSE  

WINTER  0.515  0.572  3.835  13.934 
(0.969)  (0.977)  (9.109)  (30.592)  
SPRING  0.509  0.554  2.475  9.057 
(0.981)  (0.959)  (7.776)  (15.195)  
SUMMER  0.458  0.585  1.941  6.507 
(0.991)  (0.960)  (5.003)  (13.007)  
AUTUMN  0.330  0.667  3.650  14.323 
(0.983)  (0.995)  (11.571)  (23.998) 
In Table 3 we report the Bayesian MC estimates of groundobserved () and WRFsimulated () wind direction circular means (10) and concentrations (11) and wind speed means and variances (9) for the five sHDPHMM states and the four seasons (Bayesian credible intervals are available in the supplementary material). Table 3 allows us to investigate the main features of the detected latent homogeneous states, corresponding to winds blowing with increasing groundobserved mean wind speed. On average, winds with higher speed show smaller variability for circular variables, i.e. stronger winds have more focused directions. The comparison between estimated concentrations for ground and simulated data shows a generally higher variability of WRF wind directions with respect to observed ones. Concerning forecast verification, a conservative tendency of WRF to overestimate wind speed for all states and seasons is seen, with some difficulty in reproducing the ordering of groundobserved mean wind speed due to the substantially higher variability. Some concern due to higher wind speed forecasts is addressed to winds blowing from SWW (winter state 3, spring and summer state 2). This strong positive bias may plausibly be attributed to the extreme proximity of the San Vito ground monitoring station to the Ionian sea coast line, in the absence of the drifting effects of any obstacle to winds blowing from SWW and coming straight from the sea. In general, Table 3 shows a good agreement of groundobserved and WRFsimulated circular means, though WRF seems to have some troubles in forecasting wind directions with low to intermediate speed (winter, spring and autumn state 1, summer state 3). In other words, WRF always succeeds in detecting winds from the NW and SE quadrants, unless they blow with weak intensity, as in summer.
WINTER  1  2  3  4  5 

89.648  156.339  207.042  144.058  330.557  
319.242  167.336  218.275  153.48  336.098  
0.548  0.358  0.395  0.054  0.198  
0.671  0.362  0.309  0.069  0.303  
0.697  1.988  2.122  4.028  4.049  
3.113  3.976  9.681  9.922  6.281  
0.147  1.552  1.735  1.868  4.348  
3.031  3.724  8.451  14.866  8.444  
SPRING  1  2  3  4  5 
107.067  279.180  180.703  335.475  313.651  
348.755  291.028  169.685  347.246  319.578  
0.433  0.184  0.355  0.373  0.065  
0.831  0.08  0.235  0.626  0.109  
0.722  1.811  2.439  3.277  5.644  
2.709  5.366  4.779  3.816  8.247  
0.153  0.843  0.842  1.817  4.706  
2.568  4.897  6.743  2.857  6.189  
SUMMER  1  2  3  4  5 
84.871  296.234  307.542  173.961  312.629  
82.766  261.101  176.688  168.684  322.396  
0.445  0.885  0.305  0.072  0.086  
0.836  0.308  0.695  0.126  0.173  
0.700  2.033  2.337  2.573  3.795  
2.493  7.368  3.065  4.562  5.927  
0.136  1.605  1.221  0.628  1.981  
1.524  5.134  2.677  4.129  4.098  
AUTUMN  1  2  3  4  5 
93.647  223.014  28.571  145.887  325.320  
300.908  207.719  55.458  156.076  339.381  
0.120  0.513  0.129  0.044  0.061  
0.771  0.678  0.589  0.082  0.226  
0.627  1.253  1.597  3.274  4.813  
3.671  3.653  4.525  8.737  7.672  
0.073  0.592  2.638  1.846  4.273  
3.191  4.789  9.762  11.976  9.376  
In Figure 3 dependences between circular and linear variables are displayed with square size proportional to the relative association measure. Circularcircular correlations and circularlinear dependences (Fisher, 1996; Mardia, 1976) are respectively computed by (12) and (14), while linear associations are measured by the Pearson’s correlation coefficient. Fisher’s and Pearson’s coefficients are plotted only if associated 95% credible intervals are strictly positive or negative. Since in (14) is null with probability 0, it was plotted according to a different criterion: notice that so that if the relative elements of are different from zero, then some circularlinear correlation is implied. Accordingly, posterior mean values of are plotted in Figure 3 only if at least one of the 95% credible intervals of the associated components of does not contain zero. Notice the high number of boxes with negative, null or weak correlation in states 1, 2 and 3, while states 4 and 5, corresponding to higher observed wind speed, are more likely to show positive correlations between groundobserved and WRFsimulated wind components. In particular, no meaningful correlations between forecasts and observations are estimated for state 1. Circularlinear associations are overall weaker, but still show the same tendency to increase with groundobserved wind speed.
6 Concluding remarks
In the previous Section, the sHDPHMM was used to reproduce ground data, WRF simulations and the relationship between the two. Here we do not focus on the calibration of the WRF system: rather, the description of the features that characterize clusters of observed and forecasted winds is the fundamental output of the proposed method in terms of forecast verification. Among these features are not only the distribution summaries for both observed and simulated data, but also correlations between variables and transition probabilities of moving from one sHDPHMM component to another, informing on the time evolution of homogeneous states (Table 2 in the supplementary material). The distributionsoriented approach allows us to deal with forecast verification in a fully comprehensive model estimation setting. In addition, many uncertainty features of the process and its components are quantified in the Bayesian estimation framework (as can be appreciated by the posterior credibility intervals of model parameters, fully reported in the supplementary material). In this respect, notice that the remarkably different variability of WRF with respect to groundobserved wind speed records, reproduced by the sHDPHMM as reported in Table 3, may be due the peculiar position of the validation point, located in a complex residential/industrial area characterized by the extreme proximity to the coast line of the Ionian Sea. At point locations considerably affected by local features, as the San Vito station, the spatial discretization required for the numerical solution of the atmospheric equations affects the variability of WRFsimulated wind records given by averages over volumes with homogeneous properties (Jiménez et al., 2010). Within these simulation volumes, the smoothing of surface physical properties (e.g. orography) implied by the spatial discretization has also a strong influence on wind simulations, producing higher smoothness and variability. Besides showing the ability of the sHDPHMM in reproducing both groundobserved and WRFsimulated wind data, our proposal allows us to investigate some peculiar characteristics of the WRF system performance at a specific validation point. Within an estimation framework that allows full control of process and model uncertainties, it highlights features not as precisely derived within the traditional measurementsbased forecast verification framework. This distributionsoriented proposal can then be regarded as a way to check the reliability of the WRF simulations for the specific purposes addressed in Section 1. In fact the ability to simulate focused directions for strong winds makes the application of WRF (with the specific settings reported in Table 1) especially suitable for forecasting strong wind events as defined by the Regional Air Quality Plan with the aim to reduce industrial emission during such events.
Among the possible developments of the present proposal is the investigation of the possibility to model the seasonality in the observed process. For this purpose, the regime switching mechanism might be driven by a nonhomogeneous Markov chain whose transition probabilities are periodic functions. In all the work, the assumption that the amount of null wind speed recordings and their dependence on the data generation process are negligible is justified from a physical point of view. In a more realistic situation, zeros could be considered as missing values within a latent variable approach and predicted along with unknown parameters during model fitting.
Acknowledgement
This work is partially developed under the PRIN2015 supportedproject “Environmental processes and human activities: capturing their interactions via statistical methods (EPHASTAT)” funded by MIUR (Italian Ministry of Education, University and Scientific Research).
References
 Abe and Ley (2015) Abe, T. and Ley, C. (2015). A tractable, parsimonious and highly flexible model for cylindrical data, with applications. ArXiv eprints.
 Amodio et al. (2013) Amodio, M., Andriani, E., De Gennaro, G., Di Gilio, A., Ielpo, P., Placentino, C., and Tutino, M. (2013). How a steel plant affects air quality of a nearby urban area: A study on metals and pah concentrations. Aerosol Air Quality Research, 13(2), 497–508.
 Beal et al. (2002) Beal, M., Ghahramani, Z., and Rasmussen, C. (2002). The infinite hidden Markov model. In T. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14, Cambridge, MA. MIT Press.
 Bossard et al. (2000) Bossard, M., Feranec, J., and Otahel, J. (2000). Corine land cover technical guide. Technical report, Addendum 2000, European Environment Agency.
 Brooks and Doswell (1996) Brooks, H. E. and Doswell, C. A. (1996). A comparison of measuresoriented and distributionsoriented approaches to forecast verification. Weather Forecasting, 11, 288–303.
 Brunekreef and Holgate (2012) Brunekreef, B. and Holgate, S. T. (2012). Air pollution and health. The Lancet, 360(9341), 1233–1242.
 Bulla et al. (2012) Bulla, J., Lagona, F., Maruotti, A., and Picone, M. (2012). A multivariate hidden Markov model for the identification of sea regimes from incomplete skewed and circular time series. Journal of Agricultural, Biological, and Environmental Statistics, 17(4), 544–567.
 Bulla et al. (2015) Bulla, J., Lagona, F., Maruotti, A., and Picone, M. (2015). Environmental conditions in semienclosed basins: A dynamic latent class approach for mixedtype multivariate variables. Journal de la Société Française de Statistique, 156(1), 114–137.
 Butcher (1987) Butcher, J. C. (1987). The Numerical Analysis of Ordinary Differential Equations: RungeKutta and General Linear Methods. WileyInterscience, New York, NY, USA.
 Buttner et al. (2004) Buttner, G., Feranec, J., Jaffrain, G., Mari, L., Maucha, G., and Soukup, T. (2004). The corine land cover 2000 project. EARSeL eProceedings, 3(3), 331–346.
 Cappé et al. (2005) Cappé, O., Moulines, E., and Ryden, T. (2005). Inference in Hidden Markov Models. Springer Series in Statistics. Springer, New York.
 Chen and Dudhia (2001) Chen, F. and Dudhia, J. (2001). Coupling an advanced land surface–hydrology model with the penn state–ncar mm5 modeling system. part i: Model implementation and sensitivity. Monthly Weather Review, 129(4), 569–585.
 De Tomasi et al. (2011) De Tomasi, F., Miglietta, M., and Perrone, M. (2011). The growth of the planetary boundary layer at a coastal site: a case study. BoundaryLayer Meteorology, 139(3), 521–541.
 Dudhia (1989) Dudhia, J. (1989). Numerical Study of Convection Observed during the Winter Monsoon Experiment Using a Mesoscale TwoDimensional Model. J. Atmos. Sci., 46(20), 3077–3107.
 Fantauzzo (1987) Fantauzzo, F. (1987). Dalla brezza all’uragano. Meteorologia Moderna. ETS.
 Fedele et al. (2014) Fedele, F., Menegotto, M., Trizio, L., Angiuli, L., Guarnieri, A., Carducci, C., Bellotti, R., Giua, R., and G., A. (2014). Meteorological effects on pm10 concentrations in an urban industrial site: a statistical analysis. In Conference Proceedings  1st International Conference on Atmospheric DUST, pages 162–167.
 Fedele et al. (2015) Fedele, F., Miglietta, M., Perrone, M., Burlizzi, P., Bellotti, R., Conte, D., and Carducci, A. G. C. (2015). Numerical simulations with the {WRF} model of water vapour vertical profiles: A comparison with {LIDAR} and radiosounding measurements. Atmospheric Research, 166, 110 – 119.
 Fisher (1996) Fisher, N. I. (1996). Statistical Analysis of Circular Data. Cambridge University Press, Cambridge.
 Fisher (2003) Fisher, R. (2003). Sources, measurement and control of fugitive emissions in the cokemaking process. The YearBook of the Coke Oven Managers’ Association, page 87–1005.
 Fox et al. (2011) Fox, E. B., Sudderth, E. B., Jordan, M. I., and Willsky, A. S. (2011). A sticky hdphmm with application to speaker diarization. The Annals of Applied Statistics, 5(2A), 1020–1056.
 Gill and Hangartner (2010) Gill, J. and Hangartner, D. (2010). Circular data in political science and how to handle it. Political Analysis, 18(3), 316–336.
 HernandezStumpfhauser et al. (2016) HernandezStumpfhauser, D., Breidt, F. J., and van der Woerd, M. J. (2016). The general projected normal distribution of arbitrary dimension: Modeling and bayesian inference. Bayesian Analysis, doi: 10.1214/15BA989.
 Hokimoto and Kiyofuji (2014) Hokimoto, T. and Kiyofuji, H. (2014). Effect of regime switching on behavior of albacore under the influence of phytoplankton concentration. Stochastic Environmental Research and Risk Assessment, 28(5), 1099–1124.
 Holzmann et al. (2006) Holzmann, H., Munk, A., Suster, M., and Zucchini, W. (2006). Hidden Markov models for circular and linearcircular time series. Environmental and Ecological Statistics, 13(3), 325–347.
 Hong et al. (2006) Hong, S.Y., Noh, Y., and Dudhia, J. (2006). A new vertical diffusion package with an explicit treatment of entrainment processes. Monthly Weather Review, 134(9), 2318–2341.
 Jammalamadaka and SenGupta (2001) Jammalamadaka, S. R. and SenGupta, A. (2001). Topics in Circular Statistics. World Scientific, Singapore.
 Jiménez et al. (2010) Jiménez, P. A., GonzálezRouco, J. F., GarcíaBustamante, E., Navarro, J., Montávez, J. P., de Arellano, J. V.G., Dudhia, J., and MuñozRoldan, A. (2010). Surface wind regionalization over complex terrain: Evaluation and analysis of a highresolution wrf simulation. Journal of Applied Meteorology and Climatology, 49(2), 268–287.
 Jolliffe and Stephenson (2012) Jolliffe, I. and Stephenson, D. (2012). Forecast Verification: A Practitioner’s Guide in Atmospheric Science. Wiley.
 Jona Lasinio et al. (2012) Jona Lasinio, G., Gelfand, A., and Jona Lasinio, M. (2012). Spatial analysis of wave direction data using wrapped Gaussian processes. Annals of Applied Statistics, 6(4), 1478–1498.
 Kain (2004) Kain, J. S. (2004). The kain–fritsch convective parameterization: An update. Journal of Applied Meteorology, 43(1), 170–181.
 Lagona et al. (2015) Lagona, F., Picone, M., and Maruotti, A. (2015). A hidden markov model for the analysis of cylindrical time series. Environmetrics.
 Langrock et al. (2012) Langrock, R., King, R., Matthiopoulos, J., Thomas, L., Fortin, D., and Morales, J. M. (2012). Flexible and practical modeling of animal telemetry data: hidden Markov models and extensions. Ecology, 93(11), 2336–2342.
 Lefèvre et al. (2010) Lefèvre, J., Marchesiello, P., Jourdain, N. C., Menkes, C., and Leroy, A. (2010). Weather regimes and orographic circulation around new caledonia. Marine Pollution Bulletin, 61(7), 413–431.
 Li (2005) Li, J. (2005). Clustering based on a multilayer mixture model. Journal of Computational and Graphical Statistics, 14(3), 547–568.
 Mardia (1976) Mardia, K. V. (1976). LinearCircular Correlation Coefficients and Rhythmometry. Biometrika, 63(2).
 Maruotti et al. (2015) Maruotti, A., Punzo, A., Mastrantonio, G., and Lagona, F. (2015). A timedependent extension of the projected normal regression model for longitudinal circular data based on a hidden markov heterogeneity structure. Stochastic Environmental Research and Risk Assessment, pages 1–16.
 Mastrantonio (2015) Mastrantonio, G. (2015). The Joint Projected and Skew Normal. ArXiv eprints.
 Mastrantonio et al. (2015a) Mastrantonio, G., Maruotti, A., and Jona Lasinio, G. (2015a). Bayesian hidden Markov modelling using circularlinear general projected normal distribution. Environmetrics, 26, 145–158.
 Mastrantonio et al. (2015b) Mastrantonio, G., Gelfand, A. E., and Jona Lasinio, G. (2015b). The wrapped skew Gaussian process for analyzing spatiotemporal data. Stochastic Environmental Research and Risk Assessment, To appear.
 Mastrantonio et al. (2016) Mastrantonio, G., Jona Lasinio, G., and Gelfand, A. E. (2016). Spatiotemporal circular models with nonseparable covariance structure. TEST, 25(2), 331–350.
 Mlawer and Clough (1997) Mlawer, E. and Clough, S. (1997). On the extension of rapid radiative transfer model to the shortwave region. In In Proceedings of the Sixth Atmospheric Radiation (ARM) Science Team Meeting, pages 223–226.
 Murphy and Winkler (1987) Murphy, A. and Winkler, R. (1987). A general framework for forecast verification. Monthly Weather Review, 115, 1330–1338.
 Pitman (2006) Pitman, J. (2006). Combinatorial stochastic processes, volume 1875. SpringerVerlag, Berlin. Lectures from the 32nd Summer School on Probability Theory held in SaintFlour, July 7–24, 2002, With a foreword by Jean Picard.
 Raktham et al. (2015) Raktham, C., Bruyère, C., Kreasuwun, J., Done, J., Thongbai, C., and Promnopas, W. (2015). Simulation sensitivities of the major weather regimes of the southeast asia region. Climate Dynamics, 44(56), 1403–1417.
 RostkierEdelstein et al. (2014) RostkierEdelstein, D., Liu, Y., Pan, L., and Sheu, R.S. (2014). An objective weatherregimebased verification of wrfrtfdda forecasts over the eastern mediterranean. In EGU General Assembly Conference Abstracts, volume 16, page 2635.
 Sahu et al. (2003) Sahu, S. K., Dey, D. K., and Branco, M. D. (2003). A new class of multivariate skew distributions with applications to Bayesian regression models. Canadian Journal of Statistics, 31(2), 129–150.
 Sethuraman (1994) Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statistica Sinica, 4, 639–650.
 Skamarock et al. (2005) Skamarock, W. C., Klemp, J. B., Dudhia, J., Gill, D. O., Barker, D. M., Wang, W., and Powers, J. G. (2005). A description of the advanced research WRF Version 2. Technical report, NCAR Technical Note NCAR/TN–468+STR.
 Skamarock et al. (2008) Skamarock, W. C., Klemp, J. B., Dudhia, J., Gill, D. O., Barker, M., Duda, K. G., Huang, X. Y., Wang, W., and Powers, J. G. (2008). A description of the Advanced Research WRF Version 3. Technical report, NCAR Technical Note NCAR/TN–475+STR.
 Stull (1988) Stull, R. B. (1988). An Introduction to Boundary Layer Meteorology. Springer Netherlands.
 Teh and Jordan (2010) Teh, Y. W. and Jordan, M. I. (2010). Hierarchical bayesian nonparametric models with applications. In N. Hjort, C. Holmes, P. Müller, and S. Walker, editors, Bayesian Nonparametrics: Principles and Practice. Cambridge University Press.
 Thompson et al. (2004) Thompson, G., Rasmussen, R. M., and Manning, K. (2004). Explicit forecasts of winter precipitation using an improved bulk microphysics scheme. part i: Description and sensitivity analysis. Monthly Weather Review, 132(2), 519–542.
 Thompson et al. (2008) Thompson, G., Field, P. R., Rasmussen, R. M., and Hall, W. D. (2008). Explicit forecasts of winter precipitation using an improved bulk microphysics scheme. part ii: Implementation of a new snow parameterization. Monthly Weather Review, 136(12), 5095–5115.
 Wang and Gelfand (2013) Wang, F. and Gelfand, A. E. (2013). Directional data analysis under the general projected normal distribution. Statistical Methodology, 10(1), 113–127.
 Wilks (2011) Wilks, D. (2011). Statistical Methods in the Atmospheric Sciences, 3rd Edition. Elsevier.
 Zucchini and MacDonald (2009) Zucchini, W. and MacDonald, I. (2009). Hidden Markov Models for Time Series: An Introduction Using R. Chapman & Hall/CRC Monographs on Statistics & Applied Probability. Taylor & Francis.