A similarity-based implementation of the Schaake shuffle

A similarity-based implementation of the Schaake shuffle

Abstract

Contemporary weather forecasts are typically based on ensemble prediction systems, which consist of multiple runs of numerical weather prediction models that vary with respect to in the initial conditions and/or the the parameterization of the atmosphere. Ensemble forecasts are frequently biased and show dispersion errors and thus need to be statistically postprocessed. However, current postprocessing approaches are often univariate and apply to a single weather quantity at a single location and for a single prediction horizon only, thereby failing to account for potentially crucial dependence structures. Non-parametric multivariate postprocessing methods based on empirical copulas, such as ensemble copula coupling or the Schaake shuffle, can address this shortcoming. A specific implementation of the Schaake shuffle, called the SimSchaake approach, is introduced. The SimSchaake method aggregates univariately postprocessed ensemble forecasts using dependence patterns from past observations. Specifically, the observations are taken from historical dates at which the ensemble forecasts resembled the current ensemble prediction with respect to a specific similarity criterion. The SimSchaake ensemble outperforms all reference ensembles in an application to ensemble forecasts for surface temperature from the European Centre for Medium-Range Weather Forecasts.

Keywords: empirical copula, ensemble copula coupling, probabilistic weather forecasting, Schaake shuffle, similarity criterion, statistical ensemble postprocessing

1Introduction

Contemporary weather forecasts are typically constructed from ensemble prediction systems, which have run operationally since the early 1990s. An ensemble consists of multiple runs of numerical weather prediction models which vary with respect to the initial conditions and/or the parameterization of the atmosphere. Consequently, ensembles take account of the two major sources of uncertainty [33]. Ensemble forecasts are frequently biased and show dispersion errors [23]. Thus, they require statistical postprocessing to realize their full capability. During the last decade, several ensemble postprocessing methods have been developed. Examples are (variants of) the ensemble model output statistics (EMOS) [18] approach, which is also known as non-homogeneous regression, or Bayesian model averaging (BMA) [39]. However, EMOS and BMA, as well as other postprocessing methods, only apply to a single weather variable at a single location and for a single prediction horizon. Thus, they fail to account for spatial, temporal or inter-variable dependence structures, which are crucial in many applications such as flood warning [44], winter road maintenance [7] or the handling of renewable energy sources [35]. In recent years, there has been a keen interest in the development of multivariate postprocessing methods being able to address this shortcoming, and a lot of effort has been invested to that end. For instance, variants and modifications of EMOS and BMA, respectively, that can handle spatial [5] or inter-variable [51] dependencies are available. Furthermore, there are methods to capture temporal dependencies of consecutive lead times in postprocessed predictive distributions [37]. All these multivariate postprocessing methods are parametric and work well in rather low dimensions and in settings in which the involved correlation matrix can be taken to be highly structured. However, they model one type of dependence (spatial, inter-variable or temporal) only and appear to be inadequate when considering high-dimensional situations, in which no particular structure can be exploited. These issues can be addressed using non-parametric techniques based on the use of empirical copulas [46]. Following [63], an empirical copula [9] can be interpreted as a dependence template induced by a specific discrete multivariate data set. It can be employed to transfer a particular dependence pattern to samples which are drawn independently from a collection of univariate marginal distributions [63]. For example, in the ensemble copula coupling (ECC) approach of [47], such an empirical copula is derived from the unprocessed ensemble forecast and then applied to samples from univariate postprocessed predictive distributions, which can be gained via the standard EMOS or BMA approaches. This is equivalent to ordering these samples according to the rank dependence structure of the raw ensemble, thereby capturing the spatial, temporal and inter-variable flow dependence [47]. Proceeding in a similar manner, the Schaake shuffle [8] employs an ordering based on past observations from a historical data archive. Consequently, the corresponding empirical copula in the Schaake shuffle is induced by an observational database rather than by an ensemble forecast. However, the standard Schaake shuffle fails to condition the multivariate dependence pattern on current or predicted atmospheric states. To address this shortcoming, [8] proposed to develop an extension thereof, driven by the idea

“to preferentially select dates from the historical record that resemble forecasted atmospheric conditions and use the spatial correlation structure from this subset of dates to reconstruct the spatial variability for a specific forecast.”

Inspired by this suggestion, a specific implementation of the Schaake shuffle, referred to as the SimSchaake approach, is introduced in this paper. Essentially, the SimSchaake method proceeds like the Schaake shuffle, but the observations determining the dependence structure are taken from historical dates at which the ensemble forecasts resembled the current ensemble prediction with respect to a specific similarity criterion.
The remainder of the paper is organized as follows. In Section 2, we first discuss the general setting of empirical copula-based ensemble postprocessing and review ECC and the Schaake shuffle as reference examples. Then, we develop the SimSchaake approach. In Section 3, this new method is evaluated and compared to the reference methods in a case study. The paper closes with a discussion in Section 4.

2Empirical copula-based ensemble postprocessing methods

Copulas are valuable and established tools for the modeling of stochastic dependence [32]. They have been successfully employed in numerous application areas. A copula is an -variate cumulative distribution function (CDF) with standard uniform univariate marginal CDFs, where , . As is manifested in the famous Sklar’s theorem [52], a copula links a multivariate CDF to its univariate marginal CDFs via the decomposition

for . In a multivariate postprocessing setting, the sought multivariate CDF can thus be constructed by specifying both the univariate marginal CDFs and the copula modeling the dependence. The CDFs can be obtained by common univariate postprocessing for each location, weather variable and look-ahead time individually, for instance performed via EMOS or BMA. For the choice of , a prominent example is the Gaussian copula, which has been applied to a wide range of problems in climatology, meteorology and hydrology [51].
In this paper, we focus on the case in which are the empirical CDFs given by samples from univariate postprocessed CDFs and is taken to be an empirical copula [9], . According to [63], an empirical copula can be considered a multivariate dependence template derived from a specific discrete data set. To describe this formally, let be a data set consisting of tuples of size with entries in . Moreover, let denote the rank of in for and , assuming for simplicity that there are no ties. Then, the empirical copula induced by the data set is given by

for integers , with denoting the indicator function whose value is 1 if the event occurs, and zero otherwise.
In this section, we review ensemble copula coupling (ECC) [47] and the Schaake shuffle [8] as reference methods within the general frame of empirical copula-based multivariate ensemble postprocessing [46]. In addition, we develop the SimSchaake method as a specific implementation of the Schaake shuffle. While ECC and the Schaake shuffle have been used as a benchmark in several papers [43], the SimSchaake method is new.
To allow for a formal description of the approaches, let us first set some notation which will be used throughout the whole section. Let be a weather variable, a location and a look-ahead time. For simplicity, let and denote the corresponding multi-indices, and let and , respectively. Moreover, let denote the number of raw ensemble members, and the desired number of members the postprocessed ensemble shall consist of.
In general, multivariate empirical copula-based ensemble methods to postprocess a raw ensemble forecast initialized at a specific date proceed according to the following scheme.

2.1Reference methods

Now we review ECC [47] and the Schaake shuffle [8] as empirical copula-based postprocessing techniques within the scheme described above.

Ensemble copula coupling (ECC)

In the ECC approach [47], the data set specifying the dependence structure is given by the raw ensemble forecast, that is, we have in Scheme ?. Consequently, the sample size in step 3 and hence the size of the final postprocessed ensemble in step 4 of Scheme ? is restricted to equal that of the unprocessed ensemble, that is, .
The ECC procedure operates under a perfect model assumption, implictly assuming that the ensemble is capable to represent actual spatial, inter-variable and temporal dependence structures adequately. This may or may not be expected, but surely does not hold each and every day. Moreover, ECC only applies to ensembles whose members can be considered exchangeable, that is, statistically indistinguishable.
[47] and [45] show that ECC provides an overarching frame for seemingly unrelated approaches scattered in the literature such as those of [34], [41], [14] or [56], to name just a few. It has been used as a reference technique in the recent papers by [63], [13] and [48].

The Schaake shuffle

In contrast to ECC, the data set determining the dependence pattern in the Schaake shuffle [8] is not given by the raw ensemble forecasts, but by past observations
taken from different dates of a historical archive. That is, observations from the same dates are employed for all locations and weather variables throughout the procedure of the Schaake shuffle for a fixed forecast instance. Hence, we have in Scheme ?. In particular, does not need to equal the raw ensemble size .
In the original implementation of the Schaake shuffle by [8], the corresponding dates, from which the observations are taken, are chosen from all years in the historical record, except for the year of the forecast of interest, and lie within seven days before and after the verification date , regardless of the year. A more general implementation of the Schaake shuffle may use observations from arbitrary (or randomly selected) past dates in the whole historical record. We will refer to this procedure as the Random Schaake method in the case study in Section 3.
The Schaake shuffle has been employed successfully in numerous applications [43].

2.2The SimSchaake approach as a similarity-based implementation of the Schaake shuffle

As we have seen, the Schaake shuffle generates a postprocessed ensemble inheriting the rank dependence structure from historical observations. However, its standard implementations fail to condition the multivariate dependence pattern on current or predicted atmospheric states. Inspired by the quote of [8] mentioned in the introductory Section 1, we address this challenge by linking the Schaake shuffle to similarity- or analog-based ensemble methods. In such approaches, one seeks ensemble forecasts in an archive of past data that are similar to the current one. The basic idea is that the realizing states of the atmosphere corresponding to such an analog ensemble can be assumed to be similar to the state to be predicted [24]. These techniques have become popular and important, as for instance witnessed by the papers of [3], [27], [21], [10], [30] and [11], and further by recent research [1]. In this context, the question of the choice of appropriate similarity criteria in a nearest neighbor sense arises, with the papers above providing some proposals.
The following approach, which will be referred to as the SimSchaake method in what follows, combines the idea of searching for similar ensembles and the Schaake shuffle. Like the Schaake shuffle and in contrast to standard ECC, it can be applied to any ensemble, regardless of whether it consists of exchangeable or non-exchangeable members, and the size of the final postprocessed ensemble is not restricted to equal the raw ensemble size . To describe the new SimSchaake approach formally in detail, let be the length of the training period required for the univariate ensemble postprocessing, the initialization date of the ensemble forecast, and the verification date. Let further be the number of dates in the past of for which ensemble forecast and observation data is available. For the feasibility of the SimSchaake approach, it is required that ensemble forecast and observation data is available for at least dates in the past of , that is, . Considering the prediction horizon to be fixed, the SimSchaake approach then proceeds according to the empirical copula-based postprocessing as in Scheme ?, where the data set in step 1 is derived as follows.

  1. For a fixed margin , let denote the (possibly standardized) -member raw ensemble forecast valid on date . Let further denote the corresponding -tuple consisting of the -member ensemble forecasts of all margins, that is, combinations of weather variable and location.
    If weather variables with distinct units or magnitudes are involved, the components of should be standardized for each multi-index .

  2. For each date in the past of the initialization date , where , compute a suitable fixed similarity criterion between the actual forecast valid on the verification date and the forecast valid on the date . The similarity criterion is taken to be negatively oriented, that is, the lower the similarity criterion the more similar the ensemble forecasts. A similarity criterion value of exactly zero indicates that the ensemble forecasts are identical.

  3. Choose those dates for which the data is most similar to that for the date in the sense that the corresponding values of for are the smallest among the values of for .
    Note that the information of all multi-indices simultaneously is employed to determine the dates .

  4. For each margin , let denote the corresponding historical verifying observations valid on the dates determined in step iii. For simplicity, write for and build the data vector for the verification date . The data set in step 1 from Scheme is then obtained by aggregating the historical observation databases of all margins , that is, . From this template , the empirical copula related to the SimSchaake approach is derived via .

Figure 1: Scheme of the SimSchaake approach
Figure 1: Scheme of the SimSchaake approach

Having obtained the data set and the respective empirical copula according to the above procedure, steps 2 to 4 from Scheme ? are performed in order to generate the final postprocessed SimSchaake ensemble. A scheme of the SimSchaake approach is given in Figure 1.
An appropriate choice of the similarity criterion in step ii, which is then consistently used throughout the whole SimSchaake approach, is crucial. We here consider

where

denote the empirical mean and standard deviation, respectively, of the ensemble forecast for the fixed multi-index at date . As it does not depend on how the ensemble members are labeled, the similarity criterion in can be applied to ensembles consisting of exchangeable members. In particular, it is suitable for the temperature predictions from the European Centre for Medium-Range Weather Forecasts (ECMWF) used in the case study in the next section. While the use of the empirical mean to some degree accounts for seasonal aspects when considering temperature forecasts only, the empirical standard deviations reflect the uncertainties within the ensemble forecasts. Thus, these issues are addressed by when comparing two ensemble forecasts. Alternative proposals for similarity criteria, also for the case of ensembles with non-exchangeable members, can be found in the references mentioned before.
It is possible to transform the values of a similarity criterion from to by employing the standardization . With respect to , similarity values near to 1 indicate a very high similarity between and , while similarity values near to 0 point at no similarity. Accordingly, if using , we then have to choose the dates corresponding to the highest, and not to the lowest, values of in step iii.
As mentioned, the SimSchaake approach addresses two shortcomings in the standard ECC method. First, it can also be applied to ensembles consisting of non-exchangeable members, as the reordering is not based on the ensemble forecasts, but on observations. Second, with the SimSchaake technique we can principally create ensembles of arbitrary size, as long as there are sufficiently many historical observations in the past. In contrast to ECC, the postprocessed ensemble is thus not restricted to have the same number of members as the raw ensemble.

Figure 2: Illustration of ECC (first row), the Random Schaake method (second row) and the SimSchaake method (third row) using 24 hour ahead temperature forecasts (in ^\circC) at Vienna and Bratislava valid on 9 July 2011, 1200 UTC. Ensemble forecasts are indicated by the dots, and the verifying observation by the cross. The tetragons symbolize past observations from a historical database. At the top and to the right of each subfigure, the corresponding marginal histograms are shown.
Figure 2: Illustration of ECC (first row), the Random Schaake method (second row) and the SimSchaake method (third row) using 24 hour ahead temperature forecasts (in C) at Vienna and Bratislava valid on 9 July 2011, 1200 UTC. Ensemble forecasts are indicated by the dots, and the verifying observation by the cross. The tetragons symbolize past observations from a historical database. At the top and to the right of each subfigure, the corresponding marginal histograms are shown.

In Figure 2, an illustration of the three empirical copula-based postprocessing methods presented in this section – ECC in the first row, the Schaake shuffle in the second row and the SimSchaake method in the third row – is given. We consider 24 hour ahead surface temperature forecasts (in degrees Celsius; C) at Vienna (Austria) and Bratislava (Slovakia) valid on the hot summer day of 9 July 2011 at 1200 UTC. In the subfigures, the ensemble forecast is indicated by the dots, the verifying observation by the cross, and past observations from a historical database by the tetragons. As the ECMWF raw ensemble used here has size , the illustrations are for convenience based on -member ensembles for both ECC and the Schaake shuffle-based methods. The first column of Figure 2 shows the corresponding database that is used to determine the dependence structure and the empirical copula, respectively, of the postprocessing approach: the ECMWF raw ensemble in the case of ECC, past observations on random dates for the Random Schaake method, and past observations on specific dates selected according to the similarity criterion in in the case of the SimSchaake approach. The second column shows three times the same individually postprocessed ensemble forecast. This is generated by randomly pairing the equidistant quantiles from the predictive CDFs obtained by univariate EMOS postprocessing [18] at Vienna and Bratislava separately. Such an ensemble forecast does not take account of the pronounced positive spatial correlation structure. In the third column, the final postprocessed ensemble forecasts are shown, obtained by applying the corresponding empirical copula to the samples derived from the individual EMOS postprocessing. These empirical copula-based postprocessed ensembles have the same marginal distributions as the individual EMOS postprocessed ensemble, as witnessed by the respective histograms at the top and to the right of each subfigure. Additionally, they exhibit the same spatial correlation pattern as the underlying database specified in the first column, thus respecting dependencies.

3Case study

3.1Setting

In our case study, we employ predictions of the European Centre for Medium-Range Weather Forecasts (ECMWF) core ensemble [12], whose members can be considered exchangeable. We focus on 24 hour-ahead temperature forecasts jointly at Vienna (Austria), Bratislava (Slovakia) and Budapest (Hungary), and consequently on spatial dependencies only. For each location, the ground truth is given by the corresponding surface synoptic observations (SYNOP). The approximate distance from Vienna to Bratislava is 50 kilometers, from Bratislava to Budapest 170 kilometers, and from Vienna to Budapest 210 kilometers. There are pronounced positive pairwise correlations between the observations at the three different locations, which are the stronger the closer the respective stations are. These correlation patterns are basically reflected well in the ensemble forecasts.
We consider those 3985 test days during the period from 1 January 2003 to 31 December 2013 for which all required forecast and observation data is available at all the three stations. In our case study, all data is valid on 1200 UTC. Univariate postprocessing is performed via EMOS [18] using the R package ensembleMOS [38], employing a rolling window consisting of the last days before the verification day as training period. For each marginal EMOS postprocessed predictive CDF , we follow the quantization and take the equidistant -quantiles, where , as samples, focusing on the cases of and , respectively, here.
For these desired ensemble sizes, we assess and compare the predictive performances of

  • the ECMWF raw ensemble,

  • the Individual EMOS ensemble, which assumes independence,

  • the EMOS-ECC ensemble, which assumes a dependence structure according to that in the raw ensemble,

  • the EMOS-Random Schaake ensemble, which assumes a dependence structure according to randomly selected historical observation data, and

  • the EMOS-SimSchaake ensemble, which assumes a dependence structure according to specific historical observation data valid on dates for which the ensemble forecast resembled the current one with respect to similarity criterion .

Obviously, results for the raw and the EMOS-ECC ensemble can only be reported for the case of , whereas the other ensembles are additionally evaluated for the final ensemble size of . For the two approaches employing the Schaake shuffle notion, the past dates from which the corresponding verifying observations are taken, are searched for among all available historical data, where ensemble forecast and observation data is available from 1 January 2002 to 31 December 2013. Hence, the database used for the Schaake shuffle-based methods grows over time. Recall that the EMOS-Random Schaake method just randomly selects those past dates, whereas the EMOS-SimSchaake approach chooses them based on the ensemble similarity criterion .

3.2Evaluation tools

A probabilistic forecast distribution or an ensemble forecast, respectively, should be as sharp as possible, subject to calibration, which refers to statistical consistency between the forecasts and the observation [19]. To assess the predictive performances of our different ensembles, several verification tools are available [62].
In univariate settings, calibration can be checked via the verification rank histogram [2]. As we focus on the evaluation of multivariate quantities in this paper, ensemble calibration in our case study is checked via the multivariate [20], band depth and average rank histograms [55]. When an ensemble forecast is calibrated, the multivariate, band depth or average rank, respectively, is uniformly distributed. Calibration can thus be diagnosed by compositing over forecast cases, plotting the corresponding multivariate, band depth or average rank histogram, respectively, and checking for deviations from uniformity, that is, flatness of the histogram. For an interpretation of the different shapes a rank histogram for multivariate quantities can exhibit, see [20] and [55].
The overall forecast skill can be assessed via proper scoring rules [17], which are able to assess calibration and sharpness simultaneously and are taken to be negatively oriented here, that is, the lower the score the better the predictive performance. A widely used proper scoring rule for univariate quantities is the continuous ranked probability score (CRPS) [17].
In this paper, we employ the energy score (ES) [20], which is the analog of the CRPS for multivariate quantities. For an -member ensemble forecast

and an observation

the ES is computed as

with denoting the Euclidean norm. As the ES reveals weaknesses in detecting misspecifications in the correlation structure [36], we additionally consider the variogram score (VS) [48] to address this, which is given by

where the ’s are (optional) non-negative weights. For the case study in this paper, in which we focus on spatial dependencies, we follow the suggestion of [48] and let the weights be proportional to the inverse spatial distances between the corresponding locations. That is, we choose

for and for , with denoting the spatial distance between location and location , where all distances have to be measured in the same unit. For the specific implementation in our -dimensional setting here, we employ the distances between Vienna, Bratislava and Budapest as mentioned in the preceding subsection.
In our case study, average scores over all forecast cases within our specific test period are reported.

3.3Results

Table 1: Average energy scores (ES) and variogram scores (VS) for 24 hour ahead temperature forecasts at Vienna, Bratislava and Budapest jointly over 3985 test days during the period from 1 January 2003 to 31 December 2013. The scores for the Individual EMOS and the EMOS-Random Schaake ensembles are averaged over 100 runs.
ES VS

ECMWF Raw Ensemble 2.241 0.333

Individual EMOS Ensemble 1.976 0.323

EMOS-ECC Ensemble 1.957 0.270
EMOS-Random Schaake Ensemble 1.998 0.300
EMOS-SimSchaake Ensemble 1.952 0.265
Individual EMOS Ensemble 1.971 0.327

EMOS-Random Schaake Ensemble 1.996 0.300
EMOS-SimSchaake Ensemble 1.947 0.266

As we focus on the multivariate setting in this paper, we do not explicitly show the results for the univariate EMOS postprocessing at the three stations individually here. In a nutshell, the EMOS postprocessed ensemble forecasts exhibit a better predictive performance than the unprocessed raw ensemble predictions, in that they are better calibrated and have smaller CRPS values.
In Table 1, the average energy scores (ES) and variogram scores (VS) as overall performance measures are shown. The results for the Individual EMOS and the EMOS-Random Schaake ensemble are averaged over 100 runs for each forecast instance, in order to account for randomizations. Precisely, for the Individual EMOS ensemble, the results for 100 different aggregations (that is, assignments of the member indices) of the equidistant quantiles obtained by the univariate EMOS postprocessing are averaged. In case of the EMOS-Random Schaake ensemble, the average is taken over 100 different selections of the random historical dates that are used to define the observation-based dependence model. Calibration is assessed via the multivariate, band depth and average rank histograms, respectively, in Figs. Figure 3, Figure 4 and Figure 5.

Figure 3: Multivariate rank histograms for 24 hour ahead temperature forecasts at Vienna, Bratislava and Budapest jointly over 3985 test days during the period from 1 January 2003 to 31 December 2013.
Figure 3: Multivariate rank histograms for 24 hour ahead temperature forecasts at Vienna, Bratislava and Budapest jointly over 3985 test days during the period from 1 January 2003 to 31 December 2013.
Figure 4: Band depth rank histograms for 24 hour ahead temperature forecasts at Vienna, Bratislava and Budapest jointly over 3985 test days during the period from 1 January 2003 to 31 December 2013
Figure 4: Band depth rank histograms for 24 hour ahead temperature forecasts at Vienna, Bratislava and Budapest jointly over 3985 test days during the period from 1 January 2003 to 31 December 2013
Figure 5: Average rank histograms for 24 hour ahead temperature forecasts at Vienna, Bratislava and Budapest jointly over 3985 test days during the period from 1 January 2003 to 31 December 2013
Figure 5: Average rank histograms for 24 hour ahead temperature forecasts at Vienna, Bratislava and Budapest jointly over 3985 test days during the period from 1 January 2003 to 31 December 2013

Both in the case of and , all postprocessing methods outperform the raw ensemble in terms of the scores. The raw ensemble is clearly uncalibrated, more precisely underdispersed, as witnessed by the U-shaped multivariate and average rank histograms and the skewed band depth rank histogram with an overpopulation of the lowest ranks [20].
Considering the results for the postprocessed ensembles consisting of members first, the Individual EMOS ensemble is uncalibrated, yielding U-shaped rank histograms. In the case of the band depth and average rank histograms, respectively, this points at an underestimation of the correlation structure [55], which is plausible, as this approach does not account for dependencies. With respect to the VS, the Individual EMOS ensemble performs worse than the other postprocessed ensembles, which assume specific correlation structures. In terms of the ES, the EMOS-ECC and the EMOS-SimSchaake ensembles outperform the Individual EMOS ensemble, while the distinctions are less pronounced than for the VS. This may be due to the discrimination inability of the ES with respect to correlations between different locations [36], as hinted at in the preceding subsection.
Comparing the three empirical copula-based postprocessing methods taking account of dependence patterns, the EMOS-ECC ensemble is well calibrated, apart from a slight overpopulation of the lowest ranks in all rank histograms. In contrast, the calibration of the EMOS-Random Schaake ensemble is not that good, as witnessed by the inverse U-shaped band depth and average rank histograms, indicating an overestimation of the correlation structures [55]. Moreover, the multivariate rank histogram of the EMOS-Random Schaake ensemble is skewed with too many low ranks. The EMOS-SimSchaake ensemble is calibrated best, with band depth and average rank histograms being close to uniform and an essentially flat multivariate rank histogram with a slight overpopulation of the lowest ranks only. The ranking of the three ensembles allowing for dependencies in terms of calibration is also reflected in the scores. Both for the ES and the VS, the EMOS-SimSchaake ensemble performs best, followed by the EMOS-ECC ensemble and finally the EMOS-Random Schaake ensemble.
The results and conclusions on calibration described above continue to hold analogously for the extended Individual EMOS, EMOS-Random Schaake and EMOS-SimSchaake ensembles comprising members. Similarly, the ranking of the predictive skill of these three extended postprocessed ensembles remains unchanged with respect to the ES and VS, respectively, compared to the case of , with the EMOS-SimSchaake ensemble still performing by far best. The ES and VS values of the -member ensembles are very similar to those of their counterparts in the case of . Although an extension of the ensemble size is generally useful, there is no pronounced need in our case here to increase the ensemble size of , which appears to be already reasonably large.
In a nutshell, the EMOS-SimSchaake ensemble based on the new method introduced in this paper performs best among all reference ensembles, both with respect to calibration and in terms of scores. In particular, the EMOS-SimSchaake ensemble outperforms the EMOS-Random Schaake ensemble. Hence, there appears to be a clear benefit of using the specific past dates on which the ensemble forecasts resembled the current one to create the historical observation database modeling the dependencies, rather than picking these dates randomly. The EMOS-SimSchaake ensemble also outperforms the EMOS-ECC ensemble.

4Discussion

We have discussed and compared empirical copula-based ensemble postprocessing methods that are able to account for dependencies. While ECC and the Schaake shuffle have been reviewed in a general frame, the SimSchaake scheme has been newly developed in this paper as a multivariate postprocessing tool. Essentially, the SimSchaake procedure aggregates samples from univariate postprocessed distributions, where the underlying dependence structure and the involved empirical copula, respectively, are derived from historical observations at dates in the past which showed a similar ensemble forecast to the current one. In our case study, the SimSchaake ensemble has performed best overall and better than the ECC ensemble, while having the benefit of a broader applicability, in that it can also be employed on ensembles comprising non-exchangeable members and is not restricted to have the same size as the raw ensemble.
The SimSchaake method depends on the design of a suitable similarity criterion, where the choice of has proven to be useful, yielding good results. However, the predictive performance of the SimSchaake ensemble might be improved by using a more sophisticatedly designed similarity criterion, perhaps including a suitable weighting function (or monotone transformations thereof) that accounts for seasonal aspects. Moreover, the similarity criterion in is tailored to ensembles consisting of exchangeable members such as the ECMWF ensemble in our case study. For ensembles comprising non-exchangeable members, other criteria might provide more reasonable and better choices.
As mentioned, a drawback of the standard ECC postprocessed ensemble employed here is that it is constrained to have the same size as the unprocessed ensemble, while the Schaake shuffle and the SimSchaake ensembles are not. However, there have been first attempts to design ECC-like ensembles having more members than the raw ensemble [63]. Similar to the work in [63], an issue for future work may be to design and conduct a further case study including these approaches to allow for a broader comparison of ECC- and Schaake shuffle-based concepts.

Acknowledgments

I gratefully acknowledge support by VolkswagenStiftung under the project “Mesoscale Weather Extremes: Theory, Spatial Modeling and Prediction”. Initial work on this paper was done during my time as Ph.D. student at Heidelberg University, funded by Deutsche Forschungsgemeinschaft through Research Training Group (RTG) 1953. I thank Tilmann Gneiting, Stephan Hemri, Sebastian Lerch and Martin Leutbecher for valuable comments, suggestions and discussions. The forecast data used in the case study have been made available by the European Centre for Medium-Range Weather Forecasts. I thank Stephan Hemri for help with the data and Michael Scheuerer for R code for the rank histograms for multivariate quantities.

References

  1. A novel application of an analog ensemble for short-term wind power forecasting.
    S. Alessandrini, L. Delle Monache, S. Sperati, and J. N. Nissen. Renew. Energ.
  2. A method for producing and evaluating probabilistic forecasts from ensemble model integrations.
    J. L. Anderson. J. Climate
  3. Predicting realizations of daily weather data for climate forecasts using the non-parametric nearest-neighbour re-sampling technique.
    M. Bannayan and G. Hoogenboom. Int. J. Climatol.
  4. Joint probabilistic forecasting of wind speed and temperature using Bayesian model averaging.
    S. Baran and A. Möller. Environmetrics
  5. Combining spatial statistical and ensemble information in probabilistic weather forecasts.
    V. J. Berrocal, A. E. Raftery, and T. Gneiting. Mon. Wea. Rev.
  6. Probabilistic quantitative precipitation field forecasting using a two-stage spatial model.
    V. J. Berrocal, A. E. Raftery, and T. Gneiting. Ann. Appl. Stat.
  7. Probabilistic weather forecasting for winter road maintenance.
    V. J. Berrocal, A. E. Raftery, T. Gneiting, and R. C. Steed. J. Amer. Stat. Assoc.
  8. The Schaake shuffle: A method for reconstructing space-time variability in forecasted precipitation and temperature fields.
    M. P. Clark, S. Gangopadhyay, L. E. Hay, B. Rajagopalan, and R. L. Wilby. J. Hydrometeor.
  9. La fonction de dépendance empirique et ses propriétés. Un test non paramétrique d’indépendance.
    P. Deheuvels. Acad. Roy. Belg. Bull. Cl. Sci. (5)
  10. Kalman filter and analog schemes to postprocess numerical weather predictions.
    L. Delle Monache, T. Nipen, Y. Liu, G. Roux, and R. Stull. Mon. Wea. Rev.
  11. Probabilistic weather prediction with an analog ensemble.
    L. Delle Monache, F. A. Eckel, D. L. Rife, B Nagarajan, and K. Searight. Mon. Wea. Rev.
  12. Describing ECMWF’s forecasts and forecasting system.
    ECMWF Directorate. ECMWF Newsletter
  13. Spatial postprocessing of ensemble forecasts for temperature using nonhomogeneous Gaussian regression.
    K. Feldmann, M. Scheuerer, and T. L. Thorarinsdottir. Mon. Wea. Rev.
  14. Calibrating ensemble reliability whilst preserving spatial structure.
    J. Flowerdew. Tellus A
  15. Everything you always wanted to know about copula modeling but were afraid to ask.
    C. Genest and A.-C. Favre. J. Hydrol. Eng.
  16. Weather forecasting with ensemble methods.
    T. Gneiting and A. E. Raftery. Science
  17. Strictly proper scoring rules, prediction, and estimation.
    T. Gneiting and A. E. Raftery. J. Amer. Stat. Assoc.
  18. Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation.
    T. Gneiting, A. E. Raftery, A. H. Westveld, and T. Goldman. Mon. Wea. Rev.
  19. Probabilistic forecasts, calibration and sharpness.
    T. Gneiting, F. Balabdaoui, and A. E. Raftery. J. Roy. Stat. Soc. Ser. B
  20. Assessing probabilistic forecasts of multivariate quantities, with applications to ensemble predictions of surface winds (with discussion and rejoinder).
    T. Gneiting, L. I. Stanberry, E. P. Grimit, L. Held, and N. A. Johnson. Test
  21. Analog sky condition forecasting based on a -nn algorithm.
    T. J. Hall, R. N. Thessin, G. J. Bloy, and C. N. Mutchler. Wea. Forecasting
  22. Interpretation of rank histograms for verifying ensemble forecasts.
    T. M. Hamill. Mon. Wea. Rev.
  23. Verification of Eta-RSM short-range ensemble forecasts.
    T. M. Hamill and S. J. Colucci. Mon. Wea. Rev.
  24. Probabilistic quantitative precipitation forecasts based on reforecast analogs: Theory and application.
    T. M. Hamill and J. S. Whitaker. Mon. Wea. Rev.
  25. Dependence Modeling with Copulas


    H. Joe. .
  26. Analog-based Ensemble Model Output Statistics.
    C. Junk, L. Delle Monache, and S. Alessandrini. Mon. Wea. Rev.
  27. The similar days method for predicting near surface wind vectors.
    Z. Klausner, H. Kaplan, and E. Fattal. Meteor. Appl.
  28. Ensemble forecasting.
    M. Leutbecher and T. N. Palmer. J. Comput. Phys.
  29. Scoring rules for continuous probability distributions.
    J. E. Matheson and R. L. Winkler. Manag. Sci.
  30. Probabilistic forecasts using analogs in the idealized Lorenz96 setting.
    J. W. Messner and G. J. Mayr. Mon. Wea. Rev.
  31. Multivariate probabilistic forecasting using ensemble Bayesian model averaging and copulas.
    A. Möller, A. Lenkoski, and T. L. Thorarinsdottir. Quart. J. Roy. Meteor. Soc.
  32. An Introduction to Copulas


    R. B. Nelsen. .
  33. The economic value of ensemble forecasts as a tool for risk assessment: From days to decades.
    T. N. Palmer. Quart. J. Roy. Meteor. Soc.
  34. Adaptive calibration of -wind ensemble forecasts.
    P. Pinson. Quart. J. Roy. Meteor. Soc.
  35. Wind energy: Forecasting challenges for its operational management.
    P. Pinson. Stat. Sci.
  36. Discrimination ability of the Energy score.
    P. Pinson and J. Tastu. Technical report, Technical University of Denmark, 2013.
  37. From probabilistic forecasts to statistical scenarios of short-term wind power production.
    P. Pinson, H. Madsen, H. A. Nielsen, G. Papaefthymiou, and B. Klöckl. Wind Energy
  38. R: A Language and Environment for Statistical Computing


    R Core Team. .
  39. Using Bayesian model averaging to calibrate forecast ensembles.
    A. E. Raftery, T. Gneiting, F. Balabdaoui, and M. Polakowski. Mon. Wea. Rev.
  40. Post-processing rainfall forecasts from numerical weather prediction models for short-term streamflow forecasting.
    D. E. Robertson, D. L. Shrestha, and Q. J. Wang. Hydrol. Earth Syst. Sci.
  41. Postprocessing of ensemble precipitation predictions with extended logistic regression based on hindcasts.
    E. Roulin and S. Vannitsem. Mon. Wea. Rev.
  42. On the distributional transform, Sklar’s theorem, and the empirical copula process.
    L. Rüschendorf. J. Statist. Plann. Inference
  43. Precipitation and temperature ensemble forecasts from single-valued forecasts.
    J. C. Schaake, J. Demargne, R. Hartman, M. Mullusky, E. Welles, L. Wu, H. Herr, X. Fan, and D. J. Seo. Hydrol. Earth Syst. Sci. Discuss.
  44. Summary of recommendations of the first Workshop on Postprocessing and Downscaling Atmospheric Forecasts for Hydrologic Applications held at Météo-France, Toulouse, France, 15-18 June 2009.
    J. C. Schaake, J. Pailleux, J. Thielen, R. Arritt, T. M. Hamill, L. Luo, E. Martin, D. McCollor, and F. Pappenberger. Atmos. Sci. Lett.
  45. Physically Coherent Probabilistic Weather Forecasts Using Multivariate Discrete Copula-Based Ensemble Postprocessing Methods


    R. Schefzik. .
  46. Multivariate discrete copulas, with applications in probabilistic weather forecasting.
    R. Schefzik. Ann. I.S.U.P.
  47. Uncertainty quantification in complex simulation models using ensemble copula coupling.
    R. Schefzik, T. L. Thorarinsdottir, and T. Gneiting. Stat. Sci.
  48. Variogram-based proper scoring rules for probabilistic forecasts of multivariate quantities.
    M. Scheuerer and T. M. Hamill. Mon. Wea. Rev.
  49. Multivariate non-normally distributed random variables in climate research — Introduction to the copula approach.
    C. Schoelzel and P. Friederichs. Nonlinear Proc. Geoph.
  50. Probabilistic assessment of regional climate change in Southwest Germany by ensemble dressing.
    C. Schoelzel and A. Hense. Climate Dyn.
  51. Ensemble model output statistics for wind vectors.
    N. Schuhen, T. L. Thorarinsdottir, and T. Gneiting. Mon. Wea. Rev.
  52. Fonctions de répartition à n dimensions et leurs marges.
    A. Sklar. Publ. Inst. Stat. Univ. Paris
  53. Probabilistic wind vector forecasting using ensembles and Bayesian model averaging.
    J. M. Sloughter, T. Gneiting, and A. E. Raftery. Mon. Wea. Rev.
  54. Evaluation of probabilistic prediction systems.
    O. Talagrand, R. Vautard, and B. Strauss. In Proceeding of ECMWF Workshop on Predictability, pages 1–25. Reading, UK, European Centre for Medium-Range Weather Forecasts, 1997.
  55. Assessing the calibration of high-dimensional ensemble forecasts using rank histograms.
    T. L. Thorarinsdottir, M. Scheuerer, and C. Heinz. J. Comput. Graph. Statist.
  56. Ensemble post-processing using member-by-member approaches: theoretical aspects.
    B. Van Schaeybroeck and S. Vannitsem. Quart. J. Roy. Meteor. Soc.
  57. Wind resource estimates with an analog ensemble approach.
    E. Vanvyve, L. Delle Monache, A. J. Monaghan, and J. O. Pinto. Renew. Energ.
  58. Post-processing ECMWF precipitation and temperature ensemble reforecasts for operational hydrologic forecasting at various spatial scales.
    J. S. Verkade, J. D. Brown, P. Reggiani, and A. H. Weerts. J. Hydrol.
  59. Calibration and downscaling methods for quantitative ensemble precipitation forecasts.
    N. Voisin, J. C. Schaake, and D. P. Lettenmeier. Wea. Forecasting
  60. Application of a medium-range global hydrologic probabilistic forecast scheme to the Ohio River basin.
    N. Voisin, F. Pappenberger, D. P. Lettenmeier, R. Buizza, and J. C. Schaake. Wea. Forecasting
  61. Multivariate – inter-variable, spatial and temporal – bias correction.
    M. Vrac and P. Friederichs. J. Climate
  62. Statistical Methods in the Atmospheric Sciences


    D. S. Wilks. .
  63. Multivariate ensemble Model Output Statistics using empirical copulas.
    D. S. Wilks. Quart. J. Roy. Meteor. Soc.
  64. ensembleMOS: Ensemble Model Output Statistics


    R. A. Yuen, T. Gneiting, T. L. Thorarinsdottir, and C. Fraley. , 2013.
25974
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
Edit
-  
Unpublish
""
The feedback cannot be empty
Submit
Cancel
Comments 0
""
The feedback cannot be empty
   
Add comment
Cancel

You’re adding your first comment!
How to quickly get a good reply:
  • Offer a constructive comment on the author work.
  • Add helpful links to code implementation or project page.