tvfEMD based time series analysis of Be of the CTBTOIMS network
Abstract
A methodology of adaptive time series analysis based on Empirical Mode Decomposition (EMD) has been employed to investigate Be activity concentration variability, along with temperature. Analysed data were sampled at ground level by 28 different stations of the CTBTOIMS network. The adaptive nature of the EMD algorithm allows it to deal with data that are both nonlinear and nonstationary, making no a priori assumptions on the expansion basis. Main purpose of the adopted methodology is to characterise the possible presence of a trend, occurrence of AMFM modulation of relevant oscillatory modes, residuals distributions and outlier occurrence. Trend component is first estimated via simple EMD and removed. The recent time varying filter EMD (tvfEMD) technique is then employed to extract local narrow band oscillatory modes from the data. To establish their relevance, a denoising step is then carried out, employing both the Hurst exponent as a thresholding parameter and further testing their statistical significance against white noise. The ones that pass the denoising step are considered to be meaningful oscillatory modes of the data, and their AMFM modulation is investigated. Possible applications of the adopted methodology regarding site characterisation and suggestions for further research are given in the conclusions.
Keywords: Empirical Mode Decomposition, Cosmogenic beryllium7, CTBTO, Site characterisation.
Introduction
Time series analysis has relevant applications in many fields of science, particularly geophysics, e.g. in seismology, radionuclide concentration monitoring and tracing, climate and atmospheric physics, oceanography, geomagnetism [1, 2, 3, 4, 5, 6, 7, 8]. It is also applied in site characterisation studies in order to investigate possible coupling of environmental noises to detector instrumentation. Relevant examples in this regard are seismic and Newtonian Noise (NN) [9, 10]. Beside applications in Physics, time series analysis studies also has relevant impact, among others, in wind speed forecasting [11], physiology [12], speech pattern recognition [13], power quality events classification [14]. In this paper a methodology of adaptive time series analysis based on Empirical Mode Decomposition (EMD) [15, 16, 17] is applied to Be activity concentration data, sampled at ground level, to investigate their variability. Aim of the analysis, is to characterise time series of data, , in terms of three main components, namely the trend, harmonics and residuals
(1) 
where , , , and stands respectively for time series, trend, extracted signal (harmonics), and noise (residuals), and , with the length of the time series. EMD allows to characterise both nonlinear and nonstationary data [15]. The paper is organised as follows. In Section 1 the analysed dataset is briefly described and the adopted methodology is introduced in details. In Section 2, data analysis results are presented.
1 Dataset and Methodology
All the data analysed in this paper were collected on a daily basis by 28 stations of the International Monitoring System (IMS), a worldwide distributed network set up and maintained by the Comprehensive Nuclear Test Ban Treaty Organisation (CTBTO), which goal is to monitor over CTBT compliance. Time series of Be and surface temperature have been selected based on data availability. The length of a given time series depends on when the corresponding station became operative, and the starting year of the time series varies from 2003 to 2009, while all series end on March 2016. A more detailed description of the analysed Be dataset can be found in [18]. In Table 1 the names and the exact location of the analysed stations are listed.
ID  Location  Latitude []  Longitude [] 

RN46  Chatham Island, New Zealand  43.82  176.48 
RN04  Melbourne, VIC, Australia  37.73  145.10 
RN68  Tristan da Cunha, United Kingdom  37.07  12.31 
RN47  Kaitaia, New Zealand  35.07  173.29 
RN01  Buenos Aires, Argentina  34.54  58.47 
RN10  Perth, WA, Australia  31.93  115.98 
RN23  Rarotonga, Cook Islands  21.20  159.81 
RN06  Townsville, QLD, Australia  19.25  146.77 
RN26  Nadi, Fiji  17.76  177.45 
RN09  Darwin, NT, Australia  12.43  130.89 
RN08  Cocos Islands, Australia  12.19  96.83 
RN64  Dar Es Salaam, Tanzania  6.78  39.20 
RN50  Panama City, Panama  8.98  79.53 
RN43  Nouakchott, Mauritania  18.14  15.92 
RN79  Oahu, Hawaii, USA  21.52  157.99 
RN37  Okinawa, Japan  26.50  127.90 
RN72  Melbourne, FL, USA  28.10  80.65 
RN74  Ashland, KS, USA  37.17  99.77 
RN75  Charlottesville, VA, USA  38.00  78.40 
RN17  St. John’s, N.L., Canada  47.59  52.74 
RN45  Ulaanbaatar, Mongolia  47.89  106.33 
RN33  Schauinsland/Freiburg, Germany  47.92  7.91 
RN60  Petropavlovsk, Russian Federation  53.05  158.78 
RN71  Sand Point, Alaska, USA  55.34  160.49 
RN61  Dubna, Russian Federation  56.74  37.25 
RN63  Stockholm, Sweden  59.41  17.95 
RN16  Yellowknife, N.W.T., Canada  62.48  114.47 
RN76  Salchaket, Alaska, USA  64.67  147.10 
The methodology adopted in this paper is intended to be complementary to the one described in [18] (hereafter referred to as MTsA), and aims at extending the range of analysis to strongly nonlinear and nonstationary time series. Detailed description of MTsA software and its applications can be found in [18, 19, 20, 21].
Cosmogenic beryllium7 time series measured at 28 different sites at ground level are analysed making use of the Empirical Mode Decomposition (EMD). EMD is an adaptive algorithm first introduced by Huang [15, 16, 17]. It allows to deal with nonlinear and nonstationary time series and to extract oscillatory modes, referred to as Intrinsic Mode Functions (IMFs). To be an IMF, such oscillatory functions must respect the the following two conditions:

The number of extrema and zero crossings must be equal or differ at most by one.

The mean of the upper and lower envelope must be zero.
IMFs are obtained by the EMD algorithm fitting upper and lower extrema contained in the data with cubic splines, and then subtracting the mean of the upper and lower envelopes. This procedure is iterated until the two aforementioned conditions are met, and this iterative process is referred to as sifting. Having obtained the first IMF, it is subtracted from the data and the process is repeated on the remainder of the time series until a last term is obtained, , which represents the adaptive trend or baseline wandering of the data, if present. This way, a given time series , can be represented by the following expansion [15]
(2) 
where is the jth IMF, and , with the number of data samples, and is the number of IMFs that have been extracted by EMD algorithm. Since IMFs are monocomponent or narrow band oscillatory modes, Hilbert analysis provides then meaningful Instantaneous Amplitude (IA) and Frequency (IF) of such modes. Combining the two approaches, the analysis as a whole is referred as HilbertHuang spectral analysis (HHSA). EMD is a fully datadriven technique, making no a priori assumptions on the basis functions for the expansion. It is complete, i.e. the sum of the IMFs and the trend term equals the input data and the IMFs are almost orthogonal. A significance test to distinguish IMFs possibly due to noise from significant ones has been introduced in [17]. Due to the fact that IMFs of white noise obtained by EMD decomposition are normally distributed, spread lines giving IMF significance against white noise can be obtained.
A known drawback of EMD application to noisy dataset is mode mixing, namely an IMF containing oscillations of widely different scales or different IMFs having very similar ones [15]. To deal with mode mixing and intermittency related problems, and to improve its frequency resolution, a recent modification of the EMD algorithm was introduced in [22], the time varying filter EMD (tvfEMD). The concept of IMF is replaced by that of local narrow band signal, based on instantaneous bandwidth. To extract local narrow band signals, Bsplines are employed as a filter with time varying cut off frequency. The main steps of the sifting procedure, based on the tvfEMD algorithm, can be found in [22], algorithm 3. To simplify the notation, in the remaining of this paper the term IMFs refers to the narrow band signals extracted by the tvfEMD algorithm.
The methodology adopted in this paper is based on tvfEMD and is hereafter described. The flowchart of the methodology of analysis can be found in Figure 1.
The steps of the adopted methodology are here listed:

Preprocessing: If missing data are present in the data , interpolation is carried out.

Detrend: Classical EMD is firstly performed on the data. The last obtained component, , is considered to represent the adaptive trend or baseline wandering of the data, and is removed before further analysis.

Oscillatory modes extraction: To extract local narrow band signals from the data, tvfEMD algorithm is applied and a set of IMFs, , is obtained.

IMFs that are significant against white noise, and that furthermore have , are considered as relevant oscillatory modes of the data.
While relevant oscillatory modes constituting the signal can be characterised in terms of AMFM modulations, obtained residuals are further characterised in terms of outlier occurrence following the approach of [18]. Residuals are normalised at zero mean and unit variance, and outliers are defined as those values greater than or less than . Residuals correlations are instead evaluated in terms of their Hurst exponent, as described in [15, 25]. In order to do so, Hurst exponent of the residuals is estimated following the approach described in [25] (Equation 5). At the end of the analysis, performance of the denoising step is evaluated by means of the following parameters, as defined in [23]:

Mean squared error:

Mean absolute error:

Signal to noise ratio:

Peak signal to noise ratio:

Crosscorrelation between and :
where is the input data, is the extracted signal, RMSE is the square root of MSE and , with the number of data samples.
2 Results and discussion
Results of the analysis are summarised in this section for the 28 stations of the CTBTO IMS network.
In Figure 2 the Be extracted adaptive trend are shown. Trends are normalised to zero mean and unit variance for a better comparison. A unique behaviour in terms of latitude cannot be discerned, possibly due to the widely different locations and altitudes of the different stations of the network. Monotonic trends are prevalent in both Northern and Southern hemispheres (divided by the horizontal black line), while there are also nonmonotonic trends that express a nonconstant growth of beryllium7 concentrations. It is important to notice that trend behaviour is not completely determined for some stations, since they start years after 2003. Be trends have been also crosscorrelated with the temperature ones. Even though the majority of correlations are high, a clear pattern cannot be evinced.
In Figure 3, the annual IMF of beryllium7 activity concentration is shown for all the analysed stations. To meaningfully compare the different yearly oscillations, they have been normalised to zero mean and unit variance. Maxima and minima alternate regularly and appear to be shifted in time both in Northern and Southern hemispheres (divided by the horizontal black line). Peaks of the annual oscillation are almost regularly delayed going from the equator to high latitudes, and the same occurs going from the equator to low latitudes. Such behaviour is possibly related to shifting of the Hadley cell and of the inter tropical convergenze zone (ITCZ), as also noted in [26]. It should be noted that the considered stations are not uniformly distributed around the globe, and latitudes are not continuous from the top to the bottom of Figure 3. Furthermore, stations RN79, RN26, RN06, RN23 and RN04 exhibit the highest value of the yearly peak in 2010. Due to their locations, this is possibly related to the El Niño event occurred in 20092010.
In Figure 4 the occurrence of outliers higher than (red) and lower than (blue) in Be residuals is shown. The two hemispheres are divided by the horizontal black line. A higher number of outliers is observed in the central part of the dataset between 2007 and mid 2012, and in general low values are not frequent, meaning a very rare occurrence of high drops in beryllium7 concentrations. Furthermore the number of outliers is higher in the Northern hemisphere compared to the Southern one, and represents the 61.4% of the total outliers. Almost 20% of the Northern hemisphere’s outliers are above , and 4% is above . Percentages are lower for the Southern hemisphere, and the number of outliers above and is half the number of the corresponding Northern outliers. Residuals correlations have been estimated via the Hurst exponent . All values oh ranges between and , indicating strong longrange autocorrelations of residuals time series.
Finally, figure 5 shows the yearly IMF extracted from temperature data, normalised to zero mean and unit variance for a better comparison. The number of stations is less than 28 since some stations had a poor quality temperature time series. A simple seasonal pattern is observed in this case with a six month shift between Northern and Southern hemisphere maximum temperature, as expected. Crosscorrelations between temperature and Be IMFs have also been evaluated. High correlations have been found in all but two stations, namely RN06 in the Southern hemisphere and RN17 in the Northern hemisphere. These two stations are characterised by a non perfect annual oscillation, as can be seen in Figure 3.
Conclusions
In this paper, data of Be, sampled daily at 28 different stations of the CTBTO IMS and worldwide distributed, have been analysed and characterised using a recent technique based on EMD. Their adaptive trend, yearly variability and outlier occurrence have been investigated. The annual cycle shows a different pattern at different latitudes, and is delayed at highest and lowest latitudes. Also outliers in the residuals time series occur more frequently in the Northern hemisphere, while autocorrelation properties seem not to depend on site location.
EMD has been proven to be suitable for this analysis, since it is a tool for timefrequency analysis of nonlinear and nonstationary data. Being fully adaptive, it doesn’t make any a priori assumption about the basis of expansion, which is carried out in term of amplitude and frequency modulated IMFs. The recently developed tvfEMD is an extension of such method, improving its frequency resolution while mitigating end effects, mode mixing and intermittency.
Denoising performance has been evaluated selecting a threshold of 0.5 for the Hurst exponent. However, further testing is needed in order to possibly select an optimal Hurst exponent as a thresholding parameter. This will be the object of future research.
References
 [1] Bizzarri, A., Dunham, E. M., & Spudich, P. (2010). Coherence of Mach fronts during heterogeneous supershear earthquake rupture propagation: Simulations and comparison with observations. Journal of Geophysical Research: Solid Earth, 115(B8).
 [2] Plastino, W., Plenteda, R., Azzari, G., Becker, A., Saey, P. R., & Wotawa, G. (2010). Radioxenon time series and meteorological pattern analysis for CTBT event categorisation. Pure and applied geophysics, 167(45), 559573.
 [3] Lee, H. I., Huh, C. A., Lee, T., & Huang, N. E. (2015). Time series study of a 17year record of Be and Pb fluxes in northern Taiwan using ensemble empirical mode decomposition. Journal of environmental radioactivity, 147, 1421.
 [4] Blender, R., Zhu, X., & Fraedrich, K. (2011). Observations and modelling of 1/fnoise in weather and climate. Advances in Science and Research, 6(1), 137140.
 [5] Alford, M. H., MacKinnon, J. A., Simmons, H. L., & Nash, J. D. (2016). Nearinertial internal gravity waves in the ocean. Annual review of marine science, 8, 95123.
 [6] Wunsch, C. (2007). The past and future ocean circulation from a contemporary perspective. GEOPHYSICAL MONOGRAPHAMERICAN GEOPHYSICAL UNION, 173, 53.
 [7] Balasis, G., Daglis, I. A., Zesta, E., Papadimitriou, C., Georgiou, M., Haagmans, R., & Tsinganos, K. (2012, December). ULF wave activity during the 2003 Halloween superstorm: multipoint observations from CHAMP, Cluster and Geotail missions. In Annales Geophysicae (Vol. 30, No. 12, pp. 17511768). Copernicus GmbH.
 [8] Occhipinti, G., Rolland, L., Lognonné, P., & Watada, S. (2013). From Sumatra 2004 to TohokuOki 2011: The systematic GPS detection of the ionospheric signature induced by tsunamigenic earthquakes. Journal of Geophysical Research: Space Physics, 118(6), 36263636.
 [9] Driggers, J. C., Harms, J., & Adhikari, R. X. (2012). Subtraction of Newtonian noise using optimized sensor arrays. Physical Review D, 86(10), 102001.
 [10] Fiorucci, D., Harms, J., Barsuglia, M., Fiori, I., & Paoletti, F. (2018). Impact of infrasound atmospheric noise on gravity detectors used for astrophysical and geophysical applications. Physical Review D, 97(6), 062003.
 [11] Niu, D., Liang, Y., & Hong, W. C. (2017). Wind speed forecasting based on emd and grnn optimized by foa. Energies, 10(12), 2001.
 [12] Soehle, M., Czosnyka, M., Chatfield, D. A., Hoeft, A., & Peña, A. (2008). Variability and fractal analysis of middle cerebral artery blood flow velocity and arterial blood pressure in subarachnoid hemorrhage. Journal of Cerebral Blood Flow & Metabolism, 28(1), 6473.
 [13] Manjula, M., & Sarma, A. V. R. S. (2012). Comparison of empirical mode decomposition and wavelet based classification of power quality events. Energy Procedia, 14, 11561162.
 [14] Koh, M. S., & RodriguezMarek, E. (2012, November). Speech enhancement of color noise using empirical mode decomposition. In Signals, Systems and Computers (ASILOMAR), 2012 Conference Record of the Forty Sixth Asilomar Conference on (pp. 16881692). IEEE.
 [15] Huang, N. E., Shen, Z., Long, S. R., Wu, M. C., Shih, H. H., Zheng, Q., … & Liu, H. H. (1998, March). The empirical mode decomposition and the Hilbert spectrum for nonlinear and nonstationary time series analysis. In Proceedings of the Royal Society of London A: mathematical, physical and engineering sciences (Vol. 454, No. 1971, pp. 903995). The Royal Society.
 [16] Huang, N. E. (2014). HilbertHuang transform and its applications (Vol. 16). World Scientific.
 [17] Wu, Z., & Huang, N. E. (2004, June). A study of the characteristics of white noise using the empirical mode decomposition method. In Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences (Vol. 460, No. 2046, pp. 15971611). The Royal Society.
 [18] Bianchi, S., Longo, A., & Plastino, W. (2018). A new methodological approach for worldwide beryllium7 time series analysis. Physica A: Statistical Mechanics and its Applications, 501, 377387.
 [19] Bianchi, S., Longo, A., Plastino, W., & Povinec, P. P. (2018). Evaluation of Be and Xe atmospheric radioactivity time series measured at four CTBTO radionuclide stations. Applied Radiation and Isotopes, 132, 2428.
 [20] Bianchi, S., & Plastino, W. (2018). Uranium time series analysis: A new methodological approach for event screening categorisation. Journal of environmental radioactivity, 183, 3740.
 [21] Longo, A., Bianchi, S., & Plastino, W. (2018). Xenon and radon time series analysis: A new methodological approach for characterising the local scale effects at CTBT radionuclide network. Applied Radiation and Isotopes, 139, 209216.
 [22] Li, H., Li, Z., & Mo, W. (2017). A time varying filter approach for empirical mode decomposition. Signal Processing, 138, 146158.
 [23] Sundar, A., Pahwa, V., Das, C., Deshmukh, M., & Robinson, N. (2016). A comprehensive assessment of the performance of modern algorithms for enhancement of digital volume pulse signals. International Journal of Pharma Medicine and Biological Sciences, 5(1), 91.
 [24] Mert, A., & Akan, A. (2014). Detrended fluctuation thresholding for empirical mode decomposition based denoising. Digital Signal Processing, 32, 4856.
 [25] Rilling, G., Flandrin, P., & Gon alves, P. (2005, March). Empirical mode decomposition, fractional Gaussian noise and Hurst exponent estimation. In Acoustics, Speech, and Signal Processing, 2005. Proceedings.(ICASSP’05). IEEE International Conference on (Vol. 4, pp. iv489). IEEE.
 [26] Doering, C., & Saey, P. (2014). Hadley cell influence on Be activity concentrations at Australian mainland IMS radionuclide particulate stations. Journal of environmental radioactivity, 127, 8894.