A Analytic deduction for the coarse-grained D estimator

Automatic estimation of attractor invariants

Abstract

The invariants of an attractor have been the most used resource to characterize a nonlinear dynamics. Their estimation is a challenging endeavor in short-time series and/or in presence of noise. In this article, we present two new coarse-grained estimators for the correlation dimension and for the correlation entropy. They can be easily estimated from the calculation of two U-correlation integrals. We have also developed an algorithm that is able to automatically obtain these invariants and the noise level in order to process large data sets. This method has been statistically tested through simulations in low-dimensional systems. The results show that it is robust in presence of noise and short data lengths. In comparison with similar approaches, our algorithm outperforms the estimation of the correlation entropy.

keywords:
Correlation dimension, Correlation entropy, U-correlation integral
\glsdisablehyper\newglossary

[algh]hiddenacrhacnhHidden Acronyms \newglossary[slg]symbolssotstnLista de Símbolos \newacronymgciGCIGaussian kernel correlation integral \newacronymnciiNCInoise assisted correlation integral \newacronymuciUCIU-correlation integral \newacronymuci2U correlation integral with \newacronymucimU correlation integral with \newacronymcdcorrelation dimension \newacronymk2correlation entropy \newacronymnlnoise level \newacronym[longplural=coarse-grained estimators, shortplural=CgEs, firstplural=Coarse-grained estimators (CgEs) ]cgeCgECoarse-grained estimator \newacronymsnrSNRsignal to noise ratio \newacronymetalet al.and others \newacronympdfpdfprobability density function \newacronymeegEEGelectroencefalograma \newacronymhrvVFCvariabilidad de frecuencia cardíaca \newglossaryentrysot:genci type=symbols, name=, description=Integral de correlación generalizada, sort=integraldecorrelacióngeneralizada \newglossaryentrysot:sci type=symbols, name=, description=Integral de correlación estándar, sort=integraldecorrelaciónestandar \newglossaryentrysot:gci type=symbols, name=, description=Integral de correlación de núcleo Gaussiano, sort=integraldecorrelacióndenúcleog \newglossaryentrysot:nci type=symbols, name=, description=Integral de correlación asistida por ruido, sort=Integraldecorrelacionasistida \newglossaryentrysot:sumnsci type=symbols, name=, description=Suma de correlación estándar, sort=sumadecorrelacionestandar \newglossaryentrysot:uci type=symbols, name=, description=Integral de correlación U, sort=Integraldecorrelacionu \newglossaryentrysot:uci2 type=symbols, name=, description=Integral de correlación U con , sort=Integraldecorrelacionubdos \newglossaryentrysot:ucim type=symbols, name=, description=Integral de correlación U con , sort=Integraldecorrelacionubm \newglossaryentrysot:cdu type=symbols, name=, description=Estimador dependiente de la escala para la dimensión de correlación basado en la , sort=estimadordeescalaparadu \newglossaryentrysot:cku type=symbols, name=, description=Estimador dependiente de la escala para la dimensión de correlación basado en la , sort=estimadordeescalaparaku \newglossaryentrysot:csu type=symbols, name=, description=Estimador dependiente de la escala para la dimensión de correlación basado en la , sort=estimadordeescalaparasu \newglossaryentrysot:cdn type=symbols, name=, description=Estimador dependiente de la escala para la dimensión de correlación basado en la , sort=estimadordeescalaparadn \newglossaryentrysot:ckn type=symbols, name=, description=Estimador dependiente de la escala para la entropía de correlación basado en la , sort=estimadordeescalaparakn \newglossaryentrysot:csn type=symbols, name=, description=Estimador dependiente de la escala para el nivel de ruido basado en la , sort=estimadordeescalaparasn \newglossaryentrysot:Delu type=symbols, name=, description=Funcional para el nivel de ruido basado en , sort=funcionalnivelderuido \newglossaryentrysot:heavi type=symbols, name=, text=función escalón, description=Función escalón, sort=funcionescalon \newglossaryentrysot:igamma type=symbols, name=, description=Función Gamma incompleta superior, sort=funcióngammain \newglossaryentrysot:gamma type=symbols, name=, description=Función Gamma, sort=funcióngamma \newglossaryentrysot:1f1 type=symbols, name=, description=Función Hipergeométrica Confluente, sort=funcionhipergeometricaconfluente \newglossaryentrysot:2f1 type=symbols, name=, description=Función Hipergeométrica de Gauss, symbol=, sort=funcionhipergeometricadegaus \newglossaryentrysot:pdf type=symbols, name=, description=Función de densidad de probabilidad de la variable aleatoria , sort=funciondedensidad \newglossaryentrysot:cdf type=symbols, name=, description=Función de distribución acumulada de la variable aleatoria , sort=funciondedistribucionacumulada \newglossaryentrysot:ccdf type=symbols, name=, description=Función de distribución acumulada complementaria de la variable aleatoria , sort=funciondedistribucionacumulada \newglossaryentrysot:deltaD type=symbols, name=, description=Función delta de Dirac , sort=funciondeltad \newglossaryentrysot:delta type=symbols, name=, description=Distribución delta de Kroenecker, sort=distribuciondelta \newglossaryentrysot:mnorm type=symbols, name=, description=Distribución normal multivariada de media cero y matriz de covarianza , sort=distribucionnormalmulti \newglossaryentrysot:norm type=symbols, name=, description=Distribución normal de media cero y varianza , sort=distribucionnormal \newglossaryentrysot:unif type=symbols, name=, description=Distribución uniforme con soporte en , sort=distribucionuniforme \newglossaryentrysot:chi type=symbols, name=, description=Distribución Chi, symbol=, sort=distribucionchi \newglossaryentrysot:chis type=symbols, name=, description=Distribución Chi cuadrada, symbol=, sort=distribucionchicuadrada \newglossaryentrysot:l type=symbols, name=, description=Número de vectores de estado, sort=numerodevectoresdeestado \newglossaryentrysot:d0 type=symbols, name=, description=Dimensión de conteo por cajas, sort=dimensiondeconteoporcajas \newglossaryentrysot:m0 type=symbols, name=, description=Mínima dimensión de inmersión, sort=minimadimensiondeinmersion \newglossaryentrysot:m type=symbols, name=, description=Dimensión de inmersión, sort=dimensiondeinmersion \newglossaryentrysot:beta type=symbols, name=, description=Dimensión del ruido utilizado en el algoritmo de correlación asistido por ruido - Grados de libertad, sort=dimensiondelruido \newglossaryentrysot:tau type=symbols, name=, description=Retardo de inmersión, sort=retardodeinmesion \newglossaryentrysot:h type=symbols, name=, description=Escala o umbral, sort=escala \newglossaryentrysot:n type=symbols, name=, description=Longitud de la serie temporal, sort=longituddelaserietemp \newglossaryentrysot:Q type=symbols, name=, description=Número de comparadores que componen el algoritmo de correlación - Número de distancias entre pares de vectores de estado, sort=Numerodecomparadores \newglossaryentrysot:mu type=symbols, name=, description=Realización de ruido utilizada en el algoritmo de correlación asistido por ruido, sort=realizaciónderuido \newglossaryentrysot:distinf type=symbols, name=, description=Distancia Infinito, sort=distanciainfinito \newglossaryentrysot:dist type=symbols, name=, description=Distancia Euclídea, sort=distanciaeuclidea \newglossaryentrysot:sdist type=symbols, name=, description=Cuadrado de la distancia Euclídea, sort=cuadradodeladistanciaeuclidea \newglossaryentrysot:Z2 type=symbols, name=, description=Cuadrado de la distancia Euclídea entre dos vectores de estado, sort=cuadradodeladistanciaeuclideaentrevectores \newglossaryentrysot:Z type=symbols, name=, description=Distancia Euclídea entre dos vectores de estado, sort=distanciaeuclideaentrevectores \newglossaryentrysot:varn type=symbols, name=, description=Varianza del ruido presente en la serie temporal, sort=varianzadelruido \newglossaryentrysot:vars type=symbols, name=, description=Varianza de la serie temporal s, sort=varianzadelaserietemporal \newglossaryentrysot:apen type=symbols, name=, description=Entropía aproximada, sort=entropiaaproximada \newglossaryentrysot:apenmax type=symbols, name=, description=Entropía aproximada máxima, sort=entropiaaproximadamax \newglossaryentrysot:hmax type=symbols, name=, description=Escala máxima, sort=escalaoptima \setglossarystylelong \counterwithoutfiguresection \cortext[c1]Corresponding author. Tel: +54 0343 4975 100 (122) \cortext[c2]This work was supported by the National Scientific and Technical Research Council (CONICET) of Argentina, the National University of Entre Ríos (UNER), CITER and the Grants: PID- (UNER), PICT-- (ANPCyT-UNER), and PIO-CO (CITER-CONICET). \cortext[c3]This is a post-peer-review, pre-copyedit version of an article published in Nonlinear Dynamics. The final authenticated version is available online at: http://dx.doi.org/10.1007/s11071-017-3974-3

1 Introduction

\glsunset

nl \Glscd and \glsk2 are quantities that characterize the natural measure of an attractor. The first one can be interpreted as the dimension of the attractor. It is a rough measure of the effective number of degrees of freedom (number of variables) involved in the dynamical process. The correlation entropy is a measure of the rate at which pairs of nearby orbits diverge. In other words, it gives us an estimate of the unpredictability of the system [6].

The correlation dimension and entropy allow us to measure the complexity of a dynamical system. Typically, more complex systems have higher dimensions and larger values of entropy. More important, these quantities help us to identify changes in the system’s complexity. For this reason, they have become very popular and widely used in the biomedical field [27], not only to characterize the dynamics of physiological systems, but also to detect different types of pathologies. For example, Choi \glsnameetal used the correlation dimension calculated from voice signal to characterize the dynamics of the phonatory system in normal and pathological conditions. They reported a mean value of for normal voices and an increased dimension value for pathological ones [3]. In Hosseinifard \glsnameetal [15] used \glscd, among other nonlinear measures, extracted from electroencephalographic signals to classify between depressed and normal patients. Combining these features with machine learning techniques, they achieved a classification accuracy. On the other hand, in [9]; [8] the authors illustrated how \glsk2 and other related entropy measures have been used to characterize different biological phenomena.

Since the natural measure of an attractor is invariant under the dynamical evolution, both \glscd and \glsk2 can be easily estimated from indirect time measurements of one of the system variables [19]. For an observed stationary time series of length , , the reconstructed -dimensional delay vectors , , must be formed. Here is the embedding dimension and is the embedding lag. As a collection, these delay vectors constitute the reconstructed attractor of the system in . From another point of view, this collection can be thought as samples drawn from the natural measure of the system. These samples can be used to estimate the invariants \glscd and \glsk2 using the correlation integral . This last quantity is defined as the probability that the distance between two randomly selected -dimensional delay vectors and is smaller than a value  [6]:

(1)

where is a kernel function and is the \glspdf of the distance between pairs of delay vectors. In this article, the Euclidean distance is used, but others measures of distance can be employed. The Grassberger and Procaccia correlation integral \glssot:sci uses as kernel function , where is the Heaviside step function [11]. On the other hand, the \glsgci \glssot:gci proposed by Diks \glsnameetal [5] adopts . For deterministic time series and in absence of noise, both correlation integrals scale as:

where , being the sampling frequency of the time series. This behavior is due to that, in the attractor of a chaotic dynamical system, the expected density of points within a ball of small radius , scales as . Thus, the greater the correlation dimension, the more complex the attractor, and the longer it takes to revisit the same region in it [2]. On the other hand, the factor is related to the exponential divergence of nearby trajectories. The higher the entropy values, the faster those trajectories diverge [7]; [12].

The main advantage of the \glsgci over the Grassberger and Procaccia correlation integral is that the former allows us to model the influence of additive noise on the scaling law. For a time series with added white Gaussian noise of variance , the scaling rule is [5]; [28]; [22]:

(2)

In presence of noise, it is very important to be able to quantify its level \glsnl in order to correct the estimation of \glscd and \glsk2. Moreover, an imprecise estimate of these invariants can lead to mistaken conclusions about the studied phenomena. In this sense, an estimate of the noise level must be reported allowing other researchers to be aware of the limitations of the data.

The main problem of the \glsgci is that it requires high values of to obtain a convergent estimate of \glsk2 [22]; [25]. This is a major drawback, since high values of imply much longer time series in order to achieve good estimations.

The cause of this problem is related with the kernel function of the \glsgci. It can be seen in Eq. (1) that the \glspdf of the interpoint distance depends on the embedding dimension. However, the Gaussian kernel function does not change with , and, for this reason, the convergence of the \glsnamegci to the \glsk2 is slowed down [24].

As a solution we have recently proposed the \glsuci [24]:

(3)

where is the upper incomplete Gamma function, is the Gamma function and is the \glspdf of the squared interpoint distance. There are two important issues about this correlation integral that deserve further analysis. First, note that the \glsuci’s kernel function, given by:

has a parameter that is used to incorporate information about the embedding dimension. In other words, this kernel function is able to change according to . Second, we are now working with the square of the interpoint distance, i.e., , to reduce the computational cost.

In order to find the scaling behavior of the \glsuci, we need an expression for the \glspdf of the squared distances between pairs of noisy delay vectors  [[23], see Eq. (18)]:

(4)

where is the Kummer’s confluent hypergeometric function. This distribution arises under the assumption that, in absence of noise, the \glspdf of the pairwise distances between -dimensional delay vectors (points in the reconstructed attractor) behaves as for a certain range of values [23]. If each of the coordinates of these vectors is perturbed with white Gaussian noise of variance , then the \glspdf of the perturbed distances will follow Eq. (4).

The scaling law for the \glsuci can be obtained by substituting Eq. (4) in Eq. (3) and integrating over  [24]:

(5)

where is a normalization constant and is the Gauss hypergeometric function.

In practical applications, once the scaling law has been selected and the correlation integral has been calculated, there are two main options to estimate \glscd, \glsk2 and \glsnl. The first one is to estimate these invariants through a nonlinear fitting of the scaling rule [5]; [28]. This method is highly dependent on the range of scale selected to fit the model, and there is not a consensus about the correct way to choose it [19]. The second option is to use coarse-grained estimators. These are explicit expressions for , , and as functions of and  [6]. For example, in [22] Nolte \glsnameetal have proposed a set of coarse-grained functions based on the \glsgci. The most important aspect of their approach is that the calculation of these functions only depends on the estimation of \glssot:gci for two consecutive values of the embedding dimension . Nevertheless, given that these estimators are based on the \glsgci, the convergence to \glsk2 needs high values of and, therefore, large amounts of data.

In [24] we have proposed a set of coarse-grained estimators for \glscd, \glsk2 and \glsnl based on the \glsuci. However, these estimators are highly dependent on a precise estimation of \glsnl.

In this article we present a new set of coarse-grained estimators for \glscd and \glsk2, based on the \glsuci, that only needs the calculation of two U-correlation integrals i. e. they do not require the a priori estimation of any quantity. Moreover, we propose a methodology to automatically estimate these invariants and the noise level of the time series.

In Section 2 we derive the here proposed coarse-grained estimators. Section 3 is devoted to describe the algorithm to automatically estimate the invariants. In Section 4 we discuss these results and give some suggestions for the practical of the algorithm. Finally, the conclusions are presented in Section 5.

2 Theory

Following a similar approach to one given by Nolte \glsnameetal [22], the here proposed coarse-grained estimators are based on a noise level functional. This quantity, closely related to the noise level of the time series, can be estimated using two U-correlation integrals. In this work, the noise level functional will be employed to correct the deviation of the coarse-grained estimators from the scaling law in presence of noise.

2.1 Noise level functional

The noise level functional based on the \glsuci can be defined as [24]:

(6)

where

As it can be observed from Eq. (6), is a function that decreases monotonically from to on a scale that is proportional to the value of \glsnl. This functional depends on the logarithmic derivative of two U-correlation integrals: and . Note that the parameter is set equal to in both correlation integrals.

In order illustrate the behavior of this noise level functional, we will use the Henon map:

where and . This map has been widely used in the literature to numerically characterize different methods devoted to estimate attractor’s invariants [26]; [23]; [20]; [6]; [28]; [22]; [19]; [25]; [16]; [4]. For this map, and the given parameters values, the correlation dimension and correlation entropy are \glscd and \glsk2, respectively [28].

(a)
(b)
Figure 1: Henon map: Noise level functional for , the curves for different values are color-coded in grayscale where the lightest gray corresponds to . (a) Noiseless. (b) . The theoretical curve (right-hand side of Eq. (6)) is shown in dashed black line.

The noise level functionals for both a clean and a noisy () realizations of the Henon map of length are shown in Fig. b. Note that in absence of noise (Fig. a), this functional is close to zero for a wide range of values regardless the value of . On the other hand, in Fig. b it can be observed that falls of from to and the curves for different values are very close to the theoretical curve for . We must point out that the closest curve of this functional to the theoretical value is the one corresponding to . For larger values of , slightly deviates from this behavior.

2.2 Noise level coarse-grained estimator

Starting from Eq. (6), we have derived the coarse-grained noise level estimator [24]:

(7)
(a)
(b)
Figure 2: Henon map: Noise level coarse-grained estimator for , the curves for different values are color-coded in grayscale where the lightest gray corresponds to . (a) Noiseless. (b) . The true value of the noise level is shown in dashed black line.

We must point out that all time series used in this work were rescaled to have unitary standard deviation. In consequence, is the noise level after rescaling, i.e.,

where is the noise variance and is the variance of the clean time series. In this sense corresponds to clean time series and implies only noise. The \glssnr can be calculated as:

In Fig. b, it is shown the coarse-grained estimator for noise level as a function of for for both a clean and a noisy () realization of the Henon map with length . It can be observed in Fig. a that, for the noiseless case, approaches to zero as . On the other hand, Fig. b shows that, when the noise level is increased, underestimates the value of \glsnl, but it is not too far from its real value. Note that the curve of closest to the real \glsnl is the one calculated for (top curve). Moreover, the range of values where \glsnl can be estimated is larger for than for the other values. This suggests that a coarse-grained estimator based on the \glsgci could be better than one based on the \glsuci for noise level estimation.

(a)
(b)
Figure 3: Henon map: Correlation dimension coarse-grained estimator for , the curves for different values are color-coded in grayscale where the lightest gray corresponds to . (a) Noiseless. (b) . The reported value of the correlation dimension is shown in dashed black line.

2.3 Correlation dimension coarse-grained estimator

Here we present a new coarse-grained estimator for \glscd (see Appendix A for deduction) which is given by:

(8)

The calculation of requires the estimation of and two correlation integrals: and . In [24] we have shown that if , then . Moreover, under the same condition . Replacing in Eq. (8) we have that:

This means that, in absence of noise, this coarse-grained estimator is the logarithmic derivative of the \glsuci, which tends to \glscd. On the other hand, if noise is present, the second term in the right-hand side of Eq. (8) helps to correct the estimation.

Since the \glsgci is a special case of the \glsuci for . The coarse-grained estimator of \glscd proposed by Nolte \glsnameetal in [22] can be easily derived from Eq. (16) (see Appendix A.1).

In Fig. b, the estimations of for both a clean and a noisy () realizations of the Henon map are shown for . In absence of noise, it can be seen in Fig. a that the estimator oscillates near the reported value of correlation dimension () for a wide range of scales, regardless the value of . In contrast, Fig. b shows that overestimates \glscd. Furthermore, the best estimation is reached with the lowest value of the embedding dimension, . This is related with the previously discussed scaling behavior of . However, it can be observed that the values of are close to the reported value of \glscd.

(a)
(b)
Figure 4: Correlation entropy coarse-grained estimator for , the curves for different values are color-coded in grayscale where the lightest gray corresponds to . (a) Noiseless. (b) . The reported value of the correlation entropy is shown in dashed black line.
(a)
(b)
(c)
Figure 5: Henon map with noise (). Estimation of for through different coarse-grained estimators, the curves for different values are color-coded in grayscale where the lightest gray corresponds to . (a) Jayawardena \glsnameetal (b) Nolte \glsnameetal (c) . The reported value is shown in dashed black line.
(a)
(b)
(c)
Figure 6: Rössler system. Estimation of for through different coarse-grained estimators, the curves for different values are color-coded in grayscale where the lightest gray corresponds to . (a) Jayawardena \glsnameetal (b) Nolte \glsnameetal (c) . The reported value is shown in dashed black line.

2.4 Correlation entropy coarse-grained estimator

Here proposed coarse-grained entropy estimator is defined as (see Appendix B):

(9)

This coarse-grained estimator depends on the correlation integrals and , , the logarithmic derivative , and the coarse-grained estimator . When the quantities , and taking into account that for  [24]:

then it can be easily shown that:

The estimations of from both a clean and a noisy realization () of the Henon map for are presented in Fig. b. It can be observed that in both cases (Fig. a, Fig. b) is near to the reported value of for all values. This is the main advantage of the \glsuci in comparison with other correlation integrals [24].

The last statement can be confirmed by comparing with two similar coarse-grained estimators based on different correlation integrals. The first one was proposed by Jayawardena \glsnameetal [17], and it is based on the Grassberger and Procaccia correlation integral (). The second one was proposed by Nolte \glsnameetal, and it is calculated from the \glsgci (). These approaches, as well as the here proposed, only requires the estimation of two correlation integrals of the same kind.

In Fig. c, the estimations of \glsk2 for the Henon map through the estimators proposed by Jayawardena \glsnameetal, Nolte \glsnameetal and can be compared. They were calculated from a time series of length with added white Gaussian noise at a level . The correlation sums , , and were calculated for with . Figure a shows the results for the estimator proposed by Jayawardena \glsnameetal It can be observed that, as is increased, this estimator slowly approaches to the reported value of \glsk2. However, it must be pointed out that, regardless that the dimension of this map is low (), high values of are needed in order to converge. On the other hand, it can be seen in Fig. b that the results of the coarse-grained estimator proposed by Nolte \glsnameetal also converge slowly to the reported value for \glsk2. In contrast, in Fig. c it can be observed that the estimator approaches faster to the value of \glsk2 than the other two studied estimators. Note that the curves for different values of are very close to each other, meaning that this estimator is more robust to changes in the parameter . Please note that the curves corresponding to of Fig. c are also shown in Fig. b.

The same methodology was applied to a clean time series obtained from the Rössler system:

where , and . The system was solved using the Runge-Kutta method with time step and initial condition , , . The first data points were discarded, and the next data points were taken to estimate the attractor’s invariants. The correlation sums , , and were calculated for with .

The results for the estimators proposed by Jayawardena, Nolte, and are shown in Figs. ab and c, respectively. In all three cases, a typical scaling behavior can be observed. Once again, the estimators proposed by Jayawardena and Nolte need high values of , in contrast with . Interestingly, all estimators converge to values of correlation entropy that differs from the one reported ([6]. Both the Jayawardena’s and Nolte’s estimators converge to , while converges to . From these results, we can conclude that the estimator converges by using low values of , in comparison with the previously proposed estimators [22]; [17].

3 Automatic invariants estimation

In practical applications, it is important to verify the existence of a scaling behavior for a range of . The determination of a stable scaling region, which is required to estimate the invariants, is usually done by visual inspection. However, that is not feasible when large sets of data are studied Additionally, the subjective judgment should be avoided. The necessity of an algorithm for the automatic estimation of invariants arises, once the scaling regime has been confirmed. In this section, we propose an algorithm (see algorithm 2) based on the coarse-grained estimators , and , that is able to automatically estimate these invariants. Before describing this algorithm, we need to detail some aspects about the calculation of the U-correlation integral.

3.1 Noise-assisted correlation algorithm

Figure 7: Noise-assisted correlation algorithm. The -th comparator () with threshold receives as input the squared distance between two -dimensional delay vectors minus an i.i.d. realization . All binary outputs are averaged to calculate the U-correlation sum .

The \glsuci is approximated with U-correlation sum . The calculation of requires a computationally expensive numerical integration due to the kernel of the \glsuci. However, we can use the noise-assisted correlation algorithm instead [24] (see Fig. 7). This algorithm begins by calculating the squared distances between each pair of delay vectors , where and . Then, for each squared distance an i.i.d. noise realization is subtracted. Here, and are the shape and scale parameters, respectively. The result is then compared with a threshold to produce a binary output . Then is the average of all binary outputs. These steps must be repeated for each value of and . Algorithm 1 describes all the steps that must be followed in order to estimate the noise-assisted U-correlation sum. The noise realizations were generated using realizations given that, if , then .

As an example, the correlation integral can be approximated by which in turn is calculated from the squared distances of -dimensional delay vectors, and noise realizations that follow a Gamma distribution with shape parameter equal to . Similarly, the correlation sum can be obtained from the squared distances of -dimensional delay vectors and noise realizations following a Gamma distribution with shape parameter equal to .

1: Form the -dimensional delay vectors from the temporal series and calculate all pairwise squared distances with .
2: Fix the value of the parameter and obtain realization of noise from the distribution .
3: Calculate the binary variable:
4: Calculate the U-correlation sum as:
5: Repeat steps - for all values of and .
Algorithm 1 Noise-assisted U-correlation algorithm.

3.2 Algorithm for automatic estimation of invariants

(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
Figure 8: Henon map. Box plot of the estimation of , , and for different numbers of available delay vectors . First row: estimation of . (a) noiseless, (b) , and (c) . Second row: estimation of . (d) noiseless, (e) , and (f) . Third row: estimation of . (g) noiseless, (h) , and (i) . The reported values are shown in dashed black line.

The automatic algorithm to estimate invariants begins by approximating the U-correlation integrals and , for different values () using the noise-assisted correlation algorithm (Alg. 1). This step does not requires a narrow range of as initial guess. For example, for a normalized time series the range is a good choice. On the other hand, note that the expression requires values of since Gamma distribution’s shape parameter must be greater than zero.

Next, the logarithmic derivatives , are computed and is calculated using Eq. (6). To calculate the logarithmic derivatives, we make use of a wavelet transform approach to approximate the derivative [21]. This allows us to achieve better estimations of the invariants because it implements a low-pass filter reducing the high-frequency oscillations that are naturally present in estimators derived from the noise-assisted algorithm [24]. Please observe that correlation integral is equal to correlation integral evaluated at . In order to obtain \glsnl, the coarse-grained estimator must be calculated [Eq. (7)].

To obtain a suitable range of scale values where the invariants can be estimated, one must look for a range of where the coarse-grained estimator is nearly constant and its variation across the different values is the smallest. In this sense, for the noise level we define the functions:

and

where , is the average of across . is the average over of the derivative of with respect to , gives the variation of across and is the product of the aforementioned functions. We propose to estimate \glsnl within a range of centered at the value at which is minimum. In this way, \glsnl is estimated in a range of centered in a plateaued region (scaling region) of , and its value is consistent through the parameter .

The correlation dimension is determined using the coarse-grained estimator [Eq. (8)]. To find a range of values to estimate \glscd, we define the functions , and similarly to the ones above but using the estimator instead of . The estimation of \glscd must be done within a range of centered at the value at which is minimum.

Finally, the correlation entropy can be estimated using the coarse-grained estimator . As it can be seen in Eq. (9), this estimator depends on , , the coarse-grained estimator and the ratio between and . \glsk2 must be determined within a range of centered at the value at which is minimum.

1: Calculate and using Alg. 1 for .
2: Calculate using Eq. (6) and the \glsuci obtained in step .
3: Compute with Eq. (7) and obtain the function . Estimate within a range of centered at the value where is minimum.
4: Use , and to calculate (Eq. (8)) and obtain the function . Estimate within a range of centered at the value where 1 is minimum.
5: Obtain (Eq. (9)) using , and . Calculate the function . Estimate within a range of centered at the value where is minimum.
Algorithm 2 Automatic estimation of attractors’ invariants.

3.3 Simulations

The aforementioned methodology was applied to a set of realization of the Henon map with different noise levels and data lengths which are proportional to the number of available delay vectors to estimate the invariants \glsnl, \glscd and \glsk2. To calculate the \glsucis for all realizations were normalized (unitary standard deviation), and the nearest temporal neighbors of each delay vector were discarded [19].

(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
Figure 9: Mackey-Glass box plot of the estimation of , , and for different numbers of available delay vectors, . First row: estimation of . (a) noiseless, (b) , and (c) . Second row: estimation of . (d) noiseless, (e) , and (f) . Third row: estimation of . (g) noiseless, (h) , and (i) . The reported values are shown in dashed black line.

Figure i shows the boxplots of the invariants \glsnl, \glscd and \glsk2 for . Unfortunately, there is no other automatic algorithm for the estimation of invariants. Due to this reason, we only can compare the results of our estimations with previously reported values calculated by selecting the scaling range by visual inspection. For the noise level estimation (first row of Fig. i), it can be observed that, in absence of noise, our algorithm overestimates the value of \glsnl (dashed black line). On the contrary, for higher noise levels () the proposed method slightly underestimates \glsnl. It can be noticed that, regardless the noise level, the median of each box is very close to the true value of \glsnl and the variance of the estimation decreases as the number of data points increases. The estimation of the correlation dimension is shown in the second row of Fig. i. It can be seen in Fig. d that for noiseless time series the estimation of \glscd is very close to the reported value () and its variance decreases as the length of the time series is increased. For higher noise levels, the correlation dimension is slightly biased, but the estimations are still close to the previously reported value.

The results for the correlation entropy are shown in the third row of Fig. i. In general, the here proposed method yields correlation entropy values converging to values lesser than the reported for this map ().

The proposed approach was also tested on the Mackey-Glass time series, which is generated by the following nonlinear time-delay differential equation:

where , , and . We estimate the invariants , , and from realizations with different initial conditions and normalized to have unitary standard deviation. The embedding dimensions were and the embedding lag was set to the first local minimum of the mutual information function ([18]. Finally, the nearest temporal neighbors of each delay vector were discarded.

The results of these simulations are presented in Fig. i and the estimations of the noise level are presented in Figs. a, b and c. In absence of noise the algorithm estimates small positive values of \glsnl. This behavior is explained because the function tends to zero as the length of the time series increases, producing more accurate estimations with less variance. In presence of noise ( and ) the here proposed algorithm underestimates the noise level, but in Fig.i it can be seen how the median values tend to the real value of \glsnl as increases.

The estimations of the correlation dimension are shown in the second row of Fig. i. For clean time series, our algorithm provides estimations of \glscd very close to the reported value  [13]. As the data length is increased, the estimations approach the reported value. In case of noisy time series, the estimations of the correlation dimension are greater than those obtained in the noiseless case. Nevertheless, the estimation of \glscd is still very good.

In Figs. g, h and i the estimations for the correlation entropy are presented. Our algorithm achieves satisfactory estimations of \glsk2 (reported value [14], even in presence of noise. It can be observed that regardless the value of the estimation of is consistent, but its variance decreases as the data length increases.

4 Discussion

In this section, we review the main contributions of this article and give some practical recommendations for its implementation. In this work, we introduce a new methodology to estimate the correlation dimension, the correlation entropy and the noise level of time series. This approach is based on the recent proposed U-correlation integral (Eq. (3)), introduced in [24], which uses a kernel function that depends on the embedding dimension. From the scaling function of the \glsuci (Eq. (5)), we have developed a set of coarse-grained estimators for the correlation dimension (Eq. (8)), the correlation entropy (Eq. (9)) and the noise level (Eq. (7)). The advantage of these coarse-grained estimators is that they only depend on the estimation of two U-correlation integrals. In other words, they do not need of the tuning of any external parameter and/or the estimation of the noise level, as it does other approaches [6]; [16]; [24].

From the results presented in the previous section, we can conclude that the here proposed algorithm (Alg. 2) is able to automatically select the scaling region and to estimate the invariants. As far as we know, there is no other algorithms with these features reported in the literature. For this reason, our results were compared against reported estimations that were obtained selecting the scaling region by visual inspection.

Regarding the coarse-grained estimators, the most outstanding result was obtained with . It is more accurate than previously proposed estimators and needs lesser values of to converge.

Finally, it is important to recall that the implementation of our method only requires the calculation of two correlation integrals: and , that can be obtained through the noise-assisted correlation algorithm (Alg. 1). It is also important to say that the time series must be normalized to have unitary standard deviation. As it was reported in [24], the correlation integrals computed using the noise-assisted correlation algorithm present oscillations for small values of . This phenomenon makes difficult to look for scaling regions over small values. In order to avoid this difficulty three strategies can be followed: The first one is to increase the number of comparators in the noise-assisted correlation algorithm by increasing the data length. This will make coarse-grained curves smoother and also will improve the convergence of these estimators. The second one is to increase by making copies of the squared distances between delay vectors and adding different noise realizations to them. This will not improve the convergence of any coarse-grained estimators to the invariant since no new information is added, but the ripple in the coarse-grained curves will be reduced. The third one, and the one we have found as the more adequate, is to use the algorithm proposed in [21] to calculate the logarithmic derivatives of the \glsuci. This algorithm calculates a smoothed version of the derivative of a function using the wavelet transform.

5 Conclusions

The original contributions of this document are twofold: both a new method for the estimation of invariants based on the recently proposed U-correlation integral, and an algorithm for the automatic selection of the scaling region. The combined use of these two ideas allows to automatically estimate the noise level, correlation dimension and correlation entropy of a dynamical system. This approach have been statistically tested using synthetic data coming from low-dimensional systems with different data lengths and noise levels. The results suggest that our algorithm provides robust estimations of the correlation dimension, the correlation entropy and the noise level.

Appendix A Analytic deduction for the coarse-grained estimator

Let:

(10)

and

then Eq. (5) can be rewritten as:

(11)

We need to find the logarithmic derivative of Eq. (11). For this end we can find its first derivative as:

using Eq. 15.2.1 from Ref. [1]:

Then the logarithmic derivative can be found as:

(12)

and for simplicity in the notation:

Using the identity [[10], Eq. 9.137.9]:

(13)

we can rewrite Eq. (12) as:

(14)

From Eq. (5) and using the definitions in (10) it is possible to demonstrate that:

(15)

Substituting Eq. (15) into (14):

Restoring the values of , and and clearing for

notice that and from Eq. (6)

then

(16)

Making , the coarse-grained estimator for can be defined as:

(17)

a.1 Relation with Nolte’s \glsnameetal coarse-grained estimator:

It was proved in [24] that the Gaussian correlation integral is a particular case of the U-correlation integral that arises when is constant and equal to . Let , then from Eq. (15) it can be proved that:

for all values. Thus, Eq. (16) can be rewritten as:

(18)

Finally, given that the Gaussian correlation integral is a particular case of the U-correlation integral for , i.e., , we can write: