Estimating of the inertial manifold dimension for a chaotic attractor of complex Ginzburg-Landau equation using a neural network
Dimension of an inertial manifold for a chaotic attractor of spatially distributed system is estimated using autoencoder neural network. The inertial manifold is a low dimensional manifold where the chaotic attractor is embedded. The autoencoder maps system state vectors onto themselves letting them pass through an inner state with a reduced dimension. The training processes of the autoencoder is shown to depend dramatically on the reduced dimension: a learning curve saturates when the dimension is too small and decays if it is sufficient for a lossless information transfer. The smallest sufficient value is considered as a dimension of the inertial manifold, and the autoencoder implements a mapping onto the inertial manifold and back. The correctness of the computed dimension is confirmed by its remarkable coincidence with the one obtained as a number of covariant Lyapunov vectors with vanishing pairwise angles. These vectors are called physical modes. Unlike never having zero angles residual ones they are known to span a tangent subspace for the inertial manifold.
Estimating of the inertial manifold dimension for a chaotic attractor of complex Ginzburg-Landau equation using a neural network
Keywords: chaotic attractor of a high dimensional system, inertial manifold, neural network, autoencoder, machine learning, covariant Lyapunov vectors, physical modes
Modeling and analysis of high dimensional nonlinear systems encounter obvious obstacle related with large amount of data required to be processed. However there are reasons to expect that much of these data are actually spurious and can be eliminated. The simplest way to be convinced of that is to compute Kaplan-Yorke dimension  that typically is much lower than the full phase space dimension. The Kaplan-Yorke dimension is related with the information dimension and is an upper estimate for the Hausdorff dimension of an attractor . Hence it is natural to expect that actual topological dimension of a manifold where the attractor is embedded is at least not much higher. Rigorous arguments in favor of this can be found in Ref.  where the concept of inertial manifolds is introduced. This manifold contains the attractor, attracts exponentially all solutions, and is stable with respect to perturbations. Even for infinite dimensional systems, like spatially distributed ones, these manifolds are finite dimensional and the dimension is not very high.
Obvious benefit of having the mappings to the inertial manifold and back is that high or infinite dimensional models can be reduced to low dimensional sets of equations without the loss of information. But even if the low dimensional model is not constructed explicitly, the lossless dimension reduction of state vectors eliminates spurious information from data flow and can dramatically facilitate their analysis.
General methods of finding the inertial manifold mappings are not known yet, but there are approaches for building their approximations . Moreover, there exist a lot of methods for building approximate reduced models without explicit appealing to the concept of inertial manifolds [5, 6, 7, 8, 9].
Power tools and methods for analysis of dynamical systems can be borrowed and adopted from a rapidly growing area of machine learning. For example, machine learning apparatus, particularly neural networks, are employed for building computer models of spatially distributed experimental systems [10, 11, 12, 13]. An example of such modeling includes dimension reduction of experimental data via a nonlinear principal component analysis with further creation a neural network model and spatiotemporal reconstruction of the solution . Papers [15, 16] develop an approach for building a low dimensional model of high dimensional experimental data using an autoencoder neural network . First, dimension of the input data is reduced using the encoding part of the autoencoder. Then based on the reduced vectors a mathematical model is constructed. Finally, the output of the functioning model is processed by the decoding part of the autoencoder to predict original high dimensional data.
Series of papers [18, 19, 20, 21] investigate inertial manifolds of spatially distributed systems using covariant Lyapunov vectors (CLVs). These vectors are one to one related to Lyapunov exponents and point tangent directions of expanding, neutral and contracting manifolds of attractor trajectories [22, 23, 24]. Average exponential growth of perturbations along each CLV is given by a corresponding Lyapunov exponent. As reported in Ref.  for spatially distributed chaotic systems there exists a sufficiently small subset of CLVs related to all positive and several closest to zero negative Lyapunov exponents that are highly entangled in the sense that angles between them often vanish along trajectories. This subset of CLVs span a subspace tangent to the inertial manifold and their number is equal to the dimension of this manifold. These vectors carry all essential information about the attractor and are called physical modes. In contrast the large batch of residual CLVs are orthogonal to each other. These vectors are tangent to trajectories approaching the attractor and called spurious modes.
In this paper we determine the dimension of the inertial manifold for a high dimensional chaotic system using an autoencoder. Then we compute the dimension via CLVs as a number of physical modes. These two values obtained using independent methods are found to be remarkably identical. It confirms the correctness of these values and open perspectives of using the developed approach based on machine learning for further analysis of high dimensional chaotic dynamics.
The paper is organized as follows. In Sec. 2 we introduce complex Ginzburg-Landau equation whose attractor will be considered. Section 3 represents the autoencoder. The dimension of the inertial manifold is determined here via analysis of learning curves of this network. Section 4 investigates properties of the built autoencoder. It is shown that the autoencoder preserves the distributions and Fourier spectra of the input signal and also is robust with respect to small perturbation. In Sec. 5 we determine the dimension of the inertial manifold using CLVs that coincide with the dimension obtained using the autoencoder. Finally, in the concluding Sec. 6 we summarize the results of the paper.
2 Considered system
We are going to consider complex Ginzburg-Landau equation 
with the parameters values , and and boundary conditions . For numerical solution a mesh is introduced with points and space step is . It corresponds to a spatial size . The total dimension of the discretized phase space is . For these parameter settings the system (1) demonstrates amplitude chaos as illustrates Fig. 1.
The first six Lyapunov exponents of the system are , , , , , and . The corresponding Kaplan-Yorke dimension  is:
This dimension is known to be an upper estimate of the Hausdorff (fractal) dimension of an attractor  and we observe that this is much smaller then the phase space dimension . It means that it is natural to expect that the attractor of the system (1) lays on a manifold whose dimension is much less then .
To find a manifold where the attractor of the system (1) is embedded we will reduce dimension of its state vectors using an autoencoder . In brief, this is a neural network that implements two successive transformations: encoding and decoding. The encoder maps high dimensional data vectors to low dimensional ones and the decoder maps these reduced vectors back to the high dimensional space. The network is trained in such a way that the initial and the reconstructed vectors coincide. As a result the reduced vectors carry all information about the input high dimensional data.
The encoding and decoding parts of an autoencoder can formally be represented as vector functions and , respectively:
The encoding function maps vectors from the discretized state space of the system (1) onto the reduced space of vectors , where is the reduced dimension, . The decoder maps the vectors back to the initial space of vectors . Training of the autoencoder (3), i.e., tuning of its neuron weights, is performed in such a way as minimize the loss function
When the decoder part approximates the inverse of the function , i.e., , so that the autoencoder as a whole performs the identity mapping of the state vectors onto themselves. The important point is that there is an intermediate bottleneck where all the information is carried by the reduced vectors . Obviously such autoencoder can be trained for , and when is too small the training will not be successful. Thus we expect that the minimal autoencoder (3), (4) with the smallest sufficient gives the sought mapping onto the inertial manifold and back, and the dimension of this manifold is .
Particular architecture of the autoencoder can be different and must be selected to achieve the best performance. In this paper we have constructed the one represented in Fig. 2. The encoder includes convolutional and dense (fully connected) layers (see Refs. [17, 26, 27] for the detailed explanation of these types of neural network architectures). Using the convolutional layer is motivated by the presence of the second spatial derivative in the model equation (1) whose discretized version operates similarly to the convolution. The convolutional layer accepts the initial state vector and contains inputs and outputs. The reduction occurs in the next dense layer with inputs and outputs. The outputs are elements of the reduced vector .
The first layer of the decoder is dense. It accepts elements of a vector and expand them to outputs that are further processed by the transposed convolutional layer (paper  clearly explains the details of the transposed convolution). The outputs of this layer is obtained as a result of action of hyperbolic tangent and hence always belong to the interval . To recover the original scale of the data one more dense layer is added that performs only linear scaling without a subsequent nonlinear transformation. Its outputs form a vector .
The autoencoder is created using TensorFlow framework  and the builtin optimization algorithm Adam  is used for training. The training set includes 30 trajectory cuts of the system (1) each of 10000 vectors . On each epoch of training the whole training set is shuffled and split into 30 minibatches each of length 10000. These minibatches one by one are showed to the network and the optimization step is performed. Quality of the training is estimated on the validation set including 3 trajectories each of length 10000. For this purpose a learning curve is plotted showing how the loss function (4) computed for the validation set behaves against the epoch number.
Figure 3 demonstrates learning curves for different . One can see two types of them. The curves with reach saturation level and do not decays further, while those with continue decaying. On the basis of these observations one can conclude that is the sought dimension of the inertial manifold. This autoencoder is trained further up to 12000 epochs and the loss function level . Properties of the mappings (3) implemented by the resulting network are analyzed in the next section.
4 Properties of the autoencoder
Inertial manifold in the original -dimensional state space of the system (1) is a -dimensional smooth surface, probably strongly curved. Due to the smoothness points located sufficiently close to the attractor must remain close to it under the mappings (3). In the other words the autoencoder have to be robust with respect to perturbations of input data.
To verify the robustness, we collect totally attractor state vectors from trajectories. Notice that these are new vectors never seen by the network during the training. These vectors are perturbed by adding to each site , , a random variable with uniform distribution and amplitude .This perturbed batch is passed through the autoencoder and for each vector site the absolute error of reconstruction is computed: . Distributions of this error are shown in Fig. 4.
Figure 4(a) shows the distributions of reconstruction error without adding a perturbation to the input vectors, i.e., , and in Figs. 4(b and c) the perturbation amplitudes are and , respectively. One can see that in these three cases the error distributions are almost identical and located below . But further increasing of the perturbation up to , see Fig. 4(d), results in dramatic change of the distribution since the reconstruction error becomes much higher. These observations demonstrates that the mapping onto the inertial manifold and back remain valid when the perturbations remain small. It means that the autoencoder is robust at least with respect to small perturbations and recovers a smooth and strongly curved surface where the attractor points are located.
Visual image of operation results of the mappings (3) can be obtained by considering distributions on the attractor. First we again collect a batch of attractor state vectors and let it pass through the autoencoder. Then values at central site are gathered and their distribution is plotted, see Fig. 5(a). Also the distribution of the reconstructed values is plotted for comparison, see Fig. 5(b). One can see that these two distributions look identical.
It should be noted that each new run of the software computing data for Fig. 5 results in a bit different distributions since the sample length is insufficient to correctly represent a high dimensional chaotic attractor of the system (1). However the important is that the initial and reconstructed distributions are always identical.
One more test of the autoencoder is represented in Fig. 6(a, b, and c) that show Fourier spectra for the initial data , the reduced data , and the reconstructed data , respectively. Observe very high similarity of the spectra for the initial and reconstructed data: in both panels there are two main harmonics surrounded by a chaotic basis. Also notice that the reduced vector produces similar spectrum. The highest main harmonics survives, and the second one still exists though is much lower. Also similar are chaotic bases in all three cases. It is important to note that the site number where the data from the reduced vector are taken is chosen arbitrary. The encoding transformation is performed so that no clear correspondence between sites of the initial vector and the reduces one can be established. In fact all sites of the reduced vector demonstrates similar Fourier spectra.
5 Dimension of the inertial manifold estimated via covariant Lyapunov vectors
Now we will compare the estimated dimension of the inertial manifold with the one obtained via analysis of angles between covariant Lyapunov vectors (CVLs). First of all we briefly recall what are CLVs [22, 23, 24]. Figure 7 show a trajectory of a dynamical system that belongs to a chaotic attractor. Vicinity of such trajectory can be split into a direct sum of expanding, neutral and contracting manifolds. All trajectories from the contracting manifold approach the attractor; trajectories form the expanding manifold approach it in the inverse time and the neutral manifold is marginally stable. Average exponential expansion or contraction ratios of these manifolds are given by Lyapunov exponents: positive, zero and negative exponents correspond to expanding, neutral and contracting manifolds, respectively. The numbers of these exponents equal to the dimensions of the corresponding manifolds.
CLVs are related one to one to Lyapunov exponents and show tangent directions of the manifolds, see Fig. 8. One can see here two trajectory points marked as and and three CLVs, , , and , being tangent directions for expanding, neutral and contracting manifolds. On average norms of CLVs grow or decay exponentially as follows:
where is th Lyapunov exponent.
The important information about a dynamical system can be obtained form the analysis of angles between its CLVs. The angles can be computed in a straightforward way one by one. But also the efficient algorithm for computations of angles between tangent subspaces spanned by CLVs is suggested in Ref. .
Papers [18, 19, 20, 21] investigate the possibility of recovering of the inertial manifold by analyzing pairwise angles between CLVs. As reported in Ref.  for highly dimensional chaotic systems one can observe the splitting of the phase space into two subspaces. The first one is spanned by the finite number of CLVs, related with all positive, zero and several negative Lyapunov exponents. These vectors, called physical modes, are highly entangled in the sense that their angles often vanish. The second tangent subspace is spanned by the CLVs, called spurious modes, related with the residual negative Lyapunov exponents. Unlike the first ones, these CLVs never have zero angles with each other. As argued in Ref. , physical modes are responsible for the essential properties of the attractor. In particular, their number equals to the dimension of the inertial manifold. The spurious ones are only responsible for directions approaching the attractor. As noted in Ref. , additionally in the very end of the CLVs spectrum there is one more batch of entangled vectors related with the most negative CLVs. These vectors appear due to time reversibility and become physical modes when the system moves backward time.
Results of searching of the physical modes and the estimation of the inertial manifold for the system (1) via CLVs is shown in Fig. 9. One can see here minimal pairwise angles between first CLVs. First of all notice that the first vectors are strongly entangled, i.e., their angles can vanish. It is this vector collection that form a set of physical modes. Their number agrees with our previous estimation of the dimension of the inertial manifold obtained via training of the autoencoder, see Fig. 3. The residual CLVs form spurious modes. Dark squares on the main diagonal indicate that they are entangled by pairs. This is related with the fact the system (1) is complex valued so that neighboring CLVs are related with real and imaginary parts of the dynamical variable.
In this paper we build a neural network to find the smallest dimension sufficient for lossless representation of the attractor of high dimensional chaotic system. This representation is known as inertial manifold. The network we build is an autoencoder whose encoding part maps attractor vectors from the original phase space onto the inertial manifold with a reduced dimension, and the decoding part performs the inverse mapping back to the original space. As an example complex Ginzburg-Landau equation is considered in a chaotic regime. Its phase space after numerical discretization has dimension , while the reduced dimension is found to be . The reduced dimension is found after probing its different values and analyzing behavior of corresponding learning curves. If the dimension is too small to provide the lossless transformation, the learning curve tends to saturation. Thus the appropriate reduced dimension is chosen as the smallest one for that the learning curve still decays. The built autoencoder is robust with respect to small perturbations. It means that small perturbations to the attractor vectors remain small after passing through it. Moreover the autoencoder preserves Fourier spectra of the attractor trajectories and distributions of vector elements.
The computed dimension of the inertial manifold is compared with the one obtained as a number of covariant Lyapunov vectors with vanishing angles between each other. These vectors are called physical modes and span a subspace tangent to the inertial manifold. The dimensions in both cases perfectly coincide. It confirms the correctness of the performed analysis and open perspectives of further development of machine learning methods for studying high dimensional chaotic systems.
-  J. L. Kaplan and J. A. Yorke, “Chaotic behavior of multidimensional difference equations,” in Functional differential equations and approximations of fixed points, H.-O. Peitgen and H.-O. Walter, eds., Lecture Notes in Mathematics 730, p. 204, Springer, Berlin, 1979.
-  P. Grassberger and I. Procaccia, “Measuring the strangeness of strange attractors,” Physica D: Nonlinear Phenomena 9(1), pp. 189 – 208, 1983.
-  C. Foias, G. R. Sell, and R. Temam, “Inertial manifolds for nonlinear evolutionary equations,” Journal of Differential Equations 73(2), pp. 309 – 353, 1988.
-  H.-X. Li and C. Qi, “Modeling of distributed parameter systems for applications - A synthesized review from time-space separation,” Journal of Process Control 20(8), pp. 891 – 901, 2010.
-  M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation,” Neural Computation 15(6), pp. 1373–1396, 2003.
-  P. Benner, V. Mehrmann, and D. C. Sorensen, eds., Dimension reduction of large-scale systems, vol. 35 of Lecture Notes in Computational Science and Engineering, Springer-Verlag Berlin Heidelberg, 2005.
-  G. Kerschen, J.-C. Golinval, A. F. Vakakis, and L. A. Bergman, “The method of proper orthogonal decomposition for dynamical characterization and order reduction of mechanical systems: An overview,” Nonlinear Dynamics 41(1), pp. 147–169, 2005.
-  M. G. B. Blum, M. A. Nunes, D. Prangle, and S. A. Sisson, “A comparative review of dimension reduction methods in approximate bayesian computation,” Statist. Sci. 28, pp. 189–208, 05 2013.
-  A. J. Chorin and F. Lu, “Discrete approach to stochastic parametrization and dimension reduction in nonlinear dynamics,” Proceedings of the National Academy of Sciences 112(32), pp. 9804–9809, 2015.
-  X.-G. Zhou, L.-H. Liu, Y.-C. Dai, W.-K. Yuan, and J. Hudson, “Modeling of a fixed-bed reactor using the k-l expansion and neural networks,” Chemical Engineering Science 51(10), pp. 2179 – 2188, 1996. Chemical Reaction Engineering: From Fundamentals to Commercial Plants and Products.
-  R. GonzÃ¡lez-GarcÃa, R. Rico-MartÃnez, and I. Kevrekidis, “Identification of distributed parameter systems: A neural net based approach,” Computers & Chemical Engineering 22, pp. S965 – S968, 1998. European Symposium on Computer Aided Process Engineering-8.
-  H.-X. Li, H. Deng, and J. Zhong, “Model-based integration of control and supervision for one kind of curing process,” IEEE Transactions on Electronics Packaging Manufacturing 27(3), pp. 177–186, 2004.
-  H. Deng, H.-X. Li, and G. Chen, “Spectral-approximation-based intelligent modeling for distributed thermal processes,” IEEE Transactions on Control Systems Technology 13(5), pp. 686–700, 2005.
-  C. Qi and H.-X. Li, “Nonlinear dimension reduction based neural modeling for distributed parameter processes,” Chemical Engineering Science 64(19), pp. 4164 – 4170, 2009.
-  M. Wang, H. Li, X. Chen, and Y. Chen, “Deep learning-based model reduction for distributed parameter systems,” IEEE Transactions on Systems, Man, and Cybernetics: Systems 46(12), pp. 1664–1674, 2016.
-  M. Wang, H. Li, and W. Shen, “Deep auto-encoder in model reduction of lage-scale spatiotemporal dynamics,” in 2016 International Joint Conference on Neural Networks (IJCNN), pp. 3180–3186, 2016.
-  S. S. Haykin, S. S. Haykin, S. S. Haykin, and S. S. Haykin, Neural networks and learning machines, Pearson Upper Saddle River, 3 ed., 2009.
-  H.-l. Yang, K. A. Takeuchi, F. Ginelli, H. Chaté, and G. Radons, “Hyperbolicity and the effective dimension of spatially extended dissipative systems,” Phys. Rev. Lett. 102, p. 074102, 2009.
-  P. V. Kuptsov and U. Parlitz, “Strict and fussy mode splitting in the tangent space of the ginzburg-landau equation,” Phys. Rev. E 81, p. 036214, Mar 2010.
-  K. A. Takeuchi, H.-l. Yang, F. Ginelli, G. Radons, and H. Chaté, “Hyperbolic decoupling of tangent space and effective dimension of dissipative systems,” Phys. Rev. E 84, p. 046214, 2011.
-  H.-l. Yang and G. Radons, “Geometry of inertial manifolds probed via a lyapunov projection method,” Phys. Rev. Lett. 108, p. 154101, 2012.
-  F. Ginelli, P. Poggi, A. Turchi, H. Chaté, R. Livi, and A. Politi, “Characterizing dynamics with covariant lyapunov vectors,” Phys. Rev. Lett. 99, p. 130601, 2007.
-  C. L. Wolfe and R. M. Samelson, “An efficient method for recovering Lyapunov vectors from singular vectors,” Tellus A 59, pp. 355–366, 2007.
-  P. Kuptsov and U. Parlitz, “Theory and computation of covariant Lyapunov vectors,” Journal of Nonlinear Science 22(5), pp. 727–762, 2012.
-  I. S. Aranson and L. Kramer, “The world of the complex ginzburg-landau equation,” Rev. Mod. Phys. 74, pp. 99–143, Feb 2002.
-  I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep learning, MIT press Cambridge, 2016.
-  V. Dumoulin and F. Visin, “A guide to convolution arithmetic for deep learning,” ArXiv e-prints 1603.07285, 2016.
-  M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015. Software available from tensorflow.org.
-  D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” ArXiv e-prints 1412.6980, 2014.
-  P. V. Kuptsov, “Fast numerical test of hyperbolic chaos,” Phys. Rev. E 85, p. 015203, Jan 2012.