Datadriven prediction of unsteady flow fields over a circular cylinder using deep learning
Abstract
Unsteady flow fields over a circular cylinder are trained and predicted using four deep learning networks: networks associated with supervised feature extraction with and without prior knowledge of physical conservation laws, networks associated with unsupervised feature extraction with and without prior knowledge of conservation laws. Flow fields at Reynolds numbers 100, 200, 300, and 400 are trained, while flow fields at Reynolds numbers 500 and 3000 are predicted. Physical loss functions are proposed to explicitly impose prior knowledge of physical conservation laws to deep learning networks, while an adversarial training is applied to extract features of physical conservation laws in an unsupervised manner without any prior knowledge of the conservation laws. Effects of the proposed loss functions and adversarial training are analyzed. Flow field predictions using deep learning networks show good agreement with flow fields calculated by numerical simulations. Especially, unsupervised feature extractions can be applied to extract undiscovered prior knowledge in data, where many practical data lacks information of full underlying physics. The present study suggests that deep learning techniques can be utilized for predicting wake flow.
keywords:
Deep learning; supervised feature extraction; unsupervised feature extraction; unsteady flow fields; prior knowledge.1 Introduction
Observations of fluid flow in nature, laboratory experiments, and numerical simulations have provided evidence of existence of flow features and certain, but often complex, ordinance. For example, in nature, KelvinHelmholtz waves in the cloud (Dalin et al., 2010), von Karman vortices in ocean flow around an island (Berger and Wille, 1972), and swirling great red spot on the Jupiter (Marcus, 1988) are flow structures that can be classified as a certain type of vortical motions. Similar observations also have been reported in laboratory experiments and numerical simulations (Freymuth, 1966; Ruderich and Fernholz, 1986; Wu and Moin, 2009; Babucke et al., 2008). The existence of distinct and dominant flow features that are linearly independent have also been widely investigated by mathematical decomposition techniques such as the proper orthogonal decomposition (POD) method (Sirovich, 1987), dynamic mode decomposition (DMD) (Schmid, 2010), or Koopman operator method (Mezić, 2013).
Such distinct or dominant flow features have been utilized by some species such as insects, birds, and fish to control their bodies adequately to better adapt the fluid dynamic environment and to improve the aero or hydrodynamic performance and efficiency (Wu, 2011; Yonehara et al., 2016). This suggests that species in nature estimate future flow based on information of experienced flow in the past by empirically learning dominant flow features in their living environments. However, the learning procedures of species in nature are not based on numerical simulations or mathematical decomposition techniques. Such observation of unsteady flow estimation in nature motivates us to investigate the feasibility of predicting unsteady fluid motions based on flow field information in the past, by extracting flow features using deep learning techniques. Prediction of unsteady flow fields using deep learning could offer new opportunities for realtime control and guidance of aero or hydrovehicles, weather forecast, etc.
The aim of this study is to propose and compare deep learning techniques for predicting unsteady flow fields. Unsteady flow fields over a circular cylinder are trained and predicted using four deep learning networks: networks associated with supervised feature extraction with and without prior knowledge of physical conservation laws, networks associated with unsupervised feature extraction with and without prior knowledge of conservation laws. Physical loss functions are proposed to impose prior knowledge of physical conservation laws to deep learning networks. Adversarial training of two deep learning models, generator and discriminator models, is applied to extract features of physical conservation laws in an unsupervised manner. Effects of physical loss functions and adversarial training on flow field prediction are qualitatively and quantitatively analyzed. The paper is organized as follows. Deep learning backgrounds are reviewed in section 2. Numerical methods for NavierStokes simulations are explained in section 3. The method for construction of training datasets is explained in section 4. The present deep learning method and its results are presented in sections 5 and 6, respectively, followed by concluding remarks in section 7.
2 Deep learning backgrounds
In 1986, Rumelhart et al. (1986) developed a machine learning technique, which is widely known as multilayer perceptrons (MLP). MLP is an early stage neural network that firstly applied error backpropagation for training neural networks with sigmoid activation functions. Error of a neural network is evaluated from a loss function that compares values predicted from a neural network and true values from ground truth data. Weights inside a network are updated in the direction of the gradient that minimizes the loss function respect to each weight.
The development of backpropagation for training neural networks have led to a huge growth in the application of neural networks in a variety scientific disciplines. Initial studies of the use of neural networks with sigmoid activation functions to fluid mechanics had been conducted since the late 1990s to the early 2000s (Lee et al., 1997; Pruvost et al., 2001; Chen et al., 2002; Milano and Koumoutsakos, 2002; Sarghini et al., 2003). However, neural networks with sigmoid activation functions are known to suffer from a pitfall that gradients of a loss function inevitably vanishes after many error backpropagations through layers. As a consequence, the vanishing gradient problem limited applications of neural networks to complex problems that require networks with many layers for extracting complex feature of data.
In 2012, the vanishing gradient problem has been mostly resolved by the development of a rectified linear unit (ReLU) activation function (Krizhevsky et al., 2012). The development of ReLU activation function was a breakthrough that provided efficient training and enabled deep learning of neural networks. Consequently, the last five years seen a huge growth in neural networks and there has been a rapid rise in the use of neural networks in fluid mechanics. Tracey et al. (2015) and Zhang and Duraisamy (2015) utilized shallow neural networks for turbulence modeling for a Reynoldsaverage NavierStokes (RANS) simulation. Ling et al. (2016) employed a deep neural network (DNN) with ten hidden layers to model the Reynolds stress anisotropy tensor. These approaches were found to notably improve the accuracy of simulation results.
The principal characteristics of deep learning are extracting feature in a low dimensional feature space. The high dimensional information of an input data is encoded in low dimensional features inside weights of neural networks. Especially, convolutional neural networks (CNNs) are widely considered to be the best architecture to extract features from spatial data like images. There have been some discussions on utilizing CNNs for directly predicting flow characteristics; Guo et al. (2016) employed a CNN to predict steady flow fields around bluff objects and reported reasonable predictions with significantly reduced computational cost than that required for numerical simulations; Miyanawala and Jaiman (2017) employed a CNN to predict aerodynamic force coefficients of bluff bodies, also with notably reduced computational costs.
Prediction of unsteady flow fields using deep learning involves extracting both of spatial and temporal features of input flow fields, which could be considered as learning videos. Therefore, a deep learning based video modeling architectures could be a practical method to predict unsteady flow fields. Video modeling enables prediction of a future frame of a video based on information of previous video frames by learning spatial and temporal features of videos. However, even deep learning have been reported to generate high quality realworld like images in image modeling area (Radford et al., 2015; Denton et al., 2015; Oord et al., 2016; van den Oord et al., 2016), deep learning architectures for video modeling have shown difficulties in generating high quality predictions as they show blurry predictions (Srivastava et al., 2015; Ranzato et al., 2014; Mathieu et al., 2015) due to complex temporal fatures of data.
A longshortterm memory (LSTM) recurrent neural network based video modeling architecture was proposed by Srivastava et al. (2015). However, the method have shown very blurry reconstruction and predictions of the future video frames. Similarly, Ranzato et al. (2014) proposed an architecture that both leverages a recurrent neural network and CNN for learning both of the temporal and spatial features of a video. But they reported that predictions of the future frame tends to slow down the motion and converges to a still blurr image. Mathieu et al. (2015) proposed a video modeling architecture that utilizes a generative adversarial network (GAN) with multiscale CNNs to generate predictions of future frames. GAN is a network that contains a generator and discriminator model, which is proposed by Goodfellow et al. (2014). The generator network generates images and the discriminator network is employed to discriminate the generated images from real (ground truth) images. GAN is adversarially trained so the generator network is trained to delude the discriminator network, and the discriminator network is trained not to be deluded by the generator network. Nash equilibrium inherently presence in the two pronged adversarial training and it leads the network to extract underlying lowdimensional features in an unsupervised manner, in consequence, better quality images can be generated. Mathieu et al. (2015) reported the performance of predicting future frames of videos has improved compared to Ranzato et al. (2014) by utilizing GANs with multiscale CNNs and a gradient difference loss (GDL) function. The GDL function slightly resolves blurryness due to a single usage of mean squared error (MSE) loss function on training video data, as the MSE loss function is highly possible to simply blur an image to minimize the loss function. However, video modeling networks have tended to focus on predicting videos of human motions rather than physical phenomena that follows conservation laws.
3 Numerical simulations
Numerical simulations of flow over a circular cylinder at , and , where , , and are the freestream velocity, cylinder diameter, and kinematic viscosity, respectively, have been conducted by solving the incompressible NavierStokes equations as follows:
(1) 
and
(2) 
where , , and are the velocity, pressure, and density, respectively. A fully implicit fractional step method is employed as a time integration technique, whereas all terms in the NavierStokes equations are integrated using the CrankNicolson method. Secondorder centraldifference schemes are employed for spatial discretization (You et al., 2008). The computational domain consists of a block structured Hgrid with an Ogrid around the cylinder (figure 1). The computational domain sizes are , , and in the streamwise, crossflow, and spanwise directions, respectively. The computational timestep size () of is used for simulations. The domain size, number of grid points, and timestep sizes are determined from an extensive sensitivity study.
4 Datasets
Simulation results of flow over a cylinder at each Reynolds number are collected during 101 timestpdf, with timestep intervals of . Flow variables (, , and ) at each timestep and Reynolds number in a square domain of , , ( sized domain) are extracted into a uniform grid with a size of grid cells. Thus, a dataset of each reynolds number consists of 101 flow fields with size of (= height width flow variables).
Datasets of flow fields at and compose a training dataset. Flow fields in the training dataset are randomly subsampled in time and space to five consecutive flow fields on a sized grid and sized domain (see figure 2). The subsampled flow fields contain diverse types of flow such as, free stream flow, wake flow, boundary layer flow, or separating flow. Therefore, deep learning networks utilized in the present study are possible to efficiently learn physics from diverse types of flow. The first four consecutive sets of flow fields are input () containing the information of flow fields in both space and time, and the last set of flow fields is a ground truth flow field (). A pair of input and ground truth flow fields form a training sample. In the present study, total 500,000 training samples are employed for training various deep learning networks. Performances of networks are evaluated on a test dataset, which is composed of flow fields at and , without any subsampling in space ( sized grid and sized domain).
5 Deep learning methodology
5.1 Deep learning networks for flow prediction
The employed deep learning networks for flow prediction are variations of the architecture for video modeling proposed by Mathieu et al. (2015). The network structures in the present study consist of a generator model that accepts four consecutive sets of flow fields as input. Each input set of flow fields is composed of flow variables of , to take advantage of learning correlated physical phenomena between each flow variable. The utilized generator model in this study is composed of set of multiscale generative CNNs to learn multirange spatial dependencies of flow structures. During training, a generative CNN, , generates prediction of flow fields () on a sized grid. is fed with four consecutive sets of flow fields on a sized grid (), which are bilinearly interpolated from the original sized input sets of flow fields (), and a set of upscaled flow fields, (see figure 3). Note that the domain sizes of the and grids are identical and the corresponding convolution kernel size ranges are from 3 to 7 (table 2). Consequently, is possible to learn larger spatial depedencies of flow fields than by sacrificing resolutions. As a result, the multiscale CNN based generator models learn and predict flow fields with multirange spatial dependencies. Various configurations of generator models with different set of numbers of feature map sizes (, , and , see table 1) and numbers of weight layers ( ,, and , see table 2) have been trained in the current study.
Let be the ground truth flow fields resized on a grid size of . The discriminator model consists of set of discriminative networks with CNNs and fully connected (FC) layers. A discriminative network, , is fed with inputs of the predicted flow fields from the generative CNN () and the ground truth flow fields (). The CNNs of a discriminative network extracts lowdimensional features of the predicted flow fields and ground truth flow fields. Then, the connected FC layers compare the extracted lowdimensional features to classify the ground truth flow fields into class 1 and the predicted flow fields into class 0. The output of each discriniminative network is a single continuous scalar between 0 and 1, where an output value larger than a threshold (0.5) is classified into class 1 and an output value smaller than the threshold is classified into class 0. Procedures of the discriminator model to classify flow field predictions are summarized in figure 4. Various configuration of the discriminator model with different set of numbers of feature map and layer sizes have been trained (table 3). The generator models are trained with an Adam optimizer (Kingma and Ba, 2014) with the learning rate, which is a hyperparameter that determines the network update speed, of and the discriminator model is trained with the gradient descent method with the learning rate of .
5.2 Loss functions
For a given set of input and ground truth flow fields, the generator model generates prediction of flow fields that minimize a loss function that is a combination of six different loss functions as follows:
(3) 
where is the number of scales and
(4) 
Various loss functions are tested by tuning parameters of , , , , , and . Especially, indicates no adversarial training.
minimizes the difference between predicted flow fields and ground truth flow fields as
(5) 
is the GDL function (Mathieu et al., 2015). The loss function is applied to sharpen flow fields by directly penalizing the gradient differences between predicted flow fields and ground truth flow fields as
(6) 
where the subscript indicates the grid index of a flow field, and and indicate the number of grid points in and directions. Loss functions and provide prior knowledge to networks that predicted flow fields should resemble ground truth flow fields. These loss functions support the networks to learn physics, which corresponds to the prior knowledge, by extracting features in a supervised manner.
Let be nondimensionalized flow variables of ground truth flow fields () and be nondimensionalized flow variables of predicted flow fields (). Velocities and pressure are non dimensionalized by and , respectively. enables networks to learn mass conservation by minimizing the sum of the differences between the incoming and outgoing mass flux distributions in x, y, and z directions as
(7) 
enables networks to learn momentum conservation by minimizing the sum of the differences between the incoming and outgoing momentum flux disstributions with additional consideration of pressure at boundaries, as
(8) 
enables networks to learn kinetic energy conservation by comparing the total kinetic energy in a twodimensional grid.
(9) 
Loss functions , , and provide explicit prior knowledge of physical conservation laws to networks. These loss functions support the networks to extract features including physical conservation laws in a supervised manner.
is a loss function with a purpose to delude the discriminator model to classify the generated predictions of flow fields to class 1 as
(10) 
where is the binary cross entropy loss function defined as
(11) 
for scalar values a and b between 0 and 1. The loss function does not provide any explicit prior knowledge to the network, but provide knowledge in a concealed manner that features of predicted and ground truth flow fields should be indistinguishable. This is the key loss function that provides the network to extract features, which corresponds to physical conservation laws even without providing prior knowledge of the physics, in an unsupervised manner.
The loss function of the discriminator model,
(12) 
is minimized so that the discriminator model appropriately classifies the ground truth flow fields into class 1 and predicted flow fields into class 0. The discriminator model learns concepts in feature space.
6 Flow field predictions and evaluations
6.1 Error functions for evaluations
Let be nondimensionalized flow variables of the ground truth flow fields () and be nondimensionalized flow variables of the generated flow field predictions (). Velocities and pressure are non dimensionalized by and , respectively. error of the prediction is evaluated based on the average difference between the predicted flow variables as
(13) 
error of the prediction is evaluated based on the maximum difference between the predicted flow variables as
(14) 
Error of mass conservation is evaluated based on the difference between the incoming and outgoing mass flux on sized domain boundaries as
(15) 
Error of momentum conservation is evaluated based on the difference between the incoming and outgoing momentum flux and applying pressure on sized domain boundaries as
(16) 
Error of kinetic energy conservation is evaluated based on the difference between the kinetic energy inside a sized domain as
(17) 
6.2 Flow field predictions without prior knowledge of conservation laws
In this section flow fields at and are trained with a generator model (configuration of and number set ) without prior knowledge of conservation laws. Therefore, parameters of loss functions of and are applied. Qualitative flow field predictions at and from the network are shown in figure 5 and 6. Predictions of qualitative flow fields from the network show good agreements with the ground truth flow fields even the domain and grid sizes for predictions ( and ) are larger than the domain and grid sizes used in training ( and ). This is because the fully convolutional architecture in the generator model learns to extract low dimensional representations of local flow features due to multiscale CNNs. However, dissipations of smallscale flow structures are observed in the flow field prediction at , as the generator has not been trained with flow fields containing smallscale flow structures.
Flow field predictions after timestep interval larger than are predicted recursively by utilizing flow field predictions from the previous timestep as a part of input. The recursively predicted flow fields at are shown in figure 7. Difference between flow fields increases as the recursive prediction step increases as errors from the previous prediction is propagated to the next timestep prediction. Dissipation of smallscale flow structures and unphysical largescale flow structures at region between the front stagnation and separation and wake are observed as the timestep increases; however, largescale flow structures from vortex shedding are reasonably predicted. Quantitative errors of flow field predictions are compared with other deep learning networks in following sections.
6.3 Flow field predictions with prior knowledge of physical conservation laws
Effects of providing prior knowledge of physical conservation laws are evaluated by training flow fields using a generator model with physical loss functions. The parameters of the loss functions are and ; the physical loss functions are applied and adversarial training is not applied. Errors of flow field predictions from generators (configuration of and number set ) with and without physical loss functions are quantitatively compared in figure 8. Flow field predictions with and without physical loss functions show similar levels in and errors, as both flow field predictions utilize loss functions of and , which minimizes the difference between predicted and ground truth flow fields. However, drop of errors of , , and are observed when physical loss functions are applied. Therefore, the observation implies that the applied physical loss functions assist the supervised feature extraction based network to learn physical conservation laws by providing the prior knowledge of the conservation laws. The prior knowledge of the conservation laws trains the deep learning network to extract low dimensional flow features that not only minimizes the difference between the predicted and ground truth flow fields, but also minimizes the discrepancy of mass, momentum, and kinetic energy between the predicted and ground truth flow fields.
Qualitative predictions of flow fields at and are shown in figure 9 and 10, respectively, while recursively predicted flow fields at are compared with the ground truth flow fields in figure 11. The qualitative comparison is conducted with a generator model with configuration of and number set . Largescale flow structures of predicted flow fields after a single timestep () at and agree well with ground truth data, while dissipations of smallscale flow structures are observed in the flow field prediction at . These qualitative predictions of flow fields do not show big difference with flow field predictions without prior knowledge of conservation laws. In the case of recursive prediction, smallscale flow structures are dissipated and largescale flow structures are being unphysical as the timestep increases; also, pressure with high gradient in crossflow direction at wake is observed, which is not observed in results from prediction using the generator model without prior knowledge of physical conservation laws. However, largescale flow structures from vortex shedding are reasonably predicted. The errors from recursive prediction increase firstorderly (see figure 12), but shows a drop at recursive flow field prediction at . This is because is an error criterion which is dominated by macro flow fields in a domain; therefore, is not crucially influenced by the dissipation of smallscale flow motions.
The effects of number of weight layers and number sets of generator models on flow prediction at are shown in figure 13 and 14, respectively. Flow prediction at with number set shows minimum error of , and , while shows minimum error of and . The generator model shows minimum error of , while shows minimum error of , . The largest generator model, , shows minimum error of . No clear trends in errors respect to network sizes are observed.
6.4 Flow field predictions with unsupervised feature extraction
Effects of unsupervised feature extraction are analyzed by utilizing a generator model with configuration of and discriminator model with number set . The effect of adversarial training is analyzed by tuning weight parameters of loss functions as table 4, while the corresponding errors are compared in figure 15. High percentage of adversarial training influences the networks to extract low dimensional features in an unsupervised manner. These unsupervised extraction of low dimensional features supports networks to discover prior knowledge of data, but it also may cause unstable or inefficient training if it is too high. In our case, the best overall performance of conservation errors (, , and ) is achieved with 7.5% adversarial training, while the best performance of and errors is obtained with no adversarial training; the errors are evaluated on flow field predictions at .
Errors of flow field predictions at and with and without adversarial training are quantitatively compared in figure 16. 7.5% of adversarial training is applied. Similar to flow field predictions with prior knowledge of physical conservation laws, comparable levels in and errors and decreases of errors of , , and are observed. The observation implies that the discriminator model extracts features including physical conservation laws and learns the conservation laws by minimizing the discrepancy between the predicted and ground truth flow fields in feature space. Therefore, unsupervised feature extraction shows potentials to discover and learn prior knowledge that is uncovered in data. Qualitative predictions of flow fields at and from the network are shown in figure 17 and 18, respectively. Flow field prediction results after a single timestep () at and show good agreement of largescale flow fields with ground truth data, while dissipations of smallscale flow structures are observed in the flow field prediction at . These qualitative predictions do not show big difference with flow field predictions with and without prior knowledge of conservation laws.
Recursively predicted flow fields at are compared with the ground truth flow fields in figure 19; the corresponding errors from recursive prediction increase firstorderly (see figure 20). Dissipation of smallscale flow structures and unphysical largescale flow structures are observed as the timestep increases. Pressure with high gradient in crossflow direction at wake is observed, which is consistent with results from prediction using the generator model with prior knowledge of physical conservation laws. Also, Largescale flow structures from vortex shedding are reasonably predicted. The qualitative results are consistent with results from generator model with prior knowledge of physical conservation laws, which also implies adversarial training extracts features including physical conservation laws.
Tendencies of errors of , , , and are similar to the errors from flow field predictions with prior knowledge of conservation laws, however in unsupervised feature extraction shows additional drop at prediction at .
The effects of number of weight layers and number sets of generator models on flow field prediction are shown in figure 21 and 22, respectively. Flow prediction at with number set shows minimum error of and shows minimum error of , , and . The generator model shows minimum error of , , and , while the largest generator model, , shows minimum error of , and . The largest network is observed to learn mass and kinetic energy conservations the best, while difference is learned better in smaller networks.
6.5 Flow field predictions with physical loss functions and adversarial training
Errors from a network with both physical loss functions and adversarial training are compared with errors from networks with physical loss functions or adversarial training (see figure 23). Parameters of loss functions for the network with physical loss functions and adversarial training are and . A generator model with configuration of and number set and discriminator model with number set is employed. Utilization of both physical loss functions and adversarial training does not enhance the performance of flow field predictions, since physical loss functions and adversarial training both extract low dimensional features including physical conservation laws.
7 Conclusions
Flow fields at future occasions around a circular cylinder at and , which are not in the training dataset, were successfully predicted using deep learning, which had been trained using flow field datasets produced by numerical simulations at and . The utilized deep learning networks provide good predictions of flow fields after a single timestep even on a larger domain than training because the networks learns to extract features based on the underlying physics by multiscale CNNs. However, in the case of recursive prediction, dissipation of smallscale flow structures and unphysical largescale flow structures appear as the prediction timestep increases, even largescale flow structures from vortex shedding are reasonably predicted. The proposed physical loss functions and adversarial training are observed to support the network to learn physical conservation laws. Especially, observations from adversarial training implies that a discriminator model is able to learn physical conservation laws even without prior knowledge of the conservation laws. This is because adversarial training extracts features containing physical conservation laws in an unsupervised manner. Thus, unsupervised feature extraction can be applied to many practical data, which lack information of the full underlying physics, to extract undiscovered prior knowledge in data.
8 Acknowledgements
This work was supported by Samsung Research Funding Center of Samsung Electronics under Project Number SRFCTB170301.
References
 Dalin et al. (2010) P. Dalin, N. Pertsev, S. Frandsen, O. Hansen, H. Andersen, A. Dubietis, R. Balciunas, A case study of the evolution of a Kelvin–Helmholtz wave and turbulence in noctilucent clouds, Journal of Atmospheric and SolarTerrestrial Physics 72 (2010) 1129–1138.
 Berger and Wille (1972) E. Berger, R. Wille, Periodic flow phenomena, Annual Review of Fluid Mechanics 4 (1972) 313–340.
 Marcus (1988) P. S. Marcus, Numerical simulation of Jupiter’s great red spot, Nature 331 (1988) 693–696.
 Freymuth (1966) P. Freymuth, On transition in a separated laminar boundary layer, Journal of Fluid Mechanics 25 (1966) 683–704.
 Ruderich and Fernholz (1986) R. Ruderich, H. Fernholz, An experimental investigation of a turbulent shear flow with separation, reverse flow, and reattachment, Journal of Fluid Mechanics 163 (1986) 283–322.
 Wu and Moin (2009) X. Wu, P. Moin, Direct numerical simulation of turbulence in a nominally zeropressuregradient flatplate boundary layer, Journal of Fluid Mechanics 630 (2009) 5–41.
 Babucke et al. (2008) A. Babucke, M. Kloker, U. Rist, DNS of a plane mixing layer for the investigation of sound generation mechanisms, Computers & Fluids 37 (2008) 360–368.
 Sirovich (1987) L. Sirovich, Turbulence and the dynamics of coherent structures part I: coherent structures, Quarterly of Applied Mathematics 45 (1987) 561–571.
 Schmid (2010) P. J. Schmid, Dynamic mode decomposition of numerical and experimental data, Journal of Fluid Mechanics 656 (2010) 5–28.
 Mezić (2013) I. Mezić, Analysis of fluid flows via spectral properties of the Koopman operator, Annual Review of Fluid Mechanics 45 (2013) 357–378.
 Wu (2011) T. Y. Wu, Fish swimming and bird/insect flight, Annual Review of Fluid Mechanics 43 (2011) 25–58.
 Yonehara et al. (2016) Y. Yonehara, Y. Goto, K. Yoda, Y. Watanuki, L. C. Young, H. Weimerskirch, C.A. Bost, K. Sato, Flight paths of seabirds soaring over the ocean surface enable measurement of finescale wind speed and direction, Proceedings of the National Academy of Sciences 113 (2016) 9039–9044.
 Rumelhart et al. (1986) D. E. Rumelhart, G. E. Hinton, R. J. Williams, Learning representations by backpropagating errors, Nature 323 (1986) 533.
 Lee et al. (1997) C. Lee, J. Kim, D. Babcock, R. Goodman, Application of neural networks to turbulence control for drag reduction, Physics of Fluids 9 (1997) 1740–1747.
 Pruvost et al. (2001) J. Pruvost, J. Legrand, P. Legentilhomme, Threedimensional swirl flow velocityfield reconstruction using a neural network with radial basis functions, Journal of Fluids Engineering 123 (2001) 920–927.
 Chen et al. (2002) Y. Chen, G. Kopp, D. Surry, Interpolation of windinduced pressure time series with an artificial neural network, Journal of Wind Engineering and Industrial Aerodynamics 90 (2002) 589–615.
 Milano and Koumoutsakos (2002) M. Milano, P. Koumoutsakos, Neural network modeling for near wall turbulent flow, Journal of Computational Physics 182 (2002) 1–26.
 Sarghini et al. (2003) F. Sarghini, G. De Felice, S. Santini, Neural networks based subgrid scale modeling in large eddy simulations, Computers & Fluids 32 (2003) 97–108.
 Krizhevsky et al. (2012) A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, in: Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.
 Tracey et al. (2015) B. Tracey, K. Duraisamy, J. Alonso, A machine learning strategy to assist turbulence model development, AIAA Paper 1287 (2015) 2015.
 Zhang and Duraisamy (2015) Z. J. Zhang, K. Duraisamy, Machine learning methods for datadriven turbulence modeling, AIAA 2460 (2015) 2015.
 Ling et al. (2016) J. Ling, A. Kurzawski, J. Templeton, Reynolds averaged turbulence modelling using deep neural networks with embedded invariance, Journal of Fluid Mechanics 807 (2016) 155–166.
 Guo et al. (2016) X. Guo, W. Li, F. Iorio, Convolutional neural networks for steady flow approximation, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2016, pp. 481–490.
 Miyanawala and Jaiman (2017) T. P. Miyanawala, R. K. Jaiman, An efficient deep learning technique for the NavierStokes equations: Application to unsteady wake flow dynamics, arXiv preprint arXiv:1710.09099 (2017).
 Radford et al. (2015) A. Radford, L. Metz, S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks, arXiv preprint arXiv:1511.06434 (2015).
 Denton et al. (2015) E. L. Denton, S. Chintala, R. Fergus, et al., Deep generative image models using a￼Laplacian pyramid of adversarial networks, in: Advances in Neural Information Processing Systems, 2015, pp. 1486–1494.
 Oord et al. (2016) A. v. d. Oord, N. Kalchbrenner, K. Kavukcuoglu, Pixel recurrent neural networks, arXiv preprint arXiv:1601.06759 (2016).
 van den Oord et al. (2016) A. van den Oord, N. Kalchbrenner, L. Espeholt, O. Vinyals, A. Graves, et al., Conditional image generation with PixelCNN decoders, in: Advances in Neural Information Processing Systems, 2016, pp. 4790–4798.
 Srivastava et al. (2015) N. Srivastava, E. Mansimov, R. Salakhudinov, Unsupervised learning of video representations using LSTMs, in: International Conference on Machine Learning, 2015, pp. 843–852.
 Ranzato et al. (2014) M. Ranzato, A. Szlam, J. Bruna, M. Mathieu, R. Collobert, S. Chopra, Video (language) modeling: a baseline for generative models of natural videos, arXiv preprint arXiv:1412.6604 (2014).
 Mathieu et al. (2015) M. Mathieu, C. Couprie, Y. LeCun, Deep multiscale video prediction beyond mean square error, arXiv preprint arXiv:1511.05440 (2015).
 Goodfellow et al. (2014) I. Goodfellow, J. PougetAbadie, M. Mirza, B. Xu, D. WardeFarley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, in: Advances in Neural Information Processing Systems, 2014, pp. 2672–2680.
 You et al. (2008) D. You, F. Ham, P. Moin, Discrete conservation principles in largeeddy simulation with application to separation control over an airfoil, Physics of Fluids 20 (2008) 101515.
 Kingma and Ba (2014) D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, CoRR abs/1412.6980 (2014). URL: http://arxiv.org/abs/1412.6980. arXiv:1412.6980.
List of Tables
List of Figures
 1 The computational domain for numerical simulations. denotes the number of mesh points, where , , , , , , and . The domain size and number of mesh points in the spanwise direction are and .
 2 Instantaneous flow variables of ( sized grid and sized domain). The procedure of subsampling five consecutive flow fields to an input () and ground truth () of a training sample on a sized grid and sized domain.
 3 Schematic diagram of generator models. is the set of input flow fields and denotes interpolated input flow fields on a smaller scale grid, while domain sizes are identical. indicates a generative CNN which is fed with input , while indicates the set of generatedpredictions of flow fields from generative CNN . indicates the rescale operator, which upscales the grid size twice in both height and width.
 4 Schematic diagram of the discriminator model. indicates the discriminative network which is fed with and . indicates the set of generated predictions of flow fields from generative CNN , while indicates the set of ground truth flow fields.
 5 Comparison between the predicted and ground truth flow fields at test data () using a generator model ( and ) based on supervised feature extraction without prior knowledge of physical conservation laws: (ad) are the input sets of flow fields, where (e) and (f) are the predicted and ground truth flow fields. Each group of two rows from top to bottom indicate , , , and . 15 contours for , , and are ranged from 0.5 to 1.0, 0.7 to 0.7, and 0.5 to 0.5. 20 contours for are ranged from 1.0 to 0.4. Solid line — and dotted line indicate positive and negative levels, respectively. The timestep interval between flow fields is .
 6 Comparison between the predicted and ground truth flow fields at test data () using a generator model ( and ) based on supervised feature extraction without prior knowledge of physical conservation laws: (ad) are the input sets of flow fields, where (e) and (f) are the predicted and ground truth flow fields. Each group of two rows from top to bottom indicate , , , and . 15 contours for , , and are ranged from 0.5 to 1.0, 0.7 to 0.7, and 0.5 to 0.5. 20 contours for are ranged from 1.0 to 0.4. Solid line — and dotted line indicate positive and negative levels, respectively. The timestep interval between flow fields is .
 7 Comparison between the recursively predicted flow fields and ground truth flow fields of flow variables at using a generator model ( and ) based on supervised feature extraction without prior knowledge of physical conservation laws: (a) , (b) , (c) , (d) . The first and second rows show the recursively predicted and ground truth flow fields after , , (from left to right). 15 contours for , , and are ranged from 0.5 to 1.0, 0.7 to 0.7, and 0.5 to 0.5. 20 contours for are ranged from 1.0 to 0.4. Solid line — and dotted line indicate positive and negative levels, respectively. The timestep interval between input flow fields is .
 8 Comparison between errors from generator models with and without prior knowledge of physical conservation laws: (a) and (b) are the and errors; (c), (d), and (e) are the , , and errors. denotes errors from a generator model based on supervised feature extraction without prior knowledge of physical conservation laws (configuration of , number set , and ); denotes errors from a generator model based on supervised feature extraction with prior knowledge of physical conservation laws (configuration of , number set , and ). The errorbars represents the standard deviation.
 9 Comparison between the predicted and ground truth flow fields at test data () using a generator model ( and ) with prior knowledge of physical conservation laws: (ad) are the input sets of flow fields, where (e) and (f) are the predicted and ground truth flow fields. 15 contours for , , and are ranged from 0.5 to 1.0, 0.7 to 0.7, and 0.5 to 0.5. 20 contours for are ranged from 1.0 to 0.4. Solid line — and dotted line indicate positive and negative levels, respectively. The timestep interval between flow fields is .
 10 Comparison between the predicted and ground truth flow fields at test data () using a generator model ( and ) with prior knowledge of physical conservation laws: (ad) are the input sets of flow fields, where (e) and (f) are the predicted and ground truth flow fields. 15 contours for , , and are ranged from 0.5 to 1.0, 0.7 to 0.7, and 0.5 to 0.5. 20 contours for are ranged from 1.0 to 0.4. Solid line — and dotted line indicate positive and negative levels, respectively. The timestep interval between flow fields is .
 11 Comparison between the recursively predicted flow fields and ground truth flow fields of flow variables at using a generator model ( and ) with prior knowledge of physical conservation laws: (a) , (b) , (c) , (d) . The first and second rows show the recursively predicted and ground truth flow fields after , , (from left to right). 15 contours for , , and are ranged from 0.5 to 1.0, 0.7 to 0.7, and 0.5 to 0.5. 20 contours for are ranged from 1.0 to 0.4. Solid line — and dotted line indicate positive and negative levels, respectively. The timestep interval between input flow fields is .
 12 Flow prediction errors at respect to recursive prediction step using a generator model ( and ) with prior knowledge of physical conservation laws: (a) and (b) are the and errors; (c), (d), and (e) are the , , and errors.
 13 Flow prediction errors at respect to three cases of number set using generator models with prior knowledge of physical conservation laws: (a) and (b) are the and errors; (c), (d), and (e) are the , , and errors.
 14 Flow prediction errors at respect to three cases of generator configuration with prior knowledge of physical conservation laws: (a) and (b) are the and errors; (c), (d), and (e) are the , , and errors.
 15 Flow prediction errors at respect to adversarial training percentage using a generator model ( and ): (a) and (b) are the and errors; (c), (d), and (e) are the , , and errors.
 16 Comparison between errors from generator models with and without unsupervised feature extraction: (a) and (b) are the and errors; (c), (d), and (e) are the , , and errors. denotes errors from a generator model based on supervised feature extraction without prior knowledge of physical conservation laws (configuration of , number set , and ); denotes errors from a generator model associated with unsupervised feature extraction (configuration of , number set , , , and ). The errorbars represents the standard deviation.
 17 Comparison between the predicted and ground truth flow fields at test data () using a generator model ( and ) associated with unsupervised feature extraction: (ad) are the input sets of flow fields, where (e) and (f) are the predicted and ground truth flow fields. 15 contours for , , and are ranged from 0.5 to 1.0, 0.7 to 0.7, and 0.5 to 0.5. 20 contours for are ranged from 1.0 to 0.4. Solid line — and dotted line indicate positive and negative levels, respectively. The timestep interval between flow fields is .
 18 Comparison between the predicted and ground truth flow fields at test data () using a generator model ( and ) associated with unsupervised feature extraction: (ad) are the input sets of flow fields, where (e) and (f) are the predicted and ground truth flow fields. 15 contours for , , and are ranged from 0.5 to 1.0, 0.7 to 0.7, and 0.5 to 0.5. 20 contours for are ranged from 1.0 to 0.4. Solid line — and dotted line indicate positive and negative levels, respectively. The timestep interval between flow fields is .
 19 Comparison between the recursively predicted flow fields and ground truth flow fields of flow variables at using a generator model ( and ) associated with unsupervised feature extraction: (a) , (b) , (c) , (d) . The first and second rows show the recursively predicted and ground truth flow fields after , , (from left to right). 15 contours for , , and are ranged from 0.5 to 1.0, 0.7 to 0.7, and 0.5 to 0.5. 20 contours for are ranged from 1.0 to 0.4. Solid line — and dotted line indicate positive and negative levels, respectively. The timestep interval between input flow fields is .
 20 Flow prediction errors at respect to recursive prediction step using a generator model ( and ) associated with unsupervised feature extraction: (a) and (b) are the and errors; (c), (d), and (e) are the , , and errors.
 21 Flow prediction errors at respect to three cases of number set using generator models associated with unsupervised feature extraction: (a) and (b) are the and errors; (c), (d), and (e) are the , , and errors.
 22 Flow prediction errors at respect to three cases of generator configuration associated with unsupervised feature extraction: (a) and (b) are the and errors; (c), (d), and (e) are the , , and errors.
 23 Comparison between errors from networks associated with unsupervised feature extraction with and without prior knowledge of physical conservation laws: (a) and (b) are the and errors; (c), (d), and (e) are the , , and errors. denotes errors from a generator model with prior knowledge of physical conservation laws (configuration of , number set , and ); denotes errors from a generator model associated with unsupervised feature extraction (configuration of , number set , , , and ); denotes errors from a generator model with both unsupervised feature extraction and prior knowledge of physical conservation laws (configuration of , number set , and ). The errorbars represents the standard deviation.
Number set  

16  32  64  128  256  
32  64  128  256  512  
64  128  256  512  1024 
Configuration  Generator  Feature map sizes  Kernel sizes 

16 4  3 3 3  
20 4  5 3 5  
20 4  5 3 3 3 5  
20 4  7 5 5 5 7  
16 4  3 3 3 3  
20 4  5 3 3 5  
20 4  5 3 3 3 5  
20 4  7 5 5 5 7  

16 4  3 3 3 3  
20 4  5 3 3 5  
20 4  5 3 3 3 3 5  
20 4  7 5 5 5 5 7 
Layer type  Discriminator  Feature map/layer sizes  Kernel sizes 

CNN  4  3  
4  3 3 3  
4  5 5 5  
4  7 7 5 5  
FC  1  
1  
1  
1 
Calculations  

0% adv  0.5  0.5  0.0 
2.5% adv  0.4875  0.4875  0.025 
5% adv  0.475  0.475  0.05 
7.5% adv  0.4625  0.4625  0.075 
10% adv  0.45  0.45  0.1 