Driving Digital Rock towards Machine Learning: predicting permeability with Gradient Boosting and Deep Neural Networks
Abstract
We present a research study aimed at testing of applicability of machine learning techniques for prediction of permeability of digitized rock samples. We prepare a training set containing 3D images of sandstone samples imaged with Xray microtomography and corresponding permeability values simulated with Pore Network approach. We also use Minkowski functionals and Deep Learningbased descriptors of 3D images and 2D slices as input features for predictive model training and prediction. We compare predictive power of various feature sets and methods. The later include Gradient Boosting and various architectures of Deep Neural Networks (DNN). The results demonstrate applicability of machine learning for imagebased permeability prediction and open a new area of Digital Rock research.
keywords:
Digital Rock, Machine Learning, Artificial Neural Networks, permeability prediction, gradient boosting1 Introduction and Motivation
Digital Rock is an approach for computing properties of a rock. The paradigm of Digital Rock Physics is âimageandcomputeâ. First, the rock sample is imaged to receive a 3D representation of mineral phase and pore space chauhan2016processing (); koroteev2017method (); andra2013digital_1 (); blunt2013pore (). After that, this 3D representation is used to simulate physical processes in the sample koroteev2014direct ().
Recent methods of 3D imaging of pore topology include microscale xray computed tomography (µxCT). This is imaging of rock samples with resolution down to tens of nanometers (voxel size) or hundreds of nanometers (physical resolution). It allows imaging internal structure of finestructured samples precisely and nondestructively. The resulting representation, after removing µxCT scanning artifacts and segmentation chauhan2016processing (); koroteev2017method (), is processed to retain sampleâs petrophysical properties.
Applications of the digital rock technologies cover the items below:

calculation of transport properties like absolute permeability and relative permeability andra2013digital_2 (); koroteev2014direct (); berg2017industrial (); blunt2013pore (),

calculation of electric, elastic, geomechanic properties and NMR response andra2013digital_2 (); blunt2013pore (); evseev2015coupling (),

screening the methods for enhanced oil recovery koroteev2013application (),

assessing the potential of chemical treatment at well stimulation klemin2015digital ().
The recent advances in highresolution imaging, high performance computing and machine learning are to lead to new and more effective computation.
This work focuses on applying advances in Deep Learning image processing to Digital Rock Physics. Our goal is to build fast approximation models, or socalled surrogate models, to predict permeability based on the results of physical modeling (example of such modeling could be found in belyaev2016gtapprox ()). Such models are an acknowledged method of solving various industrial engineering problems grihon2013surrogate ().
Using VGG16 DNN simonyan2014very () network, we recover a set of descriptors for 2D layers, which compose 3D image, and utilize their lowdimensional representation to compute sample permeability. The results outperform frequently used technique of using Minkowski functionals as input features for Machine Learning algorithm when used to predict logarithmic permeability.
We also evaluate application of conventional Deep Learning models, Convolutional Neural Networks (CNN), which are frequently used in analysis of multidimensional data, to µxCT voxel rock scans. We apply CNN in an endtoend fashion to simultaneously extract features and to execute regression for permeability prediction. The advantage of such approach lies in the fact that it does not require manual feature engineering, yet provides same results in terms of accuracy.
2 Data Acquisition
For model evaluation, we used a sample from the Berea Sandstone Petroleum Cores (Ohio, USA). The 3D image already has its artifacts removed and its segmentation computed by Imperial College London. The segmented sample makes no distinction between different rock phases, denoting every rock voxel as 0, and pore voxel as 1. Initial sample consisted of elements with a voxel size of 5.345 µm.
To generate a dataset for machine learning algorithms, the sample was cut into intersecting voxel cubes with the shift of 15 voxels and the same step size of 5.345 µm, leading to a dataset of 9261 samples total. Each of this smaller samples can be examined as an independent rock image, retaining some geometrical properties of parental Berea sample.
To compute initial permeability values for voxel cubes in the dataset, we used porescale network modelling code (courtesy of Taha Sochi) sochi2010pore (), paired with OpenPNM framework putz2013introducing (). Network model is a simplified representation of rock geometry, consisting of spherical pores connected by cylindrical throats, usually stored in Statoil format. The network representation was then used to compute rock permeability of each cut rock sample, making use of Darcyâs law and considering the flow type to be Stokes flow. As a result, we had a dataset of 9261 100x100x100 voxel cubes and corresponding permeability values, measured in millidarcies. The byproduct of calculations is a set of 9261 network models.
3 Regression on generated features
3.1 Feature generation
Three different approaches for feature generation for regression were examined. First, we tried to explicitly use network models’ characteristics. Second, we computed a wellknown set of geometrical descriptors, Minkowski Functionals, for use as an input for predictor. Finally, we used a set of 2D image descriptors with reduced dimensionality, acquired with VGG16 DNN and Principal Component Analysis (PCA), as a feature set.
Network Features
We considered a number of network modelâs characteristics, which could influence the permeability value for a given sample. These features are: median pore radius, mean pore radius, median throat radius, mean throat radius, median throat length, mean throat length, median pore connectivity number, mean pore connectivity number, total pore count.
As in Stokes flow, we considered advective inertial forces to be small, compared to viscous forces. The permeability of a sample is then proportional to the area of phase transition surface, which is, in turn, proportional to some of the included features. However, this approach had proven to be inferior to others.
Minkowski Functionals
Minkowski functionals (also known as intrinsic volumes or quermass integrals) are additive morphological measures, initially defined for convex objects in the field of integral geometry. It has been shown that every measure on the finite union of compact convex sets in can be expressed as a linear combination of the four Minkowski functionals. For these Minkowski functionals are the volume, area, mean breadth and the EulerPoincare characteristic. In the recent years, these functionals have found their use in astronomy, material science, medicine and biology blasquez2003efficient (); guderlei2007algorithms (); berchtold2007modelling (); vogel2010quantification (), as well as voxelbased surface recognition yarotsky2017geometric ().
The additive property of Minkowski functional for two convex sets A and B can be expressed as .This property, as well as discrete structure of 3D voxel image, considerably simplifies their computation for rock samples dataset, as for such objects the procedure is reduced to enumeration of open voxels, faces, edges and vertices blasquez2003efficient ().
For efficient computation of the functionals, we utilized method, described in blasquez2003efficient (), which is based on using binary decision diagrams. This method takes advantage of the local configuration around each added voxel.
Minkowski Functionals for Rescaled Samples
Another approach to evaluating rock permeability using its voxel image is to include not only Minkowski functionals for the sample itself, but also functionals for rescaled sample as an input to machine learning algorithm srisutthiyakorn2016deep (). The intuition behind this technique is that, while Minkowski functionals retain some fine information about geometrical structure of the sample, calculating them for rescaled sample to some extent could provide insights on the geometrical structure on a larger scale, leading to better mathematical description of the sample and its viscous flow properties.
A rescaled sample of magnitude is a voxel cube, which dimensions are times smaller. A given voxel is set to 1 (pore phase), if the average of voxels in corresponding range in initial sample is no less than a specified threshold. In this work, a threshold of 0.5, and magnitudes of 2, 5, 10 and 25 were used.
VGGPCA Descriptors
VGG is a name for family of Deep Convolutional Neural Networks (DCNN) for 2D image recognition. They were introduced on ImageNet Challenge 2014 simonyan2014very (), and their development marked the advent of Deep Learning era. Not only these networks could provide accurate image classification and generalize well, but also pretrained network could be finetuned to a given problem to quickly receive a reasonable performance in terms of some metric without requiring to additionally increase dataset size and training time.
A common approach to finetuning is to remove several bottom layers from pretrained DNN or CNN, exposing one of dense layers, which are typically denoted as FC1000 or FC4096 (see the VGG architecture in simonyan2014very ()). A number of layers is then added to the network with regards to the problem specifics and they are then trained on the new data. In our approach, we simply extract features from FC4096 layer for each 2D slice of the scan, represented as an image. Then, they are used as input features for regression after additional processing with PCA to reduce their dimensionality.
One important property of VGG network dense layers is that they retain a significant proportion of image structure, as their output alone for a given image is frequently enough to correctly classify it or process it in some other way. Although they are not interpretable, this set of values can provide a great insight on image composition, pattern distribution, and its specifics in general.
Considering rock permeability prediction, we recovered descriptors of 2D layers, or essentially voxel rectangles, which compose a given sample in the dataset. To process binary image, we converted all rock voxels to vectors in RGB code, and pore voxels to . Accordingly, voxel layers had to be resized to size . Output of second fully connected layer of size 4096 was used to recover the descriptors.
The feature vector of length 409600, obtained as a result of concatenation of 100 layer descriptors, can then be interpreted as a descriptor of sample as a whole, retaining information about individual layersâ structures and their position in the voxel cube through featuresâ placement.
The enormous vector size makes it irrational to use it directly as input for regression model. Instead, we used PCA to reduce the dimensionality of the input vectors, making it possible to use conventional models without significant modifications. Principal componentsâ size of 1350 was used.
3.2 Regression methods
We used two regression methods to evaluate predictive power of generated features: gradient regression trees (XgBoost) and Deep Neural Networks (DNN)
XgBoost
XgBoost is a gradient boosting library xgboost (), which provides powerful prediction model, consisting of numerous weak prediction models chen2016xgboost (). It is commonly used in Machine Learning due to its computation speed and interpretability of results. Initially we found model hyperparameters, which yielded better results in terms of metric (this metric is described in detail in section 5), by grid search. It is a technique which involves training the model with different hyperparameters and examining its performance on holdout validation subset.
Used parameters for all feature groups are described in Table 1.
Artificial Neural Networks
learning_rate=0.05 
n_estimators=400 
max_depth=5 

min_child_weight=6 
gamma=0.1 
subsample=1 
colsample_bytree=1 
reg_alpha=0.5 
reg_lambda=1 
4 Endtoend regression
We evaluated the usage of endtoend Convolutional Neural Networks (CNN) modeling technique which is very common in image processing. We used 2D CNN to execute regression on individual 2D slices of a sample and 3D CNN to process the samples as a whole.
2D Convolutional Neural Networks
CNNs represent a class of deep feedforward artificial neural networks, which are most commonly applied to analyzing visual imagery.
As for usual multilayer perceptrons, convolutional neural networks were inspired by biological structures, specifically, by the organization of the animal visual cortex. Different cortical neurons respond to different kinds of stimuli only in a restricted region of the visual field known as the receptive field matsugu2003subject ().
In artificial neural networks, this approach is realized by convolutional and pooling layers. Convolutional layer consists of a number of filters, each of which is trained to respond to a specific pattern. Filters are iterated over the input tensor, and for each visited position Hadamard product of filter weights and corresponding input values is calculated. The sum of the elements in resulting matrix is then passed further. The subset of input values, analyzed by filter on a given step, is called receptive field of the filter.
This generalization of biological approach leads to a number of advantages, such as shiftinvariance and parameter sharing lecun2015deep (). But the most important distinguishing feature in the context of the task at hand is emphasizing spatial structure of the data. As porosity greatly depends on the number and shapes of pores, treating input voxels which are close to each other differently than voxels which are far away is invaluable property in estimating both local and global spatial structure of the rock sample.
The conventional use case of CNNs includes analyzing images, most commonly represented as 3D tensors, where first two dimensions correspond to directions in the image, and the final corresponds to color channels (red, green, blue) for a given pixel. In this work, we used 2D convolutions with 3D rock samples the same way as for images. Instead of vectors of color channels, we used vectors of voxel values on other layers: three color channels are replaced for 100, each corresponding to a layer of the sample. After series of other convolutions, maxpooling operations and fully connected layers, the predicted value for sample permeability is calculated.
However, this approach has significant disadvantage due to the fact that each filter works with tensors, disregarding information from neighboring layers. This nuance is mitigated for 3D CNNs, where each filter examines local region across all three dimensions of a 3D rock sample.
3D Convolutional Neural Networks
3D CNN are based on the same idea as 2D CNN, but 3D filters are used for convolutional layers. In context of permeability prediction, this allows a given filter to receive the local information for a given voxel not just from the same layer, but also from neighbouring layers. As rock pores are threedimensional, such approach provides more practical information to each network unit.
5 Model Evaluation and Results
We separately compared predictive power of feature sets, and different prediction methods. Gradient Boosting regression algorithm (XgBoost) chen2016xgboost () was used with different feature sets to compare their predictive power. They were chosen mostly for their interpretability and relatively straightforward modus operandi.
All prediction methods were compared with each other. For better interpretability of results, we used a special metric, denoted as
Here, denotes true permeability value for sample , is a predicted permeability value for sample , and is the th percentile of true permeability histogram for a given cube. This metric is more informative than mean squared error. Conventional error does allow to compare algorithmic approaches to each other, but provides zero information about how large is the error compared to variability of the data. Using such denominator allows to account for that.
XgBoost
Below are results for selected feature types and feature types combinations for XgBoost approach. The last row of each table corresponds to feature group combination, yielding the best result for a given method. To produce the charts, only error values below 90th percentile were used, as for each method there exist strong outliers.
Here and below VGGPCA corresponds to introduced VGGPCA descriptors, NET corresponds to rock sample network’s features, and MX corresponds to Minkowski Functionals for Xrescaled cube.
Used Feature Groups  Validation 

VGGPCA  0.0451 
NET  0.0417 
M1  0.0421 
M1 + M2 + M5 + M10 + M25  0.0406 
M2 + M5 + M10 + NET  0.0368 
To further improve the results, training and prediction was done with logarithms of permeability, and was computed for exponent of prediction. This empirical technique has proven to lead to better results in some cases.
Used Feature Groups  Validation 

VGGPCA  0.0367 
NET  0.0372 
M1  0.0391 
M1 + M2 + M5 + M10 + M25  0.0370 
VGGPCA + M1 + M5 + M25 + NET  0.0338 
Introduced VGGPCA features perform much better when used with logarithmic permeability. Despite not being able to provide the best results individually, we found that they are included into the 25 top scoring feature types combinations. Interestingly, the result of VGGPCA features is much worse for usual permeability, but examining the cause for that would require delving into VGG16 architecture specifics, which would go outside of the scope of this article.
These results should be taken with a grain of salt, as, strictly speaking, they only correspond to the predictive power of these feature groups for a given sandstone sample, and only for XgBoost model.
Deep Neural Networks
Dense (units=2048, activation=”relu”) 

Dense (units=2048, activation=”relu”) 
Dense (units=1024, activation=”relu”) 
Dense (units=512, activation=”relu”) 
Dense (units=256, activation=”relu”) 
Dense (units=1, activation=None) 
To evaluate predictive power of feature groups paired with neural networks, we used three feature group combinations: VGGPCA, M1 + M2 + M5 + M10 + M25 and VGGPCA + M1 + M2 + M5 + M10 + M25. The number of considered feature group combinations was reduced to decrease the time spent on training the models, as neural networks are significantly more computationally demanding.
We evaluated several straightforward multilayer perceptron (MLP) architectures for each feature group combination, and the best one was selected for comparison. It is presented in Table 4.
Used Feature Groups  Validation 

VGGPCA  0.0287 
M1 + M2 + M5 + M10 + M25  0.0441 
VGGPCA + M1 + M2 + M5 + M10 + M25  0.0384 
The only difference between best architectures is that Minkowski functionals seem to provide better results when batch normalization layer is added before the output unit. In the following table we present value for all considered feature group combinations.
Addition of VGGPCA descriptors to Minkowski functionals allows to reduce prediction error. However, the best result is achieved when they are used separately from the other features. This happens due to the fact that the network was not given enough training time in order to null excessive information coming from Minkowski functionals, which introduced additional error.
The number of training epochs was limited to 50 for all tested architectures, batch size was set to 8, and Adam optimizer with a learning rate of 0.001 was used. All remaining hyperparameters were set to default. Early stopping was used in order to determine best validation score.
Convolutional Neural Networks
Below we present best performing 2D and 3D CNN architectures. As for other approaches, logarithmic permeability was used as an output value.
Used 2D CNN is inspired by VGG16 network lecun2015deep (), which was used to compute VGGPCA descriptors. We consider each sample to have 100 channels, each corresponding to an individual layer. Model was trained with Adam optimizer with default parameters, for 20 epochs and a batch size of 32.
2D Convolutional (filters=64, kernel_size=3, activation=”relu”, padding=”same”) 

2D Convolutional (filters=64, kernel_size=3, activation=âreluâ, padding=âsameâ) 
2D Convolutional (filters=64, kernel_size=3, activation=âreluâ, padding=âsameâ) 
Max Pooling (pool_size=(2, 2), strides=(2, 2)) 
2D Convolutional (filters=128, kernel_size=3, activation=âreluâ, padding=âsameâ) 
2D Convolutional (filters=128, kernel_size=3, activation=âreluâ, padding=âsameâ) 
2D Convolutional (filters=128, kernel_size=3, activation=âreluâ, padding=âsameâ) 
Max Pooling (pool_size=(2, 2), strides=(2, 2)) 
2D Convolutional (filters=256, kernel_size=3, activation=âreluâ, padding=âsameâ) 
2D Convolutional (filters=256, kernel_size=3, activation=âreluâ, padding=âsameâ) 
2D Convolutional (filters=256, kernel_size=3, activation=âreluâ, padding=âsameâ) 
Max Pooling (pool_size=(2, 2), strides=(2, 2)) 
2D Convolutional (filters=512, kernel_size=3, activation=âreluâ, padding=âsameâ) 
2D Convolutional (filters=512, kernel_size=3, activation=âreluâ, padding=âsameâ) 
2D Convolutional (filters=512, kernel_size=3, activation=âreluâ, padding=âsameâ) 
Max Pooling (pool_size=(2, 2), strides=(2, 2)) 
Dense (1024, activation=âreluâ) 
Dropout(0.5) 
Dense (512, activation=âreluâ) 
Dropout(0.5) 
Dense (1, activation=None) 
Used 3D CNN architecture was inspired by VoxNet maturana2015voxnet (), which was initially used for object recognition. Compared with 2D convolutions, the application of 3D CNN is straightforward. Model was also trained with Adam optimizer with default parameters, for 20 epochs and a batch size of 32. Valid padding was used for all convolutional layers.
3D CNN of similar structure have proven to be an efficient way of solving 3D shapes retrieval problem notchenko2017large (). It means that they are able to learn efficient descriptors for 3D objects which are bound to be effective for regression.
3D Convolutional (filters=32, kernel_size=5, strides=2, activation=”relu”) 

3D Convolutional (filters=32, kernel_size=5, strides=2, activation=”relu”) 
Max Pooling 3D (pool_size=(2, 2), strides=(1, 1)) 
3D Convolutional (filters=32, kernel_size=3, strides=1, activation=”relu”) 
3D Convolutional (filters=32, kernel_size=3, strides=1, activation=”relu”) 
Max Pooling 3D (pool_size=(2, 2), strides=(1, 1)) 
Dense (128, activation=âreluâ) 
Dense (64, activation=âreluâ) 
Dense (1, activation=None) 
Below we present the boxplot of errors for neural network approaches for permeability prediction. Once, again, strong outliers lying above 90th percentile of errors were not used.
Overall method evaluation
Below we present comparison of best results for each considered method. Overall, 3D convolutional neural networks have proven to be superior both in terms of metric and of error distribution. Therefore, we would advice to consider them as a primary direction for development of datadriven permeability prediction models.
Model and Approach  

M2+M5+M10+NET XGB  0.0368 
VGGPCA+M1+M5+M25+NET LOG XGB  0.0338 
VGGPCA MLP  0.0287 
3D CNN  0.0284 
6 Conclusions and Discussion
Results of this pilot study clearly demonstrate a significant potential of machine learning for imagebased permeability prediction. The datadriven approach is foreseen as a true game changer for digital rock technology because it is extremely fast and scalable. Moreover, same approach seem to be applicable not only to single phase permeability prediction, but to prediction of more complex properties relevant to petrophysics, structural geology and field development. The properties may include relative phase permeabilities, formation factor and resistivity, dielectric permittivity, elastic and geomechanical properties, NMR response and others. There are options to enrich input data with an information on mineral distribution, wettability and intergrain contacts to ensure the highest possible predictive power.
So, there are many things to study in future and, one should also note that the recent developments in deep learning are likely to afford prediction of not only the static characteristics of the digitized rock samples of a single type, but dynamics of fluid displacement as well. Being assisted with a feature set containing information about pore fluids, this will be a promising direction for enhanced oil recovery applications of the imagebased digital rock. Authors should mention, that this direction is likely to be developed together with physicsdriven pore scale modeling like koroteev2014direct () only, as without the physicsbased models there will be no actual data to train on.
Additionally, we plan to evaluate usage of more efficient DNN models and methods. These include modern approaches to constructing ensembles of regression models burnaev2013method () and special methods of DNN parameter initialization burnaev2016influence ().
We also consider to adapt implemented models to other core types using multifidelity regression modeling methods. These are examined in works zaytsev2017minimax (); zaytsev2017large (); burnaev2015surrogate (). Successful applications of such technique include one application in aerodynamics, examined in belyaev2014building ()
Another approach to be considered is to apply adaptive design of experiments, devised for industrial engineering problems, to both increase efficiency of sensitivity analysis and to improve utilization of computational budget when generating a training sample burnaev2017efficient (); burnaev2015adaptive ().
References
References
 (1) S. Chauhan, W. Rühaak, F. Khan, F. Enzmann, P. Mielke, M. Kersten, I. Sass, Processing of rock core microtomography images: Using seven different machine learning algorithms, Computers & Geosciences 86 (2016) 120–128.
 (2) D. A. Koroteev, A. N. Nadeev, I. V. Yakimchuk, I. A. Varfolomeev, Method for building a 3d model of a rock sample, uS Patent 9,558,588 (Jan. 31 2017).
 (3) H. Andrä, N. Combaret, J. Dvorkin, E. Glatt, J. Han, M. Kabel, Y. Keehm, F. Krzikalla, M. Lee, C. Madonna, et al., Digital rock physics benchmarksâpart i: Imaging and segmentation, Computers & Geosciences 50 (2013) 25–32.
 (4) M. J. Blunt, B. Bijeljic, H. Dong, O. Gharbi, S. Iglauer, P. Mostaghimi, A. Paluszny, C. Pentland, Porescale imaging and modelling, Advances in Water Resources 51 (2013) 197–216.
 (5) D. Koroteev, O. Dinariev, N. Evseev, D. Klemin, A. Nadeev, S. Safonov, O. Gurpinar, S. Berg, C. van Kruijsdijk, R. Armstrong, et al., Direct hydrodynamic simulation of multiphase flow in porous rock, Petrophysics 55 (04) (2014) 294–303.
 (6) H. Andrä, N. Combaret, J. Dvorkin, E. Glatt, J. Han, M. Kabel, Y. Keehm, F. Krzikalla, M. Lee, C. Madonna, et al., Digital rock physics benchmarksâpart ii: Computing effective properties, Computers & Geosciences 50 (2013) 33–43.
 (7) C. F. Berg, O. Lopez, H. Berland, Industrial applications of digital rock technology, Journal of Petroleum Science and Engineering 157 (2017) 131–147.
 (8) N. Evseev, O. Dinariev, M. Hurlimann, S. Safonov, et al., Coupling multiphase hydrodynamic and nmr porescale modeling for advanced characterization of saturated rocks, Petrophysics 56 (01) (2015) 32–44.
 (9) D. A. Koroteev, O. Dinariev, N. Evseev, D. V. Klemin, S. Safonov, O. M. Gurpinar, S. Berg, C. vanKruijsdijk, M. Myers, L. A. Hathon, et al., Application of digital rock technology for chemical eor screening, in: SPE Enhanced Oil Recovery Conference, Society of Petroleum Engineers, 2013.
 (10) D. Klemin, A. Nadeev, M. Ziauddin, et al., Digital rock technology for quantitative prediction of acid stimulation efficiency in carbonates, in: SPE Annual Technical Conference and Exhibition, Society of Petroleum Engineers, 2015.
 (11) M. Belyaev, E. Burnaev, E. Kapushev, M. Panov, P. Prikhodko, D. Vetrov, D. Yarotsky, Gtapprox: Surrogate modeling for industrial design, Advances in Engineering Software 102 (2016) 29–39.
 (12) S. Grihon, E. Burnaev, M. Belyaev, P. Prikhodko, Surrogate modeling of stability constraints for optimization of composite structures, in: SurrogateBased Modeling and Optimization, Springer, 2013, pp. 359–391.
 (13) K. Simonyan, A. Zisserman, Very deep convolutional networks for largescale image recognition, arXiv preprint arXiv:1409.1556.
 (14) T. Sochi, Porescale modeling of nonnewtonian flow in porous media, arXiv preprint arXiv:1011.0760.
 (15) A. Putz, J. Hinebaugh, M. Aghighi, H. Day, A. Bazylak, J. T. Gostick, Introducing openpnm: an open source pore network modeling software package, ECS Transactions 58 (1) (2013) 79–86.
 (16) I. Blasquez, J.F. Poiraudeau, Efficient processing of minkowski functionals on a 3d binary image using binary decision diagrams.
 (17) R. Guderlei, S. Klenk, J. Mayer, V. Schmidt, E. Spodarev, Algorithms for the computation of the minkowski functionals of deterministic and random polyconvex sets, Image and Vision Computing 25 (4) (2007) 464–474.
 (18) M. Berchtold, Modelling of random porous media using minkowskifunctionals, Ph.D. thesis (2007).
 (19) H.J. Vogel, U. Weller, S. Schlüter, Quantification of soil structure based on minkowski functions, Computers & Geosciences 36 (10) (2010) 1236–1245.
 (20) D. Yarotsky, Geometric features for voxelbased surface recognition, arXiv preprint arXiv:1701.04249.
 (21) N. Srisutthiyakorn*, Deeplearning methods for predicting permeability from 2d/3d binarysegmented images, in: SEG Technical Program Expanded Abstracts 2016, Society of Exploration Geophysicists, 2016, pp. 3042–3046.
 (22) dmlc/xgboost: Scalable, portable and distributed gradient boosting (gbdt, gbrt or gbm) library, https://github.com/dmlc/xgboost.
 (23) T. Chen, C. Guestrin, Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, ACM, 2016, pp. 785–794.
 (24) M. Matsugu, K. Mori, Y. Mitari, Y. Kaneda, Subject independent facial expression recognition with robust face detection using a convolutional neural network, Neural Networks 16 (56) (2003) 555–559.
 (25) Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (7553) (2015) 436–444.
 (26) D. Maturana, S. Scherer, Voxnet: A 3d convolutional neural network for realtime object recognition, in: Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on, IEEE, 2015, pp. 922–928.
 (27) A. Notchenko, Y. Kapushev, E. Burnaev, Largescale shape retrieval with sparse 3d convolutional neural networks, in: International Conference on Analysis of Images, Social Networks and Texts, Springer, 2017, pp. 245–254.
 (28) E. Burnaev, P. Prikhodko, On a method for constructing ensembles of regression models, Automation and Remote Control 74 (10) (2013) 1630–1644.
 (29) E. Burnaev, P. Erofeev, The influence of parameter initialization on the training time and accuracy of a nonlinear regression model, Journal of Communications Technology and Electronics 61 (6) (2016) 646–660.
 (30) A. Zaytsev, E. Burnaev, Minimax approach to variable fidelity data interpolation, in: Artificial Intelligence and Statistics, 2017, pp. 652–661.
 (31) A. Zaytsev, E. Burnaev, Large scale variable fidelity surrogate modeling, Annals of Mathematics and Artificial Intelligence 81 (12) (2017) 167–186.
 (32) E. Burnaev, A. Zaytsev, Surrogate modeling of multifidelity data for large samples, Journal of Communications Technology and Electronics 60 (12) (2015) 1348–1355.
 (33) M. Belyaev, E. Burnaev, E. Kapushev, S. Alestra, M. Dormieux, A. Cavailles, D. Chaillot, E. Ferreira, Building data fusion surrogate models for spacecraft aerodynamic problems with incomplete factorial design of experiments, in: Advanced Materials Research, Vol. 1016, Trans Tech Publ, 2014, pp. 405–412.
 (34) E. Burnaev, I. Panin, B. Sudret, Efficient design of experiments for sensitivity analysis based on polynomial chaos expansions, Annals of Mathematics and Artificial Intelligence 81 (12) (2017) 187–207.
 (35) E. Burnaev, M. Panov, Adaptive design of experiments based on gaussian processes, in: International Symposium on Statistical Learning and Data Sciences, Springer, 2015, pp. 116–125.