Prediction of Facebook Post Metrics using Machine Learning^{†}^{†}thanks: This is a draft version of a manuscript accepted in the XXI International Conference on Soft Computing and Measurement (SCM’2018), Saint Petersburg, Russia, May 23  25, 2018 (http://scm.eltech.ru/).
Abstract
In this short paper, we evaluate the performance of three wellknown Machine Learning techniques for
predicting the impact of a post in Facebook.
Social medias have a huge influence in the social behaviour.
Therefore to develop an automatic model for predicting the impact of posts in social medias can be useful to the society.
In this article, we analyze the efficiency for predicting the post impact of three popular techniques: Support Vector Regression (SVR), Echo State Network (ESN) and Adaptive Network Fuzzy Inject System (ANFIS).
The evaluation was done over a public and wellknown benchmark dataset.
Keywords: Social Networks, Neurofuzzy Inject System, Machine Learning, Echo State Networks, Neural Networks
I Introduction
Nowadays social medias impacts in the collective behaviour, and they have a very important role in the diffusion of information. For this reason, an automatic method for predicting the impact of posts in social medias can be useful in several areas such as: marketing, phycology, educational domains, security and so on.
In this study, we analyze other Machine Learning tools for predicting a group of Facebook metrics. The goal is to make an automatic prediction of the impact of a post in Facebook. A previous study in [1] explored the possibility of predicting some Facebook metrics with Support Vector Regression (SVR). Few of the metrics obtained good results when the evaluation is made using mean absolute percentage error. In this study we predict metrics: Comments, Shares, Likes. These metrics are important reference of the post impact. In [1] was defined a kind of measure named Total interactions of a post which is defined like the sum of Comments, Shares, Likes. In this short paper, we present results obtained by three popular learning methods. The Support Vector Regression (SVR) which is based in kernels. Echo State Network (ESN) that is technique based in the power of recurrent Neural Networks and Linear Regressions. Furthermore, we evaluate Adaptive Network Fuzzy Inject System (ANFIS) that is distributed parallel system with fuzzy rules.
The rest of this paper is organized as follows. Section II reviews how the machine learning techniques to be explored in this study have been applied in other studies related to social media metrics. Section 2 provides a description of the techniques and their properties as well as the methodology for this study. Data description, result of analysis, and related discussions are presented in Section 4. Conclusions and recommendations for future work are given in Section 5.
Ii Background on Machine Learning Methods
Iia Problem formalization
Let be a dimensional input data, and let be an unidimensional output variable. Given a learning dataset composed by real inputoutput pairs . The goal is to define a model for predicting an outcome variable based on a set of input features. Note that, we have several output variables and we are modeling each of them independently. The model is evaluated using a quantitative measure called cost function that measures the quality of the learning model. In this article, we use the most popular metric when the output variable is a real value, the Mean Squared Error ():
(1) 
where denotes the prediction for the input .
IiB Adaptive Network Fuzzy Inject System
The AdaptiveNetworkFuzzy Inject System (ANFIS) ANFIS is the abbreviation AdaptiveNetworkFuzzy Inject System  an adaptive network of fuzzy output. Proposed in the early nineties, ANFIS is one of the first variants of hybrid neuralfuzzy networks  a neural network of direct signal propagation of a special type. The architecture of the neuralfuzzy network is isomorphic to the fuzzy knowledge base. Neurofuzzy networks use differentiated implementations of triangular norms (multiplication and probabilistic OR), as well as smooth functions. This allows the use of crossfuzzy neural networks, rapid algorithms for learning neural networks, based on the method of back propagation of errors. The architecture and rules for each layer of the ANFIS network are described below. ANFIS implements the Sugeno fuzzy inference system in the form of a fivelayer neural network of direct signal propagation. The system works as follows:

the first layer is the terms of the input variables;

the second layer  antecedents (parcels) of fuzzy rules;

the third layer  the normalization of the degree of implementation of the rules;

the fourth layer is the conclusion of the rules;

the fifth layer is the aggregation of the result, du according to different rules.
The network inputs in a separate layer are not allocated. Figure 1 shows an example of an ANFIS network with two input variables ( and ) and four fuzzy rules. In the example, the linguistic evaluation of the input variable , three terms are used, and for the variable are used two terms.
IiC Support Vector Regression
Support Vector Regression (SVR) is a version of the wellknown Support Vector Machine (SVM) [2]. The SVR model was proposed by [3], it is a technique to be applied to the regression case. Similar to SVM, the SVR algorithm uses nonlinear mappings termed as kernels to transform an input space into a high dimensional feature space. It constructs a regression model using subset of the training instances termed as support vectors [4]. The technique uses a global parameter which is to learn a function which is at most deviations away from the target by defining a band around the regression function. Another global parameter, denoted by , controls the tradeoff between the prediction error and the flatness of the band around . Finally a test instance can be predicted using the following equation:
(2) 
where: are the support vectors (points that fall outside or on the border of the tube), is a test instance to be predicted, is the vector of parameters determined by the SVR learning algorithm, and is a kernel function used to transform the input data points into a higher dimensional feature space [2].
SVM has been applied in social network analysis for the classification of Chinese Facebook users into introverts and extroverts based on their Facebook wall posts [5]. SVM has also outperformed other classifies in several comparative analysis in the context of social media [6]. SVR is has been applied in [1] for the prediction of social media performance metrics [1].
IiD Echo State Network
Since the early 2000s, Reservoir Computing (RC) has gained prominence in the Neural Network community [7]. A RC model has a dynamical system called reservoir, which expands input data into a highdimensional space in a similar way that kernels methods. Next, the model uses a supervised learning tool to predict the model outputs. Most often, the RC model uses a simple linear regression from the feature map and the output space. RC models have been widely used in fields such as: pattern classification [8], speech recognition [9, 10], speech quality [11] and timeseries prediction [7, 8, 12, 13, 14], the Internet traffic prediction [15], and so on. Most formal, a reservoir is a temporal expansion function from an input space into a larger space with . We denote by the input pattern of the model at any time and the target of the model at any time . The recurrences are modelled with a state vector :
(3) 
where is an expansion function.
We denote the model connections by and which are matrices of dimensions (for the input weights) and (for the reservoir weights). A characteristic of the model is that these weight matrices are fixed during the training algorithm [8]. They are randomly initialized and they kept fixed. To compute the ESN output corresponding to a new input (a column vector) , the model first computes a new reservoir state computed by
(4) 
The vector is the only adjustable parameters in ESN model, which is usually estimated using ridge regression between the vector and the target.
Iii Results of Experiments
Iiia Description of Data
The dataset contains features known prior to post publication, and output variables which are used for the post impact. The output variables are: comments, shares, and likes. The variable comments counts the number of comments that provoked a specific post. The variable shares refers to the number of times that the post has been shared with other users. The variable likes is also a counter operation, that counts the number of likes caused by a post.
Variable  Mean  Median  Mode  St. deviation  Maximum  Minimum 

Number of comments  7  3  0  21  372  0 
Number of likes  178  101  98  323  5172  0 
Number of shares  27  19  13  43  790  0 
IiiB Discussion of results
The training of ESN was done with a reservoir size of 25 neurons and spectral radius of (both parameters are important in the design of the model [7]). A similarly number of hidden neuron has the ELM model. The SVR algorithm applied in this study uses a Gaussian radial basis (RBF) kernel function. The width of the Gaussian kernel was set to 0.1. The band defined around the regression function, , was set to 0.1, and the parameter which controls the tradeoff between the prediction error and the flatness of the band around the regression function was set to 1000. The selection of these values for the parameters was based on [16]. The training of ANFIS was done with 400 data pairs. ANFIS has 7 input layers, 1 output layer. In input Member Function type was selected the Gauss method, and we take 3 member functions for each input layer. In the output layer we are selected constant MF type. For training network we are using a hybrid optimization method. The training of the model parameters required 2 epoch of the training dataset.
Table 2 presents the obtained MSE of the three learning techniques applied in this study. The new propose techniques in this article (ESN and ANFIS) obtain better results than SVR. Although, ANFIS seems to performs better for predicting the amount of likes, the ESN model has a better accuracy in the other two cases. Figure 2 present an illustration of our results, where is easy to compare the different accuracy among the models.
Method  Comments  Likes  Shares 

SVR  0.002998  0.003917  0.003496 
ESN  0.002068  0.003163  0.001740 
ANFIS  0.0022092  0.002121  0.001957 
Iv Conclusions and Future Works
In this work, we are showed prediction accuracy of three models  Support Vector Regression, Echo State Network and Adaptive NeuroFuzzy Inference System. For example we have predicted the impact of a post in social network Facebook. The dataset contains 7 features known prior to post publication, and 3 output variables which are used for the post impact. The output variables are: comments, shares, and likes. The new propose techniques in this article (ESN and ANFIS) obtain better results than SVR. Although, ANFIS seems to performs better for predicting the amount of likes, the ESN model has a better accuracy in the other two cases. In future work we will to continue experiments with using ANFIS model for other datasets to comparise it with other second techniques. We also plan to test in this task deep neural networks.
References
 [1] S. Moro, P. Rita, and B. Vala, “Predicting social media performance metrics and evaluation of the impact on brand building: A data mining approach,” Journal of Business Research, vol. 69, no. 9, pp. 3341–3351, 2016. [Online]. Available: http://dx.doi.org/10.1016/j.jbusres.2016.02.010
 [2] C. Cortes and V. Vapnik, “SupportVector Networks,” Mach. Learn., vol. 20, no. 3, pp. 273–297, Sep. 1995. [Online]. Available: http://dx.doi.org/10.1023/A:1022627411411
 [3] H. Drucker, C. Burges, L. Kaufman, A. Smola, and V. Vapnik, “Support Vector Regression Machines,” Neural Information Processing Systems, vol. 1, pp. 155–161, 1996.
 [4] T. Hastie, R. Tibshirani, and J. Friedman, “The Elements of Data Mining,” The Mathematical Intelligencer, vol. 27, no. 2, pp. 83–85, 2009. [Online]. Available: http://www.springerlink.com/index/D7X7KX6772HQ2135.pdf
 [5] K. H. Peng, L. H. Liou, C. S. Chang, and D. S. Lee, “Predicting personality traits of Chinese users based on Facebook wall posts,” in 2015 24th Wireless and Optical Communication Conference, WOCC 2015, 2015, pp. 9–14.
 [6] R. Joshi and R. Tekchandani, “Comparative analysis of twitter data using supervised classifiers,” in 2016 International Conference on Inventive Computation Technologies (ICICT), vol. 3, Aug 2016, pp. 1–6.
 [7] M. Lukos̆evic̆ius and H. Jaeger, “Reservoir Computing Approaches to Recurrent Neural Network Training,” Computer Science Review, vol. 3, pp. 127–149, 2009.
 [8] H. Jaeger, “The “echo state” approach to analysing and training recurrent neural networks,” German National Research Center for Information Technology, Tech. Rep. 148, 2001.
 [9] D. Verstraeten, B. Schrauwen, M. D’Haene, and D. Stroobandt, “An experimental unification of reservoir computing methods,” Neural Networks, vol. 20, no. 3, pp. 287–289, 2007.
 [10] W. Maass, T. Natschläger, and H. Markram, “Computational models for generic cortical microcircuits,” in Neuroscience Databases. A Practical Guide. Boston, Usa: Kluwer Academic Publishers, June 2003, pp. 121–136.
 [11] S. Basterrech and G. Rubino, “Realtime Estimation of Speech Quality Through the Internet using Echo State Networks,” Journal of Advanced in Computer Networks (JACN), vol. 1, no. 3, september 2013.
 [12] J. J. Steil, “BackpropagationDecorrelation: online recurrent learning with O(N) complexity,” In Proceedings of IJCNN’04, vol. 1, 2004.
 [13] B. Schrauwen, M. Wardermann, D. Verstraeten, J. J. Steil, and D. Stroobandt, “Improving Reservoirs using Intrinsic Plasticity,” Neurocomputing, vol. 71, pp. 1159–1171, March 2007.
 [14] S. Basterrech, C. Fyfe, and G. Rubino, “Selforganizing Maps and Scaleinvariant Maps in Echo State Networks,” in 11th International Conference on Intelligent Systems Design and Applications, ISDA 2011, Córdoba, Spain, November 2224, 2011, November 2011, pp. 94–99. [Online]. Available: http://dx.doi.org/10.1109/ISDA.2011.6121637
 [15] S. Basterrech and G. Rubino, “Echo State Queueing Network: A new Reservoir Computing Learning Tool,” in 10th IEEE Consumer Communications and Networking Conference, CCNC 2013, Las Vegas, NV, USA, January 1114, 2013, 2013, pp. 118–123. [Online]. Available: http://dx.doi.org/10.1109/CCNC.2013.6488435
 [16] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikitlearn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.