Importance of initial conditions in the polarization of complex networks
Abstract
Currently used models of opinion formation use random initial conditions. In reality, most people in a social network, except for a small fraction of the population, are initially either unaware of, or indifferent to, the disputed issue. To explore the consequences of such specific initial conditions, we study the polarization of social networks when conflicting ideas arise on two different seed nodes and then spread according to a majority rule. Using the configuration model and the stochastic block model as examples, we show that this framework leads to substantially different outcomes than those which employ random initial conditions. Moreover, the empirically observed splits in the karate and the dolphins’ networks naturally come out of this paradigm. Our work thus suggests that the existing opinion dynamics models should be reevaluated to incorporate the initial condition dependence.
pacs:
89.75.Hcpacs:
87.23.Gepacs:
89.75.FbNetworks and geneological trees Dynamics of social systems Structures and organization in complex systems
1 Introduction
When faced with a question with two conflicting answers, such as which candidate to vote for, or whether the realworld networks are scalefree [1, 2], social networks often polarize by forming two opinion groups. This emergence is explained by the binary opinion models and their generalizations as a result of the ‘majority rule’ whereby the choice made by the majority of social acquaintances of a node, dictates the selection of its future choice.[3, 4, 5, 6, 7, 8, 9, 10, 11].
Despite being successful at providing many fascinating insights into the dynamics of social systems, these models assume that initially, every node is in one of the two opposite states. This assumption, however, is pretty unrealistic since in most cases, a sizable fraction of the population is initially either unaware of or indifferent to the disputed issue, and only a small number of people have a definite stand over it. Hence, it is essential to study how the ‘network polarizability’ is affected by such nonrandom initial conditions.
In this paper, we study a particular variation of this idea in which two chosen nodes, called seed nodes, are initially infected with opposite opinions while the remaining nodes are kept in a neutral state. The seed nodes then spread their opinions in the network such that every node changes its choice at each time step following a majority rule until a steady state is reached. Depending upon the selection of seed nodes, we either observe the complete consensus state or a highly polarized state, or a state with intermediate polarization. A similar model has been studied in [12] to explore the competition between two fixed nodes which never change their opinions. On the contrary, in our model, the seed nodes are not forced to retain their original opinions and are subjected to the same majority rule. Furthermore, since in real social networks one seldom knows the seed nodes on which the conflicting ideas are formed, we must be agnostic about the choice of the seed pair while talking about the polarizability of a given network. We thus propose to average the polarization over a large number of seed pairs to estimate the average polarizability of the network. The initial condition dependence in the opinion dynamics has also been investigated for a few particular situations such as for studying the effectiveness of interventions to change the adolescent smoking behavior [13], and even in the case of ‘boundedconfidence’ opinion models in the agentbased settings [14]. Interestingly, the effect of inital conditions is well studied in the case of evolutionary game theory in which a “prepared” initial spatial distribution of strategies (or the distribution in the underlying metric space for networks) has been shown to lead to different results than those obtained using random initial distribution [15, 16, 17]. However, the ‘seed initial conditions’ (SICs) that we introduce here have not been considered in the literature as per our knowledge, and as we demonstrate in the ensuing sections, are central to the understanding of observed polarizations in social systems.
It is important to note here that though the threestate opinion models, similar to our model, are well studied in the literature in the context of polarization, they still use RICs and hence our work fundamentally differs from them [18, 19, 20].
Along with the initial condition dependence, we also want to investigate how various structural features of a network contribute to the overall polarizability. Here, we focus on two most commonly found structural traits in social networks: a fattailed degree distribution and community structure [21]. Since the asymptotic steady state in our model crucially depends on the choice of the seed nodes, any answer to such question must be given in terms of an average taken over a large number of seed pairs. As we show in the following, communities tend to make the network more polarized as expected, while the existence of high degree nodes directs it towards consensus states. All the numerical simulations are carried out using graphtool [22].
2 A simple model of opinion formation
Consider a very general model of opinion formation as given below.
(1) 
Here is the vector representing the opinion values on the nodes and is a function that connects the states at times and and can have arbitrary form. Also, is the adjacency matrix of the network that in general is asymmetric with any real numbers (including negative) as its entries to represent the strengths of the connections between the nodes. Moreover, these entries can, in general, be functions of time. The set collectively represents the remaining parameters of the model.
However, our aim in this paper is not to model the opinion formation process with as much realism as possible. Rather, we want to see how the polarization dynamics is affected when the initial conditions are in the form of two seed nodes with opposite opinions instead of being completely random. To achieve this, we consider a simple version of the model given by eq.(1). The two opposite opinions can be conveniently modeled as and while the neutral view can be represented by . Also, in this paper we focus on undirected networks and update the states according to the “majority rule” as follows:
(2) 
where sgn is the sign function that takes value when its argument is positive, when its argument is negative and otherwise. The states of all the nodes are updated simultaneously. We mention that a similar model has been given in [4], but in that case, the initial state of the network was taken to be a random state. This model is also reminiscent of the label propagation method for the detection of communities in networks [23] except that apart from the states of the neighboring nodes, the present state of the node is also taken into account.
2.1 Polarization index
The dynamics of eq.(2), after a few iterations, results in states which do not change further. Hence we restrict ourselves to only such steady states. The network is highly polarized if the numbers of nodes in at least two of the three states are roughly equal. Nonetheless, for all practical purposes, we can talk in terms of and states since the steady states with a large number of node values are rare. Thus, to quantify the polarization, we define the following index:
(3) 
Here, is the fraction of nodes with negative states. It can be easily verified that for the consensus or unpolarized states, for which or , the polarization index whereas for the highly polarized network states, .
3 Effect of initial conditions
3.1 Powerlaw configuration model
We start by studying the effect of a fattailed degree distribution on the network polarization when there are two seed nodes. Thus, as a representative model of such networks, we consider the configuration model [24] with a degree sequence drawn from the powerlaw distribution and explore the results of running the dynamics of eq.(2) on it. In the configuration model, the degree value for each node of the network is specified by assigning a certain number of halfedges to it, and these halfedges are then randomly connected to each other. fig. 1 shows two different states obtained for the powerlaw configuration model. As we can see, different choices of the seed nodes can lead to drastically different steady states.
How does the abundance of hubs affect the emergence of high polarization states? In the powerlaw configuration model, this abundance can be tuned by varying the scaling index . In fig. 2, we show histograms of polarization values for different powerlaw scaling indices with seed initial conditions (SICs henceforth). In the same figure, we show the histograms obtained starting with the random initial conditions (RICs henceforth) so that each node is in one of the two states or with an equal probability. In both cases, since as is increased, the number of high degree nodes decreases, an emergence of highpolarization states becomes more probable. This is understandable since if the network contains several high degree nodes, they dominate the network with their opinion reducing the chances of observing polarized states. However, for a given value of , the corresponding histograms in two cases can be seen to be quite different from each other. In particular, the average value of the polarization can be seen to be consistently higher with RIC.
3.2 Stochastic block model (SBM)
Apart from heavytailed degree distributions, another common structural feature seen in almost all the social networks is an existence of community structure [26, 27, 28, 29]. A community in a complex network is defined as a group of nodes that have the same connection probabilities to the other nodes in the network. In particular, social networks exhibit an assortative type of communities so that the connection probabilities inside the groups are higher than the probabilities between the groups. We want to see how the existence of assortative communities affects the spread of conflicting opinions that are originated on two different nodes.
A straightforward way to study this question is to construct graphs with planted or “handmade” communities and then simulate the dynamics on them. There exist several different random graph models which contain the idea of communities or blocks in their description [30, 27, 31, 32]. Arguably, the simplest of these is the famous ‘Stochastic block model’ or SBM for brevity [31]. In the traditional SBM, we start with nodes and group them into a number of blocks or modules. Then every pair of nodes is connected with a probability where and denote the blocks to which the nodes and belong to respectively. For generating a highly assortative community structure that we are interested in, we make the probabilities inside the groups significantly higher than the probabilities between the groups.
The planted partition model is a special case of the stochastic block model described above. To construct it, we set all the interblock probabilities to the same value , and all the probabilities inside the groups to the same value . For generating a strongly assortative community structure that we are interested in, we set .
Consider the SBM with a Poisson degreedistribution. In such network, the degree values of all the nodes are concentrated around the average value. Thus, unlike the powerlaw configuration model, there are no ‘hubs’ which can dominate the network. The model is therefore ideal if we want to study effects of community structure only.
In fig. 3 (left) we show some of the steady states for the Poisson SBM with one large and two small blocks. Interestingly, the nodes in the same community, despite being densely connected to each other, can have opposite states asymptotically as the picture shows. In the same figure, we also show the polarization histograms with and without random initial conditions obtained using random realizations for a much larger network () with communities. The relative community sizes are and distributions for and are shown with fixed. The average degree in all the cases. Similar to the case of configuration model, the random initial conditions give different results.
We now wish to see how the polarization is affected by the size of the network. We start with the powerlaw configuration model. In fig. 4, we show variations of with the size of the network. As we can see, variations are starkly different for RICs and SICs, especially for larger values for which random initial conditions predict a substantial increase in the average polarization whereas the seed initial conditions predict the exact opposite. For the Poisson SBM, we see qualitatively similar results as shown in fig. 5.
4 Degreecorrected SBM
The configuration model and the stochastic block model encode two different topological features of complex networks. The former controls only the degree distribution while the later allows tuning the nature of community structure. Thus, to be more realistic, we need a model in which both the aspects could be accommodated. In particular, we want to capture a crucial aspect of many realworld networks, that of the degree heterogeneity, along with the presence of communities. In several realnetworks, the degrees of the nodes are seen to be taking values in a vast range. In other words, the degree distributions of these networks are not peaked around their average values. On the contrary, the simple SBM produces graphs with the Poisson degree distribution so that most of the nodes have degrees around the average of the distribution. Karrer and Newman have extended the traditional stochastic block model to incorporate the arbitrary degree distributions [27], and it is known as the ‘degreecorrected SBM’. In their model, along with the connection probabilities between the groups, each node is endowed with a parameter that is proportional to its specified degree. The degree values can be drawn from any distribution and then the pair is connected with the probability proportional to .
In the most general setting, one can vary all the elements of the probability matrix and see how the resultant network states change. However, since our focus here is only on understanding the effect of communities on the asymptotic states, we simplify the situation by setting and when . This is known as the degreecorrected planted partition model [33]. As mentioned earlier, many realnetworks possess fattailed degree distributions. As a consequence of this, in these networks, there exist few nodes with exceedingly large degreevalues compared to the average degree of the network. One such distribution is the powerlaw distribution where is the scalingindex, and we will use it as a “proxy” for representing the fattailed distributions.
The results of running the dynamics on networks generated using the degreecorrected SBM with the powerlaw degree distribution and three blocks with a fraction of nodes in each block equal to are summarized in the form of heatmap of polarization in fig. 6. As the figure shows, increasing (which amounts to the decrease in the abundance of hubs) increases the polarization while increasing (which amounts to weakening the community structure) decreases the polarization. In other words, degreedistribution and community structure act as controlling features for polarization. The region in the upper left corner represents a region with a close to zero polarization and corresponds to a high abundance of hubs and very weak community structure. On the other hand, a high polarization exists in the lower right corner because of a lower abundance of hubs and strong community structure. All the results in this plot correspond to SICs.
5 Analysis of the stability of the final state
So far we did not consider the question of stability of the asymptotic opinion communities. We want to investigate the effect of a small perturbation to the asymptotic state of the network. This can be done using linear stability analysis. The function in eq.(2) is discontinuous at and hence is difficult to handle for this purpose. Thus, we replace it with a smooth function which essentially captures the idea of jump at . With this modification, our model is given in eq.(4).
(4) 
As can be easily seen, this model is equivalent to the original model in eq.(2) in the limit . We have also numerically verified that the two models agree almost perfectly with .
(5)  
This can be written as a matrix equation:
(6) 
where is diagonal matrix with:
(7) 
A given state will be stable if all the eigenvalues of have absolute values less than .
Using eq.(4), we write the fixed point equation as follows:
(8) 
Consider the asymptotic states for which or only. For these states, L.H.S. is finite and nonzero and hence, when , . Therefore, using eq.(5) and eq.(6), one can see that the operator norm as . The operator norm of a square matrix is the maximum among the absolute values of its eigenvalues, and so directly relates to the stability of a given state. Noting that the eigenvalues of do not change with , we see that implies that as . Thus, the maximum value of the modulus of eigenvalues of could be made smaller than by choosing a large enough . This shows that all the asymptotic states for which or are stable.
On the other hand, if we consider the states having one or more zero values on the nodes, the above argument would not hold. In that case, we consider a general element of the Jacobian matrix eq.(5). Note that, for all . Selecting to be a node with a zero value, we obtain as . Thus, if we examine, the Jacobian matrix under the supremum norm, then we find that the norm of the Jacobian diverges as . For finite dimensional matrices, the supremum norm is equivalent to the operator norm (in fact to any other norm). So, the operator norm, and consequently, the eigenvalue with largest absolute value diverges with . Thus, such states would be unstable. This is reasonable since a small perturbation would push the nodes with to either or .
6 Empirical networks
We now apply the framework discussed so far to two well known empirical networks. The first one is the Zachary karate club network which is known to have got polarized and split into two parts. Underlying possible community structure is believed to be the major reason behind the split [34, 35]. The ideas presented here suggest an alternative possibility that the observed polarization could be a repercussion of SICs since the point of the dispute was about the raising of the fees of the club, a problem with binary choice. Similar splitting has also happened in a social network of bottlenose dolphins from New Zealand [28, 36]. Though it is somewhat speculative to think about a dispute in case dolphins, we argue that since dolphins can make friendships, the possibility of a dispute arising in them should not be overruled. We apply our model with seed type initial conditions to these networks and obtain the average polarization by averaging over all possible seed pairs in these networks.
fig. 7 shows the polarization distributions for these two networks with SICs and RICs. In this case, it is seen that the RICs give rise to lower average polarization than the corresponding SICs. Paradoxically, this is precisely opposite to the results of the previous sections in which RICs always produced higher polarization. We argue that this is a small size effect. In fact, we observe similar result for the Poisson SBM when the size of the network is small ().
The present formalism, in fact, provides a much deeper insight into the social networks as we discuss now. The idea is to average over a large number of asymptotic states to find out whether the absolute difference between the values on the nodes directly connected by an edge, tends to be larger or smaller on an average. For an edge that connects nodes and , this average difference is:
(9) 
Here the angled brackets indicate an average over a large number of asymptotic states. We then ask whether the difference for a given edge obtained using SICs () is approximately equal to the difference obtained using RICs (). At first thought, one may conclude that a SICs difference would be equal since the same network structure decides them. To check this, we make a scatterplot of SICs difference and the RICs difference as shown in fig. 8 for the karate network and the dolphin network. Indeed, the two are highly correlated as expected. Nonetheless, it is clear from the plots that they do not tend to fall on the line. In other words, an edge for which RICs difference is small need not have small SICs difference, and the edges with a significantly high difference would, in general, be disparate depending upon whether SICs are used, or RICs are used. This result has significant implications for the polarization of networks as explained below.
In fig. 9, we show the karate network and the dolphins’ network with the edges colorgraded according to the average difference across them; darker edges have higher differences. On the left side of the figure, the differences are obtained using SICs whereas, on the right, they are obtained using RICs. Immediately, we see that SICs result into darker edges at the “middle” of these networks whereas darker edges are scattered when RICs are used. Most of these ‘high SICs difference edges’ are the edges which broke during the observed splitting of these networks. The effect is particularly pronounced in the case of dolphins where removal of these edges, in fact, breaks the network completely into two parts, and the predicted breaking is almost similar to the observed breaking. This indicates that in both the networks, splitting could be a direct effect of SICs and that the initial conditions are as important as the network structure for predicting the polarization in social systems.
7 Conclusions
In this paper, we showed that the results of opinion dynamics models on networks could be unusually sensitive to the initial conditions. A possible reason behind this difference could be that the set of allowed initial conditions is now severely restricted and does not represent a truly random sample of the whole phase space. Moreover, we argued that for the polarization of complex networks, random initial conditions or RICs, in which each node is initially in one of the two opposite states with equal probability, are not realistic, and the seed initial conditions or SICs should be preferred. We also showed that SICs allow us to predict empirically observed polarization of networks like the karate club and the social network of dolphins with high accuracy.
Some of the obvious generalizations of the work presented here include using more sophisticated models that incorporate stochasticity and using weighted and directed graphs. Also, it would be interesting to see how the results are affected by varying the community structure in several ways like the number of blocks, their sizes, and the overlaps. We anticipate that this initial condition dependence would be explored in more depth in the opinion dynamics studies in future to produce more realistic predictions. In particular, we expect that SICs formalism will be applied to more realistic models of opinion dynamics to see whether they produce results that have not yet been explained in the empirical social networks.
Acknowledgements.
S.M.S. thanks Dr. Mihir Arjunwadkar for many helpful discussions. S.M.S. acknowledges the funding from the National Post Doctoral Fellowship (NPDF) of DSTSERB, India, File No: PDF/2016/002672.References
 [1] \NameBroido A. D. Clauset A. \REVIEWarXiv:1801.034002018.
 [2] \NameBarabási A.L. \BookLove is all you need (2018).
 [3] \NameCastellano C., Fortunato S. Loreto V. \REVIEWRev. Modern Phys.812009591.
 [4] \NameShao J., Havlin S. Stanley H. E. \REVIEWPhys. Rev. Lett.1032009018701.
 [5] \NameBiswas S. Sen P. \REVIEWPhys. Rev. E802009027101.
 [6] \NameGleeson J. P. \REVIEWPhys. Rev. X32013021004.
 [7] \NameDandekar P., Goel A. Lee D. T. \REVIEWPNAS11020135791.
 [8] \NameHindes J. Schwartz I. B. \REVIEWSci. Reps.72017.
 [9] \NameMedinaGuevara M. G., MacíasDíaz J. E., Gallegos A. VargasRodríguez H. \REVIEWInternational Journal of Modern Physics C2820171750058.
 [10] \NameGalam S. Martins A. C. \REVIEWPhys. Rev. E912015012108.
 [11] \NameAmato R., Kouvaris N. E., San Miguel M. DiazGuilera A. \REVIEWNew Journal of Physics192017123019.
 [12] \NameZhao J., Liu Q. Wang X. \REVIEWScientific reports420145858.
 [13] \NameAdams J. Schaefer D. R. \REVIEWJournal of health and social behavior57201622.
 [14] \NameCarro A., Toral R. San Miguel M. \REVIEWJournal of Statistical Physics1512013131.
 [15] \NamePerc M., Jordan J. J., Rand D. G., Wang Z., Boccaletti S. Szolnoki A. \REVIEWPhysics Reports68720171.
 [16] \NameKleineberg K.K. \REVIEWNature communications820171888.
 [17] \NameAmato R., DíazGuilera A. Kleineberg K.K. \REVIEWScientific reports720177087.
 [18] \NameMobilia M. \REVIEWEPL (Europhysics Letters)95201150002.
 [19] \NameCrokidakis N. \REVIEWJournal of Statistical Mechanics: Theory and Experiment20132013P07008.
 [20] \NameBalenzuela P., Pinasco J. P. Semeshenko V. \REVIEWPloS one102015e0139572.
 [21] \NameFortunato S. Hric D. \REVIEWPhysics Reports65920161.

[22]
\NamePeixoto T. P. \REVIEWfigshare2014.
http://figshare.com/articles/graph_tool/1164194  [23] \NameRaghavan U. N., Albert R. Kumara S. \REVIEWPhys. Rev. E762007036106.
 [24] \NameNewman M. E. J. \BookNetworks: an introduction (Oxford university press) 2010.
 [25] \NameFreedman D. Diaconis P. \REVIEWProbability theory and related fields571981453.
 [26] \NameGirvan M. Newman M. E. J. \REVIEWPNAS9920027821.
 [27] \NameKarrer B. Newman M. E. \REVIEWPhys. Rev. E832011016107.
 [28] \NameLusseau D., Schneider K., Boisseau O. J., Haase P., Slooten E. Dawson S. M. \REVIEWBehavioral Ecology and Sociobiology542003396.
 [29] \NamePeixoto T. P. \REVIEWPhys. Rev. X42014011047.
 [30] \NameRosvall M. Bergstrom C. T. \REVIEWPNAS10520081118.
 [31] \NamePeixoto T. P. \REVIEWPhys. Rev. E852012056122.
 [32] \NamePeixoto T. P. \REVIEWPhys. Rev. Lett.1102013148701.
 [33] \NameNewman M. E. J. \REVIEWPhys. Rev. E942016052315.
 [34] \NameNewman M. E. J. \REVIEWPNAS10320068577.
 [35] \NameRiolo M. A., Cantwell G. T., Reinert G. Newman M. \REVIEWPhys. Rev. E962017032310.
 [36] \NameNewman M. E. J. \REVIEWPhys. Rev. E882013042822.