Information-thermodynamic characterization of stochastic Boolean networks

Information-thermodynamic characterization of stochastic Boolean networks

Shun Otsubo    Takahiro Sagawa Department of Applied physics, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
July 31, 2019
Abstract

Recent progress in experimental techniques has enabled us to quantitatively study stochastic and flexible behavior of biological systems. For example, gene regulatory networks perform stochastic information processing and their functionalities have been extensively studied. In gene regulatory networks, there are specific subgraphs called network motifs that occur at frequencies much higher than those found in randomized networks. Further understanding of the designing principle of such networks is highly desirable. In a different context, information thermodynamics has been developed as a theoretical framework that generalizes non-equilibrium thermodynamics to stochastically fluctuating systems with information. Here we systematically characterize gene regulatory networks on the basis of information thermodynamics. We model three-node gene regulatory patterns by a stochastic Boolean model, which receive one or two input signals that carry external information. For the case of a single input, we found that all the three-node patterns are classified into four types by using information-thermodynamic quantities such as dissipation and mutual information, and reveal to which type each network motif belongs. Next, we consider the case where there are two inputs, and evaluate the capacity of logical operation of the three-node patterns by using tripartite mutual information, and argue the reason why patterns with fewer edges are preferred in natural selection. This result may also explain the difference of the occurrence frequencies among different types of feedforward-loop network motifs, and therefore suggests a guiding principle of the network formation.

I Introduction

In the recent development of single-cell technologies, quantitative biology has attracted much attention Taniguchi2010 (), where one of the hot topics is the study of network motifs in gene regulatory networks Shen-Orr2002 (); Milo2002 (); Lee2002 (); Alon2007 (). Complex gene regulatory networks are constituted of specific subgraphs called network motifs that occur much more frequently than those in random networks. Many studies have investigated the function of network motifs in order to reveal the reason why such specific patterns are preferred compared to the others in natural selection Shen-Orr2002 (); Mangan2003 (); Alon2006 (); Alon2007 (); Mangan2006 (); Kittisopikul2010 (); Macneil2011 (); DeRonde2012 (); Albergante2014 (). We list three-node network motifs in Fig. 1. Interestingly, they are commonly found in gene regulatory networks across species, including E.coli, yeast, mouse and human Lee2002 (); Alon2006 (); Alon2007 (); Swiers2006 (); Gerstein2012 (). This suggests that there is a guiding principle for the network formation, while we have not yet clearly understood which properties distinguish network motifs from the others.

Figure 1: Gene regulatory network motifs: We list three-node network motifs Alon2006 (); Alon2007 (). A node represents a gene, and an arrow represents regulation between genes. Black arrows represent activation, and white ones represent inhibition. There are eight types of feedforward loops, which are categorized into two groups according to whether the signs of the two regulation from the top to the bottom node are equivalent or not. It is known that coherent feedforward loops can act as low-pass filters and incoherent ones show adaptive behavior. In the setup of Fig. 2, patterns from C1 to C4 and those from I1 to I4 become equivalent respectively. On the other hand, in the setup of Fig. 5, C1 and C2, C3 and C4, I1 and I2, I3 and I4 become equivalent respectively. Therefore, in this study, we mainly focus on patterns from M5-1-1 to M5-1-4. Patterns with positive feedback loops are known to keep their states due to their bistability. They often appear in the form of M6-1-1 and M11-1-1 which are called toggle switches. In this study, however, we mention toggle switches only briefly, because little is known about edge variations of toggle switches and uncertainty remains in their behavior with stochastic Boolean model (as explained in the main text).

In statistical physics, information thermodynamics has been developed in the last decade on the basis of stochastic thermodynamics Jarzynski2000 (); Sekimoto2010 (); Seifert2012 (), which clarifies the relation between thermodynamic quantities and information Sagawa2010 (); Toyabe2010 (); Sagawa2012 (); Ito2013 (); Horowitz2014 (); Horowitz2014a (); Parrondo2015 (); Shiraishi2015 (). Since biological systems, especially cells, can be regarded as information processing systems that operate by consuming energy sources such as ATP molecules, information thermodynamics can be applied to biological systems. In fact, several studies have revealed information-thermodynamic structure in, for example, sensory adaptation of E. coli chemotaxis Sartori2014 (); Barato2014 (); Ito2015 (); Hartich2016 (); Ouldridge2017a (); Matsumoto2017 (). However, biochemical reaction networks including gene regulatory networks are yet to be further investigated.
Here we systematically characterize gene regulatory network motifs on the basis of information thermodynamics. We calculate information-thermodynamic quantities such as the efficiency of information propagation and energetic dissipation to characterize all the possible three-node patterns (704 patterns), by using a stochastic Boolean model. First, we consider the case where there is a single input signal to a three-node pattern, and reveal that all the patterns can be classified into four types. These four types are characterized as dissipative, static, informative, and adaptive. Coherent feedforward-loop network motifs belong to the informative type, and incoherent ones to the adaptive type. The static-type patterns have positive feedback loops. No network motifs are categorized into the dissipative type.
We next consider the case where there are two input signals to a three-node pattern. We evaluate the capacity of logical operation on these two inputs by using tripartite mutual information, and reveal that feedforward loops outperform the others. This result clearly accounts for network patterns that occur frequently in gene regulatory networks. It could also be a reasonable explanation for the fundamental difference of the occurrence frequencies among different types of feedforward loops Mangan2006 (); Kittisopikul2010 ().
These results indicate that gene regulatory networks are efficient in terms of information propagation and thermodynamics. The key features of our work are the following. First, information thermodynamics enables us to quantify dissipation of only a three-node pattern in a large-scale network, by taking information flow into account. Second, we found that tripartite mutual information is useful to characterize the capacity of logical operation performed by three-node patterns. Our approach with these features can be applied to wide range of networks including both artificial and biochemical ones.
This paper is organized as follows. In Sec. II, we introduce the stochastic Boolean model and describe the setup of this study. We also review the fundamental concepts in information thermodynamics. In Sec. III, we show our main results. In the first part, we consider the case of a single input and classify all the three-node patterns on the basis of information-thermodynamic quantities. In the second part, we consider the case of two inputs and calculate tripartite mutual information to evaluate the capacity of logical operation. In Sec. IV, we make concluding remarks. Appendix A shows an example of the master equation of the stochastic Boolean model. In Appendix B, we discuss the equivalence of network patterns and illustrate the relation between OR-logic patterns and AND-logic ones. Appendix C describes the three-node patterns that we exclude from the calculation. Appendix D shows dynamics of some typical patterns. In Appendix E, we remark on the details of the fitting in Fig. 7. We close this paper by discussing the parameter dependence of the main results in Appendix F.

Ii Setup

In this section, we formulate the setup of this study with a stochastic Boolean model. We also briefly review information thermodynamics.

ii.1 Stochastic Boolean model

Figure 2: Stochastic Boolean model: A three-node pattern has a single input from the node. Typical dynamics of these nodes are schematically shown. Since we can observe the on-off behavior of genes and the fluctuation around it, we model the dynamics by the stochastic Boolean model.

We first discuss the general properties of gene regulation. In this study, we restrict ourselves to transcriptional regulation, while post-transcriptional regulation is also important Filipowicz2008 (); Herranz2010 (); Liu2015 (). A gene regulatory network is represented by a graph, whose nodes and directed edges represent genes and their regulation, respectively. In this study, we focus on three-node patterns which are elementary components of gene regulatory networks (see Fig. 1).
A three-node pattern receives one or two input signals as shown in Fig. 2 and Fig. 5, and passes information to the next nodes by processing the signals. In the single-input case a pattern just propagates information, while in the two-input case it performs logical operation.
There are two types of regulation: activation and inhibition. Therefore, there are patterns in total if we exclude auto-regulation.
We next formulate the stochastic Boolean model DeJong2002 (); Alvarez-Buylla2008 (); So2011 (). Suppose in general that there are nodes, and each node takes 0 or 1 at continuous time . The value of each node represents whether the number of proteins produced by the gene exceeds a threshold value as shown in Fig. 2. Inhibitory regulation is described by the NOT gate. Regulation by multiple nodes is expressed by a logic function such as AND or OR, and the state of a node is determined by the value of a regulatory function with nodes regulating the node . In this study, we assume each regulatory function as the AND gate, which is considered as one of the major regulatory functions in gene regulatory networks Alon2006 (). (The relation between our results and the assumption of regulatory functions is discussed in Appendix B.) For example, in the case of Fig. 2, regulatory functions take the following expressions: and .
In real gene regulatory networks, the state of a gene (i.e., ) changes stochastically. Therefore, the time evolution of a gene expression is described by the master equation with the following transition rates:

(1)
(2)

Here, is a transition rate and is the reverse transition ratio. The transition matrix of the master equation is constructed by the above transition rates (see Appendix A for an example).
Such Boolean models have been used to describe the behavior of gene regulatory networks composed of more than two nodes Shmulevich2002 (); Alvarez-Buylla2008 (); Garg2009 (); Liang2012 (); Murrugarra2012 (). Strictly speaking, a Boolean model might be somewhat inaccurate for describing oscillations induced by negative feedback loops and stationary states of positive feedback loops Gardner2000 (); Zhu2007 (); Elowitz2000 (); Hori2011 (); Hori2013 (). In spite of these flaws, however, it has been shown that Boolean models basically well describe gene regulatory networks Alvarez-Buylla2008 (); Garg2009 (); Liang2012 (); Murrugarra2012 ().
We now discuss the detailed setup of this study. As shown in Fig. 2, we consider the case where the signal source activates the input node . The signal is then propagated to the output node through the middle node . Here, is regarded as either another gene or a signal molecule of .
With this setup, we focus on how information flows from to . We assume that randomly flips between 0 and 1 with equal probabilities (i.e., ). We calculate information-thermodynamic quantities for the stationary state after equilibration. We set the parameters with the following conditions:

(3)
(4)

Here, the condition comes from the assumption that the time scale of an external signal is slower than that of each node. In addition, for the sake of simplicity, we assumed that the parameters of the three nodes , and are the same.
We next consider the reduction of 704 patterns to 283 patterns by excluding irrelevant and equivalent patterns. In this study, we define the irrelevant patterns by the following criteria: (i) patterns without causal relationship from to or to , or (ii) patterns that have unregulated nodes. There also exist patterns that have different edge signs but equal with respect to all information quantities, which we regard as equivalent patterns (see Appendix B for details). We pick up only a single pattern from equivalent ones for calculation. As a result, we actually perform calculation for 283 patterns (see Appendix C for details).

ii.2 Information thermodynamics

In this section, we briefly review information thermodynamics. The entropy production of a small system can be reduced by measurement and feedback control by ”Maxwell’s demon”. Information thermodynamics enables us to take into account feedback of the demon by incorporating information quantities like mutual information, which quantifies correlation between the system and the demon.
To connect information thermodynamics to our main setup smoothly, we consider two stochastic variables and that represent the states of node and node at time . Since they are stochastic variables, we can define the probability distribution . Here, capital letters and describe stochastic variables and the small letters and describe their particular realizations. If and form correlation, we can estimate the value of from the value of . Such correlation between and is quantified by the mutual information:

(5)

Mutual information is symmetric in terms of and , and therefore mutual information cannot capture the directional information flow in stochastic dynamics. To characterize such information flow, we consider the learning rate Horowitz2014 () and the transfer entropy Schreiber2000 (), which are respectively defined as

(6)
(7)

These quantities are defined with the time series of the stochastic variables, and both of them characterize the increment of the mutual information during evolves to . Specifically, the learning rate quantifies the amount of information that the instantaneous value of obtains, while the transfer entropy quantifies that the history of obtains as a whole. This slight difference leads to an inequality Hartich2016 (); Matsumoto2017 ()

(8)

We note that the learning rate can take both positive and negative values, while the transfer entropy is always nonnegative. If the learning rate becomes positive, indeed obtains information from . If it becomes negative, consumes the correlation as a consequence of feedback control or just dissipation.
On the basis of inequality (8), it is reasonable to define the following quantity as a measure of the effectiveness of information gain by Hartich2016 (); Matsumoto2017 ():

(9)

The maximum is achieved if , which means that is a sufficient statistic of Matsumoto2017 (). This means that the latest value of is enough for the estimation of . For example, the estimator of the Kalman filter is known to be a sufficient statistic.
So far we have discussed purely information quantities, and we next consider a quantity that is more relevant to thermodynamics: information-thermodynamic dissipation Sagawa2010 (); Toyabe2010 (); Sagawa2012 (); Ito2013 (); Horowitz2014 (); Horowitz2014a (); Parrondo2015 (); Horowitz2015 (). Unlike conventional thermodynamic dissipation, information-thermodynamic dissipation explicitly includes the learning rate, which is defined for as Horowitz2015 ()

(10)

where the learning rate from to is defined as

(11)

The first term on the right-hand side of (10) is the derivative of the Shannon entropy , which represents the entropy change in . is the heat absorbed by from the bath, and therefore the second term is regarded as the entropy change in the bath. This term can be expressed by the transition probability of , , and that for the backward process through the detailed fluctuation theorem Jarzynski2000 (); Sekimoto2010 (); Seifert2012 ():

(12)

The generalized second law of thermodynamics states that the information-thermodynamic dissipation is always nonnegative: . This is a tighter inequality than the conventional second law for the entire system, and characterizes dissipation only in by incorporating the learning rate, which is a characteristic of information thermodynamics that enables us to quantify dissipation of a subsystem in a large network.
We define and in the same manner, and the total dissipation of the three nodes is given by

(13)

It immediately follows that . We note that is rewritten in a similar form to (10):

(14)

Here, is the joint Shannon entropy, and is the learning rate from to . As is the case for , if a three-node pattern learns from a signal node (i.e., ), becomes positive. A detailed derivation of these relations can be found in Horowitz2015 ().

Figure 3: Information-thermodynamic characterization of three-node patterns: The parameters are set to . Each point represents a pattern, and triangles with black edges represent feedforward loops. The difference in colors and shapes corresponds to the difference in the types (see Table 1). (a) Information-thermodynamic dissipation and . (b) Enlarged view of (a). (c) Information flow from S to Z, i.e., and . (d) and . The parameter dependence of these plots is discussed in Appendix F.
dissipative static
informative adaptive
Table 1: Classification of three-node patterns: On the basis of Fig. 3, all patterns are divided into five groups. In terms of their characteristics, they are classified into four types: dissipative, static, informative, and adaptive.
Figure 4: Patterns in the green and blue types. In particular, M5-1-1 and M5-1-2 are network motifs.

Iii Main results

We now show our main results on the characterization of three-node networks: the single input case in Sec. III.1 and the double inputs case in Sec. III.2.

iii.1 Single input: Information propagation

In the setup of Fig. 2, we classify all the three-node patterns on the basis of information-thermodynamic quantities. We reveal that patterns are classified into four types and to which type a network motif belongs.
We show scatter plots with respect to several information-thermodynamic quantities in Fig. 3. Figure 3 and Fig. 3 show information-thermodynamic dissipation and . Figure 3 and Fig. 3 show four information quantities. All the points split into four groups in Fig. 3 and \subreffig: DAll2. We also realize that the group with the second smallest dissipation split into two groups in Fig. 3 and \subreffig: CSZ. From these observations, we assign different colors and shapes to the patterns in Fig. 3. These five groups can be rearranged into four types on the basis of their characteristics as shown in Table. 1. We discuss the characteristic of each type below. Also, the concrete behavior of the patterns in each type is shown by simulation in Appendix D.
First, include negative feedback loops. The reason why they are dissipative is that they show oscillatory behavior due to such negative feedback loops.
On the other hand, the patterns in dissipate much less. These patterns also take small values in terms of the information quantities. We confirmed that all of them include positive feedback loops. The behavior of these patterns is due to the fact that dynamics of these patterns are static because of positive feedback loops, and they do not react well to the signals from .
The remaining 17 patterns that belong to are shown in Fig. 4. The patterns in take small values in terms of the information quantities, and the other way around for the patterns in . We find that the patterns in include incoherent feedforward loops. According to the simulation results shown in Appendix D, they show adaptive behavior such that they pass information from to only in a short time when the state of changes. For example, in the case of M5-1-2, changes from 0 to 1 temporarily when changes from 0 to 1, but after that, returns to 0 due to the inhibition by . This is the reason why the patterns in take small values in terms of the information quantities. We finally confirm that the patterns in propagate signals from to quite well. Thus, these patterns take large values in terms of the information quantities.
M5-1-1 and M5-1-2 network motifs are classified into and respectively, and , and M5-1-1 and M5-1-2 are one of the simplest patterns in and . No network motifs are selected from the dissipative type . This result is reasonable in terms of information thermodynamics. Further studies are necessary, however, to check whether negative feedback loops indeed lead to the oscillatory behavior Elowitz2000 (); Atkinson2003 (); Zhu2007 (); Tayar2017 () and positive feedback loops lead to the static behavior Atkinson2003 (); Zhu2007 (); Kobayashi2003 () using more detailed models or by experiment. We will discuss the reason why feedforward loops are distinguished from the other patterns in and in the next subsection.

iii.2 Two inputs: Logical operation

We now show our second result. Besides the input to , in real gene regulatory networks, a three-node pattern often takes another input to . Therefore, patterns that can nontrivially operate on both the signals would be preferable in natural selection. Here we evaluate the capacity of logical operation performed by three-node patterns by using an information quantity called tripartite mutual information, and discuss the reason why patterns with fewer edges such as feedforward loops are preferred.
We consider the situation that two signal sources and activate and independently (Fig. 5), and that and flip between 0 and 1 randomly (i.e., ). The parameters are set with the following conditions:

(15)
(16)
(17)
(18)

We exclude irrelevant patterns from calculation with the same rule as in Sec. III.1, and additionally exclude patterns that become equivalent by swapping and . As a result, we perform calculation for 204 patterns (see Appendix C for details).
We discuss the definition and the basic properties of tripartite mutual information Cerf1996 (), which quantifies how nontrivially the output depends on the inputs and . The tripartite mutual information between these three nodes is defined as

The first line above schematically corresponds to the central region of the Venn diagram in Fig. 6. In general, can take either positive or negative value. However, in the present setup, it does not take a positive value, as shown from simple calculation with the assumption . In fact, we have

(20)

The smaller the tripartite mutual information is, the more non-trivial the logical operation is. In fact, if depends on both and , becomes smaller than , which implies a small negative value of (or equivalently, the absolute value becomes large). The logical operation is non-trivial in this case. On the other hand, if depends only on and is independent of , and hold, resulting in , where the logical operation is trivial.
We show the result of our calculation in Fig. 7. The horizontal axis is the mutual information , and the vertical axis is the tripartite mutual information . Network motifs M5-1-1 and M5-1-2 take small values of the tripartite mutual information. There are only six patterns that take small values of the tripartite mutual information, which are shown with a brace in Fig. 7. These six patterns are listed in Fig. 8.

Figure 5: Our setup with the two input signals, which focuses on the logical operation performed by a three-node pattern. We investigate how nontrivially the two input signals are processed to an output .
Figure 6: Venn diagram of tripartite mutual information: Each circle represents the Shannon entropy, and the tripartite mutual information corresponds to the central gray region.
Figure 7: Tripartite mutual information of three-node patterns and the fitting models: (a) The scatter plot of versus . The parameters are set to . As shown in the legend, the four feedforward loops are highlighted with black edges, and the other patterns are represented without edges. The commonly used names of the feedforward loops are shown with red letters. Also, the patterns in Fig. 4 are highlighted with the corresponding colors (green and blue). Blue patterns are, however, hidden behind M5-1-3 except for M5-1-2. The black curves are obtained by the fitting with the following models. (b) Fitting model for the left curve. (c) Fitting model for the right curve. There is a single fitting parameter, which is given by in our fitting.
Figure 8: Patterns that take small values of the tripartite mutual information, which are shown in Fig. 7 with a brace.

There seems to be a bifurcation structure in Fig. 7. We find that this structure can be understood with simple models described in Fig. 7 and Fig. 7. In the model of Fig. 7, the NOT gates are inserted into the two inputs with probability independently. In this model, the NOT gates represent the errors that independently occur on the input signals during the logical operation. In the model of Fig. 7, the probability of inserting the NOT gates is fixed with , and one of the AND gate or a straight pass from to is chosen with probabilities and , respectively. In this model, the non-triviality of the logical operation is represented by the probability . The left curve in Fig. 7 is drawn with model \subreffig: model1 by varying in the range of , and the right curve is drawn with model \subreffig: model2 by varying in . Here, there is only a single fitting parameter . The good agreement with the data means that these models capture the essence of the behavior of the three-node patterns.
In particular, network motifs M5-1-1 and M5-1-2 perform non-trivial logical operation with relatively small errors. On the other hand, most of the patterns in perform trivial logical operation, and the patterns in other than M5-1-2 do not propagate information at all. We can thus consider that these patterns are unfavorable in terms of information processing.
We note that M4-1-1 is not a network motif. It should be remarked, however, that M4-1-1 occurs most frequently in many gene regulatory networks Gerstein2012 (), and it also occurs frequently in random networks (and thus is not regarded as a network motif). In this way, tripartite mutual information gives us an intuitive account for not only network motifs but also frequent patterns in gene regulatory networks.
The above result may give a reasonable explanation for the difference in the occurrence frequencies among different types of feedforward loops. As shown in Fig. 1, there are eight types of feedforward loops, but only C1 and I1 occur most frequently with around 40 percent each, and the other six types occur with around five percent each. This has been confirmed in the gene regulatory networks of E. coli and S. cerevisiae Mangan2006 (). The tripartite mutual information of M5-1-3 and M5-1-4 is around zero in Fig. 7, which corresponds to the low occurrence frequencies of C3, C4, I3 and I4. This suggests that the value of tripartite mutual information is important for the network formation.
It should be noted that we assumed each regulatory function as the AND gate in the foregoing argument. If the regulatory function of a three-node pattern is given by a combination of AND and OR, the conclusion becomes opposite to the above. It is necessary, therefore, to check the consistency between our result and real experimental data (see Appendix B for details). Qualitatively the same argument has also been mentioned in Refs. Mangan2003 (); Alon2006 ().

Iv Concluding remarks

In a system like a gene regulatory network with information propagation, it is crucial to take into account information flow to analyze dissipation. In this study, we have adopted information thermodynamics to quantify such dissipation in a three-node pattern which is only a part of a large-scale network.
We first considered the case where there is a single input, and characterized all the possible three-node patterns with information-thermodynamic quantities. We found that these patterns are classified into four types. We discussed the characteristics of network motifs, by considering to which types they are categorized.
We next considered the case where there are two inputs. By quantifying how the two inputs affect the output by tripartite mutual information, we argued the reason why feedforward-loop network motifs occur frequently in an intuitive manner. This result is consistent with a previous study which claims that information propagates more widely in gene regulatory networks than in random networks Kim2015 (). In addition, we found that the different occurrence frequencies among the eight types of feedforward loops may correspond to the difference in the amounts of tripartite mutual information.
However, these results might involve some uncertainty due to the fact that the stochastic Boolean model might be too much simplified. Although this model is suitable for understanding the basic behavior of gene regulatory patterns, the oscillatory or static solution often deviates from the actual behavior. In addition, since the stochastic Boolean model is a coarse-grained model, some dissipative processes such as protein productions are not included. Therefore, it is a future issue to investigate dissipation by using more detailed models. For example, the difference in the occurrence frequencies between C1 (I1) and C2 (I2) might be accounted for by such detailed analysis of dissipation.
Meanwhile, our study suggests that tripartite mutual information is important in terms of the network formation of gene regulatory networks. The symmetry between the occurrence frequencies of C1 to C4 and I1 to I4 Mangan2006 () may arise from the symmetry in the tripartite mutual information between them. It is also worth investigating whether the same tendencies can be found in other gene regulatory networks than those of E.coli and S.cerevisiae.
Our results suggest that information thermodynamics gives us a useful methodology for a systematic analysis of biochemical reaction networks. We note that tripartite mutual information has not received much attention in this context so far, while information quantities such as mutual information and transfer entropy have often been used in a wide contextHoney2007 (); Tostevin2009 (). Further application of our approach to a broader class of networks, including both biological and artificial ones, is a future issue.

Acknowledgements.
We thank Tetsuya J. Kobayashi for fruitful discussion. T. S. is supported by JSPS KAKENHI Grant No. JP16H02211 and No. 25103003.

Appendix A Master equation

In this appendix, we present an example of the master equation for the stochastic Boolean model. We consider the pattern described in Fig. 2 as an example. The joint probability distribution is represented by the probability vector defined by

(21)

where T means the transpose of a matrix. Then the transition matrix is given by

(22)

where

and with being the identity matrix. Here, an element represented by above is determined so that the sum of each column becomes zero. For example, the element of is given by . The time evolution of the total system is described by the master equation

(25)

For example, the first row of the above equation is given by

(26)

Appendix B Equivalence of network patterns

In this appendix, we first formulate the equivalence of network patterns and then discuss the relation between the OR-logic patterns and the AND-logic ones.

b.1 Definition of the equivalence

We consider the stationary state of the setup in Fig. 2. We show the definition of the equivalence of two patterns by an example. If the join probability distributions for a pattern 1 and for another pattern 2 satisfy the following relation for all , then patterns 1 and 2 are said to be equivalent:

(27)

where the bar on a letter represents the inversion and . In general, the inversion is allowed on multiple nodes, but it should be performed on both variables at time and at time simultaneously. If pattern 1 and 2 are equivalent, their information quantities take the same values. For example, when Eq. (27) holds, the mutual information for pattern 1 and for pattern 2 become the same:

(28)

This equivalence of patterns can be shown graphically. For example, Fig. 9 shows a transformation from one pattern to another equivalent pattern.

Figure 9: Equivalent transformation of patterns: The NOT transformations in all the input and output edges give an equivalent pattern.

This transformation is based on the following equations: (if for example), and .
We discuss the equivalence among feedforward loops. In the setup of Fig. 2, the equivalent transformation on and is possible. Thus, the patterns from C1 to C4 and those from I1 to I4 become equivalent, respectively. On the other hand, in the setup of Fig. 5, the transformation is possible only on . In this case, C1 and C2, C3 and C4, I1 and I2, I3 and I4 become equivalent, respectively.

b.2 Equivalence between the OR-logic patterns and the AND-logic ones

In this study, we have assumed that the regulatory functions are given by the AND gates, but there are other types of genes whose regulatory functions can be described by the OR gates. However, even if we assume that some of the regulatory functions are given by the OR gates, the result of Fig. 3 does not change at all. In fact, any OR-logic pattern can be transformed into the same pattern with the AND gate, which can be understood by the following equation: if . This transformation is illustrated in Fig. 10 and Fig. 11. Also, the scatter plot of Fig. 7 itself does not change, while a pattern that corresponds to each point becomes different from that of Fig. 7.

Figure 10: Equivalent transformation from the OR-logic pattern to the AND-logic one.
Figure 11: Equivalent transformation with the setup of Fig. 2: An OR-logic pattern is equivalent to the same pattern with the AND gate.

Next, we discuss the result of the tripartite mutual information for different types of feedforward loops under the above transformation. In the setup of Fig. 5, for example, we consider a situation that the regulatory functions of C1 for and are given by AND and OR, respectively. If we express this as C1(AND, OR), the relation C1(AND, OR) = C3(AND, AND) holds. Therefore, if the regulatory functions of a three-node pattern are given by a combination of AND and OR, the value of tripartite mutual information is swapped between M5-1-1 and M5-1-4, M5-1-2 and M5-1-3.
We explain the reason why the result in Sec. III.2 is plausible from the biological point of view. First, if we regard as the signal molecule of , then the regulatory function of becomes the AND gate between and , because the signal molecule activates the transcriptional factor . Therefore, it is sufficient to consider the regulatory function of . Intuitively, if either or is the inhibitory factor of , the regulatory function is likely to be the AND gate. Similar idea can be found, for example, in Bulashevska2005 (). In that sense, C2, C3, C4, I2, I3 and I1 are likely to be the patterns with the AND gates. Thus the result that C3, C4, and I3 perform trivial logical operation is plausible to some extent. The low occurrence frequency of I4 is a remaining issue to be further examined.

Appendix C Irrelevant patterns

There are totally three-node patterns, but some of them are irrelevant. In this study we exclude patterns from calculation with the following rules: (i) patterns without causal relationship from to or to , (ii) patterns that have unregulated nodes.
Two examples of the case (i) are shown in Fig. 12, where does not regulate or in these patterns. Since such patterns are equivalent to some two-node patterns, we exclude them from our calculation.

Figure 12: Examples of patterns without causal relationship from to .

Two examples of the case (ii) are shown in Fig. 13. Patterns without regulation of are equivalent to some two-node patterns in the setup of Fig. 2, where patterns without regulation of are meaningless.

Figure 13: Examples of patterns that have unregulated nodes.

Moreover, in the setup of Fig. 5, the roles of and become equivalent. We calculated only one of patterns that become equivalent by swapping nodes and (Fig. 14).

Figure 14: Examples of patterns that are symmetric in terms of and .

In summary, we include only a single pattern out of equivalent ones for our calculation on the basis of the above rules. As a result, we performed our calculation for 283 patterns for the setup of Fig. 2 and 204 patterns for the setup of Fig. 5 (see Supplemental Material for the list of these patterns).

Appendix D Dynamics of patterns

In this Appendix, we show the characteristics of the four types shown in Table. 1 with examples of numerical simulation based on the Gillespie method Gillespie1977 (). From Fig. 15 to Fig. 17, we show the time evolution of , and when changes with a constant period. For simplicity, we set , while as in Fig. 3.

(a) M9-1-2
(b) M10-3-4
Figure 15: Examples of dynamics of patterns in the pink type (dissipative). Both of them show oscillatory behavior.
(a) M8-2-4 (gray type)
Figure 16: An example of dynamics of a pattern in the gray type (static).
(a) M10-2-4
(b) M10-5-6
Figure 17: Examples of dynamics of patterns in the green type (informative) and the blue type (adaptive). M10-2-4 in the green type propagates information efficiently, while M10-5-6 in the blue type shows an adaptive behavior.

d.1 Pink type: Dissipative

M9-1-2 belongs to , which are respectively shown in Fig. 15(a) and \subreffig: 10-3-4. Both of them include negative feedback loops, and are dissipative types. We can see oscillatory behavior. In M10-3-4, shows oscillations independently of , while in M9-1-2, shows oscillations only when is 1. Thus we argue that this difference leads to the difference in the dissipation of .

d.2 Gray type: Static

M8-2-4 belongs to , which includes a positive feedback loop. We can see that converges to a stationary value.

d.3 Green and blue types: Informative and adaptive

M10-2-4 belongs to the informative type and M10-5-6 belongs to the adaptive type , which are shown in Fig. 17(a) and \subreffig: 10-5-6, respectively. In M10-2-4, the variation in propagates to as it is, while in M10-5-6, reacts only when changes from 0 to 1.

Appendix E Details of the fitting in Fig. 7

We show the details of the fitting curves in Fig. 7. For example, the probability distribution of the model of Fig. 7 is given by

(29)
(30)
(31)
(32)
(33)
(34)
(35)
(36)

Then, and are determined as functions of and by putting the above expressions into their definitions.

Appendix F Parameter dependence of the main results

We discuss the parameter dependence of Fig. 3 and Fig. 7. For example, Fig. 18(a) represents the case where is set to 0.01, while the other parameters are kept in the same values as those of Fig. 3. We note that the information-thermodynamic dissipation diverges and thus is meaningless for .
We show the cases of various parameters in Fig. 18 to Fig. 20. There are some cases where the type classification become ambiguous, but it can be reasonably understood by considering the characteristics of individual types, as discussed below. The classification becomes the most ambiguous in the case of . This is the case where the signal changes before the system relaxes, which is unrealistic in real biological systems. The case where the classification becomes ambiguous the second most is . This is the case where stochasticity of the system is too large, and the static nature of positive feedback loops disappears, leading to the small difference between and .
In Fig. 21, we show scatter plots of the tripartite mutual information. The models of Fig. 7 and \subreffig: model2 do not fit well with the data in Fig. 21(b), \subreffig: diff_I3, \subreffig: ediff_I3, and \subreffig: e01_I3. However, the conclusion does not change from that of Fig. 7. For example, M4-1-1, M5-1-1 and M5-1-2 still take smaller values in terms of the tripartite mutual information in the plots of Fig. 21. The reason why the models in Fig. 7 do not fit well with the data in Fig. 21(c) and \subreffig: ediff_I3 is that these models suppose the symmetric properties between and . On the other hand, in Fig. 21 \subreffig: S100_I3 and \subreffig: e0_I3, the models fit well with the data, and the fitting parameter is determined as (a) , (e) .

(a)
(b) Enlarged view of (a)
(c)
(d)
(e)
(f)
Figure 18: Parameter dependence of the information-thermodynamic dissipation: (a) The case where the signal changes quite slowly compared to the characteristic times of the three nodes. (b) Enlarged view of (a). (c) The case where the transition rates are different in the three nodes. (d) The case where the reverse transition ratios are different in each node. (e) The case where the signal changes as fast as the characteristic times of the three nodes. (f) The case where the reverse transition rate is large. In any case, the other parameters are set to the same values as those of Fig. 3.
(a)
(b)
(c)
(d)
(e)
(f)
Figure 19: Parameter dependence of the informational quantities: (a) The case where the signal changes quite slowly compared to the characteristic times of the three nodes. (b) The case where the signals changes as fast as the characteristic times of the three nodes. (c) The case where the transition rates are different in three nodes. (d) The case where the reverse transition ratios are different in the three nodes. (e) The case where the reverse transition rates are zero. (f) The case where the reverse transition rates are large. In any case, the other parameters are set to the same values as those of Fig. 3.
(a)
(b)
(c)
(d)
(e)
(f)
Figure 20: Parameter dependence of the mutual information and the sensory capacity: (a) The case where the signal changes quite slowly compared to the characteristic times of the three nodes. (b) The case where the signals changes as fast as the characteristic times of the three nodes. (c) The case where the transition rates are different in three nodes. (d) The case where the reverse transition ratios are different in the three nodes. (e) The case where the reverse transition rates are zero. (f) The case where the reverse transition rates are large. In any case, the other parameters are set to the same values as those of Fig. 3.
(a)
(b)
(c)
(d)
(e)
(f)
Figure 21: Parameter dependence of tripartite mutual information: (a) The case where the signal changes quite slowly compared to the characteristic times of the three nodes. (b) The case where the signals changes as fast as the characteristic times of the three nodes. (c) The case where the transition rates are different in three nodes. (d) The case where the reverse transition ratios are different in the three nodes. (e) The case where the reverse transition rates are zero. (f) The case where the reverse transition rates are large. In any case, the other parameters are set to the same values as those of Fig. 7. The fitting parameter are given by (a) , (e) .

References

  • (1) Y. Taniguchi, P. J. Choi, G. W. Li, H. Chen, M. Babu, J. Hearn, A. Emili and X. S. Xie, Quantifying E. coli Proteome and Transcriptome with Single-Molecule Sensitivity in Single Cells, Science 329, 533 (2010).
  • (2) S. S. Shen-Orr, R. Milo, S. Mangan, and U. Alon, Network motifs in the transcriptional regulation network of Escherichia coli, Nat. Genet. 31, 64 (2002).
  • (3) R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii and U. Alon, Network Motifs: Simple Building Blocks of Complex Networks, Science 298, 824 (2002).
  • (4) T. I. Lee, N. J. Rinaldi, F. Robert, D. T. Odom, Z. Bar-Joseph, G. K. Gerber, N. M. Thompson, I. Simon, J. Zeitlinger, E. G. Jennings, H. L. Murray, D. B. Gordon, B. Ren, J. J. Wyrick, J.-B. Tagne, T. L. Volkert, E. Fraenkel, D. K. Gifford and R. A. Young, Transcriptional Regulatory Networks in Saccharomyces cerevisiae, Science 298, 799 (2002).
  • (5) U. Alon, Network motifs: theory and experimental approaches, Nat. Rev. Genet. 8, 450 (2007).
  • (6) S. Mangan and U. Alon, Structure and function of the feed-forward loop network motif, P. Natl. Acad. Sci. USA 100, 11980 (2003).
  • (7) U. Alon, An Introduction to Systems Biology (CRC Press, 2006).
  • (8) S. Mangan, S. Itzkovitz, A. Zaslaver, and U. Alon, The incoherent feed-forward loop accelerates the response-time of the gal system of Escherichia coli, J. Mol. Biol. 356, 1073 (2006).
  • (9) M. Kittisopikul and G. M. Suel, Biological role of noise encoded in a genetic network motif, P. Natl. Acad. Sci. USA 107, 13300 (2010).
  • (10) L. T. MacNeil and A. J. M. Walhout, Gene regulatory networks and the role of robustness and stochasticity in the control of gene expression, Genome Res. 21, 645 (2011).
  • (11) W. H. De Ronde, F. Tostevin, and P. R. Ten Wolde, Feed-forward loops and diamond motifs lead to tunable transmission of information in the frequency domain, Phys. Rev. E 86 (2012).
  • (12) L. Albergante, J. J. Blow, and T. J. Newman, Buffered Qualitative Stability explains the robustness and evolvability of transcriptional networks, eLife 3, e02863 (2014).
  • (13) G. Swiers, R. Patient, and M. Loose, Genetic regulatory networks programming hematopoietic stem cells and erythroid lineage specification, Dev. Biol. 294, 525 (2006).
  • (14) M. B. Gerstein, A. Kundaje, M. Hariharan, S. G. Landt, K. K. Yan, C. Cheng, X. J. Mu, E. Khurana, J. Rozowsky, R. Alexander, R. Min, P. Alves, A. Abyzov, N. Addleman, N. Bhardwaj, A. P. Boyle, P. Cayting, A. Charos, D. Z. Chen, Y. Cheng, D. Clarke, C. Eastman, G. Euskirchen, S. Frietze, Y. Fu, J. Gertz, F. Grubert, A. Harmanci, P. Jain, M. Kasowski, P. Lacroute, J. Leng, J. Lian, H. Monahan, H. Ogeen, Z. Ouyang, E. C. Partridge, D. Patacsil, F. Pauli, D. Raha, L. Ramirez, T. E. Reddy, B. Reed, M. Shi, T. Slifer, J. Wang, L. Wu, X. Yang, K. Y. Yip, G. Z.-Schapira, S. Batzoglou, A. Sidow, P. J. Farnham, R. M. Myers, S. M. Weissman and M. Snyder, Architecture of the human regulatory network derived from ENCODE data, Nature 489, 91 (2012).
  • (15) C. Jarzynski, Hamiltonian Derivation of a Detailed Fluctuation Theorem, J. Stat. Phys. 98, 77 (2000).
  • (16) K. Sekimoto, Stochastic Energetics (Springer, 2010).
  • (17) U. Seifert, Stochastic thermodynamics, fluctuation theorems and molecular machines, Rep. Prog. Phys. 75, 126001 (2012).
  • (18) T. Sagawa and M. Ueda, Generalized Jarzynski equality under nonequilibrium feedback control, Phys. Rev. Lett. 104, 1 (2010).
  • (19) S. Toyabe, T. Sagawa, M. Ueda, E. Muneyuki, and M. Sano, Experimental demonstration of information-to-energy conversion and validation of the generalized Jarzynski equality, Nat. Phys. 6, 988 (2010).
  • (20) T. Sagawa and M. Ueda, Fluctuation Theorem with Information Exchange: Role of Correlations in Stochastic Thermodynamics, Phys. Rev. Lett. 109, 180602 (2012).
  • (21) S. Ito and T. Sagawa, Information thermodynamics on causal networks, Phys. Rev. Lett. 111, 1 (2013).
  • (22) J. M. Horowitz and M. Esposito, Thermodynamics with Continuous Information Flow, Phys. Rev. X 4, 031015 (2014).
  • (23) J. M. Horowitz and H. Sandberg, Second-law-like inequalities with information and their interpretations, New J. Phys. 16, 125007 (2014).
  • (24) J. M. R. Parrondo, J. M. Horowitz, and T. Sagawa, Thermodynamics of information, Nat. Phys. 11, 131 (2015).
  • (25) N. Shiraishi and T. Sagawa, Fluctuation theorem for partially masked nonequilibrium dynamics, Phys. Rev. E 91, 3 (2015).
  • (26) P. Sartori, L. Granger, C. F. Lee, and J. M. Horowitz, Thermodynamic Costs of Information Processing in Sensory Adaptation, PLoS Comput. Biol. 10, e1003974 (2014).
  • (27) A. C. Barato, D. Hartich, and U. Seifert, Efficiency of cellular information processing, New J. Phys. 16, 103024 (2014).
  • (28) S. Ito and T. Sagawa, Maxwell’s demon in biochemical signal transduction with feedback loop, Nat. Commun. 6, 7498 (2015).
  • (29) D. Hartich, A. C. Barato, and U. Seifert, Sensory capacity: An information theoretical measure of the performance of a sensor, Phys. Rev. E 93, 1 (2016).
  • (30) T. E. Ouldridge, C. C. Govern, and P. R. ten Wolde, Thermodynamics of Computational Copying in Biochemical Systems, Phys. Rev. X 7, 021004 (2017).
  • (31) T. Matsumoto and T. Sagawa, Role of sufficient statistics in stochastic thermodynamics and its implication to sensory adaptation, arXiv:1711.00264 (2017).
  • (32) W. Filipowicz, S. N. Bhattacharyya, and N. Sonenberg, Mechanisms of post-transcriptional regulation by microRNAs: Are the answers in sight? Nat. Rev. Genet. 9, 102 (2008).
  • (33) H. Herranz and S. M. Cohen, MicroRNAs and gene regulatory networks: managing the impact of noise in biological systems, Gene. Dev. 24, 1339 (2010).
  • (34) Z.-P. Liu, C. Wu, H. Miao, and H. Wu, RegNetwork: an integrated database of transcriptional and post-transcriptional regulatory networks in human and mouse, Database 2015, bav095 (2015).
  • (35) H. de Jong, Modeling and Simulation of Genetic Regulatory Systems: A Literature Review, J. Comput. Biol. 9, 67 (2002).
  • (36) E. R. Álvarez-Buylla, Á. Chaos, M. Aldana, M. Benítez, Y. C.-Poza, C. E.-Soto, D. A. Hartasánchez, R. B. Lotto, D. Malkin, G. J. E. Santos and P. P.-Longoria, Floral Morphogenesis: Stochastic Explorations of a Gene Network Epigenetic Landscape, PLoS ONE 3, e3626 (2008).
  • (37) L.-h. So, A. Ghosh, C. Zong, L. A Sepúlveda, R. Segev and I. Golding, General properties of transcriptional time series in Escherichia coli, Nat. Genet. 43, 554 (2011).
  • (38) I. Shmulevich, E. R. Dougherty, S. Kim, and W. Zhang, Probabilistic Boolean Networks: a rule-based uncertainty model for gene regulatory networks, Bioinformatics 18, 261 (2002).
  • (39) A. Garg, K. Mohanram, A. Di Cara, G. De Micheli, and I. Xenarios, Model- ing stochasticity and robustness in gene regulatory networks, Bioinformatics 25, i101 (2009).
  • (40) J. Liang and J. Han, Stochastic Boolean networks: An efficient approach to modeling gene regulatory networks, BMC Syst. Biol. 6, 113 (2012).
  • (41) D. Murrugarra, A. Veliz-Cuba, B. Aguilar, S. Arat, and R. Laubenbacher, Modeling stochasticity and variability in gene regulatory networks, Eurasip J. Bioinform. Syst. Biol. 2012, 1 (2012).
  • (42) T. S. Gardner, C. R. Cantor, and J. J. Collins, Construction of a genetic toggle switch in Escherichia coli, Nature 403, 339 (2000).
  • (43) R. Zhu, A. S. Ribeiro, D. Salahub, and S. A. Kauffman, Studying genetic regulatory networks at the molecular level: Delayed reaction stochastic models, J. Theor. Biol. 246, 725 (2007).
  • (44) M. B. Elowitz and S. Leibler, A synthetic oscillatory network of transcriptional regulators, Nature 403, 335 (2000).
  • (45) Y. Hori, T. H. Kim, and S. Hara, Existence criteria of periodic oscillations in cyclic gene regulatory networks Automatica 47, 1203 (2011).
  • (46) Y. Hori, M. Takada, and S. Hara, Biochemical oscillations in delayed negative cyclic feedback: Existence and profiles Automatica 49, 2581 (2013).
  • (47) T. Schreiber, Measuring Information Transfer, Phys. Rev. Lett. 85, 461 (2000).
  • (48) J. M. Horowitz, Multipartite information flow for multiple Maxwell demons, J. Stat. Mech. Theory E 2015, P03006 (2015).
  • (49) M. R. Atkinson, M. A. Savageau, J. T. Myers, and A. J. Ninfa, Development of Genetic Circuitry Exhibiting Toggle Switch or Oscillatory Behavior in Escherichia coli, Cell 113, 597 (2003).
  • (50) A. M. Tayar, E. Karzbrun, V. Noireaux, and R. H. Bar-Ziv, Synchrony and pattern formation of coupled genetic oscillators on a chip of artificial cells, P. Natl. Acad. Sci. USA 114, 11609 (2017).
  • (51) T. Kobayashi, L. Chen, and K. Aihara, Modeling genetic switches with positive feedback loops, J. Theor. Biol. 221, 379 (2003).
  • (52) N. J. Cerf and C. Adami, Information theory of quantum entanglement and measurement, Physica D 120, 62 (1998).
  • (53) H. Kim, P. Davies, and S. I. Walker, New scaling relation for information transfer in biological networks, J. Roy. Soc. Interface 12, 20150944 (2015).
  • (54) C. J. Honey, R. Kötter, M. Breakspear, and O. Sporns, Network structure of cerebral cortex shapes functional connectivity on multiple time scales. P. Natl. Acad. Sci. USA 104, 10240 (2007).
  • (55) F. Tostevin and P. R. Ten Wolde, Mutual information between input and output trajectories of biochemical networks, Phys. Rev. Lett. 102 (2009).
  • (56) S. Bulashevska and R. Eils, Inferring genetic regulatory logic from expression data, Bioinformatics 21, 2706 (2005).
  • (57) D. T. Gillespie, Exact stochastic simulation of coupled chemical reactions, J. Phys. Chem.-US 81, 2340 (1977).

Supplemental Material

In this Supplemental Material, we list all the three-node patterns that are included for our calculation. Patterns in Fig. S1 to Fig. S4 represent those included for calculation in the setup of Fig. 2 in the main text. We exclude the patterns which have ★ marks at lower right from our calculation in the setup of Fig. 5. On the other hand, patterns in Fig. S5 represent those included for calculation only in the setup of Fig. 5. An upper left node represents , a middle node , and a lower left node in each three-node pattern. The first number of the name represents the difference of the shape (which is determined to be consistent with that in Refs. Milo2002 (); Alon2006 ()), and the second number represents the pattern’s direction, and the third number represents the difference due to the different signs of edges.

Figure S1: List 1: These patterns are included for calculation in the setup of Fig. 2, while those with ★ marks at lower right are excluded from calculation in the setup of Fig. 5.
Figure S2: List 2: These patterns are included for calculation in the setup of Fig. 2, while those with ★ marks at lower right are excluded from calculation in the setup of Fig. 5.
Figure S3: List 3: These patterns are included for calculation in the setup of Fig. 2, while those with ★ marks at lower right are excluded from calculation in the setup of Fig. 5.
Figure S4: List 4: These patterns are included for calculation in the setup of Fig. 2, while those with ★ marks at lower right are excluded from calculation in the setup of Fig. 5.
Figure S5: List 5: These patterns are included for calculation only in the setup of Fig. 5.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
124877
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description