ADAM: Analysis of Discrete Models of Biological Systems Using Computer Algebra

ADAM: Analysis of Discrete Models of Biological Systems Using Computer Algebra

Franziska Hinkelmann    Madison Brandon    Bonny Guang    Rustin McNeill    Grigoriy Blekherman    Alan Veliz-Cuba and Reinhard Laubenbacher
Abstract
Abstract

ADAM: Analysis of Discrete Models of Biological Systems Using Computer Algebra Franziska Hinkelmann, Madison Brandon , Bonny Guang , Rustin McNeill , Grigoriy Blekherman , Alan Veliz-Cuba and Reinhard Laubenbacher
Virginia Bioinformatics Institute, Blacksburg, VA 24061-0123, USA Virginia Polytechnic Institute and State University, Blacksburg, VA 24061-0123, USA University of Tennessee - Knoxville, Knoxville, TN 37996-2513, USA Harvey Mudd College, Claremont, CA 91711-5901, USA University of North Carolina - Greensboro, Greensboro, NC 27402-6170, USA These authors contributed equally Email: Franziska Hinkelmann - fhinkel@vt.edu; Madison Brandon - mbrando1@utk.edu; Bonny Guang - bonny.guang@gmail.com; Rustin McNeill - rcmcneil@uncg.edu; Grigoriy Blekherman - grrigg@vbi.vt.edu; Alan Veliz-Cuba - alanavc@vt.edu; Reinhard Laubenbacher- reinhard@vbi.vt.edu; Franziska Hinkelmann - fhinkel@vt.edu; Madison Brandon - mbrando1@utk.edu; Bonny Guang - bonny.guang@gmail.com; Rustin McNeill - rcmcneil@uncg.edu; Grigoriy Blekherman - grrigg@vbi.vt.edu; Alan Veliz-Cuba - alanavc@vt.edu; Reinhard Laubenbacher- reinhard@vbi.vt.edu; Corresponding author

Background:

Many biological systems are modeled qualitatively with discrete models, such as probabilistic Boolean networks, logical models, Petri nets, and agent-based models, with the goal to gain a better understanding of the system. The computational complexity to analyze the complete dynamics of these models grows exponentially in the number of variables, which impedes working with complex models. Although there exist sophisticated algorithms to determine the dynamics of discrete models, their implementations usually require labor-intensive formatting of the model formulation, and they are oftentimes not accessible to users without programming skills. Efficient analysis methods are needed that are accessible to modelers and easy to use.

Method:

By converting discrete models into algebraic models, tools from computational algebra can be used to analyze their dynamics. Specifically, we propose a method to identify attractors of a discrete model that is equivalent to solving a system of polynomial equations, a long-studied problem in computer algebra.

Results:

A method for efficiently identifying attractors, and the web-based tool Analysis of Dynamic Algebraic Models (ADAM), which provides this and other analysis methods for discrete models. ADAM converts several discrete model types automatically into polynomial dynamical systems and analyzes their dynamics using tools from computer algebra. Based on extensive experimentation with both discrete models arising in systems biology and randomly generated networks, we found that the algebraic algorithms presented in this manuscript are fast for systems with the structure maintained by most biological systems, namely sparseness, i.e., while the number of nodes in a biological network may be quite large, each node is affected only by a small number of other nodes, and robustness, i.e., small number of attractors. For a large set of published complex discrete models, ADAM identified the attractors in less than one second.

Conclusion:

Discrete modeling techniques are a useful tool for analyzing complex biological systems and there is a need in the biological community for accessible efficient analysis tools. ADAM provides analysis methods based on mathematical algorithms as a web-based tool for several different input formats, and it makes analysis of complex models accessible to a larger community, as it is platform independent as a web-service and does not require understanding of the underlying mathematics.

ADAM: Analysis of Discrete Models of Biological Systems Using Computer Algebra Franziska Hinkelmann, Madison Brandon , Bonny Guang , Rustin McNeill , Grigoriy Blekherman , Alan Veliz-Cuba and Reinhard Laubenbacher
Virginia Bioinformatics Institute, Blacksburg, VA 24061-0123, USA Virginia Polytechnic Institute and State University, Blacksburg, VA 24061-0123, USA University of Tennessee - Knoxville, Knoxville, TN 37996-2513, USA Harvey Mudd College, Claremont, CA 91711-5901, USA University of North Carolina - Greensboro, Greensboro, NC 27402-6170, USA These authors contributed equally Email: Franziska Hinkelmann - fhinkel@vt.edu; Madison Brandon - mbrando1@utk.edu; Bonny Guang - bonny.guang@gmail.com; Rustin McNeill - rcmcneil@uncg.edu; Grigoriy Blekherman - grrigg@vbi.vt.edu; Alan Veliz-Cuba - alanavc@vt.edu; Reinhard Laubenbacher- reinhard@vbi.vt.edu; Franziska Hinkelmann - fhinkel@vt.edu; Madison Brandon - mbrando1@utk.edu; Bonny Guang - bonny.guang@gmail.com; Rustin McNeill - rcmcneil@uncg.edu; Grigoriy Blekherman - grrigg@vbi.vt.edu; Alan Veliz-Cuba - alanavc@vt.edu; Reinhard Laubenbacher- reinhard@vbi.vt.edu; Corresponding author

Background:

Many biological systems are modeled qualitatively with discrete models, such as probabilistic Boolean networks, logical models, Petri nets, and agent-based models, with the goal to gain a better understanding of the system. The computational complexity to analyze the complete dynamics of these models grows exponentially in the number of variables, which impedes working with complex models. Although there exist sophisticated algorithms to determine the dynamics of discrete models, their implementations usually require labor-intensive formatting of the model formulation, and they are oftentimes not accessible to users without programming skills. Efficient analysis methods are needed that are accessible to modelers and easy to use.

Method:

By converting discrete models into algebraic models, tools from computational algebra can be used to analyze their dynamics. Specifically, we propose a method to identify attractors of a discrete model that is equivalent to solving a system of polynomial equations, a long-studied problem in computer algebra.

Results:

A method for efficiently identifying attractors, and the web-based tool Analysis of Dynamic Algebraic Models (ADAM), which provides this and other analysis methods for discrete models. ADAM converts several discrete model types automatically into polynomial dynamical systems and analyzes their dynamics using tools from computer algebra. Based on extensive experimentation with both discrete models arising in systems biology and randomly generated networks, we found that the algebraic algorithms presented in this manuscript are fast for systems with the structure maintained by most biological systems, namely sparseness, i.e., while the number of nodes in a biological network may be quite large, each node is affected only by a small number of other nodes, and robustness, i.e., small number of attractors. For a large set of published complex discrete models, ADAM identified the attractors in less than one second.

Conclusion:

Discrete modeling techniques are a useful tool for analyzing complex biological systems and there is a need in the biological community for accessible efficient analysis tools. ADAM provides analysis methods based on mathematical algorithms as a web-based tool for several different input formats, and it makes analysis of complex models accessible to a larger community, as it is platform independent as a web-service and does not require understanding of the underlying mathematics.

Background

Mathematical modeling is a crucial tool in understanding the dynamic behavior of complex biological systems. Discrete models are now widely used for this purpose. Model types include (probabilistic) Boolean networks, logical networks, Petri nets, cellular automata, and agent-based (individual-based) models, to name the most commonly found ones [1, 2, 3, 4, 5, 6].

There are several existing software packages for simulation and analysis of discrete models. These include GINsim, BoolNet R, Snoopy, signaling Petri net-based simulator in the PathwayOracle toolkit, DDLab, GenYsis-P Toolbox , and BN/PBN Matlab Toolbox [7, 8, 9, 10, 11, 12, 13, 14]. Each package has been developed to suit the needs of a particular community, and each package is designed to analyze a different model type. They are discussed in detail in Results and Discussion.

Simulation or exhaustive enumeration of the state space is common practice to analyze discrete models, but it is limited by computational complexity, as the state spaces grows exponentially in the number of variables. When relying on simulation on a standard desktop computer, most software tools are limited to 40 variables or less, i.e., states for Boolean systems. Complex models can be analyzed with SAT or BDD-based (Binary Decision Diagram) algorithms, but these algorithms are usually not available as software tools for a broad range of users. Most implementations are platform specific and require a particular input format that is not used by other software tools. Changing a model to the required format is often a labor-intensive process. GINsim, a package designed for the analysis of gene regulatory networks, uses a method based on binary decision diagrams to efficiently analyze asynchronous logical models, and provides this method without requiring the user to understand the underlying algorithm, but GINsim is specific to logical models and cannot import other discrete modeling types or formats [7].

Here, we present the web-based tool ADAM, Analysis of Dynamic Algebraic Models [15], a tool to study the dynamics of a wide range of discrete models. ADAM provides efficient analysis methods based on mathematical algorithms as a web-based tool for several different input formats, and it makes analysis of complex models accessible to a larger community, as it is platform independent as a web-service and does not require understanding of the underlying mathematics. ADAM is the successor to DVD, Discrete Visualizer of Dynamics [16], a tool to visualize the temporal evolution of small polynomial dynamical systems.

Different types of discrete models, including (probabilistic) Boolean networks, logical networks, Petri nets, cellular automata, and agent-based models, can be converted into the unifying framework of algebraic models, namely polynomial dynamical systems [17, 18]. This allows to apply tools from computational commutative algebra to analyze the models more efficiently than by simulation and without using heuristic methods. We used ADAM on several logical models and on published Boolean models with up to 60 variables [19, 20, 21]. These models are too large for a straight-forward analysis by exhaustive enumeration of the state space, and the corresponding publications contain lengthy explanations and supplementary material that outline the calculations and algorithms used to identify the attractors. ADAM greatly simplifies the analysis; it identified the steady states of these models in less than one second. We believe that complex discrete models will gain more popularity, if sophisticated analysis methods are easily accessible to modelers.

In addition to giving access to mathematical theory for efficient analysis, algebraic models provide a unifying framework and systematic approach for several model types, which allows for an effective comparison of heterogeneous models, such as a Boolean network model and an agent-based model. For community integration to the biological sciences, ADAM contains a model repository of previously published models available in ADAM specific format [22]. This allows new users to familiarize themselves quickly with ADAM and to validate and experiment with existing models. In the following section, we discuss general features of ADAM briefly and explain new features in more detail.

Results and Discussion

A wide range of software and algorithms exist to analyze discrete models. These tools are either limited by complexity as they rely on simulation as analysis method, or they are inaccessible to biologists not familiar with programming, as they are often times only available as platform dependent command-line tools. Furthermore, implementations require different input formats, which hampers the use of different software tools on the same model.

ADAM is an analysis tool for discrete models, available as a web-based tool, which hides and encapsulates the mathematical algorithms from the user. Therefore, users who lack understanding of the underlying mathematics or programming expertise can use efficient algorithms to analyze complex models.

We propose a novel method to identify attractors of a discrete model. This method relies on the fact that many discrete models can be translated into the algebraic framework of polynomial dynamical systems. Using these polynomials, one can construct a system of polynomial equations, such that its solutions correspond to fixed points or limit cycles. Thus, the problem of identifying attractors becomes equivalent to solving a system of polynomial equations over a finite field. This is a long-studied problem in computer algebra, and can usually be solved efficiently by computing a Gröbner basis.

This method is not a new mathematical algorithm to solve polynomial dynamical systems, but a novel approach that uses the fact that attractors are the solutions of polynomial systems derived from the model when expressed in the algebraic framework. ADAM allows users unfamiliar with polynomial dynamical systems or Gröbner basis to benefit from this efficient algorithm.

General Features

ADAM automatically converts discrete models into polynomial dynamical systems, that is time and state discrete dynamical systems described by polynomials over a finite field (see Appendix A.1 for definition and example). The dynamics of the models are then analyzed by using various computational algebra techniques. Even for large systems, ADAM computes key dynamic features, such as steady states, in a matter of seconds. ADAM is available online and free of charge. It is platform independent and does not require the installation of software or a computer algebra system.

ADAM can analyze discrete models. It translates the following inputs into (probabilistic) polynomial dynamical systems and can then analyze all of them except models originating from Petri nets.

  • Logical models generated with GINsim [7]

  • Petri nets generated with Snoopy [9]

  • polynomial dynamical systems

  • Boolean networks

  • probabilistic polynomial dynamical systems, probabilistic Boolean networks (PBN) [3].

We plan to implement analysis methods for Petri nets in future versions.

ADAM’s main application is the analysis of the dynamic features of a model, which includes the identification of stable attractors. These are either steady states, i.e., time-invariant states, or limit cycles, i.e., time-invariant sets of states. ADAM is capable of identifying all steady states and limit cycles of length up to a user-specified length . The process of finding long limit cycles is quite slow for large models, however, in biological models limit cycles are likely to be short, so that can be chosen to be small in general, i.e., less than ten.

The temporal evolution of the model can be visualized by the phase space, a graph of all possible states and their transitions, also called state space or state transition graph. For small enough models, i.e., less than eleven variables, ADAM generates a graph of the complete phase space; for larger models, ADAM uses algebraic algorithms to determine dynamic properties. Independent of network size, ADAM generates a wiring diagram. The wiring diagram, also known as dependency graph, shows the static relationship between the variables. All edges in ADAM’s wiring diagrams are functional edges, that is there exists at least one state such that a change in the input variable causes a change in the output variable (see Appendix A.2 for more details). This means that ADAM determines all non-functional edges, which is oftentimes of interest.

With ADAM, one can also study the temporal evolution of user-specified initial states. The trajectory of a state describes the state’s evolution, and it can be computed by repeatedly applying the transition function until an attractor is reached.

All of these features can be computed assuming synchronous updates or sequential updates according to an update-schedule specified by the user. Note that the steady states are the same independent of the update schedule. This is due to the fact that updating any variable at a steady state does not change its value. It is irrelevant for a steady state analysis whether updates are considered to happen sequentially or simultaneously.

For probabilistic networks, i.e., models in which each variable has several choices of local update rules, ADAM can generate a graph of all possible updates. This means that states in the phase space can have out-degree greater than one, since different transitions are possible. ADAM can find all true steady states, in the context of probabilistic networks, meaning all states that are time-invariant independent of the choice of update function. For further information of probabilistic networks, see [3].

For Boolean networks, ADAM calculates all functional circuits (see Appendix A.2). Positive functional circuits are a necessary condition for multi-stationarity. For a certain class of Boolean networks, namely conjunctive/disjunctive networks, ADAM computes a complete description of the phase space as described in [23]. For further details on conjunctive networks, see Appendix B.2.

In summary, ADAM can generate the following outputs.

  • wiring diagram

  • phase space for small models

  • steady states (for deterministic and probabilistic systems)

  • limit cycles of specified length

  • trajectories originating from a given initial state until a stable attractor is found

  • dynamics for synchronous or sequential updates

  • functional circuits for Boolean networks

  • a complete description of the phase space for conjunctive/disjunctive networks.

Comparison to Other Systems

In this section, we compare ADAM to other state-of-the-art systems. Most software tools discussed here provide different functionality than ADAM does, thus a run-time comparison is not feasible. An overview of the tools and their functionality is given in table 1, and they are discussed in detail below.

Steady States

Limit Cycles

Input

Requirements

ADAM Yes Yes Boolean (or polynomial) functions None, web based
Logical Models (GINsim)
Petri Net (Snoopy)
GINsim Yes for asynchronous networks parameters (non-zero truth tables) Java virtual machine
Logical Model
BoolNet R package Boolean functions R statistics software
Snoopy reachable from a given initial state SBML
P/T graph
DDLab truth tables
GenYsis-P Yes GenYsis-P specific text file Linux that supports the
Toolbox distributed binaries
BN/PBN Matlab truth tables Matlab
Toolbox
Table 1: Comparison of different software tools regarding attractor analysis: less than 1 second on published gene regulatory networks with up to 72 variables; only for short limit cycles; heuristic methods are available for larger networks; asynchronous updates in the sense of logical models; we had no platform available that supported the distributed binaries and could not conduct benchmark calculations; installation necessary, available for common operating systems.

GINsim (Gene Interaction Network simulation) is a package designed for the analysis of gene regulatory networks [7]. As input, it accepts logical models. Logical models are an extension to Boolean models; they consist of similar switch-like rules, but allow for a finer discretization with more than two states per variable, e.g., low, medium, and high. Logical models can be updated synchronously or asynchronously. For the latter, the temporal evolution of a logical model is non-deterministic because the variables are updated randomly in an asynchronous fashion. In either case, updates of every variable are continuous, meaning that no variable changes its value by more than one unit in one time-step, see section Remarks about Logical Models for a detailed discussion.

GINsim provides algorithms that use binary decision diagrams (BDD) for the determination of steady states and oscillatory behavior [4]. For synchronous updates, analysis of limit cycles is only possible by simulating every trajectory, i.e., generating the complete state space, called state transition graph in GINsim, and therefore limited by network size. We tested GINsim on logical models with up to variables; determining the steady states took less than one second. More complex logical networks were not available to us.

Networks are entered manually into GINsim, they cannot be imported from any other format. Furthermore, models are specified by entering their parameters, i.e., entering all values that result in a non-zero target value. Especially for large models, this can be a time consuming process.

BoolNet R package provides methods for inference and analysis of synchronous, asynchronous, and probabilistic Boolean networks [8]. It is a package for the free statistics software R, and it is run via the R command-line. It is helpful, if the user is already familiar with R. Steady state analysis is implemented as exhaustive search of the state space, heuristic search, random walk, or Markov Chain analysis [3]. Non-heuristic analysis is limited to networks with 29 variables. For larger networks, steady states can be inferred heuristically, which does not guarantee that all steady states are identified.

Snoopy is a unifying Petri net framework, containing a family of Petri net modeling tools and algorithms [9]. Snoopy provides built-in simulation and animation. Analysis of Petri nets can be performed, e.g., with the tool Charlie [24]. Charlie identifies structural properties and has algorithms for invariant based or reachability graph based analysis. Reachability graph-based analysis for Petri nets usually depends on a given initial state and does not provide a complete picture of possible dynamics for other initial markings. In addition to marking-dependent analysis, Charlie uses algorithms based on linear algebra to predict the dynamic properties independent of markings, such as T and P-invariants. ADAM converts a Petri net to a collection of polynomial dynamical systems, one system for each transition. Analysis of such a non-deterministic system is currently not implemented in ADAM, and we therefore do not list any run-time results for Petri nets.

DDLab is an interactive graphics software for discrete models, including cellular automata, Boolean and multi-valued networks [10]. As it is mainly a visualization tool, analysis is based on exhaustive enumeration of the state space, and model size is limited to 31 variables.

GenYsis-P Toolbox is a command-line tool currently available only for Linux to analyze (probabilistic) gene regulatory networks. Algorithms use (reduced order) binary decision diagrams. As analysis methods are not based on exhaustive enumeration, GenYsis-P can analyze large networks. Unfortunately, we did not have access to a platform that supports the distributed binaries, and source code was not available.

BN/PBN Toolbox is a toolbox written in Matlab [11]. It uses the state transition matrix to compute attractors. Statistics for networks with more than 27 variables cannot be computed (“Maximum variable size allowed by the program is exceeded”). In addition to analyzing deterministic Boolean networks, the toolbox can analyze probabilistic Boolean networks and calculate statistics such as numbers and sizes of attractors, basins, transient lengths, Derrida curves, percolation on 2-D lattices, and influence matrices.

ADAM calculates all steady states of networks non-heuristically, by applying algebraic algorithms, see section Methods. All of the above software tools provide other functionality that ADAM currently lacks, but for the analysis of synchronous networks, they all are either restricted to less than 32 variables or require familiarity with programming. GINsim calculates steady states for large networks, but models must be specified by their parameters and cannot be entered in a compact form as such (Boolean) functions.

Several Boolean models too large for analysis by exhaustive enumeration have been published, for example on the expression of the segment polarity genes in Drosophila melanogaster or T-cell regulation [20, 21]. ADAM identified all steady states and small limit cycles for these systems in less than one second, whereas the publications that comprise the Boolean models, contain lengthy explanations outlining the analysis. The model files in ADAM format can be accessed at [22].

Remarks about Logical Models

ADAM allows for synchronous or sequential updates according to a given update schedule. In models with synchronous updates, all variables are updated simultaneously at every time step. In models with sequential updates, all variables are updated at every time-step, but in the order of the given update schedule. Models with sequential updates can be converted into synchronous models with identical state space. In models with asynchronous updates, as it is common for logical models, one variable is updated at random at every time step, which results in a non-deterministic model. Sequential and asynchronous updates of the same system result in different dynamics.

In GINsim, all models are “continuous” in the sense that at each time-step, each variable increases or decreases by at most one unit. Though logical models are discrete, there are no jumps skipping intermediate states. For example, in a model with three states, low, medium, and high, no variable can drop from high to low in a single update step. This interpretation is different from the common meaning of continuous, which usually refers to models of ordinary or partial differential equations. The parameters entered in GINsim specify the target value towards which the variable changes, i.e., the value increases by one, decreases by one, or remains constant if the target value is larger, smaller, or equal than the initial value, respectively. The phase space generated with ADAM might differ from the state transition graph generated in GINsim. To obtain the exact same phase space, every variable in the logical model must contain an explicit self-loop, and all parameters must be entered such that the target value differs by at most one from the value of the variable to be updated. Any logical model can be specified in this way without changing its state transition graph. Boolean models are always continuous.

In multi-valued logical models, variables can have different maximum values. In an algebraic model, all variables are defined over the same algebraic field, i.e., have the same maximum value. When a multi-valued logical model is translated into an algebraic model, extraneous states might be introduced such that all variables are defined over the same field. An example of such an extension is given in table 2, the extra states are the states in the last row, which are given the same values as the states above to extend the model in a meaningful way. The extra states should be ignored when analyzing the dynamics. For more details, see [17].

next state of low medium high
absent low medium high
present medium high high
extension present medium high high
Table 2: Updates for variable in a logical model, where depends on and itself. The states and represent absent and present for the Boolean variable ; , , and represent low, medium, and high for the multi-valued variable . The last row is introduced in the polynomial dynamical system such that all variables are defined over . The extra states in the state space should be ignored when interpreting the dynamics.

Remarks about Petri Nets

In the Petri net community, state space usually refers to all states (markings) reachable for a given initial state. In this manuscript, state space refers to the set of all possible states, independent of an initial state.

Translating a bounded Petri net to an algebraic model results in a set of polynomial dynamical systems, where every transition corresponds to one system. In a Petri net, different firing sequences can lead to different markings; a firing sequence relates to the order in which the different systems are iterated. The update schedule is not related to the firing sequence. As firing does not consume any time, polynomial systems describing a Petri net always use synchronous updates [25].

The term functional edge is not related to the concept of liveness of a transition. The liveness of a transition depends on an initial marking. An edge, connecting two places (source and target), is functional, if there exists a marking, such that changing only the marking of the source place, changes the marking of the target place (see Appendix A.2 for more details on functional edges).

Application

We show how to use ADAM on a well-understood model of the expression pattern of the segment polarity genes in Drosophila melanogaster. Albert and Othmer developed a model for embryonic pattern formation in the fruit fly Drosophila melanogaster [20]. Their Boolean model consists of 60 variables, resulting in a state space with more than states. They analyze the model for steady states by manually solving a system of Boolean equations. They also analyze the temporal evolution of a specific initial state corresponding to the wild type expression pattern by repeatedly applying the Boolean update rules until a steady state is found. The update schedule of the model is synchronous with the exception of activation of SMO and the binding of PTC to HH (activation of PH), which are assumed to happen instantaneously. This can be accounted for by substituting the equations for SMO and PH into the update rules for other genes and proteins, rather than using SMO and PH themselves.

To analyze the model, we first rename the variables in the Boolean rules given in [20] such as or to , to standardize their format. Then we use ADAM: the model type is Polynomial Dynamical Systems, the number of states in a Boolean model is , representing present or absent. One can choose Boolean, and enter the Boolean rules in the text-area or upload a text file with the Boolean rules. Alternatively, one can first convert the Boolean rules to polynomials over , and enter the polynomials with the choice Polynomial. The file with the polynomial equations for the model can be accessed at [22].

The rules in the model file are specified in Polynomial form. Once the polynomials are uploaded, we need to set the Analysis type. The model with variables is too complex for exhaustive enumeration, and we choose Algorithm. This means that instead of exhaustive enumeration of the state space, analysis of the dynamics is done via computer algebra by solving systems of equations. In Options, we set Limit cycle length to one because we are interested in the steady states, i.e., time-invariant states. We chose synchronous as updating scheme. Once these choices have been made, we obtain the steady states by clicking Analyze. ADAM returns a link to the wiring diagram or dependency graph, which captures the static relations between the different variables. In addition, ADAM returns the number of steady states and the steady states themselves, see figure 1. These steady states are identical to those found in [20], half of which have been observed experimentally.

Figure 1: ADAM: Analysis of steady states of Drosophila model. Each row in the table corresponds to a stable attractor. Attractors are written as binary strings, where represents non-expression of a gene (or low concentration of a protein), and expression (or high concentration)

Each row in the table corresponds to a stable attractor. Attractors are written as binary strings, where represents non-expression of a gene (or low concentration of a protein), and expression (or high concentration), e.g.,

(1)

corresponds to the genes (and proteins) being expressed ( or present in high concentration) in four cells form anterior to posterior compartments (compartment to ) as shown in table 3.

compartment 1 en, EN, hh, HH, SMO
compartment 3 ptc, PTC, PH, SMO, ci, CI, CIA
compartment 3 SLP, PTC, ci, CI, CIR
compartment 4 SLP, wg, WG, ptc, PTC, PH, SMO, ci, CI, CIA
Table 3: Genes and proteins present in steady state 1

This is the steady state obtained in [20] when starting the system with an initial state representing the experimental observations of stage 8 embryos.

ADAM can also generate trajectories for a given initial state. For example, we can choose the initial state that was used in [20] representing stage 8 embryos. Again, we enter Polynomial Dynamical Systems with as the number of states and upload the polynomials describing the model. Instead of Algorithms, we now choose Simulation. Since we are not interested in the number of steady states or the complete phase space, but in a single trajectory originating from a specific initial state, we choose One trajectory starting at an initial state as the simulation option. As initial state we enter

By clicking Analyze, we obtain the temporal evolution of this particular state until it reaches a steady state. As predicted in [20], the steady state is the state described in table 3, see Fig. 2.

Figure 2: ADAM: Trajectory of Drosophila model

To summarize, ADAM correctly identified the steady states in less than one second. All steady states have been determined previously in [20] by labor-intensive manual investigation of the system.

Furthermore, we used ADAM to verify that there are no limit cycles of length two or three. The model has not been analyzed previously for limit cycles. The absence of two- and three cycles strengthens confidence in the model, since oscillatory behavior has not been observed experimentally. Computations for limit cycles of length greater than three have not been conducted, as composing the system several times with itself is computationally complex. The model file in ADAM format can be accessed at [22].

Benchmark Calculations

We analyzed logical models available in the GINsim model repository [19] as of August 2010. The repository consists of models in GINsim XML format previously published in peer-reviewed journals. We converted all but two models into polynomial dynamics systems. For these 26 models we computed the steady states. All calculations finished in less than 1,5 seconds, see Figure 3.

In addition to the published models in [19], we analyzed randomly generated networks that have the same structure that we expect from biological systems, namely sparse, i.e., while the number of nodes in a biological network may be quite large, each node is affected only by a small number of other nodes, and robust, i.e., small number of attractors.. We tested a total of 50 networks with 50-150 nodes ( states) and an average of average in-degrees of . The steady state calculations took less than half a second for each network on a 2.7 GHz computer.

Figure 3: Runtime of steady state calculations of several logical models from [19]. Executed on a 2.7 GHz computer.

Architecture

ADAM is available as an web-based tool, that does not require any software installation. ADAM’s user interface is implemented in HTML. We use JavaScript to generate a dynamic website that adapts as the user makes various choices. This simplifies the process of entering a model. For example, after defining the model type, i.e., Polynomial Dynamical System, Probabilistic Network, Petri net, and Logical Model the next line changes to the number of states, -bound, or nothing, appropriately. Input can be entered directly into the text-area, or uploaded as a text document.

All mathematical algorithms are programmed in Macaulay2 [26]. Macaulay2 is a powerful computer algebra system. The routines for which fast execution is crucial are implemented in C/C++ as part of the Macaulay2 core. Logical Models and Petri nets in XML format are parsed using Ruby’s XmlSimple library. The interplay between HTML and Macaulay2 is also programmed in Ruby.

Output graphs are generated with Graphviz’s dot command. When Simulation is chosen as analysis method, Graphviz’s ccomps - connected components filter for graphs is used to count the connected components. A Perl script directs the execution of the Graphviz commands.

Model Repository

A model repository is part of the ADAM website [22]. The repository consists of a collection of several previously published models in ADAM format. The models are extracted from publications, and rewritten in ADAM specific format, i.e., all variables are renamed to and the update rules from the original publication are reformulated as Boolean rules or polynomials. The central repository with models in a unified framework allows for quick verification and experimentation with published models. By changing parameters or initial states, users can gain a better understanding of the models.

New users can also use the repository to quickly familiarize themselves with the main functionalities of ADAM. In addition to the model itself, the database entries contain a short summary of the biological system and relevant graphs, together with an analysis of dynamic features determined by ADAM and their biological explanation. The repository is work in progress by researchers from several institutions generating more entries for the repository. We invite all interested researchers to submit their models.

Because of their intuitive nature, discrete models are an excellent introduction to mathematical modeling for students of the life sciences. ADAM’s model repository is a great starting point to familiarize students with the abstraction of discrete models such as Boolean networks.

Conclusions

Discrete modeling techniques are a useful tool for analyzing complex biological systems and there is a need in the biological community for easily accessible analysis software. ADAM provides efficient methods as a web-based tool and will allow a larger community to use complex modeling techniques, as it is platform independent and does not require the user to understand the underlying mathematics. Upon translating discrete models, such as logical networks, Petri nets, or agent-based models into algebraic models, rich mathematical theory becomes available for model analysis.

After extensive experimentation with both discrete models arising in systems biology and randomly generated networks, we found that the algebraic algorithms presented in this manuscript are fast for sparse systems with few attractors, a structure maintained by most biological systems. All algorithms have been included in the software package ADAM[15], which is user-friendly and available as a free web-based tool. ADAM is highly suitable to be used in a classroom as a first introduction to discrete models because students can use it without going through an installation process.

ADAM provides methods to analyze the key dynamic features, such as steady states and limit cycles, for large-scale (probabilistic) Boolean networks and logical models. ADAM unifies different modeling types by providing analysis methods for all of them and thus can be used by a larger community.

We hope to expand ADAM to a more comprehensive Discrete Toolkit which incorporates new analysis methods, better visualization, and automatic conversion for more model types. We also hope to analyze controlled algebraic models and expand theory to stochastic systems.

Methods

Logical models, Petri nets, and Boolean networks are converted automatically into the corresponding polynomial dynamical system as described in [17], so that algorithms from computational algebra can be used to analyze the dynamics. In polynomial dynamical systems over a finite field, states of a variable are assigned to values in the field, and the update (or transition) rule for each variable is given as a polynomial rather than a Boolean or logical expression. For more details, see appendix A.1. Using these polynomials, one can construct systems of polynomial equations, such that their solutions correspond to fixed points or limit cycles. Thus, the problem of identifying attractors becomes equivalent to solving a system of polynomial equations over a finite field. This is a long-studied problem in computer algebra, and can usually be solved efficiently by computing a Gröbner basis.

Gröbner basis calculation is for polynomial systems what Gauss-Jordan elimination is for linear systems: a structured way to transform the original system to triangular shape without changing its solution space. The triangular shape of the resulting systems allows for stepwise retrieval of the solutions of the system. For a more in depth discussion of Gröbner bases, see for example [27].

In the worst case, computing Gröbner bases for a set of polynomials has complexity doubly exponential in the number of solutions to the system. However, in practice, Gröbner bases are computable in a reasonable time. It has been suggested, that in robust gene regulatory networks genes are regulated by only a handful of regulators [28]. Thus, the polynomial dynamical systems representing such biological networks are sparse, i.e., each function depends only on a small subset of the model variables. From our experience, a Gröbner basis calculation for sparse systems with few attractors, a structure common for biological systems, is actually quite fast.

Appendix A Mathematical Background

a.1 Polynomial Dynamical Systems

To be self-contained, we briefly explain polynomial dynamical systems and their key features.

A polynomial dynamical system (PDS) [29] over a finite field is a function

with coordinate functions . Iteration of results in a time-discrete dynamical system. A PDS can be used to describe the dynamic behavior of a biological system: every variable corresponds to a biological substrate, for example a protein or gene, and the polynomials describe the evolution of depending on the previous state of the variables .

PDS have several dynamic features of biological relevance. These include the number of components, component sizes, steady states, limit cycles, and limit cycle lengths.

  • Let and with

    The wiring diagram of , which shows the static interaction of the three variables, is depicted in Figure 4 (left) along with its phase space in Figure  missing4 (right). The phase space shows the temporal evolution of the system. Each state is represented as a vector of the values of the three variables . The PDS described by has two stable attractors: a steady state, , and a limit cycle of length three, consisting of the states , , and .

Figure 4: (left) Wiring diagram: static relationship between variables (right) Phase space: temporal evolution of the system

A probabilistic PDS over a finite field is a collection of functions

together with a probability distribution for every coordinate that assigns the probability that a specific function is chosen to update that coordinate. The coordinate functions are elements in . Probabilistic PDS, specifically Boolean probabilistic networks (PBN), have been studied extensively in [3]. ADAM analyzes probabilistic PDS. It can simulate the complete phase space for small enough models, by generating every possible transition and labeling the edge with its probability according to the distribution. If no distribution is given, ADAM assumes a uniform distribution on all functions. For large networks, ADAM’s Algorithm choice computes steady states of probabilistic networks.

a.2 Functional Edges

An edge in the wiring diagram from to is considered functional, if there exists a state such that , where and are values for , in other words, if there is at least one state, such that changing only but keeping all other values fixed, changes the next state of . In ADAM, all edges in the wiring diagram are functional. For Boolean networks, ADAM identifies all functional elementary circuits. An elementary circuit is a finite closed paths in the wiring diagram where all the nodes are distinct. Functional Circuits are a necessary condition for multi-stationarity and limit cycles. For a further discussion of functional circuits, see [4]. For multivalued networks, circuit analysis has not yet been implemented.

Appendix B Algorithms

b.1 Analysis of stable attractors

Every attractor in a PDS is either a steady state or a limit cycle. For small models, ADAM determines the complete phase space by enumeration, for large models, ADAM computes steady states and limit cycles of a given length. A state is a steady state, if it transitions to itself after one update of the system. A state is part of a limit cycle of length , if, after updates, it results in itself. Any steady state of a PDS satisfies the equation , as no coordinate of is changing as it is updated. Similarly, states of a limit cycle of length satisfy the equation . ADAM computes all steady states by solving the system for simultaneously. To efficiently solve the resulting systems of polynomial equations, we first compute the Gröbner basis in lexicographic order for the ideal generated by the equations. By the elimination and extension theorem, choosing a lexicographic order allows to easily obtain the solutions [27]. We use the Gröbner basis algorithms distributed with Macaulay2, a computer algebra system, and found that for quotient rings over a finite field the implementation ‘Sugarless’ is more efficient than the default algorithm with ‘Sugar’ [26, 30]. For limit cycles of length , the solutions of are found and then grouped into cycles, by applying to each of the solutions.

  • missingFixed points of the system shown in the example in A.1 are solutions in of the system :

    The only solution to this systems is the point . This is in accordance with the state space depicted in figure 4: is the only fixed point. To investigate limit cycles of length two, one has to look at the system ,

    Again, is the only solution, which means that there are no limit cycles of length two.

    Investigating ,

    results in the solutions . is a fixed point, and are elements of a limit cycle of length three. For all , has no solutions, that means the system has exactly two attractors, a fixed point a a limit cycle of length three.

b.2 Conjunctive/Disjunctive Networks

Some classes of networks have a certain structure that can be exploited to achieve faster calculations. Jarrah et al. show that for conjunctive (disjunctive) networks key dynamic features can be found with almost no computational effort [23]. Conjunctive (respectively disjunctive) networks consist of functions using only the AND (respectively OR) operator. ADAM comes with an implementation of this algorithm to analyze dynamics in the case of conjunctive (disjunctive) networks. Currently, this option is only implemented for networks with strongly connected dependency graphs.

Authors’ contributions

FH led the algorithm and software development group. BG, MB, and RM implemented the user interface and attractor analysis, executed benchmarking calculations, and drafted initial manuscript under FH’s leadership. AV implemented the translation for logical algorithms to PDS used by ADAM. GB participated in the software design effort and algorithm development. RL conceived of the study, provided overall leadership of the project, and secured funding for it. He also contributed to the writing and editing of the manuscript. All authors read and approved the final manuscript.

Acknowledgements

Dimitrova, Clemson University; J. Adeyeye, Winston-Salem State University; B. Stigler, Southern Methodist University; R. Isokpehi, Jackson State University are currently expanding ADAM’s Model Repository. Funding for this work was provided through U.S. Army Research Office Grant Nr. W911NF-09-1-0538, National Science Foundation Grant Nr. CMMI-0908201, and National Science Foundation Grant Nr. 0755322.

References

  • 1. Steggles LJ, Banks R, Shaw O, Wipat A: Qualitatively modelling and analysing genetic regulatory networks: a Petri net approach. Bioinformatics 2007, 23:336–343.
  • 2. Sackmann A, Heiner M, Koch I: Application of Petri net based analysis techniques to signal transduction pathways. BMC Bioinformatics 2006, 7:482, [http://www.biomedcentral.com/1471-2105/7/482].
  • 3. Shmulevich I, Dougherty ER, Kim S, Zhang W: Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks. Bioinformatics 2002, 18(2):261–274, [http://dx.doi.org/10.1093/bioinformatics/18.2.261].
  • 4. Naldi A, Thieffry D, Chaouiya C: Decision diagrams for the representation and analysis of logical models of genetic networks. In CMSB’07: Proceedings of the 2007 international conference on Computational methods in systems biology, Berlin, Heidelberg: Springer-Verlag 2007:233–247.
  • 5. Bonchev D, Thomas S, Apte A, Kier LB: Cellular automata modelling of biomolecular networks dynamics. SAR and QSAR in Environmental Research 2010, 21:77–102, [http://www.informaworld.com/10.1080/10629360903568580].
  • 6. Laubenbacher R, Jarrah AS, Mortveit H, Ravi SS: Encyclopedia of Complexity and System Science, New York: Springer Verlag 2009 chap. A mathematical foundation for agent-based computer simulation.
  • 7. Naldi A, Berenguier D, Fauré A, Lopez F, Thieffry D, Chaouiya C: Logical modelling of regulatory networks with GINsim 2.3. Biosystems 2009, 97(2):134–139.
  • 8. Müssel C, Hopfensitz M, Kestler HA: BoolNet—an R package for generation, reconstruction and analysis of Boolean networks. Bioinformatics 2010, 26(10):1378–1380.
  • 9. Rohr C, Marwan W, Heiner M: Snoopy—a unifying Petri net framework to investigate biomolecular networks. Bioinformatics 2010, 26(7):974–975.
  • 10. Wuensche A: DDLab - Discrete Dynamics Lab. Available at http://www.ddlab.com/ 2010.
  • 11. Shmulevich I, Lähdesmäki H: BN/PBN MATLAB Toolbox. Available at http://personal.systemsbiology.net/ilya/PBN/PBN.htm 2005.
  • 12. Garg A, Banerjee D, De Micheli G: Implicit methods for probabilistic modeling of Gene Regulatory Networks. In Engineering in Medicine and Biology Society, 2008. EMBS 2008. 30th Annual International Conference of the IEEE 2008:4621 –4627.
  • 13. Dubrova E, Teslenko M: A SAT-Based Algorithm for Finding Attractors in Synchronous Boolean Networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2010, 99(PrePrints).
  • 14. Ruths D, Muller M, Tseng JTT, Nakhleh L, Ram PT: The signaling petri net-based simulator: a non-parametric strategy for characterizing the dynamics of cell-specific signaling networks. PLoS computational biology 2008, 4(2):e1000005+, [http://dx.doi.org/10.1371/journal.pcbi.1000005].
  • 15. Hinkelmann F, Brandon M, Guang B, McNeill R, Veliz-Cuba A, Laubenbacher R: ADAM, Analysis of Dynamic Algebraic Models. Available at http://adam.vbi.vt.edu.
  • 16. Laubenbacher R: DVD - Discrete Visualizer of Dynamics. Available at http://dvd.vbi.vt.edu/.
  • 17. Veliz-Cuba A, Jarrah AS, Laubenbacher R: Polynomial algebra of discrete models in systems biology. Bioinformatics 2010, 26(13):1637–1643, [http://dx.doi.org/10.1093/bioinformatics/btq240].
  • 18. Hinkelmann F, Murrugarra D, Jarrah A, Laubenbacher R: A Mathematical Framework for Agent Based Models of Complex Biological Networks. Bulletin of Mathematical Biology 2010, :1–20, [http://dx.doi.org/10.1007/s11538-010-9582-8]. [10.1007/s11538-010-9582-8].
  • 19. Naldi A, Thieffry D, Chaouiya C: GINsim - model repository. Available at http://gin.univ-mrs.fr/GINsim/model_repository.html.
  • 20. Albert R, Othmer HG: The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. Journal of Theoretical Biology 2003, 223:1–18.
  • 21. Pedicini M, Barrenäs F, Clancy T, Castiglione F, Hovig E, Kanduri K, Santoni D, Benson M: Combining network modeling and gene expression microarray analysis to explore the dynamics of Th1 and th2 cell regulation. PLoS computational biology 2010, 6(12), [http://dx.doi.org/10.1371/journal.pcbi.1001032].
  • 22. Hinkelmann F: Model Repository of Discrete Biological Systems. Available at http://dvd.vbi.vt.edu/repository.html.
  • 23. Jarrah A, Laubenbacher R, Veliz-Cuba A: The Dynamics of Conjunctive and Disjunctive Boolean Network Models. Bulletin of Mathematical Biology 2010, 72:1425–1447, [http://dx.doi.org/10.1007/s11538-010-9501-z]. [10.1007/s11538-010-9501-z].
  • 24. Franzke A: Charlie 2.0 - a multi-threaded Petri net analyzer; Diploma Thesis,. Available at http://www-dssz.informatik.tu-cottbus.de/index.html?/software/charlie.html 2009.
  • 25. Heiner M, Gilbert D, Donaldson R: Petri nets for systems and synthetic biology. In Proceedings of the Formal methods for the design of computer, communication, and software systems 8th international conference on Formal methods for computational systems biology, SFM’08, Berlin, Heidelberg: Springer-Verlag 2008:215–264, [http://portal.acm.org/citation.cfm?id=1786698.1786706].
  • 26. Grayson DR, Stillman ME: Macaulay2, a software system for research in algebraic geometry. Available at http://www.math.uiuc.edu/Macaulay2/ 1992.
  • 27. Cox DA, Little J, O’Shea D: Ideals, Varieties, and Algorithms: An Introduction to Computational Algebraic Geometry and Commutative Algebra, 3/e (Undergraduate Texts in Mathematics). Secaucus, NJ, USA: Springer-Verlag New York, Inc. 2007.
  • 28. Robert D Leclerc: Survival of the sparsest: robust gene networks are parsimonious. Mol Syst Biol 2008, 4, [http://dx.doi.org/10.1038/msb.2008.52].
  • 29. Jarrah AS, Laubenbacher R, Stigler B, Stillman ME: Reverse-engineering of polynomial dynamical systems. Adv Appl Math 2007, 39:477–489.
  • 30. Giovini A, Mora T, Niesi G, Robbiano L, Traverso C: “One sugar cube, please” or selection strategies in the Buchberger algorithm. In ISSAC ’91: Proceedings of the 1991 international symposium on Symbolic and algebraic computation, New York, NY, USA: ACM 1991:49–54.

Figures

Figure 1 - ADAM: Analysis of steady states of Drosophila model. Each row in the table corresponds to a stable attractor. Attractors are written as binary strings, where 0 represents non-expression of a gene (or low concentration of a protein), and 1 expression (or high concentration)

Steady states of Drosophila Melanogaster as found with ADAM.

Figure 2 - ADAM: Trajectory of Drosophila model

Temporal evolution of given initial state until steady state is reached.

Figure 3 - Runtime of steady state calculations of several logical models from [17]. Executed on a 2.7 GHz computer.

Runtime of steady state calculations of several logical models from [17]. Executed on a 2.7 GHz computer.

Figure 4 - (left) Wiring diagram: static relationship between variables (right) Phase space: temporal evolution of the system.

(left) Wiring diagram: static relationship between variables (right) Phase space: temporal evolution of the system.

Table 1 - Comparison of different software tools regarding attractor analysis: less than 1 second on published gene regulatory networks with up to 72 variables; only for short limit cycles; heuristic methods are available for larger networks; asynchronous updates in the sense of logical models; we had no platform available that supported the distributed binaries and could not conduct benchmark calculations; installation necessary, available for common operating systems.

Comparison of different softwares regarding attractor analysis.

Table 2 - Updates for variable x2 in a logical model, where x2 depends on x1 and itself. The states 0 and 1 represent absent and present for the Boolean variable x1; 0, 1, and 2 represent low, medium, and high for the multi-valued variable x2. The last row is introduced in the polynomial dynamical system such that all variables are defined over F3. The extra states (2, 0), (2, 1), (2, 2) in the state space should be ignored when interpreting the dynamics.

Extending state space when converting logical model to polynomial dynamical system.

Table 3 - Genes and proteins present in steady state

Steady state

Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
11740
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description