Using Firing-Rate Dynamics to Train
Recurrent Networks of Spiking Model Neurons
Brian DePasquale, Mark M. Churchland, L.F. Abbott
Department of Neuroscience
Grossman Center for the Statistics of Mind
Department of Physiology and Cellular Biophysics
Columbia University College of Physicians and Surgeons
New York NY 10032-2695 USA
Recurrent neural networks are powerful tools for understanding and modeling computation and representation by populations of neurons. Continuous-variable or “rate” model networks have been analyzed and applied extensively for these purposes. However, neurons fire action potentials, and the discrete nature of spiking is an important feature of neural circuit dynamics. Despite significant advances, training recurrently connected spiking neural networks remains a challenge. We present a procedure for training recurrently connected spiking networks to generate dynamical patterns autonomously, to produce complex temporal outputs based on integrating network input, and to model physiological data. Our procedure makes use of a continuous-variable network to identify targets for training the inputs to the spiking model neurons. Surprisingly, we are able to construct spiking networks that duplicate tasks performed by continuous-variable networks with only a relatively minor expansion in the number of neurons. Our approach provides a novel view of the significance and appropriate use of “firing rate” models, and it is a useful approach for building model spiking networks that can be used to address important questions about representation and computation in neural systems.
A fundamental riddle of nervous system function is the disparity between our continuous and comparatively slow sensory percepts and motor actions and the neural representation of those percepts and actions by brief, discrete and spatially distributed actions potentials. A related puzzle is the reliability with which these signals are represented despite the variability of neural spiking across nominally identical performances of a behavior. A useful approach to addressing these issues is to build spiking model networks that perform relevant tasks, but this has proven difficult to do. Here we develop a method for constructing functioning networks of spiking model neurons that perform a variety of tasks while embodying the variable character of neuronal activity. In this context, “task” refers to a computation performed by a biological neural circuit.
There have been previous successes constructing spiking networks that perform specific tasks (see for example Seung et. al. (2000); Wang (2002); Machens, Romo & Brody (2005); Hennequin et. al. (2014)). In addition, more general procedures have been developed (reviewed in Abbott, DePasquale & Memmesheimer (2016)) that construct spiking networks that duplicate systems of linear (Eliasmith, 2005; Boerlin & Denève, 2011; Boerlin, Machens & Denève, 2013) and nonlinear (Eliasmith, 2005; Thalmeier et. al., 2015) equations. However, most tasks of interest to neuroscientists, such as action choices based on presented stimuli, are not expressed in terms of systems of differential equations.
Our work uses continuous-variable network models (Sompolinsky, Crisanti & Sommers, 1988), typically called “rate” networks, as an intermediary between conventional task descriptions in terms of stimuli and responses, and spiking network construction. This results in a general procedure for constructing spiking networks that perform a wide variety of tasks of interest to neuroscience (see also Thalmeier et. al. (2015); Abbott, DePasquale & Memmesheimer (2016)). We apply this procedure to example tasks and show how constraints on the sparseness and sign (Dale’s law) of network connectivity can be imposed. We also build a spiking network model that matches multiple features of data recorded from neurons in motor cortex and from arm muscles during a reaching task.
The focus of our work is the development of a procedure for constructing recurrently connected networks of spiking model. We begin by describing the model-building procedure and then present examples of its use.
Network architecture and network training
The general architecture we consider is a recurrently connected network of leaky integrate-and-fire (LIF) model neurons that receives task-specific input and, following training, produces an approximation of a specified “target” output signal (Figure 1a). can be thought of as external sensory input or as input from another neural network, and as the input current into a downstream neuron or as a more abstractly defined network output (for example a motor output signal or a decision variable). The neurons in the network are connected to each other by synapses with strengths denoted by the matrix . Connections between the network and to the output have strengths given by an matrix , where is the number of outputs (either 1 or 2 in the examples we provide). During network training both and are modified. In addition to the trained connections described by , we also include random connections defined by another matrix, . The elements of this matrix are chosen randomly from a Gaussian distribution with mean and variance and are not modified during training. The values of and are given below for the different examples we present. This random connectivity produces chaotic spiking in the network (van Vreeswijk & Sompolinsky, 1998; Brunel, 2000), which we use as a source of spiking irregularity and trial-to-trial variability. We use the parameter to control the level of this variability.
When a neuron in the network fires an action potential, it contributes both fast and slow synaptic currents to other network neurons. These currents are described by the two -dimensional vectors, and . When neuron in the network fires an action potential, component of both and is incremented by 1, otherwise
The two time constants determine the decay times of these slow and fast synaptic currents, and we set ms and ms. The synapses that are modified by training are all of the slow type, while the random synapses are fast. For example, the output of the network is the product of and the output weight matrix , (Figure 1a).
The membrane potentials of the model neurons, collected together into a -component vector , obey the equation
with ms. For a case with inputs, is an matrix (we consider and 2) with elements drawn independently from a uniform distribution between -1 and 1. is a bias current set equal to 10 mV. It is increased between trials in the examples of Figures 3 and 4, representing a “holding” input. Each neuron fires an action potential when its membrane potential reaches a threshold mV and is then reset to mV. Following an action potential, the membrane potential is held at the reset potential for a refractory period of 2 ms unless stated otherwise. The parameter controls the strength of the inputs to each neuron, and we provide its value for the different examples below.
We can now specify the goal and associated challenges of network training. The goal is to modify the entries of and so that the network performs the task specified by and , meaning that
when the network responds to (with the approximation being as accurate as possible). Equation 3 stipulates that must provide a basis for the function . If it does, it is straightforward to compute the optimal by minimizing the squared difference between the two sides of equation 3, averaged over time. This can be done either recursively (Haykin, 2002) or using a standard batch least-squares approach.
Determining the optimal is significantly more challenging because of the recurrent nature of the network. must be chosen so that the input to the network neurons, , generates a pattern of spiking that produces . The circularity of this constraint is what makes recurrent network learning difficult. The difference between the easy problem of computing and the difficult problem of computing is that, in the case of , we have the target in equation 3 that specifies what signal should produce. For , it is not obvious what the input it generates should be.
Suppose that we did have targets analogous to but for computing (we call them auxiliary target functions and denote them by the -component vector ). Then, , like , could be determined by a least-squares procedure, that is, by minimizing the time-averaged squared differences between the two sides of
There are stability issues associated with this procedure, that we discuss below, however the main challenge in this approach is to determine the appropriate auxiliary target functions. Our solution to this problem is to obtain them from a continuous-variable model. More generally, if we can train or otherwise identify another model that implements a solution to a task, we can use signals generated from that model to train our spiking network.
Using continuous variable models to determine auxiliary target functions
Equations 4 and 3, respectively, summarize two key features of the vector of functions : 1) They should correspond to the inputs of a recurrently connected dynamic system, and 2) They should provide a basis for the network output . To satisfy the first of these requirements, we identify with the inputs of a recurrently connected continuous-variable “rate” network. These networks have been studied intensely (Sompolinsky, Crisanti & Sommers, 1988; Rajan, Abbott & Sompolinsky, 2010) and have been trained to perform a variety of tasks (Jaeger & Haas, 2004; Sussillo & Abbott, 2009; Laje & Buonomano, 2013; Sussillo, 2014). To satisfy the second condition, we use the desired spiking network output, , as an input to the rate network. This allows us to obtain the auxiliary target functions without having to train the continuous variable network.
The continuous-variable model we use is a randomly connected network of firing-rate units (throughout we use tildes to denote quantities associated with the continuous-variable network). Like the spiking networks, these units receive the input and, as mentioned above, they also receive as an input. The continuous-variable model is described by an -component vector that satisfies the equation
where ms, is a nonlinear function (we use ), and , , and are matrices of dimension , and , respectively. The elements of these matrices are chosen independently from a Gaussian distribution of zero mean and variance for , and a uniform distribution between -1 and 1 for and unless stated otherwise. We set except where identified otherwise.
To be sure that signals from this driven network are appropriate for training the spiking model, the continuous-variable network, driven by the target output, should be capable of producing a good approximation of . To check this, we can test whether an matrix can be found (by least squares) that satisfies to a sufficient degree of accuracy. Provided and are appropriately scaled, this can be done for a wide range of tasks (Sussillo, 2014).
The auxiliary target functions that we seek are generated from the inputs to the neurons in the continuous-variable network. There is often, however, a mismatch between the dimensions of , which is , and of the inputs to the continuous-variable model, which is . To deal with this, we introduce an matrix , with elements drawn independently from a uniform distribution over the range , and write
We leave out the input term proportion to in this expression because the spiking network receives the input directly. This set of target functions satisfies both of the requirements listed at the beginning of this section and, as we show in the following examples, they allow functional spiking networks to be constructed by finding connections that satisfy equation 4. We do this initially by a recursive least squares algorithm (Haykin, 2002), but later we discuss solving this problem by batch least squares instead.
Examples of trained networks
The procedure described above can be used to construct networks that perform a variety of tasks. We present three examples that range from tasks inspired by problems of relevance to neuroscience to modeling experimental data.
Our first example is an autonomous oscillation task that requires the network to generate a self-sustained, temporally complex output (Figure 2).
for this task is a periodic function created by summing sine functions with frequencies of 1, 2, 3, and 5 Hz. We require that the network generates this output autonomously, therefore for this example. Complex, autonomous oscillatory dynamics are a feature of neural circuits involved in repetitive motor acts such locomotion (Marder, 2000).
Initially , so the activity of the network is determined by the random synaptic input prodvide by , and the neurons exhibit irregular spiking (Figure 2a-c). In this initial configuration, the average Fano factor, computed using 100 ms bins, is 0.5, and the average firing rate across the network is 5 Hz. Following the training procedure, the learned postsynaptic currents closely match their respective auxiliary target functions (Figure 2d), and the network output similarly matches the target (Figure 2e). Residual chaotic spiking due to (Figure 2c) and the fact that we are approximating a continuous function by a sum of discontinuous functions cause unavoidable deviations. Nevertheless, a network of 3,000 LIF neurons firing at an average rate of 6.5 Hz with an average Fano factor of 0.25 performs this task with normalized post-training error of 5% (this error is the variance of the difference between and divided by the variance of ).
Because the output for this first task can be produced by a linear dynamical system, previous methods could also have been used to construct a functioning spiking network (Eliasmith, 2005; Boerlin, Machens & Denève, 2013). However, this is no longer true for the following examples. In addition, it is worth noting that the network we have constructed generates its output as an isolated periodic attractor of a nonlinear dynamical system. The other procedures, in particular that of Boerlin, Machens & Denève (2013), create networks that reproduce the linear dynamics that generates . This results in a system that can produce not only , but also over a continuous range of values. This often results in a slow drift in the amplitude of over time. The point here is that our procedure solves a different problem than previous procedures, despite the fact that it generates the same output. The previous procedures were designed to duplicate the linear dynamics that produce , whereas our procedure duplicates uniquely.
The second task we present is a temporal XOR task that requires the network to categorize the input it receives on a given trial and report this decision through its output. Each trial for this task consists of a sequence of two pulses appearing as the network input (Figure 3).
Each pulse has an amplitude of 0.3, and its duration can be either short (100 ms) or long (300 ms). The two pulses are separated by 300 ms, and after an additional 300 ms delay the network must respond with either a positive or a negative pulse (with a shape given by 1/2 cycle of a 1 Hz sinewave). The rule for choosing a positive or negative output is an exclusive OR function of the input sequence (short-short , short-long , long-short , long-long ). The time between trials and the input sequence on each trial are chosen randomly.
A network of 3,000 LIF neurons with an average firing rate of 7 Hz can perform this task correctly on 95% of trials. As in the autonomous oscillation task, individual neuron spiking activity varies from trial-to-trial due to the effect of . The Fano factor computed across all neurons, all analysis times, and all task conditions is 0.26. This task requires integration of each input pulse, storage of a memory of the first pulse at least until the time of the second pulse, memory of the decision during the delay period before the output is produced, and classification according to the XOR rule.
Generating EMG activity during reaching
We now turn to an example based on data from an experimental study, with the spiking network generating outputs that match electromyograms (EMGs) recorded in 2 arm muscles of a non-human primate performing a reaching task (Churchland, Lara & Cunningham, 2016). In this task, a trial begins with the appearance of a target cue at one of eight possible reach directions (task conditions). After a short delay, during which the arm position must be held fixed, a “go” cue appears, instructing a reach to the target. The time between trials and the sequence of reach directions are varied randomly.
To convey information about target location to the model, we use two network inputs denoted by a two-component vector and with amplitudes and where the angle specifies the reach direction (Figure 4a left). The input is applied for 500 ms and, when it terminates, the network is required to generate two outputs (thus is also two-dimensional) that match trial-averaged and smoothed EMG recordings from the anterior and posterior deltoid muscles during reaches to the specified target (Figure 4a right & e).
A network of 5,000 neurons with an average firing rate of 6 Hz performs this task with a normalized post-training error of 7% (Figure 4e), consistent with another modeling study that used a firing-rate network (Sussillo et. al., 2015). The activity of the trained network exhibits several features consistent with recordings from neurons in motor cortex. Individual neurons show a large amount of spiking irregularity that is variable across trials and conditions (Figure 4b). The Fano factor computed across all neurons and all task conditions drops during task execution (Figure 4d), consistent with observations across a wide range of cortical areas (Churchland et. al., 2010). This network shows that EMGs can be generated by a network with a realistic degree of spiking variability.
Another interesting feature of the network activity is the variety of different neural responses. Individual neurons display tuning during the input period, the output period, or across multiple epochs with different tunings (Figure 4f). As in recordings from motor cortex, consistently tuned neurons represent a minority of the network; most neurons change their tuning during the task (Figure 4g).
To examine the population dynamics of this model, we performed PCA on filtered (4 ms Gaussian filter) single-trial spike trains from a data matrix, where is the number of times sampled during a trial (2900), is the number of neurons (5000), and is the number of reach conditions (8). We computed the eigenvectors of the covariance matrix obtained from these “data”, generating temporal PCs (Figure 5a & b). Each temporal PC represents a temporal function that is strongly represented across the population and across all reach conditions, albeit in different ways across the population on a single trial for each condition. Two important features emerge from this analysis. First, more prominent single-trial PCs (Figure 5a) are relatively consistent from trial-to-trial, while less prominent PCs (Figure 5b) vary more. Second, the prominent PCs fluctuate on a slower time scale than the less prominent PCs. These two features indicate that the more prominent PCs form the basis of the network output on a single trial, while the less prominent PCs reflect network noise. This can be verified by reconstructing the network output using increasing numbers of temporal PCs and calculating the fraction of the output variance captured (Figure 5c red). A small number of the leading PCs account for most of the network output variance despite the fact that they account for only a fraction of the full activity variance on a single trial (Figure 5c blue).
Learning constrained connectivities
The examples we have presented up to now involve a fully connected matrix with no sign constraints. In other words, no elements were constrained to be zero, and the training procedure could make the sign of any element either positive or negative. Biological networks tend to be sparse (many elements of are fixed at zero) and obey Dale’s law, corresponding to excitatory and inhibitory cell types. This implies that the columns of should be labelled either excitatory or inhibitory and constrained to have the appropriate sign ( for excitatory and for inhibitory). Here we outline a procedure for training a network to solve a task while abiding by these constraints.
In the previous examples, we used a recursive least squares (RLS) procedure to compute because of stability issues that we now explain. Satisfying equation 4 accurately assures that can be generated self-consistently, but it does not guarantee that the resulting network state will be stable, and if it is unstable the least-squares solution is useless. We find that the use of RLS resolves this problem. As explained previously (Sussillo & Abbott, 2009), RLS provides stability because the fluctuations produced by the network when it is doing the task are sampled during the recursive procedure, allowing adjustments to be made that quench instabilities. However, when sparseness and especially sign constraints are impossed, use of RLS becomes impractical. Instead we must resort to a batch least-squares (BLS) algorithm.
The BLS algorithm computes on the basis of samples of that must be provided. To assure a stable solution, these samples should not only characterize the activity of a network doing the task, they should include the typical fluctuations that arise during network operation. To obtain such samples, we perform the network training in two steps. First, we train a fully connected and sign-unconstrained network using the RLS procedure, just as in the previous examples. We sample from this network during the training process, and use these samples to solve the BLS problem while enforcing the desired constraints.
Applying this two-step procedure, we can construct networks that perform the autonomous oscillation task of Figure 2 with 50% sparseness, either without (Figure 6a-c) or with (Figure 6g-i) a Dale’s law constraint. The normalized post-training error for both networks on this task is 5%, although to obtain this level we had to allow an average firing rate of 22 Hz. In addition, the random connectivity represented by was not included in this case because, for the number of neurons we used (1000), these networks are somewhat fragile to chaotic fluctuations. This fragility could be reduced by using more neurons, but even with BLS the computations, especially for the sign-constrained case, are quite lengthy.
We argued that the two-step training procedure was needed to sample network fluctuations properly. Could we have simply added white noise to samples of that did not contain the actual fluctuations produced by an operating network (Jaeger & Haas, 2004; Eliasmith, 2005)? PCA on the covariance matrix of the network fluctuations obtained during RLS training shows that most of their variance is captured by a small number of PCs (Figure 6f magenta), indicating significant spatial correlations. The temporal autocorrelation function for residual errors also showed significant correlation (not shown). To understand the role of these correlations, we created a dataset of with the actual network fluctuations replaced by shuffled fluctuations. Although the shuffled synaptic input (Figure 6e) is very similar to the unshuffled input (Figure 6b), use of the shuffled data set resulted in poor performance of the trained network (Figure 6d). This is because the shuffled data fail to capture the correlations present in the actual network fluctuations (Figure 6f cyan). These results reaffirm the conclusion that the RLS algorithm is effective for sampling network instabilities, and that the fluctuations obtained in this way can be used effectively to obtain constrained BLS values for .
We have developed a framework for constructing recurrent spiking neural networks that perform the types of tasks solved by biological neural circuits and that can be made compatible with biophysical constraints on connectivity. In this approach, the first step in producing a spiking system that implements a task is to identify a continuous-variable dynamical system that can, at least in principle, perform the task. Previous approaches to building models that perform tasks also resorted to identifying continuous analog systems. A key distinction, however, is that by exploiting the rich dynamics of externally driven, randomly connected, continuous-variable models, we can apply our approach to cases where a dynamic description of the task is not readily apparent. In general, any continuous variable network that can implement a task should generate useful auxiliary targets for training a spiking model. An intriguing future direction would be to use continuous-variable networks trained with back-propagation (Martens & Sutskever, 2011; Sussillo et. al., 2015; Sussillo, 2014) for this purpose. Another recent proposal for training spiking networks also makes use of continuous-variable network dynamics, but in this interesting approach, a spiking network is constructed to duplicate the dynamics of a continuous-variable model, and then it is trained essentially as if it were the continuous model (Thalmeier et. al., 2015).
Our work involves a more complex map from the continuous variables of a “rate” model to the action potentials of a spiking model. The simplest map of this type assigns a population of spiking neurons to each unit of the continuous-variable model such that their collective activity represents the continuous “firing-rate”. If we had followed this path, our spiking networks would have likely involved hundreds of thousands to millions of model neurons. In our approach, only a few times as many spiking neurons as continuous variable units are needed. Performing PCA on the activity of a continuous-variable network reveals that relatively small number of PCs capture a large fraction of the variance (Rajan, Abbott & Sompolinsky, 2010; Sussillo & Abbott, 2009). Thus, the unit activity in these networks is redundant, and matching every unit with a different population of spiking neurons is wasteful because this amounts to representing the same PCs over and over again. Our approach avoids this problem by distributing continuous signals from the entire continuous-variable network to overlapping pools of spiking neurons.
Our work also involves a novel interpretation of continuous variable models and the outputs of their units. These models are typically considered to be approximations of spiking models (Ermentrout, 1994; Gerstner, 1995; Shriki, Hansel & Sompolinsky, 2003; Ostojic & Brunel, 2011). We do not interpret the “firing rates” of units in continuous-variable networks as measures of the spiking rates of any neuron or collection of neurons in our spiking networks (in fact, these “firing rates” can be negative, which is why we avoid the term firing-rate network). Instead, the continuous-variable networks are generators of the principle components upon which the dynamics of both networks are based. PCA applied to of the spiking network and to of the continuous-variable network yield very similar results, at least for the predominant PCs. For example, the leading 10 temporal PCs account for more than 90% of the total variance for both networks, and the median of the principle angles between the subspaces defined by these two sets of 10 vectors is 5 degrees. This indicates that the temporal signals that dominate the dynamics and output of these two different types of networks are extremely similar.
We extracted a new set of auxiliary targets by projecting the original set onto a relatively small number of PCs of the continuous-variable network, and this did not have a detrimental impact on network training. We did this for the autonomous oscillator task using just 12 PCs, and the normalized error for the resulting network was similar to the error for the network shown in Figure 2. This confirms that the key information being passed from the continuous-variable network to the spiking network is carried by the leading PCs. The continuous-variable network is used in our procedure as a way of computing the PCs relevant to a task. If these can be obtained in another way, the spiking network could be trained directly from the PCs. One way of doing this is to extract the PCs directly from data, as has been done in other studies (Fisher et. al., 2013; Rajan, Harvey & Tank, 2016).
Finally, our approach strongly supports the use of continuous-variable models to analyze and understand neural circuits. However, it is important to appreciate that the connection between spiking and continuous-variable networks is subtle. In our procedure, the connectivity and nonlinearity in the continuous network bear no relation to the corresponding features of the spiking model, and the continuous network is not unique. Furthermore, the signals that allow a task to be performed are only apparent at the population level. In addition, since our spiking networks are not constructed by a rational design process, it may not be immediately apparent how they work. Nevertheless, the underlying continuous-variable model, and especially its leading PCs, capture the essence of how the spiking network operates and tools exist for understanding this operation in detail (Sussillo & Barak, 2013). These models and methods should do the same for experimental data.
We are grateful to Antonio Lara for providing the EMG data. Research supported by NIH grant MH093338 and by the Simons Collaboration for the Global Brain, the Gatsby Charitable Foundation, the Swartz Foundation, the Mathers Foundation and the Kavli Institute for Brain Science at Columbia University. B.D. was supported by a National Science Foundation Graduate Research Fellowship.
- Abbott, DePasquale & Memmesheimer (2016) Abbott, L.F., DePasquale, B., Memmesheimer, R.-M. (2016) Building functional networks of spiking model neurons. Nat. Neurosci., (to be published).
- Boerlin & Denève (2011) Boerlin, M., Denève, S. (2011) Spike-based population coding and working memory. PLoS Comput. Biol. 7:e1001080.
- Boerlin, Machens & Denève (2013) Boerlin, M., Machens, C.K., Denève S. (2013). Predictive coding of dynamical variables in balanced spiking networks. PLoS Comp Biol, 9(11): e1003258.
- Brunel (2000) Brunel, N. (2000). Dynamics of networks of randomly connected excitatory and inhibitory spiking neurons. J. Physiol. (Paris) 94, 445Ð463.
- Churchland, Lara & Cunningham (2016) Churchland, M.M., Lara, A.H., Cunningham, J.P. (2016). The motor cortex and supplementary motor area exhibit different population-level dynamics. COSYNE annual meeting abstract.
- Churchland et. al. (2010) Churchland, M.M., Yu, B.M., et. al. (2010). Stimulus onset quenches neural variability: a widespread cortical phenomenon. Nat Neurosci, 13(3):369-378.
- Eliasmith (2005) Eliasmith, C. (2005). A unified approach to building and controlling spiking attractor networks. Neural Computation, 17:1276-1314.
- Ermentrout (1994) Ermentrout, B. (1994) Reduction of conductance-based models with slow synapses to neural nets. Neural Comput., 6: 679Ð695.
- Fisher et. al. (2013) Fisher, D., Olasagasti, I., Tank, D. W., Aksay, E. R. F., Goldman, M. S. (2013) A modeling framework for deriving the structural and functional architecture of a short-term memory microcircuit. Neuron, 79:987Ð1000.
- Gerstner (1995) Gerstner, W. (1995) Time structure of the activity in neural network models. Phys. Rev. E, 51: 738Ð758.
- Haykin (2002) Haykin, S. (2002) Adaptive Filter Theory (Upper Saddle River, NJ: Prentice Hall).
- Hennequin et. al. (2014) Hennequin, G., Vogels, T.P., Gerstner, W. (2014). Optimal Control of Transient Dynamics in Balanced Networks Supports Generation of Complex Movements. Neuron, 82:1394-1406.
- Jaeger & Haas (2004) Jaeger, H. & Haas, H. (2004). Harnessing nonlinearity: predicing chaotic systems and saving energy in wireless communication. Science, 304:78-80.
- Laje & Buonomano (2013) Laje, R., Buonomano, D.V. (2013) Robust timing and motor patterns by taming chaos in recurrent neural networks. Nat. Neurosci., 16:925Ð933.
- Machens, Romo & Brody (2005) Machens, C.K., Romo, R., Brody, C.D. (2005). Flexible control of mutual inhibition: a neural model of two-interval discrimination. Science, 307:1121-1124.
- Marder (2000) Marder, E. (2000) Motor pattern generation. Curr. Opin. Neurobiol., 10: 691-698.
- Martens & Sutskever (2011) Martens, J. and Sutskever, I. (2011). Learning recurrent neural networks with Hessian-free optimization. Proc. 28 Int. Conf. on Machine Learning, (Bellevue, WA).
- Ostojic & Brunel (2011) Ostojic, S., Brunel, N. (2011) From spiking neuron models to linear-nonlinear models. PLoS Comput. Biol., 7: e1001056.
- Rajan, Abbott & Sompolinsky (2010) Rajan, K, Abbott, L.F., Sompolinsky, H. (2010). Stimulus-dependent suppression of chaos in recurrent neural networks. Physical Review E, 82:011903.
- Rajan, Harvey & Tank (2016) Rajan, K., Harvey, C., Tank, D. Recurrent network models of sequence generation and memory. Neuron (to be published).
- Seung et. al. (2000) Seung, H.S., Lee, D.D., Reis, B.Y., Tank, D.W. (2000). Stability of the memory of eye position in a recurrent network of conductance-based model neurons. Neuron, 26:259-271.
- Shriki, Hansel & Sompolinsky (2003) Shriki, O., Hansel, D., Sompolinsky, H. (2003) Rate models for conductance-based cortical neuronal networks. Neural Comput.,15: 1809Ð1841.
- Sompolinsky, Crisanti & Sommers (1988) Sompolinsky, H., Crissanti, A., and Sommers, H.J. (1988). Chaos in Random Neural Networks. Cerebral Phys. Rev. Lett., 61:259-262.
- Sussillo (2014) Sussillo, D. (2014) Neural circuits as computational dynamical systems. Curr. Opin. Neuro- biol., 25:156Ð163.
- Sussillo & Abbott (2009) Sussillo, D. and Abbott, L.F. (2009). Generating coherent patterns of activity from chaotic neural networks. Neuron, 63(4):544-557.
- Sussillo & Barak (2013) Sussillo, D., Barak, O. (2013) Opening the black box: low-dimensional dynamics in high-dimensional recurrent neural networks. Neural Comput., 25:626-649.
- Sussillo et. al. (2015) Sussillo, D., Churchland, M.M., Kaufman, M.T. & Shenoy, K.V. (2015). A neural network that finds a naturalistic solution for the production of muscle activity. Nat Neurosci, 18:1025-1033.
- Thalmeier et. al. (2015) Thalmeier, D., Uhlmann, M., Kappen, H. J., Memmesheimer, R.-M. (2015) Learning universal computations with spikes. arXiv:1505.07866.
- van Vreeswijk & Sompolinsky (1998) van Vreeswijk, C. & Sompolinsky, H. (1998). Chaotic balanced state in a model of cortical circuits. Neural Computation, 10:1321-1371.
- Wang (2002) Wang, X.J. (2002). Probabilistic decision making by slow reverberation in cortical circuits. Neuron, 36:955-968.