Efficient low-dimensional approximation of continuous attractor networks

# Efficient low-dimensional approximation of continuous attractor networks

## Abstract

Continuous “bump” attractors are an established model of cortical working memory for continuous variables and can be implemented using various neuron and network models. Here, we develop a generalizable approach for the approximation of bump states of continuous attractor networks implemented in networks of both rate-based and spiking neurons. The method relies on a low-dimensional parametrization of the spatial shape of firing rates, allowing to apply efficient numerical optimization methods. Using our theory, we can establish a mapping between network structure and attractor properties that allows the prediction of the effects of network parameters on the steady state firing rate profile and the existence of bumps, and vice-versa, to fine-tune a network to produce bumps of a given shape.

\RS@ifundefined

subsecref \newrefsubsecname = \RSsectxt \RS@ifundefinedthmref \newrefthmname = theorem  \RS@ifundefinedlemref \newreflemname = lemma

1: School of Computer and Communication Sciences and School of Life Sciences, Brain Mind Institute,
École Polytechnique Fédérale de Lausanne, 1015 Lausanne EPFL, Switzerland
2: Institute for Zoology, Faculty of Mathematics and Natural Sciences,
University of Cologne, 50674 Cologne, Germany

A previous version of this article was published as Chapter 3 of the first author’s Ph.D. thesis [1]

### 1 Introduction

Behaving animals commonly need to transiently memorize information about the environment. For example, as an animal overlooks the visual scenery, locations of certain salient stimuli need to be recorded and stored. Such information does not need to be stored in long-term memories. Rather, working memory must provide a quickly accessible computational substrate for storing information over short durations. While long-term memory is thought to be stored in the efficacy of synaptic connections in the brain [2, 3, 4], a possible substrate for working memory may be transiently stable states of neuronal activity across cortical networks [5, 6, 7].

As model implementations of this concept, localized spatial profiles of neural activity have been proposed for the internal representation of sensory stimuli [8, 9, 10]. First, neurons are associated to the presence of physical quantities through elevated responses during and after the presentation of stimuli, akin to receptive fields. For example, the presentation of stimuli at varying angular positions in the visual field evokes persistent and elevated firing rates in selective groups of neurons of the prefrontal cortex during delay periods [11]. Choosing recurrent connection weights (or connection probabilities) which are stronger between neurons that are responsive to similar stimuli, together with feedback inhibition limiting the total firing rates in a network, allows this class of models to display bumps of self-sustained activity: neuronal activity that is localized in the space of possible stimuli. Since these states are stable attractive states, and all possible such states form a continuum, these models are often referred to as continuous attractors. The elevated firing of neurons responsive to similar stimuli is then seen as the working memory representation of physical quanta stored in the network, e.g. spatial orientations [12], or angular positions in the visual field [13]. Similar computational circuits might also serve as the basis of persistent internal representations in hippocampal areas [14, 15].

Continuous attractor models with simplified shapes of connectivity or neuronal input-output relations can be analyzed and often exactly solved [8, 9, 10, 16, 17, 18, 19], or may generally be approximated in the linear input-output regime of balanced networks [20]. However, the inclusion of biologically plausible nonlinearities, like nonlinear neuronal input-output relations [13, 21], neuronal adaptation [22, 23], or synaptic nonlinearities like short-term plasticity [18, 24] and saturating NMDA kinetics [13, 25], complicate the mathematical solution of these systems considerably and make a derivation of the stable firing rate profile unfeasible. Therefore, such systems are usually studied by explicit simulations of the underlying dynamics or by numerical optimization of approximated equations for all neurons [20, 26]. While these procedures in principle allow the prediction of the network activity as a function of the parameters, they involve computationally demanding numerical optimization of high-dimensional systems of equations, possibly as costly as simulating the full neuronal dynamics. Thus, currently, relating the microscopic network parameters to the resulting emergent bump states involves repeated and possibly time consuming simulations. For example, this makes the matching of the network steady states to physiologically constrained features tedious.

Here, we present a generalizable approach for the approximation of the network-wide steady states of continuous attractors. Our approach allows the prediction of the shape of steady-state firing rate profiles, under nonlinear neuronal dynamics and varying configurations of the underlying microscopic system, without having to solve the dynamics of the full, high-dimensional system. Our novel method relies on a low-dimensional parametrization of the network’s firing rate profile, which allows us to derive computationally tractable systems of self-consistent equations describing the attractor steady-state, akin to mean-field approaches for networks with discrete attractors [27]. These equations can be used to efficiently predict the dependence of the firing rate profile on microscopic network parameters. Importantly, because the dimension of the parameterization of the spatial activity profile is low, our approach makes optimization of the microscopic network parameters for the appearance of desired bump profiles feasible. We apply our method to both networks of simplified rate neurons, and networks of complex, conductance-based integrate-and-fire neurons with saturating and voltage-dependent nonlinear NMDA transmission.

### 2 Results

Mean-field approaches (see e.g. [27, 28]) that predict the steady states of recurrently connected neuronal networks usually rely on dimensionality reduction. The number of equations describing the dynamics is reduced by partitioning neurons into groups of “similar” neurons, and deriving expressions which describe the average statistics for these coupled groups in the steady states. For example, the simplest such partition consists in considering the mean firing rates of excitatory and inhibitory neurons separately, e.g. all excitatory neurons fire with similar mean rates given the same input:

 νE=FE(inputE).

If the groups of neurons are now homogeneously coupled, i.e. the connections between neurons depend only on the groups of the neurons involved, one can derive the input to neurons of each group in dependence of the firing rates of the groups only. This leads to a closed system of self-consistency equations describing the coupled steady-state firing rates:

 νE =FE(inputE[νE,νI]), νI =FI(inputI[νE,νI]).

In the steady states of continuous attractor models (see 1), neurons fire at different rates, making a clear partition into discrete groups of similarly firing neurons difficult. Therefore, the solution of such systems usually relies on the explicit simulation of the neuronal dynamics of all neurons along the spatial dimension, or a numerical solution of the coupled self-consistency equations for all neurons.

Here, by using the continuity of the shape of the attractor states, we demonstrate that continuous attractors are amenable to dimensional reduction, by parametrizing the attractor state by a low-dimensional family of functions. In 2.2, we check our method on networks of simple rate neurons, for which the method might not yield much improvement over simulations or numerical solutions of the steady states. For the spiking networks considered in 2.3, we show that our approach speeds up predictions of the steady states considerably and further makes these networks amenable to the optimization of network parameters.

#### 2.1 General equations for the approximation of stable states in ring-attractors

As a concrete class of continuous attractors, we consider the ring-attractor model, in which stable bumps of neuronal activity are freely translatable along all positions on a circle. Ring-attractor models can be constructed by placing neurons (rate-based or spiking) at equally spaced angular positions along the ring (1A) [8, 9, 10]. We choose the angular space to consist of positions , where we identify the ends of the interval: a neuron at position is the neighbor of a neuron at position At short angular distances, recurrent connections are chosen to be strong and excitatory, while neurons further apart in angular space effectively inhibit each other’s firing (1A). Due to the symmetry in connectivity with respect to distance, these networks can form a continuous manifold of stable states for sufficiently strong connections: the network activity in response to external inputs converges to firing rate profiles centered around some angular position. The position can be, for example, controlled by providing an external input to the network centered around any desired position (1B). The stereotypical shape of the resulting firing rate profiles (1C) is invariant with respect to translations in angular space.

The continuity of the firing rate profile allows us to parametrize the firing rates in the population by a small number of parameters, and to derive equations from the underlying model that constrain these parameters. Here, inspired by shapes observed in simulations, we choose to parameterize the firing rate profile by a generalized Gaussian function, where we assume without loss of generality that the distribution is centered at (cf. 2A):

 g(θ)=g0+g1exp(−[|θ|gσ]gr). (1)

Here, controls the baseline firing rate and will be the maximum firing rate of the profile. The parameters and control the width and steepness of the profile, respectively. If we know the distance-dependent connectivity between neurons and their input-output relation , we can predict the expected neuronal firing at any position in the population (cf. 2B). Crucially, we want the firing rate profile to be generated by the neuronal dynamics – we thus identify at the point with the firing rate of a neuron at position . Finally, we replace the synaptic input to the neuron at position with the contributions from all neurons firing at rates along the ring. For any given position along the ring, this yields a self-consistent equation in the function :

 g(θi) =F(inputθi[g]) =F(∫π−πdφw(φ−θi)g(φ)), (2)

with the corresponding self-consistency error

 Erri ≡g(θi)−F(inputθi[g]). (3)

In principle, this procedure can yield up to coupled error functions, one for each of the neurons. One could then minimize the quadratic error with respect to the parameters to find an approximate solution of the system. However, since the evaluation of each error function can be costly (e.g. in spiking networks, see 2.3.2), we propose a low-dimensional approximation to constrain the set of free parameters: we pick only points , at which we evaluate the errors. This assumes that the shapes of firing rate profiles maintained by the network are well approximated by the function , which we found to be the case for all networks considered here. This leaves the choice of points to evaluate. To ensure that errors are evaluated across different firing rates, we set the position of these points to cover a range of function values : we choose the top of the distribution with , as well as the lowest point with (circles in 2C). The remaining intermediate points (triangles in 2C) are dynamically positioned (see 4.3.1 in Methods): their position depends on the function such that they always sample intermediate function values .

#### 2.2 Approximation of ring-attractor profiles in rate models

The proposed method is, in principle, applicable to any neuron model with a defined input-output relation . The shape of the stable attractor profiles will, however, depend on the concrete choice of neuron model and the microscopic parameters, in particular the parameters governing the connectivity between neurons. To test the ability of the low-dimensional approximation proposed in the last section to correctly predict the shapes of firing rate profiles, we implemented the ring-attractor model introduced above (see 1) in a network of rate-based neurons with input-output function and a generalized Gaussian recurrent connectivity (see 4.1 in Methods). For the rate neuron models chosen here, the self-consistency errors (LABEL:error-func) are given by (see 4.3.2 in Methods):

 Erri =g(θi)−νmax2[1+tanh(τss0N2πinputθi[g])], =g(θi)−νmax2[1+tanh(τss0N2π∫π−πdφw(φ−θi)g(φ))], (4)

where is the time-constant of synaptic inputs, is the maximal firing rate and is an input scale.

Prediction of stable firing rate profiles To approximate the firing rate profiles that these networks admit as self-consistent solutions, we minimize the error functions (LABEL:rate-error) with respect to the parameters as free variables. We find that, for a range of connectivity parameters (see 2), the predicted shapes converge to unique solutions. This solution matches the steady state of the microscopic network simulations accurately (2D). This is the case both for attractor states that lie in the linear regime of the neuronal input-output relations (2D, red line) as well as for highly nonlinear attractor dynamics in which neuronal firing reaches saturation values, leading to plateau-shaped firing rate profiles (2D, blue and green lines).

As discussed above, the placement of intermediate sampling points (2C, triangles) is not constrained by theory, but remains a free parameter of our approach. We chose these points at positions such that they sample given function values . To investigate the dependence of the prediction on the placement of intermediate sampling points, we calculated several predictions while randomly varying the choices of (2E, triangles). We find that this hardly affects the converged solutions.

##### Optimization of network parameters

While our low-dimensional system of self-consistent equations can be used for the prediction of the steady-state firing rates, they can also be used in the inverse way, to optimize any of the network parameters. We demonstrate this here, using the shape of the recurrent connectivity as an example. However, such optimizations can include further parameters of the network model (see 2.3.3).

To optimize the network connectivity parameters, we keep the parametrization parameters fixed to the desired values of the shape of the firing rate profile. We then optimize the self-consistent equations (LABEL:mfeq-selfconsist) for values of the recurrent connectivity parameters which lead to solutions of the equations and produce the desired bump profile. In 3 we show the results of this procedure for the three systems also investigated in 2. The procedure yields network connectivities that fulfill the desired properties (3A), largely independently of the points chosen for the evaluation of the errors. Importantly, for some shapes the solutions show a degeneracy (3B, gray lines), in the sense that several connectivity parameter sets are found that produce the same stable firing rate profile.

#### 2.3 Approximation of ring-attractor profiles in spiking networks

In complex spiking neuron models, the steady-state input-output relations often involve integral functions [27, 29] that are not amenable to further theoretical analysis. Further, the introduction of nonlinear NMDA transmission at excitatory synapses [13, 30] complicates the analysis of such models considerably: voltage-dependent gating of the maximal NMDA conductance (by the voltage-dependence of the block) and saturation of the NMDA at high conductances necessitates the numerical solution of a -dimensional system of self-consistent equations for a relatively simple 2 population mean-field model [27] (see below).

Here, we demonstrate that firing rate profiles in a continuous attractor network of spiking neurons with such nonlinear NMDA transmission are amenable to the same approach as described above, which still involves evaluating comparatively few equations. The spiking network we implement (similar to [13] and used with variations in e.g. [21, 31, 32, 33, 34], see 4.2 in Methods) consists of two fully connected populations of conductance-based integrate-and-fire neurons: a population of inhibitory neurons with unstructured all-to-all connectivity, and a population of excitatory neurons, with distance-dependent recurrent excitatory connections (4A). In addition, all neurons receive excitatory background input mediated by spikes generated by Poisson processes. These network can be tuned such that they possess a bi-stability (4B): a uniform state with spontaneous spiking activity in the excitatory population (the inhibitory population is always uniformly spiking) coexists with an “evoked” spatially inhomogeneous bump-state that appears after an external cue input is given to a subgroup of excitatory neurons (stimulus is present at in 4B).

Self-consistent equations for networks of spiking neurons For the excitatory population, we again parametrize the spatial profile of firing rates by (LABEL:g-gen-gaussian), which allows us to derive self-consistent equations for any neuron in the excitatory population. We construct self-consistent equations for the excitatory firing rates at positions as in (LABEL:mfeq-selfconsist). However, these now will depend additionally on the inhibitory firing rate . Also, the voltage-dependence of the differential equations leads to an additional self-consistent equation for the mean-voltage . For any position , the excitatory self-consistent equations are of the form (see 4.3.3 in Methods for detailed expressions):

 g(θ) =F(inputθ[g],νI,¯V(θ)) ≡F(∫π−πdφw(φ−θ)ψ(g(φ)),νI,¯V(θ)), (5) ¯V(θ) =G(inputθ[g],νI,¯V(θ)). (6)

The function expresses the mean synaptic activation under presynaptic Poisson spiking at rate . For accuracy, we chose to measure numerically for the model of nonlinear NMDA conductance of the recurrent excitatory synapses given in the network (see 4.3.3 in Methods).

To constrain the free parameters of , we again pick points , each now yielding 2-dimensional error functions

 Erri =⎛⎜⎝g(θi)−F(inputθi[g],νI,¯V(θi))¯V(θi)−G(inputθi[g],νI,¯V(θi))⎞⎟⎠. (7)

The resulting 8 equations are optimized for the parameters of the parametrization , as well as the additional variables . The inhibitory population, on the other hand, is assumed to be homogeneous. Its activity can be described by a single mean firing rate and the average voltage in the inhibitory population , resulting in a pair of additional self-consistency errors that constrain these two variables:

 ErrI= (νI−F(inputI,νI,¯VI)¯VI−G(inputI,νI,¯VI)), (8)

where is the mean recurrent excitatory input to inhibitory neurons.

As mentioned above, in a certain range of parameters the spiking system possesses two dynamically stable states (4B): the uniform state and the “evoked” bump state. In this bistable regime, the associated self-consistency Equations (5)-(6) must have an an additional unstable solution [35]. Even for parameters in which the bump-state is the only stable state of the system, the uniform state will still be a (unstable) solution of the self-consistent equations. Accordingly, numerical solutions of the errors Eqs. (7)-(8) sometimes converge to the uniform state or an unstable intermediate solution, even if a stable bump state at higher firing rates exists. In the following we consider only the solutions with the highest spatial modulation found under repeated solutions (see Discussion).

##### Prediction of firing rate profiles from network properties

Above, we derived error functions constraining the parametrization of firing rate profiles for spiking networks (LABEL:spike-selfcons-err)-(8). Here, we use these to predict the dependence of the spatial shape of the firing rate profile on of a bifurcation parameter , which is the maximal strength of recurrent excitatory connections. At the connection profile is homogeneous, while at larger values local connections are stronger (4C). The strength of long range connections is calculated by a normalization condition (see 4.2.3 in Methods).

As is increased (4C) above a critical value, a spatially inhomogeneous bump state appears in simulations of the spiking network (4E). Our theory predicts this dependence of the network state on the bifurcation parameter, while approximating to a large degree the changing shape of the rate profile as the parameter is increased (4D,E). The firing rates of the inhibitory population and their increase with the parameter are also well described (4D, black dots and lines).

As mentioned above, the error functions (LABEL:spike-selfcons-err) could, in principle, be evaluated at an arbitrary number of points. To constrain the parameters of the parametrization , we chose only the minimal number of points. This reduces the necessary number of evaluations of the errors (5A). Further, since in this case the dimensions of the optimization variables and the error vector coincide, application of a more efficient numerical optimization method (Root, see 4.3.4) allows for faster optimization (5A, green bar), which reduces the needed time from to close to (5A, right hand axis). We observe, however, that adding additional points does slightly influence the resulting prediction (5B), where more sampling points placed in the flanks of the bump tend to reduce slightly the predicted maximal firing rates (5C, orange points).

In a second experiment, we show that the theory can be used to efficiently predict the effect that changing network parameters have on the shape of the resulting firing rate profile. Similar to the simulated experiment in [31, Fig. 3], we systematically reduced the strength of recurrent excitatory-to-excitatory () and inhibitory-to-excitatory () conductances of the network from the baseline of the network presented in 4. Such changes of the ratio of excitation to inhibition have been hypothesized to occur under cortical disinhibition observed in schizophrenia [31, 36]. Recovering the result presented in the study, we see that the width of the bump profile1 depends mostly on the ratio of recurrent conductances, and thus undergoes significant widening under disinhibition. As we have shown in 5A, the optimization procedure for each datapoint is comparatively fast and thus enables these type of parameter scans for wide ranges of values under many parameters.

These results show that our approach can be used to accurately describe the firing rate profiles of bump-attractor networks of recurrently connected spiking excitatory and inhibitory neurons, across a range of parameters. While evaluating the error function at more points can lead to slightly increased accuracy of the prediction, the impact on optimization performance is significant, increasing the needed time by an order of magnitude.

##### Optimization of network parameters

As demonstrated above for a rate-based network, our low-dimensional approximation of continuous attractors allows the optimization of network parameters. Here, we demonstrate that this approach extends well to continuous attractors implemented in recurrently connected spiking neural networks with nonlinear synaptic transmission. As in the case of the rate network, this is achieved by fixing some desired properties of the firing rate profile, while minimizing the error functions Eqs. (7)-(8) with respect to several network parameters. Here, we included the shape parameters and of the distance-dependent connectivity, as well as the strengths of all recurrent synaptic connections: (cf. 3 for details).

To find networks that admit a given shape of the firing rate profile, we first fixed the firing rate modulation and width , while optimizing the connectivity parameters as well as the strength of all recurrent excitatory and inhibitory transmission. In total, there were variables (see 3 for listings and optimization results) which were optimized (see 4.3.4 for details). Varying these free parameters allows us to optimize the remaining parameters of the spiking networks through a range of shapes of the stable firing rate profile, from rather thin bumps of high activity (6, red) to wide bumps of low activity (6, blue). To check whether we could optimize the spiking network to show saturated flat-top shapes at low firing rates, we fixed and the sharpness parameter to a high value , while optimizing the parameter (6, green).

Similar optimization results for all three bump shapes were achieved by imposing additionally a low basal excitatory firing rate , which constrains firing rates in the uniform (spatially homogeneous) state for . For the optimizations above, the basal rates were unconstrained and varied between and . Thereby, the number of optimization variables is increased to , since additionally a basal inhibitory firing rate and basal inhibitory and excitatory mean voltages need to be introduced to calculate the self-consistency error of the basal firing rate . It should be noted, that the value of this constraint affects the possible bump shapes: for example, setting did not yield a converging optimization for the blue and green curves of 6 – this could be alleviated by relaxing to the higher value .

### 3 Discussion

We have presented a framework for the approximation of self-sustained firing-rate profiles in continuous attractor neural networks. Analytical computation of the steady states of continuous attractors is often not possible, since it involves the solution of high-dimensional systems of nonlinear equations, making numerical solutions necessary. Moreover, the spatially inhomogeneous firing rate profiles of these networks prohibit dimensional reduction of these equations by separation of neurons into homogeneous populations, as is usually done in mean-field approaches [27, 28]. Here, we propose a simple approach, consisting in approximating the continuous firing rate profiles by a family of functions with only parameters. These parameters are constrained by equations expressing the microscopic dynamics of the neurons and synapses involved in the model, and can be optimized to find admissible solutions. As we have shown, this can be used for the efficient mapping of the effects that different network parameters have on the bump shape. Next to predicting the emergent steady states of attractors, the utility of the low-dimensional approximation is that the derived self-consistent equations are efficiently optimizable: we were able to use standard numerical methods to constrain the parameters of spiking networks to show desired firing-rate profiles.

In the main text we have formulated our approach as generally as possible, to emphasize that the method does not rely on a specific neuronal (rate or spiking) model, as long as a prediction of firing rates given the synaptic input can be derived. Therefore, the theory could be extended easily to other neuron models, or connectivities where connection probabilities are distance-dependent (e.g. [37]). The approach presented here could also be used to predict the activity of two-dimensional attractor models implemented in “sheets” of neurons, that are often used in the context of hippocampal networks [14, 38, 39]. Assuming isotropy of the connectivity (if connection strengths depend only on the Euclidean distance between neurons), a two-dimensional generalized Gaussian function with and a single parameter as before could be used to approximate activity states. For non-isotropic networks, further width parameters could be introduced and constrained by sampling at additional points.

We have shown that our approach is amenable to the inclusion of synaptic nonlinearities like saturating NMDA transmission (which is captured by the synaptic activation function , cf. (LABEL:spike-selfcons-rate)). Other sources of nonlinear synaptic transmission, for example the activity-dependent short-term plasticity of synapses [24] that is often investigated in the context of working memory models [18, 37, 40], can be similarly incorporated: calculation (or numerical estimation) of a compound function that describes the steady-state values of synaptic activation under all nonlinear processes affecting synaptic transmission would suffice to adapt the theory2. Similar to the estimation of the mean-voltage in spiking networks demonstrated here, the theory could also be extended to incorporate adaptation effects on the steady-state firing rates of neurons [22].

For the prediction of firing rates in spiking networks we have adapted a theory for the description of mean-field firing rates of conductance based integrate-and-fire neurons [27] to predict spatial firing rate profiles in recurrently connected networks [13]. Mean firing rates and mean voltages in this theory are generally expectation values over ensembles of neurons that can be assumed to have homogeneous activity. Strictly speaking, here we violate this assumption by taking the rate prediction of theoretical ensembles as the prediction of the firing rates (and mean voltages) of single neurons at given positions along the ring attractor. However, since we are investigating the stationary state of networks, the mean firing rates calculated from this theory can also be interpreted as the time averaged firing rate of single neurons [41]. Further, the approximation of the recurrent synaptic inputs in the steady state usually relies on the averaging over presynaptic ensembles of homogeneously firing neurons. Nonetheless, the theory still works quite well, which might be due to the fact that we are calculating the synaptic drive as an integral over a continuum of presynaptic neurons, thereby effectively averaging out deviations of the synaptic drive that are to be expected in single neuron samples from such ensembles.

Although bump attractor states are interesting from a functional point of view, they are not the only solutions to ring-like networks. As we have seen, spiking networks can show bi-stability, in which both a stable uniform state and a stable evoked bump-state co-exist. Here, we have neglected solutions of our theory that converged to the uniform state (or intermediate unstable solutions) if a bump-state at larger firing rates was also found as a solution. While it goes beyond the scope of the current work, our theory could be extended to also predict the dynamical stability of the states that are found: similar to approaches in networks that admit a splitting into discrete populations [28] one could measure the magnitude of perturbations to the firing rates at the points resulting from perturbations to the parametrization: the general perturbation would translate into both perturbations of the rates and the prediction . Stability can then be determined by comparing the scale of the input perturbations to the predicted output perturbations [28]. It is worth noting, that calculating such linear perturbations will involve the derivative of the synaptic activations (cf. (LABEL:spike-selfcons-rate)), which will also have to be estimated numerically in cases where no analytical formula for can be found.

Our choice of parametrization of the firing rate profile was heuristic, guided by the shapes observed in numerical simulations. The framework presented here, though, could be used with any other family of functions parametrized by a small number of parameters, which should be adapted to the shapes to be approximated. For example, multi-modal ring-attractor profiles resulting from narrower connectivities (see e.g. [42], or [32] for a spiking network similar to the one investigated here) can not be approximated by the unimodal family chosen here. Since the topology of ring-attractors is periodic, a natural candidate for such a generalization would be the family of finite Fourier series. However, the nonzero frequency components necessary to faithfully approximate shapes that deviate far from simple (e.g. cosine-shaped) unimodal distributions might require a large number of Fourier coefficients for parametrization.

In this report we mostly chose as many positions along the attractor manifold as there are free parameters of the profile to be constrained. In principle, the number of free parameters can be chosen independently of the number of positions, by performing numerical optimization on a dimension-agnostic sum of squared errors. We have shown that matching the error dimension to the number of parameters permits using efficient optimization methods that significantly speed up the optimization. However, we have also investigated under-determined (see 6B) and over-determined (see 5A-C) systems, which also converged to similar solutions. Finally, when optimizing the network parameters for desired spatial profiles (3 and 6), choosing optimization goals outside the space of possible solutions of the network dynamics did not allow the procedure to converge. Thus, our approach could be used to estimate the boundaries of the solution space for a given neural network, by starting the optimization at a known solution and varying the shape parameters until convergence fails.

### 4 Methods

#### 4.1 Rate network model

We study a network of recurrently connected rate neurons indexed by , where each neuron is described by a single variable which denotes the firing rate of the neuron [43].

The neuronal firing rate is given by a nonlinear input-output function of a synaptic variable (assuming that the membrane time constant is considerably faster than the synaptic variable):

 νi(t) =F(si(t))=νmax2[1+tanh(si(t)s0)]. (9)

Here, sets the maximal firing frequency, and is the (dimensionless) scale of the synaptic input.

The input to neuron is mediated through the synaptic variable :

 ˙si(t) =−si(t)τs+N−1∑j=0wijνj(t), (10)

where denotes the temporal derivative, is the synaptic time constant, and are the recurrent connection weights (see below), and .

Neurons are organized at circular positions with identified boundaries, such that neuron is the direct neighbor of neuron . The recurrent connections depend only on the distance between neurons in the resulting angular space: the connection from neuron to neuron is given by a generalized Gaussian function, with 4 free parameters controlling its shape:

 wij =w(θi−θj)=1N(w0+w1exp[−(∣∣min(∣∣θi−θj∣∣,2π−∣∣θi−θj∣∣)∣∣wσ)wr]). (11)

The parameters used for the networks of 2 and 3 are given in 2.

#### 4.2 Spiking network model

Spiking simulations are based on a reimplementation of a popular ring-attractor model of visuospatial working memory [13] in the NEST simulator [44]. Parameters were modified from the original publication to produce the results shown in 4, 5, and 6 (see 1 for parameter values). For completeness we restate the definition of the model here.

Neuron model Neurons are modeled by leaky integrate-and-fire dynamics with conductance based synaptic transmission [13, 27]. The network consists of recurrently connected populations of excitatory and inhibitory neurons, both additionally receiving external spiking input with spike times generated by independent, homogeneous Poisson processes, with mean rates . Following [13], we assume that external excitatory inputs are mediated by fast AMPA receptors, while recurrent excitatory currents are mediated only by slower NMDA channels.

The neuronal dynamics for neurons in both excitatory and inhibitory populations are governed by the following system of differential equations indexed by (with different sets of parameters for each population):

 Cm˙Vi(t) = −ILi(t)−IExti(t)−IIi(t)−IEi(t), (12) IPi = gPsPi(Vi(t),t)(Vi(t)−VP),

where . Here, is the membrane capacitance and are the reversal potentials for leak, excitatory currents, and inhibitory currents, respectively. The parameters for are fixed scales for leak (L), external input (Ext) and recurrent excitatory (E) and inhibitory (I) synaptic conductances, which are dynamically gated by the (possibly voltage dependent) gating variables . In the main text we refer to the conductance scales of excitatory neurons by the “strength of synaptic connections” and . Similarly, for inhibitory neurons we refer to the conductance scales by the “strengths” and . The gating variables are described in detail below, however we set the leak conductance gating variable to .

The model neuron dynamics ((LABEL:base_deqs)) are integrated until their voltage reaches a threshold . At any such time, the respective neuron emits a spike and its membrane potential is reset to the value . After each spike, voltages are clamped to for a refractory period of (see 1 for parameter values).

Synaptic gating variables The synaptic gating variables for for external and inhibitory currents are exponential traces of the firing times of all presynaptic neurons :

 ˙sPi(t)=−sPi(t)τP+∑j∈pre(P)∑tjδ(t−tj), (13)

where the sum runs over all neurons presynaptic to the neuron regarding the connection .

For the recurrent excitatory gating variables a nonlinear NMDA model is used [45]. This model has second order kinetics for NMDA channel activation [25], which result in a saturation of channels. Together with a voltage dependence of the conductance (due to the release of the block, see [46]) this yields the following dynamics:

 sEi(V,t) = Mg(Vi)NE∑j=1wEijyj(t), (14) ˙yj = −yjτE+αxj(t)(1−yj), (15) ˙xj = −xjτE,rise+∑tjδ(t−tj), (16) Mg(V) = 11+γexp(−βV). (17)

See 1 for parameter values used in simulations.

Network connectivity All connections except for the recurrent excitatory connections are all-to-all and uniform. The recurrent excitatory connections are chosen to be distance-dependent. As in the rate model, each neuron of the excitatory population with index is assigned an angular position . Recurrent excitatory NMDA connections from neuron to neuron are then given by the Gaussian function :

 wEij =wE(θi−θj)=w0+(w+−w0)exp(−[min(∣∣θi−θj∣∣,2π−∣∣θi−θj∣∣)]212σ2w).

Additionally, for each neuron we keep the integral over all recurrent connection weights normalized, resulting in the normalization condition This normalization ensures that varying the maximum weight will not change the total recurrent excitatory input if all excitatory neurons fire at the same rate. Here, we choose as a free parameter and constrain the baseline connection weight to

 w0 =w+σwerf(π√2σw)−√2πσwerf(π√2σw)−√2π.

#### 4.3 Self-consistent equations

Placement of sampling points Self consistent equations ((LABEL:mfeq-selfconsist)) are constructed for both rate-based and spiking neurons by using the low-dimensional parametrization (LABEL:g-gen-gaussian) described in the main text. As mentioned there, we choose the top of the firing rate profile , as well as the lowest point . For the intermediate points for , we sample the firing rate profile by inverting the function to give a sample at a desired height with . This yields a relation for the position which depends on the shape parameters and :

 θi=−gσlog(ai)1gr.

For all figures except for 5 and 6, the intermediate points were chosen by setting and , although we show (2 and 3) that the exact choice of points affects the solutions only slightly.

In 5 we iterate through even numbers of sampling points. As before, we choose , as well as the lowest point . Generalizing the placement of points described above, the remaining points were chosen as for , and for . For the optimization of spiking network parameters shown in 6, we chose sampling points: we first chose points by the scheme just described, then added the point .

Derivation of input-output functions for the rate network For the rate network, we set (LABEL:rate-input) to zero and solve for the steady-state input , which yields

 si =τsN−1∑j=0wijνj=τsN−1∑j=0wijg(θj) ≈τsN2π∫π−πdφw(θi−φ)g(φ).

Here, we have replaced the activity of neurons in the network by our parametrization . In the second line we approximated the summation by the integral and exploited that the connectivity is only dependent on the angular distance between the neuron (at position ) and neurons (at varying positions ), to replace . We use this steady state input in (LABEL:rate-neurons) to arrive at (LABEL:rate-error).

Derivation of input-output functions for the spiking network In the rate model presented in 4.1 the firing rates are given by (LABEL:rate-neurons). In the spiking network, we have to approximate the expected firing rates of neurons. To this end, we first replace the synaptic activation variables for by their expectation values under Poisson input. For the linear synapses this yields

 ⟨sext(t)⟩Poisson= τextνpre, ⟨sI(t)⟩Poisson= τIνI.

The nonlinear synaptic activation of NMDA synapses under stimulation with Poisson processes at rates was estimated by simulating Eqs. (15)-(16) under varying presynaptic firing rates and fitting an interpolating function to the temporal means of the synaptic activation . An analytical approximation of the function was stated in [27, p. 80]. We instead chose to numerically fit this function to simulated data, since for higher firing rates the analytical approximation tended to over-estimate the synaptic activations.

We then define the expected recurrent excitatory input, assuming presynaptic Poisson firing, by

 Ji ≡1NENE−1∑i=0wEijψ(νj). (18)

Following [27], we linearize the voltage dependence (LABEL:Mg-NMDA) at the mean voltage and reduce the differential equations of (LABEL:base_deqs) to dimensionless form. The resulting expressions depend only on the mean firing rates and mean voltages of excitatory and inhibitory neurons (see A of the Appendix for the detailed expressions and derivations):

 τi˙Vi = −(Vi−VL)+μi+σi√τiηi(t) (19) μi = μi(Ji,νI,νext,⟨Vi⟩) σi = gextCm(⟨V⟩−VE)τext√τiNextνext. τi = τi(Ji,νI,νext,⟨Vi⟩) ⟨ηi(t)⟩ = 0 ⟨ηi(t)ηi(t′)⟩ = 1τextexp(−|t−t′|τext)

Here, is the bias of the membrane potential due to synaptic inputs, and measures the scale of fluctuations in the membrane potential due to random spike arrival approximated by the Gaussian process . Due to active synaptic conductances, the effective membrane time constant is decreased from the intrinsic membrane time-constant – its value thus depends on all presynaptic firing rates (and the mean voltage, see A of the Appendix).

The prediction of the mean firing rates and of mean voltages of populations of neurons governed by this type of differential equation can be well approximated by [27] (see also the published corrections in [47]):

 ϕ[μi,σi,τi] = (τref+√πτi∫α(μi,σi)β(μi,σi)duexp(u2)[1+erf(u)])−1, (20) α(μi,σi) = Vreset−VL−μiσi(1+τext2τi)+1.03√τextτi−τextτi, (21) β(μi,σi) = Vreset−VL−μiσi, (22) ⟨Vi⟩ = μi+VL−(Vthr−Vreset)ϕ[μi,σi,τi]τi. (23)

As in the rate model, we first replace the network activity on the right hand side of (LABEL:total-input) by our parametrization . We then approximate the summation with an integral , and replace the connectivity by its continuous equivalent to arrive at:

 Ji ≈12π∫π−πdφwE(θi−φ)ψ(g(φ)) ≡12πinputθi[g].

We then substitute this relation in Eqs. (20) and (23) to arrive at

 μi =μi(inputθi[g],νI,νext,⟨Vi⟩), τi =τi(inputθi[g],νI,νext,⟨Vi⟩), F(inputθi[g],νI,⟨Vi⟩) ≡ϕ[μi,σi,τi], G(inputθi[g],νI,⟨Vi⟩) ≡μi+VL−(Vthr−Vreset)gi(θi)τi, (24)

which defines Eqs. (5) and (6) of the main text.

Optimization of self-consistent equations For each point that we choose to sample from the excitatory population, the theory of 4.3.3 yields constraining Equations (20) and (23). The inhibitory population, being homogeneous and unstructured, yields equations, for the 2 free variables and . Since we choose a low-dimensional parametrization for the excitatory population, the number of free variables increases only by (the mean voltage ) for each point that we choose to evaluate, while yielding the same constraining equations. This allows us to choose at minimum evaluation points to constrain the free parameters of the parametrization (see 3 for a listing).

The errors (and , for spiking networks) between firing rate predictions and the firing rate parametrization are numerically minimized using methods provided in the Scipy package [48]. In particular, if the dimension of the error function matches the number of parameters, we are able to use the efficient optimize.root solver (Root in the main text), which applies a modified version of the Powell hybrid method [49], but does not provide constraints on valid parameter regions. Here, we implemented artificial constraints by returning a high error for dimensions that leave the bounded region. The same optimization results were achieved by using the slower optimize.minimize method, which allows optimization (of the sum of squared errors , or for spiking networks) in constrained parameter regions via the L-BFGS-B [50] and SLSQP [51]. For spiking networks, we normalized firing rate errors by the firing rate and voltage differences by the voltage range , to ensure comparable contributions to the SSE for variables with different dimensions.

For the optimization results of 6 we chose sampling points (see 4.3.1 for details), which yielded errors, including those of the inhibitory population. These were used to optimize free parameters using the SLSQP algorithm (see 3 for a listing). We also tried using sampling points, which brings both the number of equations and free parameters up to – this yielded similar results at increased processing time (the possibly faster Root solver failed to converge most of the time).

Wall clock times for error functions in 5A were measured on a single core of a MacBook Pro with 2,6 GHz Intel Core i5 processor, using the Python benchmark timeit.timeit (minimum wall clock time of repetitions). We first measured average time for evaluation of a single error of (LABEL:spike-selfcons-err), which evaluated to (100 repetitions of 10 executions). For a single evaluation of the inhibitory error of (LABEL:spike-selfcons-err-I) we found (the numerical integration performed in the calculation of is faster). The wall clock time (see 5A, right axis) for a given number of error vector evaluations on points was then calculated by .

#### 4.4 Spiking simulations

All network simulations where performed in the NEST simulator [44] using fourth-order Runge-Kutta integration as implemented in the GSL package [52]. For the simulation results shown in 4 and 6B, networks underwent a transient initial period of . Neurons centered at a position of then received a short and strong excitatory input mediated by additional Poisson firing onto AMPA receptors () with connections scaled down by a factor of . The external input ceased at .

Simulations were run until and spikes were recorded and converted to firing rates by spike counts in a window shifted at a time resolution of . For every time step, the firing rates across the whole population were then rectified (by measuring the phase of the first spatial Fourier coefficient and setting it to by rotation of the angular space) to center the bump of activity around the position . The resulting centered firing rates were then sampled at an interval of in the interval for 5 repetitions of the network simulation with the same microscopic parameters. In 4B times were: . In 4D,E we chose: . For 6B: .

## Appendix

### Appendix A Detailed derivation of dimensionless voltage equations

In this section we give details on the derivation of (LABEL:reduced-voltage-deq) as well as the resulting full expressions. This closely follows [27, pp. 79–81], while keeping a slightly simplified notation.

We first replace all synaptic activations in (LABEL:base_deqs) by their expected values under Poisson input, which also introduces the expected recurrent excitatory input (cf. (LABEL:total-input)):

 sexti(t) →Nextτextνpre+ΔS,ext, sIi(t) →NIτIνI, sEi(t) →Mg(Vi)NE−1∑i=0wEijψ(νj)=Mg(Vi)NEJi.

Here, represents fluctuations of around the mean of due to random spike arrival at fast AMPA synapses. Since the synaptic timescales (GABA, NMDA) of the other synaptic activations are much longer, these fluctuations can be neglected. We then rearrange (LABEL:base_deqs) to dimensionless form, which yields:

 CmgL˙Vi = −(Vi−VL)[1+TIνI+Textνext+gEgLMg(Vi)NEJi] +(VI−VL)TIνI+(VE−VL)[Textνext+gEgLMg(Vi)NEJi] +gextgL(Vi−VE)ΔS,ext,

where are effective timescales of external and inhibitory input.

To get rid of the nonlinear voltage dependence of the right hand side through , we linearize this function (cf. (LABEL:Mg-NMDA)) around the mean voltage :

 Vi−VE1+γexp(−βVi) =

where .

After replacing the voltage dependence in the fluctuation term by the mean voltage, we arrive at

 CmgL˙Vi = −(Vi−VL)[1+TIνI+Textνext+(ρ1+ρ2)Ji] +(VI−VL)TIνI+(VE−VL)[Textνext+ρ1Ji] +ρ2(⟨Vi⟩−VL)Ji+gextgL(⟨Vi⟩−VE)ΔS,ext. ρ1 = gENEgLρ ρ2 = βgENE(⟨Vi⟩−VE)(ρ−1)gLρ2 ρ = 1+γexp(−β⟨Vi⟩). (25)

Finally, we replace the fluctuations by independent Gaussian noise processes with zero mean and simpler autocorrelation , to arrive at the full form of (LABEL:reduced-voltage-deq) in the main text:

 τi˙Vi = −(Vi−VL)+μi+σi√τiηi(t) (26) Si = 1+TIνI+Textνext+(ρ1+ρ2)Ji μiSi = (VI−VL)TIνI+(VE−VL)Textνext+ [ρ1(VE−VL)+ρ2(⟨V⟩−VL)]Ji σi = gextCm(⟨V⟩−VE)τext√τiNextνext. τi = Cm