A Biologically Plausible Supervised Learning Method for Spiking Neural Networks Using the Symmetric STDP Rule
Spiking neural networks (SNNs) possess energy-efficient potential due to event-based computation. However, supervised training of SNNs remains a challenge as spike activities are non-differentiable. Previous SNNs training methods can be generally categorized into two basic classes, i.e., backpropagation-like training methods and plasticity-based learning methods. The former methods are dependent on energy-inefficient real-valued computation and non-local transmission, as also required in artificial neural networks (ANNs), whereas the latter are either considered to be biologically implausible or exhibit poor performance. Hence, biologically plausible (bio-plausible) high-performance supervised learning (SL) methods for SNNs remain deficient. In this paper, we proposed a novel bio-plausible SNN model for SL based on the symmetric spike-timing dependent plasticity (sym-STDP) rule found in neuroscience. By combining the sym-STDP rule with bio-plausible synaptic scaling and intrinsic plasticity of the dynamic threshold, our SNN model implemented SL well and achieved good performance in the benchmark recognition task (MNIST dataset). To reveal the underlying mechanism of our SL model, we visualized both layer-based activities and synaptic weights using the t-distributed stochastic neighbor embedding (t-SNE) method after training and found that they were well clustered, thereby demonstrating excellent classification ability. Furthermore, to verify the robustness of our model, we trained it on another more realistic dataset (Fashion-MNIST), which also showed good performance. As the learning rules were bio-plausible and based purely on local spike events, our model could be easily applied to neuromorphic hardware for online training and may be helpful for understanding SL information processing at the synaptic level in biological neural systems.
keywords:spiking neural networks, dopamine-modulated spike-timing dependent plasticity, pattern recognition, supervised learning, biologically plausibility
Due to the emergence of deep learning technology and rapid growth of high-performance computing, artificial neural networks (ANNs) have achieved various breakthroughs in machine learning tasks (lecun2015deep, ; schmidhuber2015deep, ). However, ANN training can be energy inefficient as communication among neurons is generally based on real-valued activities and many real-valued weights and error signals need to be transmitted in the basic error backpropagation (BP) training algorithm (rumelhart1988learning, ). This can consume considerable energy for computations on central processing units (CPUs) and information transportation between CPUs and random-access memory (RAM) in traditional high-performance computers. As such, ANNs are potentially highly energy consumptive. However, information processing in the human brain is very different to that in ANNs. Neurons in biological systems communicate with each other via spikes or pulses (i.e., event-based communication), which allow for the asynchronous update of states only when a spike is incoming. This has the potential advantage of energy efficiency compared to the synchronous update of real-valued states at each step in ANNs. In addition, learning rules for modifying synaptic weights of realistic neurons can also be based on local spike events, e.g., spike-timing dependent plasticity (STDP) (bi1998synaptic, ; Caporale2008b, ; song2000competitive, ), where the weight of a synapse is changed based on the spike activities of its pre-and post-synaptic neurons. This local event-based learning requires no additional energy for non-local transmission, which is required during the ANN training process. Therefore, spiking neural networks (SNNs), which can closely mimic the local event-based computations and learning of biological neurons, may be more energy efficient than ANNs, especially if SNNs are implemented onto neuromorphic platforms. Moreover, SNNs exhibit the natural ability of spatiotemporal coding of an input, and thus hold the potential advantage of efficient coding though sparse activities, particularly for continuous spatiotemporal inputs (e.g., optic flow), whereas ANNs require specially designed architectures to deal with temporal input (e.g., long short-term memory, LSTM) (hochreiter1997long, ; greff2016lstm, ) as their individual neurons generally have no intrinsic temporal characteristics and only generate output responses based on current inputs at each step. Thus, for these reasons, SNNs are considered as third generation neural network models and have attracted growing interest for exploring their functions in real-world tasks (maass2004computational, ; tavanaei2018deep, ).
Unlike that of ANNs, SNN training is highly challenging due to the non-differentiable properties of the spike-type activity. Hence, the development of an efficient training algorithm for SNNs is of considerable importance. Much effort has been expended in the past two decades on this issue (tavanaei2018deep, ), with the subsequently developed approaches generally characterized as indirect supervised learning (SL), direct SL, or plasticity-based training (tavanaei2018deep, ; wu2017spatio, ). For the indirect SL method, ANNs are first trained and then mapped to equivalent SNNs by different conversion algorithms that transform real-valued computing into spike-based computing (cao2015spiking, ; hunsberger2015spiking, ; diehl2015fast, ; esser2015backpropagation, ; diehl2016conversion, ; neil2016effective, ; esser2016convolutional, ; hu2018spiking, ); however, this method does not incorporate SNN learning and therefore provides no heuristic information on how to train a SNN. The direct SL method is based on the BP algorithm (wu2017spatio, ; lee2016training, ; Samadi2017Deep, ; tavanaei2017bp, ; zhangaaai2018, ), e.g., using membrane potentials as continuous variables for calculating errors in BP (lee2016training, ; zhangaaai2018, ) or using continuous activity function to approximate neuronal spike activity and obtain differentiable activity for the BP algorithm (wu2017spatio, ; tavanaei2017bp, ). However, such research must still perform numerous real-valued computations and non-local communications during the training process; thus, BP-based methods are as potentially energy inefficient as ANNs and also lack bio-plausibility. For plasticity-based training, synaptic plasticity rules (e.g., STDP) are used to extract features for pattern recognition in an unsupervised learning (USL) way (diehl2015unsupervised, ). Due to the nature of spontaneous unsupervised clustering of synaptic plasticity, this method requires an additional supervised module for recognition tasks. Three supervised modules have been used in previous studies: (1) a classifier (e.g., support vector machine, SVM) (kheradpisheh2017stdp, ), (2) a label statistical method outside the network (diehl2015unsupervised, ), and (3) an additional supervised layer (Beyeler2013Categorization, ; hu2017stdp, ; shrestha2017stable, ; mozafari2018combining, ). In our opinion, a neural network model with biological plausibility must meet the following basic characteristics. Firstly, the single neuron model must integrate temporal inputs and generate pulses or spikes as outputs. Secondly, the computation processes of training and inference must be completely spike-based. Finally, all learning rules must be based on experiments, and should not be violated (obviously contrary to) by experiments or artificially designed. The first two supervised modules are bio-implausible due to the need of computation outside the SNNs (kheradpisheh2017stdp, ; diehl2015unsupervised, ). The last supervised module has the potential of bio-plausibility, but existing supervised SNN models have either adopted artificially modified STDP rules (Beyeler2013Categorization, ; hu2017stdp, ; shrestha2017stable, ; mozafari2018combining, ) or exhibited poor performance (Beyeler2013Categorization, ). Currently, therefore, truly bio-plausible SNN models that can accomplish SL and achieve high-performance pattern recognition are lacking. Moreover, although great progress has been made in understanding the physiological mechanisms for the modification of synapses at the microscopic level (bi1998synaptic, ; Caporale2008b, ), how teacher learning at the macroscopic behavioral level is realized by changes in synapses at the microscopic level, i.e., the mechanism of SL processing in the brain, is still far from clear.
In this study, we proposed a novel bio-plausible SL method for training SNNs based on biological plasticity rules. We introduced the dopamine-modulated STDP (DA-STDP) rule, a new type of symmetric STDP (sym-STDP), for pattern recognition. The DA-STDP rule has been observed in several different experiments in the hippocampus and prefrontal cortex Zhang2009Gain ; Ruan2014 ; brzosko2015retroactive , where the modification of synaptic weight is always incremental if the interval between the pre- and post-synaptic spike activities is within a narrow time-window when dopamine (DA) is present. The differences between the sym- or DA-STDP and classic STDP rules are shown in Figure 1 bi1998synaptic ; Zhang2009Gain ; brzosko2015retroactive . While the sym-STDP rule has been used previously masuda2007formation ; tanaka2009cmos ; serrano2013stdp ; mishra2016symmetric , this is the first time it has been applied for SL. In our proposed model, a three-layer feedforward SNN was trained by DA-STDP combined with synaptic scaling turrigiano2008the ; turrigiano1998activity ; effenberger2015self-organization and dynamic threshold, which are two homeostatic plasticity mechanisms for stabilizing and specializing the network response under supervised signals. Two different training methods were used in our SNN model, i.e., training two-layer input synaptic weights simultaneously and training the SNN layer-by-layer. Our model was tested in the benchmark handwritten digit recognition task (MNIST dataset) and achieved high performance under the two training methods. We also evaluated the model using the MNIST-like fashion product dataset (Fashion-MNIST), which also showed good classification performance. These results thus highlighted the robustness of our model.
2 Network architecture and neuronal dynamics
We constructed a three-layer feedforward spiking neural network for SL, which included an input layer, hidden layer, and SL layer (Figure 2). The structure of the first two layers was inspired by the USL model of Diehl and Cook diehl2015unsupervised . Input patterns were coded as Poisson spike processes with firing rates proportional to the intensities of the corresponding pixels. The Poisson spike trains were then fed to the excitatory neurons in the hidden layer with all-to-all connections. The dark blue shaded area in Figure 2 shows the input connection to a specific neuron. The connection from the excitatory to inhibitory neurons was one-to-one. An inhibitory neuron only received input from the corresponding excitatory neuron at the same position in the map and inhibited the remaining excitatory neurons. All excitatory neurons were fully connected to the SL layer. In the SL layer, neurons fired with two different modes during the training and testing processes. During the SL training period, the label information of the current input pattern was converted to a teacher signal in a one-hot coding scheme by the 10 SL neurons. Only one SL neuron was pushed to fire as a Poisson spike process, with the remaining SL neurons maintained in the resting state. In the testing mode, all SL neurons fired according to inputs from the hidden layer.
We used the leaky integrate-and-fire (LIF) model and set the parameters within bio-plausible ranges. The resting membrane potential was set at mV and the equilibrium potentials of the excitatory synapses and inhibitory synapses were set to mV and mV, respectively. The time constant of membrane potential damping was equal to . The membrane potential dynamics of the neurons can be described by diehl2015unsupervised and vogels2011inhibitory :
where and are the total excitatory and total inhibitory conductances, respectively.
Both and have a similar dynamic equation. They are dependent on the number of excitatory and inhibitory synapses ( and ) and input synapses (in-synapses) weights ( and ) of excitatory and inhibitory synapses, respectively. Both of the time constants of synapse conductance damping and were equal to ms. Thus, and in Eq. 1 can be described by the following equations diehl2015unsupervised ; vogels2011inhibitory :
where is the th spike time from the th neuron.
In our neuron model, once the membrane potential exceeds the threshold voltage, the membrane potential is returned to the reset voltage ( mV), and the neuron will not fire again in the refractory time ( ms). The firing threshold is not static, with a dynamic threshold (i.e., homeostatic plasticity) mechanism instead adopted. The dynamic threshold is the intrinsic plasticity of a neuron, and is found in different neural systems yeung2004synaptic ; sun2009experience ; zhang2003other ; pozo2010unraveling ; cooper2012bcm . Here, it was introduced to generate a specific response to a class of input patterns for each excitatory neuron in the hidden layer zhang2003other ; pozo2010unraveling ; diehl2015unsupervised , otherwise single neurons can dominate the response pattern due to their enlarged input synaptic (in-synaptic) weights and lateral inhibition. The membrane potential threshold was composed of two elements (Eq. 4), i.e., the constant ( mV) and dynamic variable () parts. increases slightly when the neuron fires, otherwise it decays exponentially. As a neuron will not (or barely) fire when is too large, which can negatively impact model performance, we adopted a dynamical increment to slow growth gradually. Therefore, can be described as:
where is the time constant of , with the initial value of () set to mV, is the maximum value of the increment and is its dynamical scaling factor, and is the -th firing time.
Synaptic weights were modified according to the two biological plasticity rules DA-STDP and synaptic scaling. Dopamine is an important neuromodulator and plays a critical role in learning and memory processes tritsch2012dopaminergic . Here, inspired by the DA-STDP found in different brain areas, such as the hippocampus and prefrontal cortex Zhang2009Gain ; Ruan2014 ; brzosko2015retroactive (Figure 1), we hypothesized that DA can modulate changes in synaptic weights according to the DA-STDP rule during the SL process. The weights increment under the phenomenological model of DA-STDP can be expressed as Zhang2009Gain ; brzosko2015retroactive :
where is the time difference between the pre- and post-synaptic spikes, and are the time constants of positive and negative phases for , respectively, and and are learning rates.
As DA-STDP can only increase synapse strength, the synaptic scaling plasticity rule was introduced to generate a competition mechanism among all in-synapses of a neuron in the hidden layer (only for excitatory neurons) and SL layer. Synaptic scaling is a homeostatic plasticity mechanism observed in many experiments (pozo2010unraveling, ; turrigiano2008the, ; davis2006homeostatic, ), especially in visual systems (maffei2008multiple, ; keck2013synaptic, ; hengen2013firing, ) and the neocortex (turrigiano1998activity, ). Here, synaptic scaling was conducted after the pattern was trained and synapse strength was normalized according to the following equation (diehl2015unsupervised, ):
where is the number of all in-synapses of a single neuron and with is the scaling factor.
3 Recognition performance for the MNIST task
Our SNN model was trained on the MNIST dataset (training set: 60 000 samples; test dataset: 10 000 samples) (http://yann.lecun.com/exdb/mnist/) using two training methods, i.e., simultaneous and layer-by-layer training. For simultaneous training, the in-synapses of the hidden and SL layers were updated simultaneously during the training process, whereas for layer-by-layer training, the hidden layer was trained first, then the SL layer was trained with all in-synaptic weights of the hidden layer fixed. In the inference stage, the most active SL layer neuron was regarded as the inference label of the input sample. Error rate on the test dataset was used to evaluate performance. No data preprocessing was conducted, and we used a SNN simulator (GeNN) (yavuz2016genn, ) to simulate all experiments.
For the model parameters, we set the simulation time step to and the presentation time of a single input sample to , followed by a resting period of . Both and were tuned according to network size. The in-synaptic and output synaptic (out-synaptic) weights of the excitatory neurons in the hidden layer were in the ranges of and , respectively. All initial weights were set to corresponding maximum weights multiplied by uniform distributed values in the range . The factor for synaptic scaling was set to 0.1. The firing rates of the input neurons were proportional to the intensity of the pixels of the MNIST images (diehl2015unsupervised, ). We set maximum rates to 63.75 Hz after dividing the maximum pixel intensity of 255 by 4. When less than five spikes were found in the excitatory neurons of the hidden layer during 350 ms, the maximum input firing rates were increased by 32 Hz. The firing rates of the SL layer neurons were 200 Hz or 0 Hz according to the one-hot code in the SL training period.
To demonstrate the power of the proposed SL method for different network sizes, we compared the results of our SL algorithm to the ‘Label Statistics’ algorithm used in previous study (diehl2015unsupervised, ). In this algorithm, an additional module first calculates the most likely representation of a neuron for a special class of input patterns, labels the neuron to the class during the training process, and uses the maximum mean firing rates among all classes of labeled neurons for inference in the test process.
We trained the model with different epochs of the training dataset for different network sizes (3, 5, 7, 10, and 20 epochs for the excitatory neuron number of the hidden layer 100, 400, 1600, 6400, and 10000, respectively). During the training process, the network performances for each training set of 10 000 samples were estimated by the test dataset. Taking network sizes 400 and 6400 as examples, classifying accuracies converged quickly under the two training methods (Figure 3). Figure 4 shows very high consistency between the desired (label) and inferred (real) outputs in the SL layer.
The results for the different learning methods are summarized in Table 1. Our two SL methods outperformed the ‘Label Statistics’ method for all small-scale networks (). In addition, the ‘Layer-by-Layer’ training method outperformed ‘Label Statistics’ for all network scales. The best performance of our SL model (96.73) was achieved in the largest network under the ‘Layer-by-Layer’ training method. These results indicate that a SNN equipped with biologically realistic plasticity rules can achieve good SL by pure spike-based computation.
|Network Size ()||100||400||1600||6400||10000|
4 Visualization of model clustering ability
To demonstrate the underlying mechanisms of our SL model in pattern recognition tasks, we adopted the t-distributed Stochastic Neighbor Embedding (t-SNE) method maaten2008visualizing to reveal the model’s clustering ability. The t-SNE is a popular nonlinear dimensionality reduction method widely used for visualizing high-dimensional data in low-dimensional space (e.g., two or three dimensions). We visualized the original digit patterns (Figure 5A), spike activities of the hidden layer (Figure 5B), and spike activities of the SL layer (Figure 5C) for all samples in the test dataset. The separability of the output information of the three layers in our model increased from the input to SL layer, indicating that the SL layer served as a good classifier after training.
To demonstrate why our SL method achieved effective clustering for the hidden layer outputs, we also applied t-SNE to reduce the dimensions of the out-synaptic weights of the excitatory neurons. As shown in Figure 6, the clustering of the out-synaptic weights of the excitatory neurons was highly consistent with the clustering of their label information using the ‘Label Statistics’ method. This explains why our SL method achieved comparatively good performance as the ‘Label Statistics’ method for the classification task, although our model did not require ‘Label Statistics’ computation outside the network to calculate the most likely representation of a hidden neuron diehl2015unsupervised , instead realizing SL based solely on computation within the network.
5 Comparison with other SNN models
Current SNN models for pattern recognition can be generally categorized into three classes: that is, indirect training cao2015spiking ; hunsberger2015spiking ; diehl2015fast ; esser2015backpropagation ; diehl2016conversion ; neil2016effective ; esser2016convolutional ; hu2018spiking , direct SL training with BP wu2017spatio ; Beyeler2013Categorization ; lee2016training ; Samadi2017Deep ; tavanaei2017bp ; zhangaaai2018 ; Lee2018 , and plasticity-based unsupervised training with supervised modules querlioz2013immunity ; diehl2015unsupervised ; kheradpisheh2017stdp . Table 2 summaries several previous SNN models trained and tested using the full training and testing sets of the MNIST dataset.
Comparison with SNN models trained using BP. In previous studies with indirect training, ANNs were trained using the BP algorithm based on activity rates and transformed to corresponding equivalent SNNs based on firing rates. Although their performances were very good, they ignored the temporal evolution of SNNs and spike-based learning processes. Thus, indirect training provides very little enlightenment on how SNNs learn and encode different features of inputs. For other studies using direct SL training, most adopted the BP algorithm and calculated errors based on continuous variables, e.g., membrane potentials (voltage), currents, or activity rates, to approximate spike activities and achieve SL wu2017spatio ; lee2016training ; Samadi2017Deep ; tavanaei2017bp ; zhangaaai2018 ; Lee2018 . For example, Zhang et al. proposed a voltage-driven plasticity-centric SNN for SL zhangaaai2018 , with four learning stages required for training, i.e., equilibrium learning and voltage-based STDP-like learning rule for USL as well as voltage-based BP for SL; however, this resulted in the model being highly dissimilar to biological neuronal systems. Lee et al. pre-trained a multi-layer SNN system by STDP in an USL way for optimal initial weights and then used current-based BP to re-train all-layer weights in a supervised way Lee2018 ; however, this also resulted in the model being bio-implausible due to the use of the BP algorithm.
Comparison with STDP-based models without BP. Several studies have also proposed STDP-based training methods without BP. These previous models adopted STDP-like plasticity rules for USL and required a special supervised module for SL, e.g., a classifier (SVM) kheradpisheh2017stdp , artificial label statistics outside the network querlioz2013immunity ; diehl2015unsupervised , or additional supervised layer Beyeler2013Categorization ; hu2017stdp ; shrestha2017stable ; mozafari2018combining . However, the first two supervised modules are bio-implausible because their computing modules are outside the SNNs, resulting in the SNNs having no direct relationship with SL querlioz2013immunity ; diehl2015unsupervised ; kheradpisheh2017stdp . For example, Beyeler et al. adopted a calcium concentration-based STDP for SL Beyeler2013Categorization , which showed considerably poorer performance than that of our model. Hu et al. used an artificially modified STDP with a special temporal learning phase for SL hu2017stdp ; however, their STDP rule was artificially designed and its internal mechanism was not well explained. Shrestha et al. also adopted a specially modified STDP rule with exponential weight change and extended depression window for SL in a SNN, with a similar supervised module as ours, but performance was relatively poor (less than 90) shrestha2017stable . Mozafari et al. used a mixed classic STDP and anti-STDP rule to generate reward-modulated STDP with a remote supervised spike for SL, but they were not able to provide biological evidence to explain this type of learning mozafari2018combining .
Detail comparison with STDP-based SNN model by Diehl and Cook. Our model was inspired by the STDP-based SNN method proposed by Diehl and Cook diehl2015unsupervised . In their two-layer SNN, STDP is used to extract features by USL and an additional computing module outside the SNN is used for label statistics in the training process and classification in the testing process. In our model, we achieved the same algebraic computation and reasoning using an additional layer of spiking neurons instead of the outside-network computations, thus achieving considerable progress in STDP-based SNN models for SL due to the completely spike-based computations. Moreover, there were two other improvements in the USL process in our model compared to that of Diehl and Cook diehl2015unsupervised . The first improvement was the novel sym-STDP rule rooted in DA-STDP, with DA-STDP able to give a potential explanation for the SL processes occurring in the brain. That is, we speculated that DA may be involved in the SL process and that local synaptic plasticity could be changed to sym-STDP during the whole training process. With the aid of the forced firing of a supervised neuron by the incoming teacher signal, sym-STDP could establish the relationship between the input and its teacher information after sufficient training. The second improvement was the new dynamic threshold rule in Eq. 5, in which a decay factor for was introduced, which could significantly improve performance.
It should be noted that several other SNN training models have performed classification tasks in other ways, but their recognition performance was relatively poor lin2018relative ; sporea2013supervised . Recently, Xu et al. constructed a novel convolutional spiking neural network (CSNN) model using the tempotron as the classifier and attempted to take advantage of both convolutional structure and SNN temporal coding ability xu2018csnn . However, their model only achieved a maximum accuracy of 88 on a subset of the MNIST dataset (Training samples: 500; Test samples: 100) when the network size equaled ; in contrast, our model achieved an accuracy of 91.41 with equal to on the full MNIST dataset. This indicates that our model could also work very well under small network size constraints.
Thus, for the above reasons, our proposed sym-STDP based SNN model could solve the lack of bio-plausible and high-performance SNN methods for spike-based SL.
|Network Size ()||400||6400|
6 Robustness of our SL model
To demonstrate the robustness of our SL model, we also tested its performance on Fashion-MNIST xiao2017fashion , a MNIST-like fashion product dataset with 10 classes. Fashion-MNIST shares the same image size and structure of the training and testing splits as MNIST but is considered more realistic as its images are generated from front look thumbnail images of fashion products on Zalando’s website via a series of conversions. Therefore, Fashion-MNIST poses a more challenging classification task than MNIST. We preprocessed the data by normalizing the sum of a single sample gray value because of high variance among examples. We then made necessary parameter adjustments to the model mentioned earlier. Specifically, some class examples in Fashion-MNIST have more non-zero pixels than MNIST, such as T-shirt, pullover, shirt, and coat. We decreased in Eq. 7 so as to reduce weights and offset the impact of excessive spike quantity. We trained our model under different network sizes ( = 400 and 6400). The same evaluation criteria were applied, as shown in Table 3, our model also performed well on the Fashion-MNIST task under both two SL training methods. For example, the Layer-by-Layer training methods achieved accuracies of 78.68 and 85.31 for network sizes 400 and 6400, respectively. The best performance of our model is comparable with traditional machine learning methods, such as SVM with linear kernel (83.9) and multilayer perceptron (the highest accuracy reported was 87.1) xiao2017fashion . These results further confirm the robustness of our SL model.
A neural network model with biological plausibility must meet three basic characteristics, i.e., the ability to integrate temporal input and generate spike output, spike-based computation for training and inference, and all learning rules rooted in biological experiments. Here, we used the LIF neuron model, with all learning rules (e.g., sym-STDP, synaptic scaling, and dynamic threshold) rooted in experiments and computation based on spikes. Thus, the proposed SNN model meets all the above requirements and is a true biologically plausible neural network model.
However, how did our model obtain good pattern recognition performance? This was mainly because the three learning rules worked synergistically to achieve good feature extraction and generate the appropriate mapping from input to output. The sym-STDP rule demonstrated a considerable advantage by extracting the relationship of spike events, regardless of their temporal order, in two connected neurons, with synaptic scaling able to stabilize total in-synaptic weights and create weight competition among in-synapses of a neuron to ensure that the suitable group of synapses became strong. Furthermore, the dynamic threshold mechanism compelled a neuron to fire for matched patterns but rarely for unmatched ones, which generated neuron selectivity to a special class of patterns. By combining the three bio-plausible plasticity rules, our SNN model established a strong relationship between the input signal and supervised signal after sufficient training, ensuring effective SL implementation and good performance in the benchmark pattern recognition task (MNIST). The proposed model also obtained good performance by training two layers synchronously, whereas many previous SNN models require layer-by-layer or multi-phase/multi-step training kheradpisheh2017stdp ; lee2016training ; Samadi2017Deep ; zhangaaai2018 ; hu2017stdp .
To explore the effect of different deep network structures, we also tested the performance of our SL algorithm when adding or removing a hidden layer in our network model. Taking as an example, the network model with no hidden layer achieved a performance of only 57.74, whereas the network model with two hidden layers reached 81.65, slightly lower than that achieved in the network with one-hidden layer (83.57). In deep ANNs, classification performance is usually improved by increasing the number of hidden layers; however, this phenomenon was not found in our model. In our model, each hidden layer excitatory neuron has a global receptive field, which differs from that in deep convolutional neural networks (CNNs), where each convolutional layer neuron has a common local receptive field. A CNN needs more convolutional layers to generate larger receptive fields and thus achieve a global receptive field in the last layer; however, this is not required in our model. After training, the global receptive field of a hidden-layer neuron resulted in a maximum response of the neuron to a special category of input patterns, thus allowing hidden-layer neurons to be easily divided into different classes. That is, unsupervised clustering can be realized using just one hidden layer. A subsequent supervised layer can then directly implement classification after training based on supervised label information. Therefore, in the framework of our network model, it would be useless to promote performance by increasing hidden layers, and also would harm performance by removing the hidden layer as the model can only generate 10 special global receptive fields (by 10 supervised neurons) for all 60000 training patterns. Therefore, in our model, we increased the number of hidden-layer neurons to improve performance rather than the number of hidden layers. As no more than 10 neurons usually responded to an input pattern in the hidden layer, activity was so sparse that simple linear mapping from the high-dimension activities in the hidden layer to the low-dimension activities in the classification layer was possible, again indicating no need for further hidden layers. Nevertheless, additional hidden layers may be necessary to improve performance if the convolutional structure (local receptive fields) is adopted for the network. We will explore convolutional SNNs to further verify the universality of our SL algorithm in the future.
In our SNN model, DA was found to be a key factor for achieving SL. Dopamine plays a critical role in different learning processes and can serve as a reward signal for reinforcement learning glimcher2011understanding ; holroyd2002neural ; wise2004dopamine ; dayan2002reward . A special form of DA-modulated STDP, different to the symmetric one used here, has been applied for reinforcement learning in SNNs previously izhikevich2007solving ; however, no direct experimental evidence for this kind of STDP rule has yet been reported. Here, we assumed that DA may also be involved in the SL process, with the symmetric DA-STDP rule found in experiments to modify synaptic weights during SL. Our work further indicated the potentially diverse functions of DA in regulating neural networks for information processing. Moreover, even if this DA-involved assumption is shown to be biologically unsound in the future, direct experimental evidence, i.e., identification of a sym-STDP without the need of DA in the hippocampal CA3-CA3 synapses under slow-frequency stimulation (mishra2016symmetric, ), supports the bio-plausibility of the sym-STDP rule used in our model. It is worth noting that the sym-STDP rule is a spiking version of the original Hebbian rule, that is, ‘cells that fire together wire together’, and a rate-based neural network with the Hebbian rule, synaptic scaling, and dynamic bias could be expected to have similar classification abilities as our model. However, the performance of the rate model may not be as high as that reported here. Further exploration is needed using the simplified rate model with the original Hebbian rule.
Several SNN models have been developed to provide a general framework for SNN learning, e.g., liquid state machine (LSM) (maass2002real, ) and NeuCube (kasabov2014neucube, ; kasabov2018time, ). The LSM is a classic model of reservoir computing (RC) (jaeger2001echo, ; jaeger2004harnessing, ; lukovsevivcius2009reservoir, ; lukovsevivcius2012reservoir, ; maass2002real, ), which contains an input module, a reservoir module consisting of relatively large spiking neurons with randomly or fully connected structure, and a readout module. In a typical LSM, the low dimensional input, usually possessing spatiotemporal information, is fed into the high-dimensional space of the reservoir to optimally maintain a high-dimensional, dynamic representation of information as a short-term memory when the state of the reservoir is at the edge of chaos. The reservoir is then projected to the readout module where synaptic weights are modified by learning methods. NeuCube is also a general SNN model similar to a RC model, with both a similar network structure and ability to encode spatiotemporal input as LSM (kasabov2014neucube, ; kasabov2018time, ). In NeuCube, the reservoir is a three-dimensional-based structure and can dynamically evolve its weight structure using unsupervised learning rules, which allows it to encode an input as a special activity trajectory in the high-dimension activity space of the reservoir. In the readout module of NeuCube, an evolving SNN (eSNN) is usually adopted for classification (kasabov2018time, ), which can dynamically create a new output neuron for clustering by evaluating the similarity between a new input and encoded inputs in Euler space. Our model is a general SNN model for SL. There are some significant differences between our model and the two models. Firstly, our learning rule (sym-STDP) is consistent for different layers across the whole model, whereas NeuCube generally uses different learning rules for synapses in the reservoir and out-synapses reservoir and the classic LSM only uses a learning rule to adjust the weights of readout synapses from the reservoir with fixed inner connections. Secondly, our model contains biologically plausible receptive fields, which can be learned during training, whereas the concept of receptive fields is ambiguous in NeuCube and LSM. Compared to the two models, however, our model lacks the ability to process complex temporal information. Therefore, in the future, it would be useful to construct a model possessing the advantages of the above models to improve the ability to process complex spatiotemporal information.
As the plasticity rules used here were based purely on local spike events, in contrast with the BP method, our model not only has the potential to be applied to other machine learning tasks under the SL framework but may also be suitable for online learning on programmable neuromorphic chips. Moreover, our hypothesis regarding the function of DA in SL processing may serve as a potential mechanism for synaptic information processing of SL in the brain, which will need to be verified in future experiments.
The code is available at https://github.com/haoyz/sym-STDP-SNN.
This work was supported by the National Natural Science Foundation of China (Grant No. 11505283), Beijing Brain Science Project (Grant No. Z181100001518006), and Strategic Priority Research Program of the Chinese Academy of Sciences (Grant No. XDB32070000). We also gratefully acknowledge the support of the NVIDIA Corporation for the donation of the Titan X Pascal GPU used in this research.
- (1) Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (7553) (2015) 436–444.
- (2) J. Schmidhuber, Deep learning in neural networks: An overview, Neural Networks 61 (2015) 85–117.
- (3) D. E. Rumelhart, G. E. Hinton, R. J. Williams, et al., Learning representations by back-propagating errors, Cognitive modeling 5 (3) (1988) 1.
- (4) G.-q. Bi, M.-m. Poo, Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type, J. Neurosci. 18 (24) (1998) 10464–10472.
- (5) N. Caporale, Y. Dan, Spike timing-dependent plasticity : A hebbian learning rule, Annu . Rev . Neurosci 31 (2008) 25–46.
- (6) S. Song, K. D. Miller, L. F. Abbott, Competitive hebbian learning through spike-timing-dependent synaptic plasticity, Nature neuroscience 3 (9) (2000) 919.
- (7) S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural computation 9 (8) (1997) 1735–1780.
- (8) K. Greff, R. K. Srivastava, J. Koutník, B. R. Steunebrink, J. Schmidhuber, Lstm: A search space odyssey, IEEE transactions on neural networks and learning systems 28 (10) (2016) 2222–2232.
- (9) W. Maass, H. Markram, On the computational power of circuits of spiking neurons, J. Comput. Syst. Sci. 69 (4) (2004) 593–616.
- (10) A. Tavanaei, M. Ghodrati, S. R. Kheradpisheh, T. Masquelier, A. S. Maida, Deep learning in spiking neural networks, arXiv preprint arXiv:1804.08150.
- (11) Y. Wu, L. Deng, G. Li, J. Zhu, L. Shi, Spatio-temporal backpropagation for training high-performance spiking neural networks, arXiv preprint arXiv:1706.02609.
- (12) Y. Cao, Y. Chen, D. Khosla, Spiking deep convolutional neural networks for energy-efficient object recognition, Int. J. Comput. Vision 113 (1) (2015) 54–66.
- (13) E. Hunsberger, C. Eliasmith, Spiking deep networks with lif neurons, arXiv preprint arXiv:1510.08829.
- (14) P. U. Diehl, D. Neil, J. Binas, M. Cook, S.-C. Liu, M. Pfeiffer, Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing, in: IJCNN, IEEE, 2015, pp. 1–8.
- (15) S. K. Esser, R. Appuswamy, P. A. Merolla, J. V. Arthur, D. S. Modha, Backpropagation for energy-efficient neuromorphic computing, NIPS (2015) 1117–1125.
- (16) P. U. Diehl, G. Zarrella, A. Cassidy, B. U. Pedroni, E. Neftci, Conversion of artificial recurrent neural networks to spiking neural networks for low-power neuromorphic hardware, in: ICRC, IEEE, 2016, pp. 1–8.
- (17) D. Neil, S.-C. Liu, Effective sensor fusion with event-based sensors and deep network architectures, in: ISCAS, IEEE, 2016, pp. 2282–2285.
- (18) S. K. Esser, P. A. Merolla, J. V. Arthur, A. S. Cassidy, R. Appuswamy, A. Andreopoulos, D. J. Berg, J. L. Mckinstry, T. Melano, D. R, et al., Convolutional networks for fast, energy-efficient neuromorphic computing, Proc. Natl. Acad. Sci. U. S. A. 113 (41) (2016) 11441–11446.
- (19) Y. Hu, H. Tang, Y. Wang, G. Pan, Spiking deep residual network, arXiv preprint arXiv:1805.01352.
- (20) J. H. Lee, T. Delbruck, M. Pfeiffer, Training deep spiking neural networks using backpropagation, Front. Neurosci. 10.
- (21) A. Samadi, T. P. Lillicrap, D. B. Tweed, Deep learning with dynamic spiking neurons and fixed feedback weights, Neural Comput. 29 (3) (2017) 578–602.
- (22) A. Tavanaei, A. S. Maida, Bp-stdp: Approximating backpropagation using spike timing dependent plasticity, arXiv preprint arXiv:1711.04214.
- (23) T. Zhang, Y. Zeng, D. Zhao, M. Shi, A plasticity-centric approach to train the non-differential spiking neural networks, in: AAAI, 2018.
- (24) P. U. Diehl, M. Cook, Unsupervised learning of digit recognition using spike-timing-dependent plasticity, Front. Comput. Neurosci. 9.
- (25) S. R. Kheradpisheh, M. Ganjtabesh, S. J. Thorpe, T. Masquelier, Stdp-based spiking deep convolutional neural networks for object recognition, Neural Networks 99 (2018) 56–67.
- (26) M. Beyeler, N. D. Dutt, J. L. Krichmar, Categorization and decision-making in a neurobiologically plausible spiking network using a stdp-like learning rule, Neural Networks 48 (10) (2013) 109–124.
- (27) Z. Hu, T. Wang, X. Hu, An stdp-based supervised learning algorithm for spiking neural networks, in: ICONIP, Springer, 2017, pp. 92–100.
- (28) A. Shrestha, K. Ahmed, Y. Wang, Q. Qiu, Stable spike-timing dependent plasticity rule for multilayer unsupervised and supervised learning, in: Neural Networks (IJCNN), 2017 International Joint Conference on, IEEE, 2017, pp. 1999–2006.
- (29) M. Mozafari, M. Ganjtabesh, A. Nowzari-Dalini, S. J. Thorpe, T. Masquelier, Combining stdp and reward-modulated stdp in deep convolutional spiking neural networks for digit recognition, arXiv preprint arXiv:1804.00227.
- (30) J. C. Zhang, P. M. Lau, G. Q. Bi, C. F. Stevens, Gain in sensitivity and loss in temporal contrast of stdp by dopaminergic modulation at hippocampal synapses, Proc. Natl. Acad. Sci. U. S. A. 106 (31) (2009) 13028–13033.
- (31) H. Ruan, T. Saur, W.-D. Yao, Dopamine-enabled anti-Hebbian timing-dependent plasticity in prefrontal circuitry, Front. Neural Circuit. 8 (April) (2014) 1–12.
- (32) Z. Brzosko, W. Schultz, O. Paulsen, Retroactive modulation of spike timing-dependent plasticity by dopamine, Elife 4.
- (33) N. Masuda, H. Kori, Formation of feedforward networks and frequency synchrony by spike-timing-dependent plasticity, Journal of computational neuroscience 22 (3) (2007) 327–345.
- (34) H. Tanaka, T. Morie, K. Aihara, A cmos spiking neural network circuit with symmetric/asymmetric stdp function, IEICE transactions on fundamentals of electronics, communications and computer sciences 92 (7) (2009) 1690–1698.
- (35) T. Serrano-Gotarredona, T. Masquelier, T. Prodromakis, G. Indiveri, B. Linares-Barranco, Stdp and stdp variations with memristors for spiking neuromorphic learning systems, Front. Neurosci. 7 (2013) 2.
- (36) R. K. Mishra, S. Kim, S. J. Guzman, P. Jonas, Symmetric spike timing-dependent plasticity at ca3–ca3 synapses optimizes storage and recall in autoassociative networks, Nature communications 7 (2016) 11552.
- (37) G. G. Turrigiano, The self-tuning neuron: Synaptic scaling of excitatory synapses, Cell 135 (3) (2008) 422–435.
- (38) G. G. Turrigiano, K. R. Leslie, N. S. Desai, L. C. Rutherford, S. B. Nelson, Activity-dependent scaling of quantal amplitude in neocortical neurons, Nature 391 (6670) (1998) 892.
- (39) F. Effenberger, J. Jost, A. Levina, Self-organization in balanced state networks by stdp and homeostatic plasticity, PLoS Comput. Biol. 11 (9).
- (40) T. P. Vogels, H. Sprekeler, F. Zenke, C. Clopath, W. Gerstner, Inhibitory plasticity balances excitation and inhibition in sensory pathways and memory networks, Science 334 (6062) (2011) 1569–1573.
- (41) L. C. Yeung, H. Z. Shouval, B. S. Blais, L. N. Cooper, Synaptic homeostasis and input selectivity follow from a calcium-dependent plasticity model, Proceedings of the National Academy of Sciences 101 (41) (2004) 14943–14948.
- (42) Q.-Q. Sun, Experience-dependent intrinsic plasticity in interneurons of barrel cortex layer iv, Journal of Neurophysiology 102 (5) (2009) 2955–2973.
- (43) W. Zhang, D. J. Linden, The other side of the engram: experience-driven changes in neuronal intrinsic excitability, Nature Reviews Neuroscience 4 (11) (2003) 885.
- (44) K. Pozo, Y. Goda, Unraveling mechanisms of homeostatic synaptic plasticity, Neuron 66 (3) (2010) 337–351.
- (45) L. N. Cooper, M. F. Bear, The bcm theory of synapse modification at 30: interaction of theory with experiment, Nature Reviews Neuroscience 13 (11) (2012) 798.
- (46) N. X. Tritsch, B. L. Sabatini, Dopaminergic modulation of synaptic transmission in cortex and striatum, Neuron 76 (1) (2012) 33–50.
- (47) G. W. Davis, Homeostatic control of neural activity: from phenomenology to molecular design, Annu. Rev. Neurosci. 29 (2006) 307–323.
- (48) A. Maffei, G. G. Turrigiano, Multiple modes of network homeostasis in visual cortical layer 2/3, Journal of Neuroscience 28 (17) (2008) 4377–4384.
- (49) T. Keck, G. B. Keller, R. I. Jacobsen, U. T. Eysel, T. Bonhoeffer, M. Hübener, Synaptic scaling and homeostatic plasticity in the mouse visual cortex in vivo, Neuron 80 (2) (2013) 327–334.
- (50) K. B. Hengen, M. E. Lambo, S. D. Van Hooser, D. B. Katz, G. G. Turrigiano, Firing rate homeostasis in visual cortex of freely behaving rodents, Neuron 80 (2) (2013) 335–342.
- (51) E. Yavuz, J. Turner, T. Nowotny, Genn: a code generation framework for accelerated brain simulations, Scientific reports 6 (2016) 18854.
- (52) L. v. d. Maaten, G. Hinton, Visualizing data using t-sne, J. Mach. Learn. Res. 9 (Nov) (2008) 2579–2605.
- (53) C. Lee, P. Panda, G. Srinivasan, K. Roy, Training deep spiking convolutional neural networks with stdp-based unsupervised pre-training followed by supervised fine-tuning, Frontiers in Neuroscience 12 (2018) 435.
- (54) D. Querlioz, O. Bichler, P. Dollfus, C. Gamrat, Immunity to device variations in a spiking neural network with memristive nanodevices, IEEE Trans. Nanotechnol. 12 (3) (2013) 288–295.
- (55) Z. Lin, D. Ma, J. Meng, L. Chen, Relative ordering learning in spiking neural network for pattern recognition, Neurocomputing 275 (2018) 94–106.
- (56) I. Sporea, A. Grüning, Supervised learning in multilayer spiking neural networks, Neural computation 25 (2) (2013) 473–509.
- (57) Q. Xu, Y. Qi, H. Yu, J. Shen, H. Tang, G. Pan, Csnn: An augmented spiking based framework with perceptron-inception., in: IJCAI, 2018, pp. 1646–1652.
- (58) H. Xiao, K. Rasul, R. Vollgraf, Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, arXiv preprint arXiv:1708.07747.
- (59) P. W. Glimcher, Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis, Proceedings of the National Academy of Sciences 108 (Supplement 3) (2011) 15647–15654.
- (60) C. B. Holroyd, M. G. Coles, The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity., Psychological review 109 (4) (2002) 679.
- (61) R. A. Wise, Dopamine, learning and motivation, Nature reviews neuroscience 5 (6) (2004) 483.
- (62) P. Dayan, B. W. Balleine, Reward, motivation, and reinforcement learning, Neuron 36 (2) (2002) 285–298.
- (63) E. M. Izhikevich, Solving the distal reward problem through linkage of stdp and dopamine signaling, Cereb. Cortex 17 (10) (2007) 2443–2452.
- (64) W. Maass, T. Natschläger, H. Markram, Real-time computing without stable states: A new framework for neural computation based on perturbations, Neural computation 14 (11) (2002) 2531–2560.
- (65) N. K. Kasabov, Neucube: A spiking neural network architecture for mapping, learning and understanding of spatio-temporal brain data, Neural Networks 52 (2014) 62–76.
- (66) N. K. Kasabov, Time-Space, Spiking Neural Networks and Brain-Inspired Artificial Intelligence, Vol. 7, Springer, 2018.
- (67) H. Jaeger, The “echo state” approach to analysing and training recurrent neural networks-with an erratum note, Bonn, Germany: German National Research Center for Information Technology GMD Technical Report 148 (34) (2001) 13.
- (68) H. Jaeger, H. Haas, Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication, science 304 (5667) (2004) 78–80.
- (69) M. Lukoševičius, H. Jaeger, Reservoir computing approaches to recurrent neural network training, Comput. Sci. Rev. 3 (3) (2009) 127–149.
- (70) M. Lukoševičius, H. Jaeger, B. Schrauwen, Reservoir computing trends, KI-Künstliche Intelligenz 26 (4) (2012) 365–371.