Reservoir Computing Using Non-Uniform Binary Cellular Automata

Reservoir Computing Using
Non-Uniform Binary Cellular Automata

Stefano Nichele Department of Computer Science
Oslo and Akershus University College of Applied Sciences
Oslo, Norway
stefano.nichele@hioa.no
   Magnus S. Gundersen Department of Computer and Information Science
Norwegian University of Science and Technology
Trondheim, Norway
magnugun@stud.ntnu.no
Abstract

The Reservoir Computing (RC) paradigm utilizes a dynamical system, i.e., a reservoir, and a linear classifier, i.e., a read-out layer, to process data from sequential classification tasks. In this paper the usage of Cellular Automata (CA) as a reservoir is investigated. The use of CA in RC has been showing promising results. In this paper, selected state-of-the-art experiments are reproduced. It is shown that some CA-rules perform better than others, and the reservoir performance is improved by increasing the size of the CA reservoir itself. In addition, the usage of parallel loosely coupled CA-reservoirs, where each reservoir has a different CA-rule, is investigated. The experiments performed on quasi-uniform CA reservoir provide valuable insights in CA-reservoir design. The results herein show that some rules do not work well together, while other combinations work remarkably well. This suggests that non-uniform CA could represent a powerful tool for novel CA reservoir implementations.

Reservoir Computing, Cellular Automata, Parallel Reservoir, Recurrent Neural Networks, Non-Uniform Cellular Automata.

I Introduction

Real life problems often require processing of time-series data. Systems that process such data must remember inputs from previous time-steps in order to make correct predictions in future time-step, i.e, they must have some sort of memory. Recurrent Neural Networks (RNN) have been shown to possess such memory [11].

Unfortunately, training RNNs using traditional methods, i.e., gradient descent, is difficult [2]. A fairly novel approach called Reservoir Computing (RC) has been proposed [13, 20] to mitigate this problem. RC splits the RNN into two parts; the non-trained recurrent part, i.e., a reservoir, and the trainable feed-forward part, i.e. a read-out layer.

In this paper, an RC-system is investigated, and a computational model called Cellular Automata (CA) [26] is used as the reservoir. This approach to RC was proposed in [30], and further studied in [31], [5], and [19]. The term ReCA is used as an abbreviation for ”Reservoir Computing using Cellular Automata”, and is adopted from the latter paper.

In this paper a fully functional ReCA system is implemented and extended into a parallel CA reservoir system (loosely coupled). Various configurations of parallel reservoir are tested, and compared to the results of a single-reservoir system. This approach is discussed and insights of different configurations of CA-reservoirs are given.

Ii Background

Ii-a Reservoir Computing

Feed-forward Neural Networks (NNs) are neural network models without feedback-connections, i.e. they are not aware of their own outputs [11]. They have gained popularity because of their ability to be trained to solve classification tasks. Examples include image classification [25], or playing the board game GO [22]. However, when trying to solve problems that include sequential data, such as sentence-analysis, they often fall short [11]. For example, sentences may have different lengths, and the important parts may be spatially separated even for sentences with equal semantics. Recurrent Neural Networks (RNNs) can overcome this problem [11], being able to process sequential data through memory of previous inputs which are remembered by the network. This is done by relieving the neural network of the constraint of not having feedback-connections. However, networks with recurrent connections are notoriously difficult to train by using traditional methods [2].

Fig. 1: General RC framework. Input is connected to some or all of the reservoir nodes. Output is usually fully connected to the reservoir nodes. Only the output-weights are trained.

Reservoir Computing (RC) is a paradigm in machine learning that combines the powerful dynamics of an RNN with the trainability of a feed-forward neural network. The first part of an RC-system consists of an untrained RNN, called reservoir. This reservoir is connected to a trained feed-forward neural network, called readout-layer. This setup can be seen in fig. 1

The field of RC has been proposed independently by two approaches, namely Echo State Networks (ESN) [13] and Liquid State Machines (LSM) [20]. By examining these approaches, important properties of reservoirs are outlined.

Perhaps the most important feature is the Echo state property [13]. Previous inputs ”echo” through the reservoir for a given number of time steps after the input has occurred, and thereby slowly disappearing without being amplified. This property is achieved in traditional RC-approaches by clever reservoir design. In the case of ESN, this is achieved by scaling of the connection weights of the recurrent nodes in the reservoir [18].

As discussed in [3], the reservoir should preferably exhibit edge of chaos behaviors [16], in order to allow for high computational power [10].

Ii-B Various RC-approaches

Different RC-approaches use reservoir substrates that exhibit the desired properties. In [8] an actual bucket of water is implemented as a reservoir for speech-recognition, and in [15] the E.coli-bacteria is used as a reservoir. In [24] and more recently in [4], the usage of Random Boolean Networks (RBN) reservoirs is explored. RBNs can be considered as an abstraction of CA [9], and is thereby a related approach to the one presented in this paper.

Ii-C Cellular Automata

Cellular Automaton (CA) is a computational model, first proposed by Ulam and von Neumann in the 1940s [26]. It is a complex, decentralized and highly parallel system, in which computations may emerge [23] through local interactions and without any form of centralized control. Some CA have been proved to be Turing complete [7], i.e. having all properties required for computation; that is transmission, storage and modification of information [16].

A CA usually consists of a grid of cells, each cell with a current state. The state of a cell is determined by the update-function , which is a function of the neighboring states . This update-function is applied to the CA for a given number of iterations. These neighbors are defined as a number of cells in the immediate vicinity of the cell itself.

In this paper, only one-dimensional elementary CA is used. This means that the CA only consists of a one-dimensional vector of cells, named , each cell with state . In all the figures in this paper, is shown as white, while is shown as black. The cells have three neighbors; the cell to the left, itself, and the cell to the right. A cell is a neighbor of itself by convention. The boundary conditions at each end of the 1D-vector is usually solved by wrap-around, where the leftmost cell becomes a neighbor of the rightmost, and vice versa.

The update-function , hereafter denoted rule , works accordingly by taking three binary inputs, and outputting one binary value. This results in different rules. An example of such a rule is shown in fig. 2, where rule 110 is depicted. The numbering of the rules follows the naming convention described by Wolfram [29], where the resulting binary string is converted to a base 10 number. The CA is usually updated in synchronous steps, where all the cells in the 1D-vector are updated at the same time. One update is called an iteration, and the total number of iterations is denoted by .

Fig. 2: Elementary CA rule 110. The figure depicts all the possible combinations that the neighbors of a cell can have. A cell is its own neighbor by convention.

The rules may be divided into four qualitative classes [29], that exhibit different properties when evolved; class I: evolves to a static state, class II: evolves to a periodic structure, class III: evolves to chaotic patterns and class IV: evolves to complex patterns. Class I and II rules will fall into an attractor after a short while [16], and behave orderly. Class III rules are chaotic, which means that the organization quickly descends into randomness. Class IV rules are the most interesting ones, as they reside at a phase transition between the chaotic and ordered phase, i.e., at the edge of chaos [16]. In uniform CA, all cells share the same rule, whether non-uniform CA cells are governed by different rules. Quasi-uniform CA are non-uniform with a small number of diverse rules.

Ii-D Cellular automata in reservoir computing

As proposed in [30], CA may be used as reservoir of dynamical systems. The conceptual overview is shown in fig. 3. Such system is referred to as ReCA in [19], and the same name is therefore adopted in this paper. The projection of the input to the CA-reservoir can be done in two different ways [30]. If the input is binary, the projection is straightforward, where each feature dimension of the input is mapped to a cell. If the input is non-binary, the projection can be done by a weighted summation from the input to each cell. See [31] for more details.

The time-evolution of the reservoir can be represented as follows:

Where is the state of the 1D CA at iteration m and Z is the CA-rule that was applied. is the initial state of the CA, often an external input, as discussed later.

Fig. 3: General ReCA framework. Input is projected onto the cells of a one dimensional (1D) cellular automata, and the CA-rule is applied for a number of iterations. In the figure, each iteration is stored, and denoted by . The readout-layer weights are trained according to the target-function. Figure adapted from [31]

As discussed in section II-A, a reservoir often operates at the edge of chaos [10]. Selecting CA-based reservoirs that exhibit this property is trivial, as rules that lie inside Wolfram class IV can provide this property. Additionally, to fully exploit such property, all iterations of a the CA evolution are used for classification, and this can be stated as follows:

Where is used for classification.

The ReCA system must also exhibit the echo state property, as described in section II-A. This is done by allowing the CA to take external input, while still remembering the current state. As descibed in more details later, ReCA-systems address this issue by using some time-transition function, named F, which allows some previous inputs to echo through the CA.

CA also provide additional advantages to RC. In [31] a speedup of 1.5-3X in the number of operations compared to the ESN [14] approach is reported. This is mainly due to a CA relying on bit-wise operations, while ESN uses floating point operations. This can be additionally exploited by utilizing custom made hardware like FPGAs. In addition, if edge-of-chaos rules are selected, Turing complete computational power is present in the reservoir. CA theoretical analysis is easier than RNNs, and they allow Boolean logic and Galois field algebra.

Ii-E ReCA system implementations

ReCA systems are a very novel concept and therefore there are only few implemented examples at the current stage of research. Yilmaz [30, 31] has implemented a ReCA system with elementary CA and Game of Life [6]. Bye [5] also demonstrated a functioning ReCA-system in his master’s thesis (supervised by Nichele). The used approaches are similar, however, there are some key differences:

Ii-E1 Encoding and random mappings

In the encoding stage, [31] used random permutations over the same input-vector. This encoding scheme can be seen in fig. 4. The permutation procedure is repeated number of times, because it was experimentally observed that multiple random mappings improve performance.

Fig. 4: The encoding used in [31]. For a total of R permutations, is randomly mapped to vectors of the same size as the input-vector itself.

In [5] a similar approach was used. The main difference is that the input is mapped to a vector that is larger than the input-vector itself. The size of this mapping-vector is given by a parameter ”automaton size”. This approach can be seen in fig. 5. The input-bits are randomly mapped to one of the bits in the mapping-vector. The ones that do not have any mapping to them are left to zero.

Fig. 5: The encoding used in [5]. The input X is randomly mapped to a vector with size larger than the input vector itself. This mapping is done R times. The size of the vector that the input is mapped to can be determined in two ways. Either by ”automaton size”, which explicitly gives the size of the vector (in this case 8), or by the C-parameter, which the size is given by (in this case C=2)

In the work herein, the approach described in [5] is used, but with a modification. Instead of using the automaton size-parameter, the C-parameter is introduced. The total length of the permutation is given by the number C multiplied by the length of the input-vector. In the case of fig. 5, the automation size would be 8, and C would be 2.

Ii-E2 Feed-forward or recurrent

[31] proposed both a feed-forward and a recurrent design. The difference was whether the whole input-sequence is presented to the system in one chunk or step-by-step. [5] only described a recurrent design. Only the recurrent architectures will be investigated in this paper. This is because it is more in line with traditional RNNs and RC-systems, and is conceptually more biologically plausible.

Ii-E3 Concatenation of the encoded inputs before propagating into the reservoir

After random mappings have been created, there is another difference in the proposed approaches. In the recurrent architecture, [31] concatenates the number of permutations into one large vector of length () before propagating it in a reservoir of the same width as this vector. The 1D input-vector at time-step can be expressed as follows:

is inserted into the reservoir as described in section II-D, and then iterated times. The iterations are then concatenated into the vector , which is used for classification at time-step t.

[5] adapted a different approach, the same one that was also used by the feed-forward architecture in [31], where the R different permutations are iterated in separate reservoirs, and the different reservoirs are then concatenated before they are used by the classifier. The vector which is used for classification at time-step is as follows:

Where is the vector from the concatenated reservoir. In this paper, the recurrent architecture approach is used.

Ii-E4 Time-transition

In order to allow the system to remember previous inputs, a time-transition function is needed to translate between the current time-step and the next. One possibility is to use normalized addition as time-transition function, as shown in fig. 6, with F as normalized addition. This function works as follows: The cell values are added, and if the sum is 2 (1+1) the output-value becomes 1, if the sum is 0, the output-value becomes 0 and if the sum is 1, the cell-value is decided randomly (0 or 1). The initial 1D-CA-vector of the reservoir at time-step is then expressed as:

Where may be any bit-wise operation, is the input from the sequential task at time-step , and is the last iteration of the previous time-step. At the first time-step (t=0), the transition-function is bypassed, and the input is used directly in the reservoir.

Fig. 6: Time transition used in [31]. The sequence input is combined with the state of the reservoir at the last iteration at the previous time-step . The function may be any bit-wise function. Only one permutation is shown in the figure to increase readability.

Another possibility is to use ”permutation transition” as time-transition function, as seen in fig. 7. Here, all cells that have a mapping to them (from the encoder) are bit-wise filled with the value of input-vector . If the cells do not have any mapping to them, the values from are inserted. This allows the CA to have memory across time-steps in sequential tasks. By adjusting the automaton-size, or C-parameter, the interaction between each time-step can be regulated.

Fig. 7: Time-transition by permutation. The input is directly copied from , according to the mapping from the encoder, as shown in fig. 5. The other cells have their values copied from the last iteration of the previous time-step . Only one permutation is shown to increase readability.

The described approaches have different effects on the parameters R and I, and also the resulting size of the reservoir. This is relevant when discussing the computational complexity of ReCA systems.

In this paper, the ”permutation transition” is used.

Iii Experimental Setup

The basic architecture implemented in this paper is shown in fig. 9. The encoder is based on the architecture described in [5]. In this paper, the parameter C is introduced as a metric on how large resulting mapping-vector should be. The concatenation procedure is adapted from [31]. The vectors, after the encoding (random mappings), are concatenated into one large vector. This vector is then propagated into the reservoir, as described in section II-E3. The time-transition function is adapted from [5]. The mappings from the encoder are saved, and used as a basis where new inputs are mapped to, as described in section II-E4. The values from the last step in the previous time-step are directly copied. The classifier used in this paper is a Support Vector Machine, as implemented in the Python machine learning framework scikit-learn [21]. The code-base that was used in this paper is available for download [1].

Fig. 8: Example run of the ReCA system with rule 90. The run is done with the parameters R=8, I=4 and C=5. The horizontal gray lines represent a time-step, in which the time-transition function is applied to every bit. Time flows downwards. The visualization is produced with the ReCA system described in this paper.
Fig. 9: Architecture of the implemented system. The encoding is done according to the encoding-scheme as shown in fig. 5, but with the slight modification of the C-parameter. The encoding is exemplified with R=2 and C=2, which yields a size of eight for each permutation. The two permutations are then concatenated. At time-step 1, there are no previous inputs, and the concatenated vector is simply used as the first iteration of the CA-reservoir. The rule Z is then applied for I iterations. At time-step 2, the encoding and concatenation is repeated. The time-transition scheme is then applied, as described in fig. 7. The procedure as described in time-step 2 is repeated until the end of the sequence.

An example run with rule 90 is shown in fig. 8. This visualisation gives valuable insights in how the reservoir behaves when parameters are changed, and makes it easier to understand the reservoir dynamics. Most natural systems come in the form of a temporal system (sequential), i.e., an input to the system depends on previous inputs. Classical feed-forward architectures are known to have issues with temporal tasks [11]. In order to test the ReCA-system at a temporal task, the 5-bit task [12] is chosen in this paper. Such task has become a popular and widely used benchmark for reservoir computing, in particular because it tests the long-short-term memory of the system. An example data set from this task is presented in fig. 10. The length of the sequence is given by . and are the input-signals, and and are the output-signals. At each time-step only one input-signal, and one output-signal, can have the value 1. The values of and at the first five time-steps give the pattern that the system shall learn. The next time-steps represent the distractor-period, where the system is distracted from the previous inputs. This is done by setting the value of to 1. After the disctractor period, the signal is fired which marks the cue-signal. The system is then asked to repeat the input-pattern on the outputs and . The output is a waiting signal, which is supposed to be 1 right until the input-pattern is repeated. More details on the 5-bit memory task can be found in [14].

Fig. 10: Example data from the 5-bit task. The length of the sequence is . The signals , , and are input-signals, while , and are output-signals. In the first five time-steps the system learns the pattern. The system is then distracted for time-steps. After the cue-signal is set, the system is expected to reproduce the pattern that was learned.

Iii-a Use of parallel CA-reservoirs in RC

Fig. 11: Concept behind parallel CA reservoirs. Iterations flow downward. The rules are interacting at the middle boundaries and at the side boundaries, where the CA wraps around.

In this paper the use of parallel reservoirs is proposed. The concept is shown in fig. 11. At the boundary conditions, i.e. the cell at the very end of the reservoir, the rule will treat the cell that lies within the other reservoir, as a cell in its own reservoir. This causes information/computation to flow between the reservoirs (loosely coupled).

By having different rules in the reservoirs, one might be able to solve different aspects of the same problem, or even two problems at the same time. In [5], both the temporal parity and the temporal density task [14] are investigated.

Which rule is most suited for a task is still an open research question. The characteristics and classes described in section II-C are useful knowledge, however it does not precisely describe why some rules perform better than others on different tasks. In fig. 12 an example run of the parallel system is showed, with rule 90 on the left, and 182 on the right. This visualization gives useful insights on how the rules interact.

Fig. 12: Example run of the ReCA system with rule 90 on the left and 182 on the right. Information is allowed to flow between the reservoirs. The run is done with the parameters R=8, I=4 and C=5. The horizontal gray lines represent a time-step, in which the time-transition function is applied to every bit. Time flows downwards. The visualization is produced with the implemented system.

Iii-B Measuring computational complexity of a CA-reservoir

The size of the reservoir is crucial for the success of the system. In this paper, the reservoir size is measured by . As seen in section III-A, the size of the reservoirs will remain the same both for the one-rule reservoirs and the two-rule reservoirs. This is crucial in order to be able to directly compare their performances.

Iv Results

Training set size 32
Testing set size 32
Distractor period 200
No. runs 120
TABLE I: 5-bit task parameters
CA rules 60, 90, 102, 105, 150, 153, 165, 180, 195
I (iterations) 2, 4
R (random mappings) 4, 8
C (size multiple) 10
TABLE II: CA reservoir parameter combinations

The parameters for the used 5-bit memory task can be seen in table I. The same parameters as in the single-reservoir system are used in the quasi-uniform CA reservoir system with a combination of two rules. The tested combinations of rules are shown in table II.

Iv-a Results from the single ReCA-system

The results from the single reservoir ReCA-system can be seen in table III. The results in this paper are significantly better than what was reported in [5]. We can however see a similar trend. Rules 102 and 105 were able to give promising results, while rule 180 was not very well suited for this task. An exception is rule 90 and 165, where the results in table III show very high accuracy. In [31] very promising results from rule 90 are also achieved.

Rule I=2, R=4 I=2, R=8 I=4, R=4 I=4, R=8
60 25.8% 53.3% 76.7% 95.0%
90 100.0% 100.0% 97.5% 100.0%
102 30.8% 63.3% 71.7% 96.7%
105 95.8% 99.2% 99.2% 100.0%
150 96.7% 100.0% 100.0% 100.0%
153 26.7% 55.0% 80.0% 95.0%
165 100.0% 100.0% 100.0% 100.0%
180 9.2% 38.3% 0.8% 1.7%
195 39.2% 61.7% 79.2% 95.8%
TABLE III: Single reservoir CA on 5-bit task. Successful runs with T=200

Iv-B Results from the parallel (non-uniform) ReCA-system

Results can be seen in table IV. It can be observed that combination of rules that were performing well in table III seem to give good results when combined. However, some combination of rules, e.g., 60 and 102, 153 and 195, gave worse results than the rules by themselves. We can observe the same tendencies as in the single-runs; higher R and I generally yields better results.

Rule I=2, R=4 I=2, R=8 I=4, R=4 I=4, R=8
60 and 90 87.5% 100.0% 96.9% 100.0%
60 and 102 0.0% 0.0% 0.0% 0.0%
60 and 105 81.2% 100.0% 96.9% 100.0%
60 and 150 71.9% 100.0% 96.9% 100.0%
60 and 153 0.0% 0.0% 0.0% 0.0%
60 and 165 87.5% 93.8% 96.9% 96.9%
60 and 180 43.8% 53.1% 90.6% 84.4%
60 and 195 0.0% 0.0% 0.0% 0.0%
90 and 102 90.6% 100.0% 100.0% 96.9%
90 and 105 100.0% 100.0% 100.0% 100.0%
90 and 150 100.0% 100.0% 100.0% 100.0%
90 and 153 93.8% 96.9% 96.9% 100.0%
90 and 165 90.6% 100.0% 100.0% 100.0%
90 and 180 90.6% 100.0% 100.0% 100.0%
90 and 195 87.5% 96.9% 100.0% 100.0%
102 and 105 78.1% 100.0% 96.9% 100.0%
102 and 150 81.2% 100.0% 96.9% 100.0%
102 and 153 0.0% 0.0% 0.0% 3.1%
102 and 165 93.8% 100.0% 100.0% 100.0%
102 and 180 0.0% 40.6% 3.1% 6.2%
102 and 195 0.0% 0.0% 0.0% 3.1%
105 and 150 93.8% 100.0% 100.0% 100.0%
105 and 153 75.0% 93.8% 93.8% 100.0%
105 and 165 96.9% 100.0% 100.0% 100.0%
105 and 180 93.8% 100.0% 100.0% 100.0%
105 and 195 65.6% 93.8% 96.9% 100.0%
150 and 153 87.5% 100.0% 96.9% 100.0%
150 and 165 100.0% 100.0% 100.0% 100.0%
150 and 180 81.2% 100.0% 100.0% 100.0%
150 and 195 78.1% 96.9% 100.0% 100.0%
153 and 165 81.2% 100.0% 100.0% 100.0%
153 and 180 3.1% 46.9% 0.0% 0.0%
153 and 195 0.0% 0.0% 0.0% 0.0%
165 and 180 96.9% 96.9% 100.0% 100.0%
165 and 195 87.5% 100.0% 100.0% 100.0%
180 and 195 40.6% 87.5% 93.8% 96.9%
TABLE IV: Parallel CA on 5-bit task. Successful runs with T=200

V Analysis

V-a Single reservoir ReCA-system

The complexity of the reservoir is a useful metric when comparing different approaches. If we examine rule 90, we can observe that it achieves 100% success rate at , and . The size of the reservoir is at this configuration. Note that even though lower values of and also give 100%, at and the success is 97.5%, yet again 100% at and . [31] reported a 100% success-rate on the same task with and . The C-parameter was set to 1. As such, the size of the reservoir is (feed-forward architecture).

[31] also presented results on the 5-bit task using the recurrent architecture. 100% success-rate was achieved with and . This yields a reservoir size of . Those results were intended to study the relationship between the distractor period of the 5-bit task, and the number of random mappings. The was kept fixed at 32 during this experiment. Even if the motivation for the experiments were different, the comparison of results gives insight that the reservoir size itself may not be the only factor that determines the performance of the ReCA system.

V-B Parallel reservoir (non-uniform) ReCA-system

Why are some combinations better than others? As observed in section IV-B, rules that are paired with others rules that perform well on their own, also perform well together. The combination of rule 90 and rule 165 is observed to be very successful. As described in [28], rule 165 is the complement of rule 90. If we observe the single-CA results in table III we can see that rule 90 and 165 perform very similarly.

Examining one of the worst-performing rule-combinations of the experiments, i.e., rule 153 and rule 195, we get some useful insight as seen in fig. 13. Here it is possible to notice that the interaction of rules creates a ”black” region in the middle (between the rules), thereby effectively reducing the size of the reservoir. As described in [27], rule 153 and 192 are the mirrored complements.

Rule 105 is an interesting rule to be combined with others. As described in [29], the rule does not have any compliments or any mirrored compliments. Nevertheless, as seen in table IV-B, it performs well in combination with most other rules.

Fig. 13: Example run of the ReCA system with rule 153 and 195. The run is done with the parameters R=8, I=4 and C=5. The horizontal gray lines represent a time-step, in which the time-transition function is applied to every bit. Time flows downwards. The visualization is produced with the implemented system.

Vi Conclusion

A framework for using cellular automata in reservoir computing has been implemented, which makes use of uniform CA and quasi-uniform CA. Relationship between reservoir size and performances of the system are presented. The implemented configuration using parallel CA reservoir is tested in this paper for the first time (to the best of the authors’ knowledge). Results have shown that some CA rules work better in combination than other. Good combinations tend to have some relation, e.g. being complementary. Rules that are mirrored compliments do not work well together, because they effectively reduce the size of the reservoir. The concept is still very novel, and a lot of research is left to be done, both regarding the use of non-uniform CA reservoir, as well as ReCA-systems in general.

As previously discussed, finding the best combination of rules is not trivial. If we only consider the usage of two distinct rules, the rule space grows from only single-reservoir options to different combinations. Matching two rules that perform well together can be quite a challenge. By investigating the characteristics of the rules, e.g., with lambda-parameter [16], Lyapunov exponent [17] or other metrics, it may be possible to pinpoint promising rules. Ideally, the usage of more than two different rules could prove a powerful tool. The rule space would then grow even larger, and an exhaustive search would be infeasible. However, one possibility would be to use evolutionary algorithms to search for suitable rules. Adding more and more rules would bring the reservoir closer to a true non-uniform CA.

In [14] a wide range of different tasks is presented. In this paper only one (5-bit task) is used as a benchmark. By combining different rules’ computational power, one could design a reservoir that performs well on a variety of tasks.

References

  • [1] Repository of the implemented reca system used in this paper. https://github.com/magnusgundersen/spec. Accessed: 2016-12-14.
  • [2] Yoshua Bengio, Patrice Simard, and Paolo Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks, 5(2):157–166, 1994.
  • [3] Nils Bertschinger and Thomas Natschläger. Real-time computation at the edge of chaos in recurrent neural networks. Neural computation, 16(7):1413–1436, 2004.
  • [4] Aleksander Vognild Burkow. Exploring physical reservoir computing using random boolean networks. Master’s thesis, NTNU, 2016.
  • [5] Emil Taylor Bye. Investigation of elementary cellular automata for reservoir computing. Master’s thesis, NTNU, 2016.
  • [6] John Conway. The game of life. Scientific American, 223(4):4, 1970.
  • [7] Matthew Cook. Universality in elementary cellular automata. Complex systems, 15(1):1–40, 2004.
  • [8] Chrisantha Fernando and Sampsa Sojakka. Pattern recognition in a bucket. In European Conference on Artificial Life, pages 588–597. Springer, 2003.
  • [9] Carlos Gershenson. Introduction to random boolean networks. arXiv preprint nlin/0408006, 2004.
  • [10] Thomas E Gibbons. Unifying quality metrics for reservoir networks. In Neural Networks (IJCNN), The 2010 International Joint Conference on, pages 1–7. IEEE, 2010.
  • [11] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep learning. Book in preparation for MIT Press, 2016.
  • [12] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
  • [13] Herbert Jaeger. The “echo state” approach to analysing and training recurrent neural networks-with an erratum note. Bonn, Germany: German National Research Center for Information Technology GMD Technical Report, 148:34, 2001.
  • [14] Herbert Jaeger. Long short-term memory in echo state networks: Details of a simulation study. Technical report, Jacobs University Bremen, 2012.
  • [15] Ben Jones, Dov Stekel, Jon Rowe, and Chrisantha Fernando. Is there a liquid state machine in the bacterium escherichia coli? In 2007 IEEE Symposium on Artificial Life, pages 187–191. Ieee, 2007.
  • [16] Chris G Langton. Computation at the edge of chaos: phase transitions and emergent computation. Physica D: Nonlinear Phenomena, 42(1):12–37, 1990.
  • [17] Robert Legenstein and Wolfgang Maass. Edge of chaos and prediction of computational performance for neural circuit models. Neural Networks, 20(3):323–334, 2007.
  • [18] Mantas Lukoševičius, Herbert Jaeger, and Benjamin Schrauwen. Reservoir computing trends. KI-Künstliche Intelligenz, 26(4):365–371, 2012.
  • [19] Mrwan Margem and Ozgür Yilmaz. An experimental study on cellular automata reservoir in pathological sequence learning tasks. 2016.
  • [20] Thomas Natschläger, Wolfgang Maass, and Henry Markram. The” liquid computer”: A novel strategy for real-time computing on time series. Special issue on Foundations of Information Processing of Telematik, 8(Lnmc-Article-2002-005):39–43, 2002.
  • [21] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12(Oct):2825–2830, 2011.
  • [22] David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. Mastering the game of go with deep neural networks and tree search. Nature, 529(7587):484–489, 2016.
  • [23] Moshe Sipper. The emergence of cellular computing. Computer, 32(7):18–26, 1999.
  • [24] David Snyder, Alireza Goudarzi, and Christof Teuscher. Computational capabilities of random automata networks for reservoir computing. Physical Review E, 87(4):042808, 2013.
  • [25] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1–9, 2015.
  • [26] John Von Neumann, Arthur W Burks, et al. Theory of self-reproducing automata. IEEE Transactions on Neural Networks, 5(1):3–14, 1966.
  • [27] Eric W Weisstein. ”rule 60.” from mathworld – a wolfram web resource. http://mathworld.wolfram.com/Rule60.html. Accessed: 2016-12-10.
  • [28] Eric W Weisstein. ”rule 90.” from mathworld – a wolfram web resource. http://mathworld.wolfram.com/Rule90.html. Accessed: 2016-12-10.
  • [29] Stephen Wolfram. A new kind of science, volume 5. Wolfram media Champaign, 2002.
  • [30] Ozgur Yilmaz. Reservoir computing using cellular automata. arXiv preprint arXiv:1410.0162, 2014.
  • [31] Ozgur Yilmaz. Connectionist-symbolic machine intelligence using cellular automata based reservoir-hyperdimensional computing. arXiv preprint arXiv:1503.00851, 2015.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
199102
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description