Spiking Networks for Improved Cognitive Abilities of Edge Computing Devices

Spiking Networks for Improved Cognitive Abilities of Edge Computing Devices

Abstract

This concept paper highlights a recently opened opportunity for large scale analytical algorithms to be trained directly on edge devices. Such approach is a response to the arising need of processing data generated by natural person (a human being), also known as personal data. Spiking Neural networks are the core method behind it: suitable for a low latency energy-constrained hardware, enabling local training or re-training, while not taking advantage of scalability available in the Cloud.

1 Introduction

A sudden realization came to our minds while preparing this white paper – mobile phones are the first type of devices that received dedicated math accelerators at a pervasive scale. Such things never got wide adoption before: Intel 8087 co-processor[11], Intel Xeon Phi[2, 5] or Google TPU (Tensor Processing Unit)[6] stayed niche devices that few people use and even fewer develop for. But since the last two years, major mobile phone companies include dedicated co-processors[4] necessary for computational photography enhancement or facial recognition, that are suitable for general machine learning.

Currently the dominant analytical approach stores data and runs computations in the Cloud[12]. However Cloud based methods poorly fit to a range of important practical applications including augmented reality, real-time data analysis, real-time user interaction, or processing sensitive data that incur high risks for a company if leaked, stolen or intercepted in transfer. The price of deployed analytical methods is increased by the need to have a permanently working internet connection for users, and cloud hardware rent for service providers.

1.1 Mobile-first Machine Learning

The difference in the operating systems running on mobile and desktop devices is smaller than many people think. It is limited to the interface, that interacts with window-based applications using mouse and keyboard on desktops, and with full-screen applications using touch gestures on mobile devices. The operating system kernel, storage and graphics are almost identical to the desktop analogues.

However, there is a factor that amplifies the small differences of mobile devices to the extent of making them useless in machine learning: the software. Researchers use programming languages (Python, R) created for systems with global folder access and command-line installers. These are not readily available on mobile devices that sandbox all applications for security reasons[7]. Even if they manage to run on mobile devices, their speed and efficiency are abysmal because they can only access a low-power CPU that needs to emulate advanced x86 architecture commands[3].

The missing software problem is recognized by the mobile device vendors, who addressed it by providing optimized libraries or even whole new programming languages1 that enable writing powerful software for the mobile platform. The authors worked with GPU-accelerated mobile BLAS/LAPACK library[1] and found it to be as convenient as high-level CUDA primitives while performing on par at the level of a 45W Intel i7 laptop processor.

1.2 Spiking Neural Networks as the ”mobile brains”

A good supervised learning method is needed for the mobile-first machine learning, and it must be able to train directly on device. Apple ecosystem provides accelerated libraries for all kinds of linear systems (dense and sparse), and for inference in arbitrary neural networks.

Deep Learning and traditional neural networks can be rejected for their slow and energy-inefficient training. Support Vector Machines and Nearest Neighbour method also slow at large scale. Decision Trees and Random Forest are missing the corresponding libraries, and linear models are limited in their learning ability. Native support for the concept of time is another requirement for a human-friendly method operated on mobile devices by actual people, and it is not available in any of the mentioned methods. A recurrent deep neural network with the training speed of a linear model would fit perfectly for mobile devices.

Such method actually exist, in form of spiking neural network[10] implemented in reservoir computing framework[13], also known as Liquid State Machine[9, 8]. Spiking neurons have an internal state value, that produces a binary output (”spike”) and resets to zero upon reaching a threshold. That internal state provides networks of spiking neurons with the concept of time, and binary output spikes are propagated energy efficiently at the lowest possible precision in edge devices. Due to internal states spiking networks can be run only one step at a time, limiting speedup from batch processing on large parallel devices (server-grade GPU accelerators) but fitting well to the low latency (due to local processing) and shared-memory GPUs of edge devices.

Figure 1: Liquid State Machine: a spiking neural network running in reservoir computing framework. Reservoir part (gray) requires only inference; the training occurs in the linear readout layer (blue).

Spiking neurons have no effective training methods as they are not differentiable; but gradient-based training is a poor choice for edge devices due to high computational power and energy demands. Liquid State Machines (LSM) offer a framework for spiking networks that avoids training the spiking neuron reservoir (a sparsely connected pool of neurons), see Figure 1. A network is initiated by randomly generating and fixing sparse input weights , sparse binary connection weights , and spiking neuron parameters. Inference step consists of propagating spikes inside the reservoir by multiplying neuron outputs with , compute spiking neurons inputs by adding input vector multiplied by , then updating and recording states of spiking neurons. The most computationally demanding part of sparse matrix-vector multiplication with is done in lowest precision as the values are binary. The network is trained by learning weights of a linear output layer between the recorded spiking neuron states and the true outputs . Network can be re-trained any time with new true outputs and the recorded states, without the need to re-run inference as all inference parameters are fixed.

2 Conclusions: Native machine learning at the edge

The development tools and hardware accelerators for mobile machine learning are already available, but researchers are generally unaware of the computational power in these devices, or are skeptical about feasibility of typical machine learning methods running at edge devices.

This paper explores the motivation behind training on edge devices, and highlights a suitable method based on spiking neural networks. It fits to the hardware specifics of such devices, less affected by their drawbacks, and makes large-scale machine learning with model training or re-training feasible directly on the edge.

Footnotes

  1. https://swift.org/

References

  1. A. Akusok, L. E. Leal, A. Lendasse and K. Björk (2018) High-performance elm for memory constrained edge computing devices with metal performance shaders. Note: Extreme Learning Machines (ELM) 2018 Cited by: §1.1.
  2. G. Chrysos (2014) Intel® xeon phi™ coprocessor-the architecture. Intel Whitepaper 176. Cited by: §1.
  3. G. Hillar (2011) Intel avx2 will bring integer instructions with 256-bit simd numeric processing capabilities. Dr Dobb’s Bloggers, Jun 24. Cited by: §1.1.
  4. M. T. R. Insights (2018)(Website) External Links: Link Cited by: §1.
  5. J. Jeffers (2013) Intel® xeon phi™ coprocessors. In Modern Accelerator Technologies for Geographic Information Science, pp. 25–39. Cited by: §1.
  6. N. P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia, N. Boden and A. Borchers (2017) In-datacenter performance analysis of a tensor processing unit. In Computer Architecture (ISCA), 2017 ACM/IEEE 44th Annual International Symposium on, pp. 1–12. Cited by: §1.
  7. Q. Li and G. Clark (2013) Mobile security: a look ahead. IEEE Security & Privacy 11 (1), pp. 78–81. Cited by: §1.1.
  8. W. Maass and H. Markram (2004) On the computational power of circuits of spiking neurons. Journal of Computer and System Sciences 69 (4), pp. 593 – 616. External Links: ISSN 0022-0000 Cited by: §1.2.
  9. W. Maass, T. Natschläger and H. Markram (2002) Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural computation 14 (11), pp. 2531–2560. Cited by: §1.2.
  10. W. Maass (1997) Networks of spiking neurons: the third generation of neural network models. Neural networks 10 (9), pp. 1659–1671. Cited by: §1.2.
  11. J. Palmer (1980) The intel® 8087 numeric data processor. In Proceedings of the 7th annual symposium on Computer Architecture, pp. 174–181. Cited by: §1.
  12. C. Qiu, H. Shen and L. Chen (2018) Towards green cloud computing: demand allocation and pricing policies for cloud service brokerage. IEEE Transactions on Big Data. Cited by: §1.
  13. B. Schrauwen, D. Verstraeten and J. Van Campenhout (2007) An overview of reservoir computing: theory, applications and implementations. In Proceedings of the 15th european symposium on artificial neural networks. p. 471-482 2007, pp. 471–482. Cited by: §1.2.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
402635
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description