Spiking Networks for Improved Cognitive Abilities of Edge Computing Devices
This concept paper highlights a recently opened opportunity for large scale analytical algorithms to be trained directly on edge devices. Such approach is a response to the arising need of processing data generated by natural person (a human being), also known as personal data. Spiking Neural networks are the core method behind it: suitable for a low latency energy-constrained hardware, enabling local training or re-training, while not taking advantage of scalability available in the Cloud.
A sudden realization came to our minds while preparing this white paper – mobile phones are the first type of devices that received dedicated math accelerators at a pervasive scale. Such things never got wide adoption before: Intel 8087 co-processor, Intel Xeon Phi[2, 5] or Google TPU (Tensor Processing Unit) stayed niche devices that few people use and even fewer develop for. But since the last two years, major mobile phone companies include dedicated co-processors necessary for computational photography enhancement or facial recognition, that are suitable for general machine learning.
Currently the dominant analytical approach stores data and runs computations in the Cloud. However Cloud based methods poorly fit to a range of important practical applications including augmented reality, real-time data analysis, real-time user interaction, or processing sensitive data that incur high risks for a company if leaked, stolen or intercepted in transfer. The price of deployed analytical methods is increased by the need to have a permanently working internet connection for users, and cloud hardware rent for service providers.
1.1 Mobile-first Machine Learning
The difference in the operating systems running on mobile and desktop devices is smaller than many people think. It is limited to the interface, that interacts with window-based applications using mouse and keyboard on desktops, and with full-screen applications using touch gestures on mobile devices. The operating system kernel, storage and graphics are almost identical to the desktop analogues.
However, there is a factor that amplifies the small differences of mobile devices to the extent of making them useless in machine learning: the software. Researchers use programming languages (Python, R) created for systems with global folder access and command-line installers. These are not readily available on mobile devices that sandbox all applications for security reasons. Even if they manage to run on mobile devices, their speed and efficiency are abysmal because they can only access a low-power CPU that needs to emulate advanced x86 architecture commands.
The missing software problem is recognized by the mobile device vendors, who addressed it by providing optimized libraries or even whole new programming languages
1.2 Spiking Neural Networks as the ”mobile brains”
A good supervised learning method is needed for the mobile-first machine learning, and it must be able to train directly on device. Apple ecosystem provides accelerated libraries for all kinds of linear systems (dense and sparse), and for inference in arbitrary neural networks.
Deep Learning and traditional neural networks can be rejected for their slow and energy-inefficient training. Support Vector Machines and Nearest Neighbour method also slow at large scale. Decision Trees and Random Forest are missing the corresponding libraries, and linear models are limited in their learning ability. Native support for the concept of time is another requirement for a human-friendly method operated on mobile devices by actual people, and it is not available in any of the mentioned methods. A recurrent deep neural network with the training speed of a linear model would fit perfectly for mobile devices.
Such method actually exist, in form of spiking neural network implemented in reservoir computing framework, also known as Liquid State Machine[9, 8]. Spiking neurons have an internal state value, that produces a binary output (”spike”) and resets to zero upon reaching a threshold. That internal state provides networks of spiking neurons with the concept of time, and binary output spikes are propagated energy efficiently at the lowest possible precision in edge devices. Due to internal states spiking networks can be run only one step at a time, limiting speedup from batch processing on large parallel devices (server-grade GPU accelerators) but fitting well to the low latency (due to local processing) and shared-memory GPUs of edge devices.
Spiking neurons have no effective training methods as they are not differentiable; but gradient-based training is a poor choice for edge devices due to high computational power and energy demands. Liquid State Machines (LSM) offer a framework for spiking networks that avoids training the spiking neuron reservoir (a sparsely connected pool of neurons), see Figure 1. A network is initiated by randomly generating and fixing sparse input weights , sparse binary connection weights , and spiking neuron parameters. Inference step consists of propagating spikes inside the reservoir by multiplying neuron outputs with , compute spiking neurons inputs by adding input vector multiplied by , then updating and recording states of spiking neurons. The most computationally demanding part of sparse matrix-vector multiplication with is done in lowest precision as the values are binary. The network is trained by learning weights of a linear output layer between the recorded spiking neuron states and the true outputs . Network can be re-trained any time with new true outputs and the recorded states, without the need to re-run inference as all inference parameters are fixed.
2 Conclusions: Native machine learning at the edge
The development tools and hardware accelerators for mobile machine learning are already available, but researchers are generally unaware of the computational power in these devices, or are skeptical about feasibility of typical machine learning methods running at edge devices.
This paper explores the motivation behind training on edge devices, and highlights a suitable method based on spiking neural networks. It fits to the hardware specifics of such devices, less affected by their drawbacks, and makes large-scale machine learning with model training or re-training feasible directly on the edge.
- (2018) High-performance elm for memory constrained edge computing devices with metal performance shaders. Note: Extreme Learning Machines (ELM) 2018 Cited by: §1.1.
- (2014) Intel® xeon phi™ coprocessor-the architecture. Intel Whitepaper 176. Cited by: §1.
- (2011) Intel avx2 will bring integer instructions with 256-bit simd numeric processing capabilities. Dr Dobb’s Bloggers, Jun 24. Cited by: §1.1.
- (2018)(Website) External Links: Cited by: §1.
- (2013) Intel® xeon phi™ coprocessors. In Modern Accelerator Technologies for Geographic Information Science, pp. 25–39. Cited by: §1.
- (2017) In-datacenter performance analysis of a tensor processing unit. In Computer Architecture (ISCA), 2017 ACM/IEEE 44th Annual International Symposium on, pp. 1–12. Cited by: §1.
- (2013) Mobile security: a look ahead. IEEE Security & Privacy 11 (1), pp. 78–81. Cited by: §1.1.
- (2004) On the computational power of circuits of spiking neurons. Journal of Computer and System Sciences 69 (4), pp. 593 – 616. External Links: Cited by: §1.2.
- (2002) Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural computation 14 (11), pp. 2531–2560. Cited by: §1.2.
- (1997) Networks of spiking neurons: the third generation of neural network models. Neural networks 10 (9), pp. 1659–1671. Cited by: §1.2.
- (1980) The intel® 8087 numeric data processor. In Proceedings of the 7th annual symposium on Computer Architecture, pp. 174–181. Cited by: §1.
- (2018) Towards green cloud computing: demand allocation and pricing policies for cloud service brokerage. IEEE Transactions on Big Data. Cited by: §1.
- (2007) An overview of reservoir computing: theory, applications and implementations. In Proceedings of the 15th european symposium on artificial neural networks. p. 471-482 2007, pp. 471–482. Cited by: §1.2.