Cloud-based Deep Learning of Big EEG Data for Epileptic Seizure Prediction
Developing a Brain-Computer Interface (BCI) for seizure prediction can help epileptic patients have a better quality of life. However, there are many difficulties and challenges in developing such a system as a real-life support for patients. Because of the nonstationary nature of EEG signals, normal and seizure patterns vary across different patients. Thus, finding a group of manually extracted features for the prediction task is not practical. Moreover, when using implanted electrodes for brain recording massive amounts of data are produced. This big data calls for the need for safe storage and high computational resources for real-time processing. To address these challenges, a cloud-based BCI system for the analysis of this big EEG data is presented. First, a dimensionality-reduction technique is developed to increase classification accuracy as well as to decrease the communication bandwidth and computation time. Second, following a deep-learning approach, a stacked autoencoder is trained in two steps for unsupervised feature extraction and classification. Third, a cloud-computing solution is proposed for real-time analysis of big EEG data. The results on a benchmark clinical dataset illustrate the superiority of the proposed patient-specific BCI as an alternative method and its expected usefulness in real-life support of epilepsy patients.
Motivation: Almost one percent of the world’s population suffers from epilepsy , a chronic disorder characterized by the occurrence of spontaneous seizures. Although the symptoms of a seizure affect any part of the body, the electrical events that produce the symptoms occur in the brain. For about 30 percent of the patients, medications are not curative and, even after surgery, many patients may have spontaneous seizures . Anxiety due to the possibility of a seizure occurring may affect the quality of life of the patients as well as their safety, relationships, work condition, driving and so much more.
Vision: The use of computers to help physicians in the acquisition, management, storage, and reporting of the biomedical signals is well established [3, 4, 5, 6]. To this end, Brain-Computer Interfaces (BCIs) use Electroencephalogram (EEG) which is a measure of brain waves. For example, a BCI system for seizure forecasting can help epileptic patients have a better quality of life. In order for such a BCI system to work effectively, computational algorithms must reliably identify periods of increased probability of seizure occurrence. If the occurrence of seizures could be identified, designing devices to warn patients would be possible and patients could avoid dangerous activities like driving or swimming. Also, medications could be used only when needed to prevent impending seizures.
Challenges: There are many difficulties and challenges in developing a seizure-prediction system as a real-life support for epileptic patients. The first challenge is due to the fact that EEG is not a stationary signal. Therefore, normal and seizure patterns may vary across different patients. As a result, finding a group of manually-extracted features might not scale well to new patterns of seizure activity, and supervised feature extraction may not be sufficient for learning algorithms. The second challenge relates to situations of electrodes implanted within the head that provide for intracranial electroencephalography (iEEG). This method of brain-signal recording has potential advantages like high spatio-temporal resolution and electro-optic mapping of the dynamic neuronal activity. However, implanted electrodes generate massive amounts of real-time data leading to the big data problem. This situation calls for a safe storage to save the large volume of data and for high computational resources to process the data in real time.
Our Approach: Signal processing, machine learning, and brain-state prediction need to be carried out in big data in order to develop a practical BCI. The next generation BCI systems may be connected to high-performance computing servers to process medical big data in real-time. Cloud computing is a new Information and Communications Technology (ICT) which enables ubiquitous and on-demand access to computational resources through the global Internet. Our approach is to develop new processing and classification methods to be implemented as a cloud-based BCI.
Contributions: To address existing challenges, we introduce a cloud-based BCI system for big data problem in epilepsy. Moreover, we have developed a deep-learning unsupervised feature-extraction technique for seizure prediction. Specifically, our contributions include the development of the following novel methods:
A dimension reduction using Principal and Independent Component Analysis to increase the classification accuracy as well as to reduce the computation time and the communication bandwidth.
A stacked autoencoder as a deep-learning structure to analyze EEG signals for the epileptic seizure prediction.
A BCI system implemented in the cloud as a safe storage with high computational resources for big data problem generated by implanted electrodes.
The proposed system has the ability of pervasive data-collection and analysis, which is useful in real-life support for epileptic patients. To study the accuracy and performance, the system is evaluated and compared to other methods on a benchmark epilepsy dataset.
Paper Outline: The remainder of this paper is organized as follows. In Sect. II, we provide a literature review. In Sect. III, we present our solution including dimensionality reduction, a novel stacked autoencoder as a deep-learning structure to analyze EEG signals, and cloud-computing framework. Then, in Sect. IV, we discuss the proof-of-concept prototype of the proposed BCI seizure predictor and show preliminary results. Finally, in Sect. VI, the paper is concluded.
Ii Literature Review
In this section, we provide an overview of the previous studies on seizure prediction systems and big data management of epilepsy. In , extracting EEG features for epileptic seizure prediction is followed by an elimination-based feature selection method to improve the efficacy and diminish redundant points. In , a support vector machine (SVM) algorithm was developed to identify preictal states in continuous iEEG recordings of dogs with naturally occurring epilepsy. In , spectral power and ratios of spectral power extracted from iEEG and processed by a second-order Kalman filter and then input to a linear SVM classifier for epileptic seizure prediction. In , for classification of preictal and interictal stages, artifact-free preictal and interictal EEG epochs were acquired and characterized with global feature descriptors. In general, existing works have focused on local processing and storage without considering multiple channels and big patient data. In , our group developed a multi-tier distributed computing structure based on Mobile Device Cloud (MDC) and cloud computing for real-time epileptic seizure detection. In this work, we have developed a deep learning structure in the cloud to address the big data analysis problem in epilepsy. In contrast to the existing methods, the proposed method extracts unsupervised features from iEEG patterns to predict seizures.
Iii Proposed Work
We propose a seizure prediction system for real-time big data analysis of EEG that can be implemented as a cloud-based service. In Sect.III-A, the data dimension is reduced by principal and independent component analysis. Decreasing the data dimensions decreases the telecommunication bandwidth needed for sending the data to the cloud, increases the classification accuracy by eliminating noise information, and reduces the computational time and energy. In Sect.III-B, a deep learning technique is developed using Stack Autoencoder for unsupervised feature extraction from big unlabeled data. In the end, a softmax layer classifies interictal (baseline) patterns of Preictal (prior to seizure) signals. In Sect.III-C, a cloud based architecture is described for the BCI system.
A. Dimensionality Reduction: For an efficient analysis of a complex data set, dimensionality reduction is critical. Given a data space , dimension reduction methods [12, 13] find a mapping () such that the transformed data vector preserves most of the information of . In , we proposed a method based on Infinite Independent Component Analysis (I-ICA)  for the EEG feature selection task. In this paper, to enhance the dimensionality reduction process, a Principal Component Analysis (PCA) method is applied before I-ICA. PCA generates a diagonal covariance matrix from the input data [15, 16, 17]. Then, using a transformation each dimension is normalized such that the covariance matrix is equal to the identity matrix . As a result, small trailing eigenvalues are discarded and also computational complexity is decreased by minimizing pairwise dependencies. In this combination, PCA decorrelates the input EEG raw data and the remaining higher-order dependencies are separated by I-ICA. The proposed method for dimensionality reduction is described in Algorithm 1.
B. Deep Network: A stacked autoencoder which is a class of deep neural networks  with two sparse encoders as hidden layers is developed. Stacked autoencoder captures the hierarchical grouping of the EEG input for seizure prediction task. The encoder maps the input to a hidden representation. The size of the second hidden layer is designed less than the first hidden layer so the second encoder learns an even smaller representation of the input data. The deep network structure is shown in Fig 1. Hidden layers are trained individually in an unsupervised method. The training data without labels are used to replicate the input from the output in the training step. To enforce a constraint on the sparsity of the output from the hidden layer, the impact of a sparsity regularizer is controlled. The first autoencoder tends to learn first-order features in the raw EEG input. Using the primary features as the input to second hidden layer, the second-order features are extracted. Then, a softmax layer is trained and the layers are joined to form a deep network. Finally, the deep network is trained one final time in a supervised manner. The pseudocode of the proposed classification method is shown in Algorithm 2.
The main property of stacked autoencoder is the ability of feature extraction from a large amount of unlabeled data which makes it a suitable solution for the big data problem. A nonlinear transformation is applied to each layer’s input and a representation is provided in the output. Thus, there is no need to extract EEG features by hand-engineering techniques for each patient. In deep architecture, multiple nonlinear transformation layers are stacked together to represent a nonlinear function of EEG data. A gradient-log-normalizer of the categorical probability distribution as softmax layer  is used to classify the nonlinear function of EEG as interictal or preictal signal in the last layer.
where , is the class prior probability, and is the conditional probability of the sample given class k.
C. Cloud Computing: Cloud computing provides a âlimitlessâ scale of computing power that can be made available on demand and by way of the Internet makes it ubiquitously available for an extensive global reach. There are many cloud platforms including Microsoft, Google and Amazon AWS. But for the purposes of our study and based on proven use-cases for large scale processing, we will base our reference of cloud usage to the Amazon Cloud, otherwise called Amazon Web Services (AWS). The cloud is generally broken into three layers based on the service provided: (1) Infrastructure as a Service (IaaS); (2) Platform as a Service (PaaS); and (3) Software as a Service (SaaS). These 3 layers will all lend to the different infrastructural setup of the BCI as follows:
IaaS provides computing power, networking, storage and virtual orchestrators and operating systems. It is available at large scale and on demand with the ability to deliver High Performance Computing (HPC) which lends itself well to the processing required with rapid real-time epilepsy monitoring. An applicable BCI system dealing with large amounts of data from distributed electrodes requires storage capability and both rapid and timely event-related mining to produce intelligence in the forms of trends, predictions and recommendations. With a low cost of entry and ease of setup, the core engine of the BCI can be effectively deployed using the AWS HPC. High Performance Computing processors allow the BCI system to function above a teraflop capacity or 1012 floating-point operations per second allowing for realtime results inspite of large data entry. The Health Insurance Portability and Accountability Act (HIPAA) and its Protected Health Information (PHI) provision also requires service providers to adhere to strict assurrances regarding protection of personal data. A need for encryption and use of AWS HIPAA eligible  services are required to host the BCI system.
PaaS uses an open source allowing developers from different constituencies to leverage the BCI to continue developing modules and customized features for their local environment in order to adapt the application to their practices and needs. SaaS uses a cloud-based BCI application allowing a good deal of processing power to be made available and distributed globally with decreased reliance on local extensive computer infrastructure in order to complete predictions. Aside from the standard electroencephalographic recording units and other specialized detection tools; run analysis, simulations, and other high-end processes can be initiated from relatively light client applications including smartphone apps.
A proof-of-concept prototype of the proposed BCI seizure predictor was developed in the cloud and Autonomic Computing Center (CAC), Rutgers University. In this testbed, we chose to use a benchmark dataset of epilepsy, an HP laptop with intel i5 processor, 8 GB RAM and battery capacity of 4400 mAh, and a supercluster of computers hosted by Amazon Elastic Compute Cloud (EC2). Message Queuing Telemetry Transport (QMTT) and RESTful Web Service protocols are used for sending data in cloud . The clinical iEEG dataset of two epileptic patients (60 interictal and 60 preictal segments) with temporal and extratemporal lobe epilepsy has been used, which was jointly developed by the University of Pennsylvania and the Mayo Clinic, and sponsored by the American Epilepsy Society .111The Dataset recorded by 15 electrodes. Preicatl and interictal data are segmented in 10 minute long clips. The sampling rate is 5000 Hz and the reference recorded voltage is an electrode outside the brain. Preictal data segments covered one hour prior to seizure and seizure horizon is five minutes. The pre-seizure horizon grantees that seizures could be foretasted with enough warning to allow using medications for preventing seizure occurring. Fig. 2 compares the patterns of the interictal and preictal segments.
The database consists of a few independent cases with a big data problem. Therefore, algorithms should be regulated against over fitting, and some techniques such as KNN or tree-based algorithms did not work well. However, since the proposed solution extracts the features in an unsupervised manner, the risk of overfitting is decreased. Moreover, to evaluate the generality of the results, we used leave-one-out as an exhaustive cross validation technique [24, 25, 26]. Using this technique, the model is fitted to subsets of EEG data and the accuracy of the model is found using the held-out sample .
|Output interictal||Output preictal||Total|
The confusion matrix of the proposed method is shown in Table I. To evaluate the classification ability of the proposed unsupervised feature extraction, the EEG feature sets are used for classification by other methods listed in Table II. The extracted features are based on fast Fourier transform, general energy average, and energy STDV over time for each channel, power spectral density correlation coefficients, partial directed coherence of the coefficients, power in band, low-gamma phase sync, and log of energy in different frequency bands for each channel . Experimental results in Table II show that the proposed deep learning method outperforms previous methods for the EEG seizure prediction task. The feasibility of using cloud computing is analyzed by the network latency offered by Amazon EC2 cloud servers. The Round Trip Time (RTT) for servers located at different geographical locations (Virginia, Oregon, Singapore, and Ireland) is calculated for 64B EEG segments at 10 days using the âpingâ command. The shortest RTT is 15 ms for Virginia server and the longest RTT is 97 ms for Oregon server.
|MLP Neural Network||0.68||0.70||0.67||0.31||0.32|
Efficiently handling and processing of medical big data can provide useful information about a patient and about diseases. This is now a high-focus area in data science. Intracranially implanted electrodes can be used for seizure prediction preparatory to stimulus delivery for aborting the event. Such electrodes generate considerable amounts of data, calling for safe storage and high computational resources to process big data. On the other hand, iEEG records a larger variety of patterns with fluctuations in amplitude and frequency, making feature extraction a challenging problem. In order to address these two broad issues, we introduced a novel cloud-based BCI to provide real-time seizure prediction from iEEG data. The proposed preprocessing step as a dimensionality reduction provides more accurate classification and decreases energy, computation time, and communication bandwidth. The developed deep-learning methods have the capability for unsupervised feature extraction and, therefore, represent a suitable substitute to manual feature-extraction techniques for classification purposes. These methods extract high-level, complex abstractions for data representations through a hierarchical learning process. The key benefit of the proposed method centers upon the analysis and learning allowed from massive amounts of unsupervised data, making it a practical method for developing a patient-based seizure prediction system. A cloud-based deep-learning method that is able to perform seizure prediction under such circumstances has immediate applicability in the present day.
Vi How to Cite Item
M.P. Hosseini, H. Soltanian-Zadeh, K. Elisevich, D. Pompili âCloud-based Deep Learning of Big EEG Data for Epileptic Seizure Prediction,â IEEE Global Conference on Signal and Information Processing (GlobalSIP), IEEE, 2016.
-  “World health organization.” [Online]. Available: https://http://www.who.int/mediacentre/factsheets/fs999/en/,RetrievedonSept14,2016.
-  M.-P. Hosseini, M. R. Nazem-Zadeh, D. Pompili, and H. Soltanian-Zadeh, “Statistical validation of automatic methods for hippocampus segmentation in mr images of epileptic patients,” in Proc. of IEEE International Conference of Engineering in Medicine and Biology Society (EMBC), 2014, pp. 4707–4710.
-  M. P. Hosseini, H. Soltanian-Zadeh, and S. Akhlaghpoor, “Detection and severity scoring of chronic obstructive pulmonary disease using volumetric analysis of lung ct images,” Iranian Journal of Radiology, vol. 9, no. 1, pp. 22–27, 2012.
-  M.-P. Hosseini, H. Soltanian-Zadeh, and S. Akhlaghpoor, “Three cuts method for identification of copd.” Acta Medica Iranica Journal, vol. 51, no. 11, pp. 771–778, 2013.
-  M.-P. Hosseini, H. Soltanian-Zadeh, S. Akhlaghpoor, A. Jalali, and M. Bakhshayesh Karam, “Designing a new cad system for pulmonary nodule detection in high resolution computed tomography (hrct) images,” Tehran University Medical Journal (TUMJ), vol. 70, no. 4, pp. 250–256, 2012.
-  M. P. Hosseini, H. Soltanian-Zadeh, and S. Akhlaghpoor, “Computer-aided diagnosis system for the evaluation of chronic obstructive pulmonary disease on ct images,” Tehran University Medical Journal TUMS Publications, vol. 68, no. 12, pp. 718–725, 2011.
-  N. Wang and M. R. Lyu, “Extracting and selecting distinctive eeg features for efficient epileptic seizure prediction,” IEEE Journal of Biomedical and Health Informatics, vol. 19, no. 5, pp. 1648–1659, 2015.
-  B. H. Brinkmann, E. E. Patterson, C. Vite, V. M. Vasoli, D. Crepeau, M. Stead, J. J. Howbert, V. Cherkassky, J. B. Wagenaar, B. Litt et al., “Forecasting seizures using intracranial eeg measures and svm in naturally occurring canine epilepsy,” PloS one, vol. 10, no. 8, 2015.
-  Z. Zhang and K. K. Parhi, “Low-complexity seizure prediction from ieeg/seeg using spectral power and ratios of spectral power,” IEEE transactions on biomedical circuits and systems, vol. 10, no. 3, pp. 693–706, 2016.
-  L.-C. Lin, S. C.-J. Chen, C.-T. Chiang, H.-C. Wu, R.-C. Yang, and C.-S. Ouyang, “Classification preictal and interictal stages via integrating interchannel and time-domain analysis of eeg features,” Clinical EEG and neuroscience, 2016.
-  M.-P. Hosseini, A. Hajisami, and D. Pompili, “Real-time epileptic seizure detection from eeg signals via random subspace ensemble learning,” IEEE International Conference on Autonomic Computing (ICAC), pp. 209–218, 2016.
-  B. Babagholami-Mohamadabadi, A. Zarghami, M. Zolfaghari, and M. S. Baghshah, “Pssdl: Probabilistic semi-supervised dictionary learning,” in Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 2013, pp. 192–207.
-  M. Rahmani and G. Atia, “A subspace learning approach for high dimensional matrix decomposition with efficient column/row sampling,” in Proceedings of The 33rd International Conference on Machine Learning, 2016, pp. 1206–1214.
-  D. Knowles and Z. Ghahramani, “Infinite sparse factor analysis and infinite independent components analysis,” in Independent Component Analysis and Signal Separation. Springer Berlin Heidelberg, 2007, pp. 381–388.
-  S. Minaee and Y. Wang, “Screen content image segmentation using least absolute deviation fitting,” in IEEE International Conference on Image Processing 2015, 2015.
-  M. Rahmani and G. Atia, “Randomized robust subspace recovery for high dimensional data matrices,” arXiv preprint arXiv:1505.05901, 2015.
-  M. Joneidi, P. Ahmadi, M. Sadeghi, and N. Rahnavard, “Union of low-rank subspaces detector,” IET Signal Processing, vol. 10, no. 1, pp. 55–62, 2016.
-  S. Minaee and Y. Wang, “Fingerprint recognition using translation invariant scattering network,” in IEEE Signal Processing in Medicine and Biology Symposium 2015, 2015.
-  Y. Chen, Z. Lin, X. Zhao, G. Wang, and Y. Gu, “Deep learning-based classification of hyperspectral data,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 7, no. 6, pp. 2094–2107, 2014.
-  R. Guo, L. Liu, W. Wang, A. Taalimi, C. Zhang, and H. Qi, “Deep tree-structured face: A unified representation for multi-task facial biometrics,” in 2016 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2016, pp. 1–8.
-  “Amazon web services.” [Online]. Available: https://aws.amazon.com/hpc/
-  J. K. Zao, T. Gan, C. You, C. Chung, Y. Wang, S. J. Mndez, T. Mullen, C. Yu, C. Kothe, C. Hsiao, and al., “Pervasive brain monitoring and data sharing based on multi-tier distributed computing and linked data technology,” Frontiers in human neuroscience, vol. 8, 2014.
-  B. H. Brinkmann, M. R. Bower, K. A. Stengel, G. A. Worrell, and M. Stead, “Large-scale electrophysiology: acquisition, compression, encryption, and storage of big data,” Journal of neuroscience methods, vol. 180, no. 1, pp. 185–192, 2009.
-  M.-P. Hosseini, M. R. Nazem-Zadeh, F. Mahmoudi, H. Ying, and H. Soltanian-Zadeh, “Support vector machine with nonlinear-kernel optimization for lateralization of epileptogenic hippocampus in mr images,” in Proc. of IEEE International Conference of Engineering in Medicine and Biology Society (EMBC), 2014, pp. 1047–1050.
-  M.-R. Nazem-Zadeh, J. M. Schwalb, H. Bagher-Ebadian, F. Mahmoudi, M.-P. Hosseini, K. Jafari-Khouzani, K. V. Elisevich, and H. Soltanian-Zadeh, “Lateralization of temporal lobe epilepsy by imaging-based response-driven multinomial multivariate models,” in Proc. of IEEE International Conference of Engineering in Medicine and Biology Society (EMBC). IEEE, 2014, pp. 5595–5598.
-  M.-P. Hosseini, M. R. Nazem-Zadeh, D. Pompili, K. Jafari-Khouzani, K. Elisevich, and H. Soltanian-Zadeh, “Automatic and manual segmentation of hippocampus in epileptic patients mri,” in 6th annual New York Medical Imaging Informatics Symposium (NYMIIS). Staten Island University Hospital, NY, USA, 2015.
-  M.-P. Hosseini, M.-R. Nazem-Zadeh, D. Pompili, K. Jafari-Khouzani, K. Elisevich, and H. Soltanian-Zadeh, “Comparative performance evaluation of automated segmentation methods of hippocampus from magnetic resonance images of temporal lobe epilepsy patients,” Medical Physics, vol. 43, no. 1, pp. 538–553, 2016.
-  J. Ge and G. Zhang, “Novel images extraction model using improved delay vector variance feature extraction and multi-kernel neural network for eeg detection and prediction,” Technology and Health Care, vol. 23, no. s1, pp. S151–S155, 2015.