HHHFL: Hierarchical Heterogeneous Horizontal Federated Learning for Electroencephalography
Electroencephalography (EEG) classification techniques have been widely studied for human behavior and emotion recognition tasks. But it is still a challenging issue since the data may vary from subject to subject, may change over time for the same subject, and maybe heterogeneous. Recent years, increasing privacy-preserving demands poses new challenges to this task. The data heterogeneity, as well as the privacy constraint of the EEG data, is not concerned in previous studies. To fill this gap, in this paper, we propose a heterogeneous federated learning approach to train machine learning models over heterogeneous EEG data, while preserving the data privacy of each party. To verify the effectiveness of our approach, we conduct experiments on a real-world EEG dataset, consisting of heterogeneous data collected from diverse devices. Our approach achieves consistent performance improvement on every task.
Electroencephalography (EEG) is an electrophysiological monitoring method to record electrical activity of the brain [chong2013parameter]. It is typically noninvasive, with the electrodes placed along the scalp. Studies have shown that EEG signals can effectively reflect a person’s fatigue, panic, alertness, behavioral intentions, epilepsy, and other information [subha2010eeg]. Therefore, EEG signals have a wide range of applications. For example, in education, EEG can be used to help inspect and improve the concentration of students [angelakis2007eeg] and its application in neural-feedback is investigated as well to help students learn to control and change their brain activity [Li:2009:TAL:1631111.1631118].
Many EEG studies utilize model-based approaches that a large amount of data is required to build an accurate and robust model. In these approaches, EEG data of different people are collected to a central server for further processing. However, as EEG signals reflect brain activities in numerous aspects, the potential abuse of EEG data may lead to sever privacy violation. Such ethical and privacy concerns has attracted public attention [yu2018building]. In the European Union, the General Data Protection Regulation (GDPR) [regulation2016regulation] specified many terms for protecting user privacy and prohibit organizations from exchanging data without explicit user approval. Under the increasingly stringent data security and privacy protection legislation, it is hard to collect abundant data for any single device and it is significant to conduct joint EEG signal analysis while protecting user privacy.
Federated learning (FL) [DBLP:journals/corr/McMahanMRA16] is an emerging but powerful technique to solve this kind of problem that it can jointly train a machine learning model using data on different clients while ensuring their data privacy. Existing federated learning mainly focus on homogeneous dataset, where the different parties share the same feature space.
For EEG signal collection, even the same equipment manufacturer may develop EEG signal acquisition equipment with varying electrode number, position and sampling rate let alone different equipment vendors. Such device diversity further exacerbates the scarcity of training data and leads to numerous distributed heterogeneous datasets in EEG classification. Heterogeneous domain adaptation linking different feature spaces based on labels has been studied [wang2011heterogeneous11]. However, existing solutions assume multiple source domains with abundant labeled instances and one target domain with limited labeled instances as well as unlabeled instances. When applied to our problem setting, existing heterogeneous domain adaptation approaches will face significant accuracy drop. The reason is that in our setting, there are numerous clients each with limited labeled data, any of them can hardly act as a source domain. Moreover, heterogeneous domain adaptation should be adapted multiple times by taking each party as the target domain every time.
2 Related Work
EEG data classification has been extensively studied using various machine learning approaches [hwang2013eeg; cnnSchirr17; bird2019deep; 6335751]. Among which, Schirrmeister et al. [cnnSchirr17] utilized convolutional neural networks (CNN) to distinguish pathological from normal EEG recordings and reaches state-of-the-art performance. Brid et al. [bird2019deep] explored Multi-layer Perceptron (MLP) and Long Short-Term Memory (LSTM) augmented with adaptive boosting. Shenkai and Yaochu  noticed the heterogeneous problem and proposed a heterogeneous classifier which consists different base classifiers as well as different feature extraction approaches for accurate EEG signal classification. Although data collected from different devices for the same task has been studied for EEG signal classification [roesler2014comparison], as far as we known, there is no attempt to build a EEG classifier over heterogeneous EEG data collected from different devices, let along further taking the privacy issue into consideration.
FL is a potential technique to solve the problem, it is first proposed by Google in 2016 [DBLP:journals/corr/McMahanMRA16; DBLP:journals/corr/KonecnyMRR16] and it can train machine learning models under privacy constraint using decentralized data residing on end devices. A possible approach to bridge heterogeneous feature spaces in FL is discussed in [liu2018secure], it proposed a FTL approach which leverages instance co-occurrence in different parties and builds multiple correlated models instead of a single model. Also, federated multi-task learning is studied in [NIPS2017_7029], which deals with each task seperately. Symmetric transformation approaches [wang2011heterogeneous11; wang2018heterogeneous18; duan2012learning] project source and target feature spaces into a domain-invariant feature subspace to associate cross-domain data, while most of them aim to boost the performance in one target domain.
3 Problem Statement
Each EEG device captures signals from multiple sensors located in different area of brain. The sensor readings are high-dimensional data distributed on some embedding submanifold , where dimension is a large number.
Assume we have cluster of sensor readings on embedding submanifold , . The sensor readings are private and only accessed by users. We aim to find an privacy-preserving classifier to detect if one can recognize the digits in the image from electroencephalogram signals.
In this section, we propose our approach to find the privacy-preserving classifier. We take two strategies to prevent the high dimensionality and data privacy issues of sensor readings on EEG devices collected from multiple data source in Section 4.1 and Section 4.3 respectively. The hierarchical architecture of our approach refers to Figure 1 and Figure 6 in Appendix.
4.1 Manifold Projection
For each cluster of sensor readings on , we build its own projection to map embedding submanifold to the common embedding space . Upon the establishment of manifold projections, we require the raw EEG data fall closely onto the common embedding space . We approximate these projections with neural networks approach. The illustration of manifold projection refers to Figure 2.
The loss in our architecture is designed for minimizing both the classification loss and the domain loss as follows,
Classification Loss: is the typical classification loss (i.e. cross entropy) over the whole dataset, , and the ground truth labels, .
Domain Loss: We apply maximum mean discrepancy (MMD) [gretton2007kernel] to measure the distances between probability distributions of projected EEG data on Manifold . Suppose each projected data , where is the probability distribution over embedding space . For a feature map , where is a reproducing kernel Hilbert space. The MMD in our case is defined as follows,
where is the local cluster of sensor readings .
The overall loss is the sum of two kinds upon the two criteria as follows,
4.3 Federated learning
Since of the privacy issue of EEG data, we leverage FL to train a model for accurate brain activity inference. By using FL, we can manage to train a global model without direct access to the raw training EEG data. Specifically, the techniques of FL follows a server-client setting. A server acts as model aggregator. In each round, the server collects updated feature mapping models and EEG classifiers from each client for model aggregation. Federated averaging [DBLP:journals/corr/McMahanMRA16] is conducted over clients of each device for feature mapping aggregation and over all the clients for EEG classifier aggregation. After model aggregation, the server sends the updated global model to each client. When a client receives the model sent by server, it updates the model with its local data distributed on the projection manifold . The training process continues until the model converges.
To verify the effectiveness of our approach, we conduct experiments on real-world EEG datasets.
The MindBigData dataset 111The MindBigData dataset is acquired and processed (http://www.mindbigdata.com/opendb/index.html) is a publicly available dataset containing millions of EEG brain signals of two seconds each, captured with the stimulus of seeing a digit from 0 to 9 and thinking about it, or captured without the stimulus of seeing the digits for contrast. Our task is to infer whether the subject receives a stimulus of seeing and thinking about a digit, or not, according to the corresponding EEG signal. There are three types of devices with varying channels and sampling frequencies adopted for the same task: 1) MindWave (MW) with one channel and 512HZ, 2) EPOC (EP) with 14 channels and 128HZ, and 3) Muse (MU) with 4 channels and 220HZ. Federated learning over such heterogeneous dataset yields to a requirement of heterogeneous transfer learning for subspace alignment.
5.2 Evaluation Metric and Methods
The metric, accuracy, is used to measure the effectiveness of our approach, and the following methods will be compared with our approach.
Baseline: The neural network approach of classifier is trained on each device dataset. Three device dataset yield three baselines in total.
MU + MW or MW + EP or MU + EP: Our approach, HHHFL method, is trained on every two of three device datasets. We call them directly the abbreviations of two datasets.
MU + MW + EP: Our approach, HHHFL method, is trained on three device datasets.
5.3 Implementation Details and Result Analysis
The experiments are conducted on a machine with 1.3 GHz Intel Core i5 with memory 4 GB 1600 MHz DDR3. We implement each private network architecture with convolutional neural network (CNN) layer and fully-connected layer (FC) with the dimensions of inputs 512, 440 and 1024 for three devices MU, EP and MW respectively. The reduced dimension on the common embedding space is 10. We adopt PySyft framework for the Horizontal Federated Learning (HFL).
6 Future Work
We plan to further augment data privacy-preservation by applying secure multi-party computation and differential privacy. Secure multiparty computation can be applied to the proposed heterogeneous federated learning approach, to prevent information leakage during model training. Differential privacy will also be leveraged to prevent membership-inference attack from adversarial model querier.
We would like to thank FedAI community for their helpful suggestions and contributions.