A Multimodal Approach to Estimating Vigilance Using EEG and Forehead EOG
Objective. Covert aspects of ongoing user mental states provide key context information for user-aware human computer interactions. In this paper, we focus on the problem of estimating the vigilance of users using EEG and EOG signals. Approach. To improve the feasibility and wearability of vigilance estimation devices for real-world applications, we adopt a novel electrode placement for forehead EOG and extract various eye movement features, which contain the principal information of traditional EOG. We explore the effects of EEG from different brain areas and combine EEG and forehead EOG to leverage their complementary characteristics for vigilance estimation. Considering that the vigilance of users is a dynamic changing process because the intrinsic mental states of users involve temporal evolution, we introduce continuous conditional neural field and continuous conditional random field models to capture dynamic temporal dependency. Main results. We propose a multimodal approach to estimating vigilance by combining EEG and forehead EOG and incorporating the temporal dependency of vigilance into model training. The experimental results demonstrate that modality fusion can improve the performance compared with a single modality, EOG and EEG contain complementary information for vigilance estimation, and the temporal dependency-based models can enhance the performance of vigilance estimation. From the experimental results, we observe that theta and alpha frequency activities are increased, while gamma frequency activities are decreased in drowsy states in contrast to awake states. Significance. The forehead setup allows for the simultaneous collection of EEG and EOG and achieves comparative performance using only four shared electrodes in comparison with the temporal and posterior sites.
Humans interact with their surrounding complex environments based on their current states, and context awareness plays an important role during such interactions. However, the majority of the existing systems lack this ability and generally interact with users in a rule-based fashion. Covert aspects of ongoing user mental states provide key context information in user-aware human computer interactions (?), which can help systems react adaptively in a proper manner. Various studies have introduced the assessment of the mental states of users, such as intention, emotion, and workload, to promote active interactions between users and machines (?, ?, ?, ?). Zander and Kothe proposed the concept of a passive brain-computer interface (BCI) to fuse conventional BCI systems with cognitive monitoring (?). It is attractive to implement these novel BCI systems with increasing information flow of human states without simultaneously increasing the cost significantly. Among these cognitive states, vigilance is a vital component, which refers to the ability to endogenously maintain focus.
Various working environments require sustained high vigilance, particularly for some dangerous occupations such as driving trucks and high-speed trains. In these cases, a decrease in vigilance (?) or a momentary lapse of attention (?, ?) might severely endanger public transportation safety. Driving fatigue is reported to be a major factor in fatal road accidents.
Various approaches for estimating vigilance levels have been proposed in the literature (?, ?, ?). However, several research challenges still exist. Vigilance decrement is a dynamic changing process because the intrinsic mental states of users involve temporal evolution rather than a time point. This process cannot simply be treated as a function of the duration of time while engaged in tasks. The ability to predict vigilance levels with high temporal resolution is more feasible in real-world applications (?). Moreover, drivers’ vigilance levels cannot be simply classified into several discrete categories but should be quantified in the same way as the blood alcohol level (?, ?). We still lack a standardized method for measuring the overall vigilance levels of humans.
Among various modalities, EEG is reported to be a promising neurophysiological indicator of the transition between wakefulness and sleep in various studies because EEG signals directly reflect human brain activity (?, ?, ?, ?, ?, ?). Rosenberg and colleagues recently presented a neuromarker for sustained attention from whole-brain functional connectivity (?). They developed a network model called the sustained attention network for predicting attentional performance. Moreover, EEG has intrinsic potential to allow fatigue detection at onset or even before onset (?). O’Connell and colleagues examined the temporal dynamics of EEG signals preceding a lapse of sustained attention (?). Their results demonstrated that the specific neural signatures of attentional lapses are registered in the EEG up to 20 s prior to an error. Lin et al. presented a wireless and wearable EEG system for evaluating drivers’ vigilance levels, and they tested their system in a virtual driving environment (?). They also combined lapse detection and feedback efficacy assessment for implementing a closed-loop system. By monitoring the changes of EEG patterns, they were able to detect driving performance and estimate the efficacy of arousing warning feedback delivered to drowsy subjects (?).
In addition to EEG, EOG signals contain characteristic information on various eye movements, which are often utilized to estimate vigilance because of its easy setup and high signal-noise ratio (?, ?, ?, ?). Researchers have developed various multimodal approaches for constructing hybrid BCIs (?) and combining brain signals and eye movements for robotic control and cognitive monitoring (?, ?, ?, ?, ?). Simola et al. studied the valence and arousal interactions under free viewing of emotional scenes by analysing eye movement behaviours and eye-fixation-related potentials (?). Their findings support the multi-dimensional, interactive model of emotional processing. Moreover, Bulling and colleagues found that eye movements from EOG signals are good indicators for activity recognition (?). However, the electrodes in the traditional EOG are placed around the eyes, which may distract users and cause discomfort. In our previous study, we proposed a new electrode placement on the forehead and extracted various eye movement features from the forehead EOG (?, ?). Various studies have indicated that signals from different modalities represent different aspects of convert mental states (?, ?, ?). EEG and EOG represent internal cognitive states and external subconscious behaviours, respectively. These two modalities contain complementary information and can be integrated to construct a more robust vigilance estimation model.
In this paper, we present a multimodal approach for vigilance estimation by combining EEG and forehead EOG. The main contributions of this paper are as follows: 1) we explore the effect of EEG for vigilance estimation in different brain areas: frontal, temporal, and posterior; 2) we propose a multimodal vigilance estimation framework with EEG and forehead EOG in terms of feasibility and accuracy; 3) we acquire both EEG and EOG signals simultaneously with four shared electrodes on the forehead and combine them for vigilance estimation; 4) we reveal the complementary characteristics of EEG and forehead EOG modalities for vigilance estimation; 5) we apply continuous conditional neural field (CCNF) and continuous conditional random field (CCRF) models to enhance the performance of the vigilance estimation model to capture dynamic temporal dependency; and 6) we investigate neural patterns regarding critical frequency activities under awake and drowsy states.
2.1 Experiment Setup
To collect EEG and EOG data, we developed a virtual-reality-based simulated driving system. A four-lane highway scene is shown on a large LCD screen in front of a real vehicle without the unnecessary engine and other components. The vehicle movements in the software are controlled by the steering wheel and gas pedal, and the scenes are simultaneously updated according to the participants’ operations. The road is primarily straight and monotonous to induce fatigue in the subjects more easily. The simulated driving system and the experimental scene are shown in Figure 1.
A total of 23 subjects (mean age: 23.3, STD: 1.4, 12 females) participated in the experiments. All participants possessed normal or corrected-to-normal vision. Caffeine, tobacco, and alcohol were prohibited prior to participating in the experiments. At the beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions. Most experiments were performed in the early afternoon (approximately 13:30) after lunch to induce fatigue easily when the circadian rhythm of sleepiness reached its peak (?). The duration of the entire experiment was approximately 2 hours. The participants were asked to drive the car in the simulated environments without any alertness.
Both EEG and forehead EOG signals were recorded simultaneously using the Neuroscan system with a 1000 Hz sampling rate. The electrode placement of the forehead EOG (?) is shown in Figure 2. For the EEG setup, we recorded 12-channel EEG signals from the posterior site (, , , , , , , , , , , and ) and 6-channel EEG signals from the temporal site (, , , , , and ) according to the international 10-20 electrode system shown in Figure 3. Eye movements were simultaneously recorded using SMI ETG eye tracking glasses111http://eyetracking-glasses.com/, and the facial video was recorded from a video camera mounted in front of the participants.
For reproducing the results in this paper and enhancing cooperation in related research fields, the dataset used in this study will be freely available to the academic community as a subset of SEED222http://bcmi.sjtu.edu.cn/~seed/.
2.2 Vigilance Annotations
The primary challenge of vigilance estimation using a supervised machine learning paradigm is how to quantitatively label the sensor data because the ground truth of convert mental states cannot be accurately obtained in theory. To date, researchers have proposed various vigilance annotation methods in the literature, such as lane departure and local error rates (?, ?). Lin et al. designed an event-related lane-departure driving task in which the subjects were asked to respond to the random drifts as soon as possible and the response time reflected the vigilance states of the subjects (?, ?). Shi and Lu (?) conducted a study in which the local error rate of the subjects’ performance was used as the vigilance measurement. The subjects were asked to press correct buttons according to the colours of traffic signs. These two annotation methods are based on subjects’ behaviours and can reflect their actual vigilance levels to some extent. However, they are not feasible for dual tasks, particularly in real-world driving environments.
There is another annotation method called PERCLOS (?), which refers to the percentage of eye closure. It is one of the most widely accepted vigilance indices in the literature (?, ?, ?). Conventional driving fatigue detection methods utilize facial videos to calculate the PERCLOS index. However, the performance of facial videos can be influenced by environmental changes, especially for various illuminations and heavy occlusion. In this study, we adopt an automatic continuous vigilance annotation method using eye tracking glasses, which was proposed in our previous work (?). This approach allows vigilance to be measured in both laboratory and real-world environments.
Compared with facial videos, eye tracking glasses can more precisely capture different eye movements, such as blink, fixation, and saccade, as shown in Figure 4. The eye tracking-based PERCLOS index can be calculated from the percentage of the durations of blinks and ‘CLOS’ over a specified time interval as follows:
where ‘CLOS’ denotes the duration of the eye closures.
We evaluated the efficiency of the eye tracking-based method for vigilance annotations with the facial videos recorded simultaneously and found a high correlation between the PERCLOS index and the subject’s current cognitive states. Compared with other approaches (?, ?, ?), this method is more feasible for real-world driving environments, where performing dual tasks can distract attention and cause safety issues (?). This new vigilance annotation method can be performed automatically without too much interference to the drivers.
Note that although the eye tracking-based approach can estimate the vigilance level more precisely, it is not currently feasible to apply it to real-world applications due to its very expensive cost. Here, we utilize eye tracking glasses as a vigilance annotation device to obtain more accurate labelled EEG and EOG data for training vigilance estimation models.
2.3 Feature Extraction
2.3.1 Preprocessing for Forehead EOG
For traditional EOG recordings, the electrodes are mounted around the eyes using the electrodes numbered one to four in Figure 2 (a). However, in real-world applications, such electrode placement is not easily mounted and may distract users with discomfort. To implement wearable devices for real-world vigilance estimation, we propose placing all the electrodes on the forehead, as shown in Figure 2 (b), and separating vertical EOG (VEO) and horizontal EOG (HEO) using the electrodes numbered four to seven shown in Figure 2 (b). For the traditional EOG setup shown in Figure 2 (a), the VEO and HEO signals are obtained by subtracting electrodes four and three and electrodes one and two, respectively. VEO and HEO signals contain details of eye movements, such as blink, saccade, and fixation.
How to extract VEO and HEO signals from the forehead EOG setup is one of the key problems in this study. We extracted VEO signals from electrodes numbered four and seven and extracted HEO signals from electrodes five and six using two separation strategies: the minus rule and independent component analysis (ICA). For the minus rule, the subtraction of channels five and seven is an approximation of VEO, named VEO, and the subtraction of channels five and six is an approximation of HEO, named HEO. Here, the subscript ‘’ indicates ‘forehead’.
ICA is a blind source separation method proposed to decompose a multivariate signal into independent non-Gaussian signals (?). We extracted the VEO and HEO components using FASTICA (?) from channels four and seven and channels five and six, respectively. The comparison of the traditional EOG and forehead EOG using the minus operation and ICA separation strategies is depicted in Figure 5. As shown, the extracted VEO and HEO from the forehead electrodes have similar waves to the traditional ones, and the forehead VEO and HEO can capture critical eye movements, such as blinks and saccades.
2.3.2 Feature Extraction from Forehead EOG
After preprocessing forehead EOG signals and extracting VEO and HEO, we detected eye movements such as blinks and saccades using the wavelet transform method (?). We computed the continuous wavelet coefficients at a scale of 8 with a Mexican hat wavelet defined by
where is the standard deviation. Because the wavelet transform is sensitive to singularities, we used the peak detection algorithm on the wavelet coefficients to detect blinks and saccades from the forehead VEO and HEO, respectively. The detected blinks and saccades are shown in Figures 6 and 7, respectively.
By applying thresholds on the continuous wavelet coefficients, we encoded the positive and negative peaks in forehead VEO and HEO into sequences, where the positive peak was encoded as ‘1’ and the negative one as ‘0’. A saccade is characterized by a sequence of two successive positive and negative peaks in the coefficients. A blink contains three successive large peaks, namely, negative, positive, and negative, and the time between two positive peaks should be smaller than the minimum time. Therefore, for the encoding, segments with ‘01’ or ‘10’ are recognized as saccade candidates, and segments with ‘010’ are recognized as blink candidates. Moreover, there are some other constraints, such as slope, correlation, and maximal segment length, for guaranteeing a precise detection of blinks and saccades. Following the detection of blinks and saccades, we extracted the statistical parameters, such as the mean, maximum, variance, and derivative, of different eye movements with an 8 s non-overlapping window as the EOG features. We extracted a total of 36 EOG features from the detected blinks, saccades, and fixations. Table 1 presents the details of the extracted 36 eye movement features.
2.3.3 Forehead EEG Signal Extraction
For conventional EEG-based approaches, the EOG signals are always considered to be severe contamination, particularly for frontal sites. Many methods have been proposed for removing eye movement and blink artifacts from EEG recordings (?, ?, ?). However, in this study, we consider that both EEG and EOG contain discriminative information for vigilance estimation. Our intuitive concept is that it is possible to separate EEG and EOG signals from the shared forehead electrodes. The main advantage of this concept is that we can leverage the favourable properties of both EEG and EOG modalities while simultaneously not increasing the setup cost.
We utilize the FASTICA algorithm to extract EEG and EOG components from the four forehead channels (Nos. 4-7) shown in Figure 2 (b). The ICA algorithm decomposes the multi-channel data into a sum of independent components (?). Similar to artifact removal using blind signal separation in conventional approaches, the forehead EEG signals are reconstructed with a weight matrix by discarding the EOG components. The raw data recorded at the four forehead channels (Nos. 4-7) are concatenated as the input matrix for ICA as follows:
where the rows of the input matrix are signals , , , and from channels Nos. 4-7. After ICA decomposition, the un-mixing matrix can be obtained, which decomposes the multi-channel data into a sum of independent components as follows:
where the rows of are time courses of activations of the ICA components. The columns of the inverse matrix indicate the projection strengths of the corresponding components. Therefore, the clean forehead EEG signals can be derived as
where is the matrix of activation waveforms with rows representing EOG components set to zero.
The decomposed independent components and reconstructed forehead EEG of one segment under eye closure conditions are shown in Figure 8. Under eye closure conditions, the alpha rhythm appears more dominant in EEG signals in previous studies (?). From Figure 8 (a), we can observe that the first two rows are the corresponding eye movement components, and the last two rows contain EEG components with high alpha power values. The reconstructed signals contain characteristics of EEG waves, which are accompanied by high alpha bursts. The results presented in Figure 8 demonstrate the efficiency of our approach in extracting EEG signals from forehead electrodes.
2.3.4 Feature Extraction from EEG
In addition to forehead EOG, we recorded EEG data from temporal and posterior sites, which showed high relevance along with vigilance in the literature and our previous work (?, ?). For preprocessing, the raw EEG data were processed with a band-pass filter between 1 and 75 Hz to reduce artifacts and noise and downsampled to 200 Hz to reduce the computational complexity. For feature extraction, an efficient EEG feature called differential entropy (DE) was proposed for vigilance estimation and emotion recognition (?, ?), which showed superior performance compared to the conventional power spectral density features.
The original formula for calculating differential entropy is defined as
If a random variable obeys the Gaussian distribution , the differential entropy can simply be calculated by the following formulation,
According to the DE definition mentioned above, for each EEG segment, we extracted the DE features from five frequency bands: delta (1-4 Hz), theta (4-8 Hz), alpha (8-14 Hz), beta (14-31 Hz), and gamma (31-50 Hz). We also extracted the DE features from the total frequency band (1-50 Hz) with a 2 Hz frequency resolution. All the DE features were calculated using short-term Fourier transforms with an 8 s non-overlapping window.
2.4 Vigilance Estimation
After obtaining vigilance labels and EOG/EEG features, we used support vector regression (SVR) with radial basis function (RBF) kernels as a basic regression model. The optimal values of the parameters and were tuned with the grid search. As the modality fusion strategy, we used feature-level fusion, in which the feature vectors of EEG and EOG are directly concatenated into a larger feature vector as inputs. For evaluation, we separated the entire data from one experiment into five sessions and evaluated the performance with 5-fold cross validation. There are a total of 885 samples for each experiment.
The root mean square error (RMSE) and correlation coefficient (COR) are the most commonly used evaluation metrics for continuous regression models (?). RMSE is the squared error between the prediction and the ground truth, and it is defined as follows:
where is the ground truth and is the prediction.
Since RMSE-based evaluation cannot provide structural information, we used COR to overcome the shortcomings of RMSE. COR provides an evaluation of the linear relationship between the prediction and the ground truth, which reflects the consistency of their trends. Pearson’s correlation coefficient is defined as follows:
where and are the means of and . However, COR is sensitive to short segments and is appropriate for long evaluation metrics. Therefore, we concatenated the predictions and ground truth of five sessions and calculated COR as the final evaluation. In general, the more accurate the model is, the higher the COR is and the lower the RMSE is.
2.5 Incorporating Temporal Dependency into Vigilance Estimation
Vigilance is a dynamic changing process because the intrinsic mental states of users involve temporal evolution. To incorporate the temporal dependency into vigilance estimation, we introduced continuous conditional neural field (CCNF) and continuous conditional random field (CCRF) when constructing vigilance estimation models. CCNF and CCRF are extensions of conditional random field (CRF) (?) for continuous variable modelling that incorporates temporal or spatial information and have shown promising performance in various applications (?, ?, ?). CCNF combines the nonlinearity of conditional neural fields (?) and the continuous output of CCRF.
The probability distribution of CCNF for a particular sequence is defined as follows:
where is the normalization function, is a set of input observations, is a set of output variables, and is the length of the sequence.
There are two types of features defined in these models: vertex features and edge features . The potential function is defined as follows:
where , , the vertex features denote the mapping from to with a one-layer neural network, and is the weight vector for the neuron .
The vertex features of CCNF are defined as
where the optimal number of vertex features is tuned with cross-validation. In our experiments, we evaluated .
The edge features denote the similarities between observations and , which are defined as
where the similarity measure controls the existence of the connections between two vertices.
In the experiments, is set to 1 and is set to 1 when two nodes and are neighbours; otherwise, it is 0. The sequence length is set to seven. The formulas for CCRF are the same as those for CCNF, except for the definition of vertex features. The vertex features of CCRF are defined as
The training of parameters in CCRF and CCNF is based on the conditional log-likelihood as a multivariate Gaussian. For more details regarding the learning and inference of CCRF and CCNF, please refer to (?, ?). The outputs of support vector regression are used to train CCRF, and the original multi-dimensional features are used to train CCNF. The CCRF and CCNF regularization hyper-parameters for and are chosen based on a grid search in and using the training set, respectively.
3 Experimental Results
3.1 Forehead EOG-Based Vigilance Estimation
First, we evaluated the similarity between forehead EOG and traditional EOG and the performance of forehead EOG-based vigilance estimation for different separation strategies. We extracted forehead VEO and HEO using the minus and ICA separation approaches and computed the correlation with traditional VEO and HEO. The mean correlation coefficients of VEO-MINUS, VEO-ICA, HEO-MINUS, and HEO-ICA are 0.63, 0.80, 0.81, and 0.75, respectively. These comparative results demonstrate that the extracted forehead VEO and HEO contain most of the principal information of traditional EOG.
The mean RMSE, the mean COR and their standard deviations for different separation methods are presented in Table 2. ‘ICA-MINUS’ denotes ICA-based VEO and minus-based HEO separations, and it has the highest correlation coefficient with traditional VEO and HEO. As shown in Table 2, ICA-MINUS achieves the best performance for vigilance estimation in terms of both COR and RMSE. It is consistent with the above results that VEO-ICA and HEO-MINUS are more similar to the original VEO and HEO. For VEO, it contains many blink components, such as impulses, which are more likely to be detected by ICA. In contrast, the minus method reduces the amplitude of VEO signals since the polarity of the pair electrodes is the same. For HEO, saccade components are more difficult to be detected by ICA, and the polarity of the pair electrodes is different.
3.2 EEG-Based Vigilance Estimation
We reconstructed the frontal 4-channel EEG from the forehead signals based on the ICA algorithm. In the experiments, we also recorded 12-channel and 6-channel EEG signals from posterior and temporal sites. We extracted the DE features in two ways: one is from the five frequency bands, and the other is to use a 2 Hz frequency resolution in the entire frequency band. The mean COR, mean RMSE and their standard deviations of different EEG features from different brain areas are shown in Table 3. The ranking of the performance for EEG-based vigilance estimation from different brain areas is as follows: posterior, temporal, and forehead sites. For the single EEG modality, the posterior EEG contains the most critical information for vigilance estimation, which is consistent with previous findings (?, ?). The EEG features with a 2 Hz frequency resolution achieve better performance than those with five frequency bands. In the later experimental evaluation in this paper, we employ the EEG features with a 2 Hz frequency resolution of the entire frequency band.
In addition to the accuracy that we discussed above for decoding brain states, another important concern is to examine whether patterns of brain activity under different cognitive states exist and whether these patterns are to some extent common across individuals. Identifying the specific relationship between brain activities and cognitive states provides evidence and support for understanding the information processing mechanism of the brain and brain state decoding (?). Huang et al. demonstrated the specific links between changes in EEG spectral power and reaction time during sustained-attention tasks (?). They found that significant tonic power increases occurred in the alpha band in the occipital and parietal areas as reaction time increased. Ray and colleagues proposed that alpha activities of EEG reflect attentional demands and that beta activities reflect emotional and cognitive processes (?). They found increasing parietal alpha activities for tasks that do not require attention.
In this work, to investigate the changes in neural patterns associated with vigilance, we split the EEG data into three categories (awake, tired, and drowsy) with two thresholds (0.35 and 0.7) according to the PERCLOS index. We averaged the DE features over different experiments. Figure 9 presents the mean neural patterns of awake and drowsy states as well as the difference between them. As shown in Figure 9, increasing theta and alpha frequency activities exist in parietal areas and decreasing gamma frequency activities exist in temporal areas in drowsy states in contrast to awake states. These results are consistent with previous findings in the literature (?, ?, ?, ?, ?, ?, ?) and support the previous evidence that the increasing trend for the ratio of slow and fast waves of EEG activities reflects decreasing attentional demands (?).
3.3 Modality Fusion with Temporal Dependency
In this section, we introduced a multimodal vigilance estimation approach with the fusion of EEG and forehead EOG. We combined the EEG signals from different sites (forehead, temporal, and posterior) and forehead EOG signals to utilize their complementary characteristics for vigilance estimation. The performance of each single modality and different modality fusion strategies are shown in Figure 10. For a single modality, forehead EOG achieves better performance than posterior EEG. The reason for this result is that forehead EOG has more information in common with the annotations of eye tracking data. Modality fusion can significantly enhance the regression performance in comparison with a single modality with a higher COR and lower RMSE. We evaluated the statistical significance using one-way analysis of variance (ANOVA), and the values of COR for forehead EOG and posterior EEG are 0.2978 and 0.0264, respectively. The values of RMSE for forehead EOG and posterior EEG are 0.0654, and 0.0002, respectively.
For different brain areas, an interesting observation is that the fusion of forehead EOG and forehead EEG achieves better performance than that of forehead EOG and posterior EEG, whereas for single EEG, the posterior site achieves the best performance. These results indicate that forehead EEG and forehead EOG have more coherent information. The temporal EEG performs slightly better than the forehead EEG. However, the former requires six extra electrodes for the setup. The forehead setup only uses four shared electrodes and both EOG and EEG features can be extracted. Therefore, the information flow can be increased without any additional setup cost. From the above discussion, we see that the forehead approach is preferred for real-world applications.
To incorporate temporal dependency information into vigilance estimation, we adopted CCRF and CCNF in this study. As shown in Figures 10 (a) and (b), the temporal dependency models can enhance the performance. For the forehead setup, the mean COR/RMSE of SVR, CCRF, and CCNF are 0.83/0.10, 0.84/0.10, and 0.85/0.09, respectively. The CCNF achieves the best performance with higher accuracies and lower standard deviations.
To verify whether the predictions from our proposed approaches are consistent with the true subjects’ behaviours and cognitive states, the continuous vigilance estimation of one experiment is shown in Figure 11. The snapshots in Figure 11 show the frames corresponding to different vigilance levels. We can observe that our proposed multimodal approach with temporal dependency can moderately predict the continuous vigilance levels and its trends.
To further investigate the complementary characteristics of EEG and EOG, we analysed the confusion matrices of each modality, which reveals the strength and weakness of each modality. We split the EEG data into three categories, namely, awake, tired and drowsy states, with thresholds according to the corresponding PERCLOS index as described above. Figure 12 presents the mean confusion graph of forehead EOG and posterior EEG of all experiments. These results demonstrate that posterior EEG and forehead EOG have important complementary characteristics. Forehead EOG has the advantage of classifying awake and drowsy states (77%/76%) compared to the posterior EEG (65%/72%), whereas posterior EEG outperforms forehead EOG in recognizing tired states (88% vs. 84%). The forehead EOG modality achieves better performance than the posterior EEG overall. This result may be because our ground truth labels are obtained with eye movement parameters from eye tracking glasses. The forehead EOG contains more similar information with the experimental observations. Moreover, awake states and tired states are often misclassified with each other, and similar results are observed for drowsy and tired states. In contrast, awake states are seldom misclassified as drowsy states and vice versa for both modalities. These observations are consistent with our intuitive knowledge. EEG and EOG features of awake and drowsy states should have larger differences. These results indicate that EEG and EOG have different discriminative powers for vigilance estimation. Combining the complementary information of these two modalities, modality fusion can improve the prediction performance.
In this study, we have developed a multimodal approach for vigilance estimation regarding temporal dependency and combining EEG and forehead EOG in a simulated driving environment. Several researchers have performed pilot studies for on-road real driving tests. Papadelis et al. designed an on-board system to assess a driver’s alertness level in real driving conditions (?). They found that EEG and EOG are promising neurophysiological indicators for monitoring sleepiness. Haufe et al. performed a study to assess the real-world feasibility of EEG-based detection of emergency braking intention (?). Indeed, in addition to driving applications, there are many other scenarios that require vigilance estimation, such as students’ performance in classes. Hans and colleagues examined how cognitive fatigue influences students’ performance on standardized tests in their study (?). To evaluate the feasibility of our approach, we will apply our vigilance estimation approach to real scenarios in future work.
Considering the wearability and feasibility of a vigilance estimation device for real-world applications, we have designed four-electrode placements on the forehead, which are suitable for attachment in a wearable headset or headband. We can collect both EEG and EOG simultaneously and combine their advantages via shared forehead electrodes. The experimental results demonstrate that our proposed approach can achieve comparable performance with the conventional methods on critical brain areas, such as parietal and occipital sites. This approach increases the information flow with easy setups while not considerably increasing the cost.
In recent years, substantial progress has been made in dry electrodes and high-performance amplifiers. Several commercial EEG systems have emerged for increasing the usability in real-world applications (?, ?, ?). It is feasible to integrate these techniques with our proposed approach to design a new wearable hybrid EEG and forehead EOG system for vigilance estimation in the future.
In this study, we focus only on vigilance estimation without considering any neurofeedback. For example, a feedback can be timely provided to the driver to enhance driving safety if the vigilance detection system indicates that he or she is in an extremely tired state. An adaptive closed-loop BCI system that consists of vigilance detection and feedback is very useful in changing environments (?, ?). How to efficiently provide and assess the feedback in high vigilance tasks should be further investigated.
Due to individual differences of neurophysiological signals across subjects and sessions, the performance of vigilance estimation models may be dramatically degraded. The generalization performance of vigilance estimation models should be considered for individual differences and adaptability. However, training subject-specific models requires time-consuming calibrations. To address these problems, one efficient approach is to train models on the existing labelled data from a group of subjects and generalize the models to the new subjects with transfer learning techniques (?, ?, ?, ?, ?).
In this paper, we have proposed a multimodal vigilance estimation approach using EEG and forehead EOG. We have applied different separation strategies to extract VEO, HEO and EEG signals from four shared forehead electrodes. The COR and RMSE of single forehead EOG-based and EEG-based methods are 0.78/0.12 and 0.70/0.13, respectively, whereas the modality fusion with temporal dependency can significantly enhance the performance with values of 0.85/0.09. The experimental results have demonstrated the feasibility and efficiency of our proposed approach based on the forehead setup. Our vigilance estimation method has the following three main advantages: both EEG and EOG signals can be acquired simultaneously with four shared electrodes on the forehead; modelling both internal cognitive states and external subconscious behaviours with fusion of forehead EEG and EOG; and introducing temporal dependency to capture the dynamic patterns of the vigilance of users. From the experimental results, we have observed that phenomena of increasing theta and alpha frequency activities and decreasing gamma frequency activities in drowsy states do exist in contrast to awake states. We have also investigated the complementary characteristics of forehead EOG and EEG for vigilance estimation. Our experimental results indicate that the proposed approach can be used to implement a wearable passive brain-computer interface for tasks that require sustained attention.
This work was supported in part by grants from the National Natural Science Foundation of China (Grant No. 61272248), the National Basic Research Program of China (Grant No. 2013CB329401), and the Major Basic Research Program of Shanghai Science and Technology Committee (15JC1400103).
-   Baltrusaitis, T., Banda, N. & Robinson, P. (2013). Dimensional affect recognition using continuous conditional random fields, 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, IEEE, pp. 1–8.
-   Baltrušaitis, T., Robinson, P. & Morency, L.-P. (2014). Continuous conditional neural fields for structured regression, Computer Vision–ECCV, Springer, pp. 593–608.
-   Bergasa, L. M., Nuevo, J., Sotelo, M. A., Barea, R. & Lopez, M. E. (2006). Real-time system for monitoring driver vigilance, IEEE Transactions on Intelligent Transportation Systems 7(1): 63–77.
-   Berka, C., Levendowski, D. J., Lumicao, M. N., Yau, A., Davis, G., Zivkovic, V. T., Olmstead, R. E., Tremoulet, P. D. & Craven, P. L. (2007). EEG correlates of task engagement and mental workload in vigilance, learning, and memory tasks, Aviation, space, and environmental medicine 78(Supplement 1): B231–B244.
-   Bulling, A., Ward, J., Gellersen, H., Tröster, G. et al. (2011). Eye movement analysis for activity recognition using electrooculography, IEEE Transactions on Pattern Analysis and Machine Intelligence 33(4): 741–753.
-   Calvo, R. A. & D’Mello, S. (2010). Affect detection: An interdisciplinary review of models, methods, and their applications, IEEE Transactions on Affective Computing 1(1): 18–37.
-   Daly, I., Scherer, R., Billinger, M. & Muller-Putz, G. (2015). Force: Fully online and automated artifact removal for brain-computer interfacing, IEEE Transactions on Neural Systems and Rehabilitation Engineering 23(5): 725–736.
-   Damousis, I. G. & Tzovaras, D. (2008). Fuzzy fusion of eyelid activity indicators for hypovigilance-related accident prediction, IEEE Transactions on Intelligent Transportation Systems 9(3): 491–500.
-   Davidson, P. R., Jones, R. D. & Peiris, M. T. (2007). EEG-based lapse detection with high temporal resolution, IEEE Transactions on Biomedical Engineering 54(5): 832–839.
-   Delorme, A. & Makeig, S. (2004). EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis, Journal of Neuroscience Methods 134(1): 9–21.
-   Dinges, D. F. & Grace, R. (1998). PERCLOS: A valid psychophysiological measure of alertness as assessed by psychomotor vigilance, US Department of Transportation, Federal Highway Administration, Publication Number FHWA-MCRT-98-006 .
-   D’mello, S. K. & Kory, J. (2015). A review and meta-analysis of multimodal affect detection systems, ACM Computing Surveys 47(3): 43.
-   Dong, Y., Hu, Z., Uchimura, K. & Murayama, N. (2011). Driver inattention monitoring system for intelligent vehicles: A review, IEEE Transactions on Intelligent Transportation Systems 12(2): 596–614.
-   Duan, R.-N., Zhu, J.-Y. & Lu, B.-L. (2013). Differential entropy feature for EEG-based emotion classification, 6th International IEEE/EMBS Conference on Neural Engineering, IEEE, pp. 81–84.
-   Ferrara, M. & De Gennaro, L. (2001). How much sleep do we need?, Sleep Medicine Reviews 5(2): 155–179.
-   Gao, X.-Y., Zhang, Y.-F., Zheng, W.-L. & Lu, B.-L. (2015). Evaluating driving fatigue detection algorithms using eye tracking glasses, 7th International IEEE/EMBS Conference on Neural Engineering, IEEE, pp. 767–770.
-   Grier, R. A., Warm, J. S., Dember, W. N., Matthews, G., Galinsky, T. L., Szalma, J. L. & Parasuraman, R. (2003). The vigilance decrement reflects limitations in effortful attention, not mindlessness, Human Factors: The Journal of the Human Factors and Ergonomics Society 45(3): 349–359.
-   Grozea, C., Voinescu, C. D. & Fazli, S. (2011). Bristle-sensors¡ªlow-cost flexible passive dry EEG electrodes for neurofeedback and BCI applications, Journal of Neural Engineering 8(2): 025008.
-   Hairston, W. D., Whitaker, K. W., Ries, A. J., Vettel, J. M., Bradford, J. C., Kerick, S. E. & McDowell, K. (2014). Usability of four commercially-oriented EEG systems, Journal of Neural Engineering 11(4): 046018.
-   Haufe, S., Kim, J.-W., Kim, I.-H., Sonnleitner, A., Schrauf, M., Curio, G. & Blankertz, B. (2014). Electrophysiology-based detection of emergency braking intention in real-world driving, Journal of Neural Engineering 11(5): 056011.
-   Haynes, J.-D. & Rees, G. (2006). Decoding mental states from brain activity in humans, Nature Reviews Neuroscience 7(7): 523–534.
-   Huang, R.-S., Jung, T.-P. & Makeig, S. (2009). Tonic changes in EEG power spectra during simulated driving, Foundations of augmented cognition, neuroergonomics and operational neuroscience, Springer, pp. 394–403.
-   Huo, X.-Q., Zheng, W.-L. & Lu, B.-L. (2016). Driving fatigue detection with fusion of EEG and forehead EOG. in International Joint Conference on Neural Networks.
-   Imbrasaite, V., Baltrusaitis, T. & Robinson, P. (2014). CCNF for continuous emotion tracking in music: Comparison with CCRF and relative feature representation, IEEE International Conference on Multimedia and Expo Workshops, IEEE, pp. 1–6.
-   Jap, B. T., Lal, S., Fischer, P. & Bekiaris, E. (2009). Using EEG spectral components to assess algorithms for detecting fatigue, Expert Systems with Applications 36(2): 2352–2359.
-   Ji, Q., Zhu, Z. & Lan, P. (2004). Real-time nonintrusive monitoring and prediction of driver fatigue, IEEE Transactions on Vehicular Technology 53(4): 1052–1068.
-   Jung, T.-P., Makeig, S., Humphries, C., Lee, T.-W., Mckeown, M. J., Iragui, V. & Sejnowski, T. J. (2000). Removing electroencephalographic artifacts by blind source separation, Psychophysiology 37(02): 163–178.
-   Kang, J.-S., Park, U., Gonuguntla, V., Veluvolu, K. & Lee, M. (2015). Human implicit intent recognition based on the phase synchrony of EEG signals, Pattern Recognition Letters 66: 144–152.
-   Khushaba, R. N., Kodagoda, S., Lal, S. & Dissanayake, G. (2011). Driver drowsiness classification using fuzzy wavelet-packet-based feature-extraction algorithm, IEEE Transactions on Biomedical Engineering 58(1): 121–131.
-   Kim, I.-H., Kim, J.-W., Haufe, S. & Lee, S.-W. (2014). Detection of braking intention in diverse situations during simulated driving based on EEG feature combination, Journal of Neural Engineering 12(1): 016001.
-   Lafferty, J., McCallum, A. & Pereira, F. C. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data, 18th International Conference on Machine Learning, Morgan Kaufmann.
-   Lee, E. C., Woo, J. C., Kim, J. H., Whang, M. & Park, K. R. (2010). A brain–computer interface method combined with eye tracking for 3D interaction, Journal of Neuroscience Methods 190(2): 289–298.
-   Lin, C.-T., Chuang, C.-H., Huang, C.-S., Tsai, S.-F., Lu, S.-W., Chen, Y.-H. & Ko, L.-W. (2014). Wireless and wearable EEG system for evaluating driver vigilance, IEEE Transactions on Biomedical Circuits and Systems 8(2): 165–176.
-   Lin, C.-T., Huang, K.-C., Chao, C.-F., Chen, J.-A., Chiu, T.-W., Ko, L.-W. & Jung, T.-P. (2010). Tonic and phasic EEG and behavioral changes induced by arousing feedback, NeuroImage 52(2): 633–642.
-   Lin, C.-T., Huang, K.-C., Chuang, C.-H., Ko, L.-W. & Jung, T.-P. (2013). Can arousing feedback rectify lapses in driving? prediction from EEG power spectra, Journal of Neural Engineering 10(5): 056024.
-   Lu, Y., Zheng, W.-L., Li, B. & Lu, B.-L. (2015). Combining eye movements and EEG to enhance emotion recognition, International Joint Conference on Artificial Intelligence, pp. 1170–1176.
-   Ma, J.-X., Shi, L.-C. & Lu, B.-L. (2010). Vigilance estimation by using electrooculographic features, Annual International Conference of the IEEE Engineering in Medicine and Biology Society, IEEE, pp. 6591–6594.
-   Ma, J.-X., Shi, L.-C. & Lu, B.-L. (2014). An EOG-based vigilance estimation method applied for driver fatigue detection, Neuroscience and Biomedical Engineering 2(1): 41–51.
-   Makeig, S. & Inlow, M. (1993). Lapse in alertness: coherence of fluctuations in performance and EEG spectrum, Electroencephalography and clinical neurophysiology 86(1): 23–35.
-   Martel, A., Dähne, S. & Blankertz, B. (2014). EEG predictors of covert vigilant attention, Journal of Neural Engineering 11(3): 035009.
-   McMullen, D. P., Hotson, G., Katyal, K. D., Wester, B. A., Fifer, M. S., McGee, T. G., Harris, A., Johannes, M. S., Vogelstein, R. J., Ravitz, A. D. et al. (2014). Demonstration of a semi-autonomous hybrid brain–machine interface using human intracranial eeg, eye tracking, and computer vision to control a robotic upper limb prosthetic, IEEE Transactions on Neural Systems and Rehabilitation Engineering 22(4): 784–796.
-   Morioka, H., Kanemura, A., Hirayama, J.-i., Shikauchi, M., Ogawa, T., Ikeda, S., Kawanabe, M. & Ishii, S. (2015). Learning a common dictionary for subject-transfer decoding with resting calibration, NeuroImage 111: 167–178.
-   Mühl, C., Jeunet, C. & Lotte, F. (2014). EEG-based workload estimation across affective contexts, Frontiers in Neuroscience 8.
-   Mullen, T. R., Kothe, C. A., Chi, Y. M., Ojeda, A., Kerth, T., Makeig, S., Jung, T.-P. & Cauwenberghs, G. (2015). Real-time neuroimaging and cognitive monitoring using wearable dry EEG, IEEE Transactions on Biomedical Engineering 62(11): 2553–2567.
-   Nicolaou, M., Gunes, H., Pantic, M. et al. (2011). Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space, IEEE Transactions on Affective Computing 2(2): 92–105.
-   O’Connell, R. G., Dockree, P. M., Robertson, I. H., Bellgrove, M. A., Foxe, J. J. & Kelly, S. P. (2009). Uncovering the neural signature of lapsing attention: electrophysiological signals predict errors up to 20 s before they occur, The Journal of Neuroscience 29(26): 8604–8611.
-   Oken, B. S. & Salinsky, M. C. (2007). Sleeping and driving: Not a safe dual-task, Clinical Neurophysiology 118(9): 1899.
-   Pan, S. J. & Yang, Q. (2010). A survey on transfer learning, IEEE Transactions on Knowledge and Data Engineering 22(10): 1345–1359.
-   Papadelis, C., Chen, Z., Kourtidou-Papadeli, C., Bamidis, P. D., Chouvarda, I., Bekiaris, E. & Maglaveras, N. (2007). Monitoring sleepiness with on-board electrophysiological recordings for preventing sleep-deprived traffic accidents, Clinical Neurophysiology 118(9): 1906–1922.
-   Peiris, M. T., Davidson, P. R., Bones, P. J. & Jones, R. D. (2011). Detection of lapses in responsiveness from the EEG, Journal of Neural Engineering 8(1): 016003.
-   Peng, J., Bo, L. & Xu, J. (2009). Conditional neural fields, Advances in Neural Information Processing Systems, pp. 1419–1427.
-   Pfurtscheller, G., Allison, B. Z., Bauernfeind, G., Brunner, C., Escalante, T. S., Scherer, R., Zander, T. O., Mueller-Putz, G., Neuper, C. & Birbaumer, N. (2010). The hybrid BCI, Frontiers in Neuroscience 4: 3.
-   Ranney, T. A. (2008). Driver distraction: A review of the current state-of-knowledge, Technical report, National Highway Traffic Safety Administration, Washington, DC, USA.
-   Ray, W. J. & Cole, H. W. (1985). EEG alpha activity reflects attentional demands, and beta activity reflects emotional and cognitive processes, Science 228(4700): 750–752.
-   Rosenberg, M. D., Finn, E. S., Scheinost, D., Papademetris, X., Shen, X., Constable, R. T. & Chun, M. M. (2016). A neuromarker of sustained attention from whole-brain functional connectivity, Nature Neuroscience 19: 165–171.
-   Sahayadhas, A., Sundaraj, K. & Murugappan, M. (2012). Detecting driver drowsiness based on sensors: a review, Sensors 12(12): 16937–16953.
-   Shi, L.-C., Jiao, Y.-Y. & Lu, B.-L. (2013). Differential entropy feature for EEG-based vigilance estimation, 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, IEEE, pp. 6627–6630.
-   Shi, L.-C. & Lu, B.-L. (2010). Off-line and on-line vigilance estimation based on linear dynamical system and manifold learning, Annual International Conference of the IEEE Engineering in Medicine and Biology Society, IEEE, pp. 6587–6590.
-   Shi, L.-C. & Lu, B.-L. (2013). EEG-based vigilance estimation using extreme learning machines, Neurocomputing 102: 135–143.
-   Sievertsen, H. H., Gino, F. & Piovesan, M. (2016). Cognitive fatigue influences students¡¯ performance on standardized tests, Proceedings of the National Academy of Sciences 113(10): 2621–2624.
-   Simola, J., Le Fevre, K., Torniainen, J. & Baccino, T. (2015). Affective processing in natural scene viewing: Valence and arousal interactions in eye-fixation-related potentials, NeuroImage 106: 21–33.
-   Trutschel, U., Sirois, B., Sommer, D., Golz, M. & Edwards, D. (2011). PERCLOS: An alertness measure of the past, Driving Assessment 2011: 6th.
-   Urigüen, J. A. & Garcia-Zapirain, B. (2015). EEG artifact removal¡ªstate-of-the-art and guidelines, Journal of Neural Engineering 12(3): 031001.
-   Wang, Y.-K., Jung, T.-P. & Lin, C.-T. (2015). EEG-based attention tracking during distracted driving, IEEE Transactions on Neural Systems and Rehabilitation Engineering 23(6): 1085–1094.
-   Wronkiewicz, M., Larson, E. & Lee, A. K. (2015). Leveraging anatomical information to improve transfer learning in brain–computer interfaces, Journal of Neural Engineering 12(4): 046027.
-   Wu, D., Courtney, C. G., Lance, B. J., Narayanan, S. S., Dawson, M. E., Oie, K. S. & Parsons, T. D. (2010). Optimal arousal identification and classification for affective computing using physiological signals: virtual reality stroop task, IEEE Transactions on Affective Computing 1(2): 109–118.
-   Zander, T. O. & Jatzev, S. (2012). Context-aware brain–computer interfaces: exploring the information space of user, technical system and environment, Journal of Neural Engineering 9(1): 016003.
-   Zander, T. O. & Kothe, C. (2011). Towards passive brain–computer interfaces: applying brain–computer interface technology to human–machine systems in general, Journal of Neural Engineering 8(2): 025005.
-   Zhang, Y.-F., Gao, X.-Y., Zhu, J.-Y., Zheng, W.-L. & Lu, B.-L. (2015). A novel approach to driving fatigue detection using forehead EOG, 7th International IEEE/EMBS Conference on Neural Engineering, IEEE, pp. 707–710.
-   Zhang, Y.-Q., Zheng, W.-L. & Lu, B.-L. (2015). Transfer components between subjects for EEG-based driving fatigue detection, International Conference on Neural Information Processing, Springer, pp. 61–68.
-   Zheng, W.-L., Dong, B.-N. & Lu, B.-L. (2014). Multimodal emotion recognition using EEG and eye tracking data, 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, IEEE, pp. 5040–5043.
-   Zheng, W.-L. & Lu, B.-L. (2015). Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks, IEEE Transactions on Autonomous Mental Development 7(3): 162–175.
-   Zheng, W.-L. & Lu, B.-L. (2016). Personalizing EEG-based affective models with transfer learning, International Joint Conference on Artificial Intelligence, pp. 2732–2738.