Surrogate-based artifact removal from single-channel EEG

Surrogate-based artifact removal from single-channel EEG


Objective: The recent emergence and success of electroencephalography (EEG) in low-cost portable devices, has opened the door to a new generation of applications processing a small number of EEG channels for health monitoring and brain-computer interfacing. These recordings are, however, contaminated by many sources of noise degrading the signals of interest, thus compromising the interpretation of the underlying brain state. In this work, we propose a new data-driven algorithm to effectively remove ocular and muscular artifacts from single-channel EEG: the surrogate-based artifact removal (SuBAR). Methods: By means of the time-frequency analysis of surrogate data, our approach is able to identify and filter automatically ocular and muscular artifacts embedded in single-channel EEG. Results: In a comparative study using artificially contaminated EEG signals, the efficacy of the algorithm in terms of noise removal and signal distortion was superior to other traditionally-employed single-channel EEG denoising techniques: wavelet thresholding and the canonical correlation analysis combined with an advanced version of the empirical mode decomposition. Even in the presence of mild and severe artifacts, our artifact removal method provides a relative error 4 to 5 times lower than traditional techniques. Significance: In view of these results, the SuBAR method is a promising solution for mobile environments, such as ambulatory healthcare systems, sleep stage scoring or anesthesia monitoring, where very few EEG channels or even a single channel is available.

Keywords: Artifact removal, electroencephalography (EEG), single-channel EEG, surrogate data, wavelet decomposition

©2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. DOI: 10.1109/TNSRE.2018.2794184

1 Introduction

Electroencephalogram (EEG) s the standard recording of electrophysiological activity of the brain. Due to its temporal resolution (ms), technical simplicity (portable and non-invasive) and low cost, EEG is nowadays extensively used for studying different cognitive and pathological brain states. EEG recordings are, however, often contaminated by non-neural physiological activities, as well as other external or environmental noises, that seriously degrade the signals of interest.

Eye movement-related artifacts have a strong detrimental effect on the quality of scalp EEG. During eye movements, abrupt changes on the retina’s resting potential are primarily observed in the frontal EEG electrodes before their widespread propagation over the scalp. The strength and spatial distribution of the artifact strongly depends on the position of EEG electrodes and on the direction of the eye movement [2, 3, 1]. Eye blinks also contaminate EEG signals but with an artifact whose amplitude is generally larger than that produced by eye movements. Muscular artifacts originate from the electrical activity elicited by contracting muscles. Although electromyographic (EMG) activity can be observed over the entire scalp, the amplitude and distribution of EMG artifacts depend on the type of muscle contracted (e.g. jaw, neck or face) and on the degree of tension [2, 3, 1]. Other possible perturbations include breathing artifacts, electrodermal interferences produced by sweating, motion artifacts (e.g. head or chest movements), as well as shifts of the electrical properties of electrodes.

In clinical settings, visual inspection and manual removal of contaminated EEG segments is a common practice prior to any off-line signal analysis. Obviously, such manual methods are not suitable for on-line applications. There is a number of general techniques used for artifact removal from EEG recordings. When the frequency bands of the signal and interferences do not overlap, simple low pass, band pass or high pass filtering are effective techniques for removing artifacts. Nevertheless, some interferences (e.g. muscular activities) have a wide spectral distribution that overlaps with that of EEG, making difficult to remove them. In case of spectral overlap, more refined techniques such as adaptive filtering, Wiener filtering, as well as blind source separation (BSS) methods have been effectively used to cancel EEG interferences. Other methods like wavelet decompositions (WT) and empirical mode decomposition (EMD) have also been successfully applied to remove EEG artifacts [10, 12]. For a review and discussion of different approaches see [2, 3, 1] and references therein.

Linear regressions (in time or frequency domain) and adaptive filtering have been successfully applied in EOG and ECG correction procedures [1]. Their main disadvantage is, however, that they assume that one or more reference channels with the artifacts waveforms are available. Independent component analysis (ICA) is a blind signal separation (BSS) technique that has been largely used for EEG artifact removal [5, 4]. Briefly, ICA separates multi-channel EEG signals into statistically independent components which are assumed to represent the underlying sources of the observed EEG signals. Clean EEG signals can be reconstructed by removing artifact-related components from the original ICA decomposition. ICA-based methods can effectively remove interferences from a wide variety of artifactual sources in EEG recordings only when the number of channels and the amount of data are large enough [1]. Canonical correlation analysis (CCA) is a more efficient BSS method for muscular artifact removal that exploits the relative low autocorrelation of EMG artifacts in comparison with EEG activity [7, 11].

Artifact removal by wavelet-based methods and other data-driven decompositions (e.g. the EMD and its variants [31, 34, 1]) relies on the assumption that EEG artifacts can be represented by one or more levels or modes, which are thresholded before reconstructing the signal from the filtered representation. The success of these techniques depends on the threshold selecting criteria. In recent studies, different combinations of algorithms to remove artifacts (e.g. EMD-ICA, EMD-CCA) have provided significantly improved results [11, 13, 6].

Figure 1: Block diagram of the proposed artifact removal method.

In last years, simplified EEG systems with few channels have been developed with the aim to increase usability of ambulatory neuroimaging technologies in clinical environments (e.g. in epilepsy and sleep diagnosis) and in custom-designed settings for routine monitoring [14, 15]. For some applications like neurofeedback, mental state classification, emotion sensing, etc., artifact removal algorithms should perform reasonably well with short epochs of streaming EEG data. Further, conventional multichannel techniques can not be applied directly to isolate artifact sources in reduced channels configurations. Hence, there is a growing need to develop effective artifact removal techniques that can operate in short segments of single channel EEG, especially in single-channel settings [2, 13, 6, 12].

In this work, we propose a data-driven approach for artifact removal from single-channel EEG: the Surrogates-Based Artifact Removal (SuBAR) method. Our approach combines wavelet decomposition and a resampling method called surrogate data. Under the hypothesis of stationary EEG segments, muscular and ocular artifacts are considered as nonstationary events that can be identified across different scales. The obtained scales from the original recorded EEG are then compared with those obtained from resampled data, which are stationary by construction. Hence, instead of estimating the filtering threshold from uncontaminated segments or theoretical functions, we obtain it directly from the WT of surrogate data.

The proposed framework is validated on artifact-free EEG data contaminated with simulated artifacts of several types (EMG or EOG). The method is also illustrated on a real EEG contaminated data collected from a healthy subject. The reliability and performances of our method are also compared with those obtained by an unsupervised wavelet-based artifact removal [11, 6, 1, 12] and by those resulting from the CCA in combination with an advanced version of the EMD [11, 34, 6]. The remainder of the paper is organized as follows: Section 2 describes the proposed framework, as well as the WT and CCA-EMD approaches. Section 3 presents the database and Section 4 describes the procedure to simulate EMG and EOG artifacts [4]. Section 5 provides the experimental results and evaluation of the method. Finally, we conclude the paper with a discussion in Section 6.

2 Methods

In this section, we describe the various techniques employed in this work. Firstly, we describe our surrogates-based EEG technique, and then we outline two alternative methods for artifact removal from EEG single-channels (used here for comparison): a wavelet thresholding method [11, 6, 1, 12], and the artifact removal algorithm based on CCA combined with the EMD [11].

2.1 Resampling-based artifact removal

In this work, we assume that a recorded single-channel EEG signal1 is a linear combination of the original desired stationary EEG signal contaminated with an artifact plus instrumental Gaussian white noise . Although some nonlinear filters have been proposed to remove multiplicative EMG interferences [21], here we only consider additive artifacts of the form for the sake of simplicity.

The aim of our work is to filter from the vector of observations , with minimal a priori knowledge on the artifact signal, the EEG signal and the observational noise. To this end, we have used wavelet transform as a tool to detect and remove artifacts from single EEG channels. Wavelet-based artifact removal aims at separating the artifact components from the clean EEG components in the wavelet domain. Once the artifact components have been identified and the corresponding wavelet coefficients removed, the remaining components are kept to reconstruct the cleaned EEG signal. The thresholding criterion is traditionally based on statistical properties of the wavelet spectrum obtained from uncontaminated baselines or from theoretical thresholding functions. The originality of our approach is that we propose to estimate this threshold directly from the stationarized spectrum of the observed data.

Figure 1 shows the general pipeline of the artifact removal method proposed in this study. The key point of our method is that the hypothesis of EEG stationarity (which corresponds to time-invariance in the time-frequency spectrum) is statistically characterized on the basis of a set of surrogates which all share the same average stationary spectrum as the desired EEG signal. In our approach, the recorded EEG signal is of the form , where the artifact is assumed to be a non-stationary event restricted to a finite time interval shorter than the observed signal . Since surrogates can be viewed as distinct, independent stationary realizations of the observed signal, the wavelet spectra obtained from surrogates can define the learning set for stationarity. Wavelet thresholding is therefore based on the statistical distribution of the spectrum of surrogates (for each time-frequency bin). All the components of the original wavelet decomposition higher than a given threshold are set to zero, and the desired EEG signal is reconstructed utilizing the remaining components only.

The main blocks of the proposed method are the following:


In most EEG systems, some procedures are currently applied to prepare the EEG data for analysis. These procedures include re-referencing of recorded signals, down-sampling of original data for saving transmission power and computational cost, or filtering for baseline and power-line (50/60 Hz) interference removal. In our pre-processing step, the sampled raw EEG signal is divided into epochs of size . Very short segments ( s) may not represent some slow artifacts properly (e.g. ocular or movement artifacts) whereas in very large windows (e.g.  s), the stationary assumption of EEG may be no longer valid and there is a high chance that clean EEG segments will be distorted by the artifact removal procedure [30]. Here, we set  s.


Bootstrap and other resampling methods have been used extensively in the past to appropriately determine the properties of a time series before applying different analysis and modeling techniques [22]. The aim of these techniques is to capture a given structure of the original signal, and construct additional realizations that can then be used to test the original data for additional, unexplained structure. Surrogate data techniques were proposed as non-parametric methods for testing general hypotheses on data without making assumptions on the underlying generating process. Surrogates are time series created directly from the original dataset through replication of the linear autocorrelation (or equivalently, the power spectrum density) and amplitude distribution, with all other higher-order quantities randomized [24].

In this work, surrogate time series were obtained by destroying the organized phase structure that controls the nonstationarity of the original signal  [23]. In the classical Fourier-transform surrogate algorithm (FT), the signal is first Fourier transformed, and the magnitude of the spectrum is then kept unchanged while its phases are replaced by a random sequence, uniformly distributed over  [23, 24]. This modified spectrum is then inverse Fourier transformed, leading to a Gaussian stationary surrogate time series with the same spectrum as the original signal [23, 24]. To deal with non-Gaussian data, the algorithm of amplitude adjusted Fourier transform (AAFT) first orders a Gaussian white noise time series to match the rank order of the original data and derives the FT surrogate of this time series [23]. The final surrogate is scaled to the distribution of the original data by sorting the original data according to the ranking of the FT surrogate. [23]. The AAFT algorithm guarantees that surrogate data possesses the original distribution exactly and the original power spectrum approximately [24]. In this work, we have used the so-called Iterative Amplitude Adjusted Fourier Transform (IAAFT), which is an iterative version of AAFT [26, 24]. The steps are repeated until the autocorrelation function is sufficiently similar to the original, or until there is no change in the amplitudes [26, 24].

In the context of nonlinear time series analysis, surrogate data have been widely used for testing linearity and, more recently, for testing stationarity [24, 25]. To this end, time-frequency decompositions are used for determining whether the spectral characteristics of a signal change significantly over time, which is an indication of non-stationarity [25]. The time-frequency localization properties of the wavelet transform allow a comparison of the original signal with the stationarized surrogates in the wavelet domain. We can therefore detect the position, on the time-frequency plane, where the spectrum of the observed EEG differs from a stationary process. In this study, values derived from surrogate realizations were used to represent the spectrum of stationary signals.

Wavelet decomposition

Wavelet transform is a method commonly used to remove artifacts from EEG signals [11, 10, 6, 1, 12]. The wavelet transform is an useful analysis tool for time-frequency representation of non-stationary signals, obtained by convolving the signal with a scaled and translated wavelet function,


with and , positive real numbers and . The WT provides thus a decomposition of the signal in different scales, where the obtained wavelet coefficients represent a measure of similarity between the signal and the corresponding wavelet function .

In the discrete wavelet transform (DWT) both factors and are integers and are chosen in a dyadic grid ( and with integers and playing roles of the decomposition level and temporal localization at this level, respectively). At the th level of decomposition, a matrix of size can be appropriately built with the corresponding orthonormal wavelet basis. The wavelet coefficients can be thus obtained by . The DWT can be seen as a band-pass filter bank: the decomposition starts by passing the signal through a low-pass filter giving the approximation coefficients, and a high-pass filter producing the detail coefficients. The resulting two orthogonal sub-bands are afterwards down-sampled by two. Then, the low-pass result can be recursively filtered by the same pair of filters until the desired frequency range is obtained.

In this paper, we used the Maximal Overlap Wavelet Transform (MODWT) [28], instead of the orthogonal DWT, to estimate wavelet decomposition. As the DWT, the MODWT can also be utilized for multi-resolution analysis, with the advantage that it can be applied to signals of any size, while the DWT requires the sample size to be an integer power of two. In contrast with the usual DWT, the MODWT is translation invariant, i.e. shifting circularly the signal by any amount results into shifting the outputs of the low-pass and high-pass filters by the same amount. This property does not hold for the DWT because of the subsampling involved in the filtering process [28].

Let and a th level wavelet and scaling filter, respectively. The corresponding th level MODWT wavelet and scaling filters are defined, respectively, by and with the same common length  [28]. The th level MODWT wavelet and scaling coefficients are dimensional vectors defined by




for , respectively. The “mod” operator denotes the modular arithmetic between two integers.

Inverse transforming the MODWT coefficients creates the so-called details and smooths that form a multi-resolution analysis of signal  [28]:


with denoting the number of decomposition levels. An interesting property of MODWT is that and are associated with zero phase filters [28]. The frequency band of detail coefficients associated to coefficients is given by with a width  [28]. In contrast with DWT, there are always coefficients at each level. The MODWT is an energy preserving transform and the total energy of can be partitioned by the MODWT scaling and wavelet coefficients:  [28].

For all the wavelet-based methods studied in this paper, we used the symlets wavelet and 5 levels of decomposition. The symlets are orthogonal functions, nearly symmetrical wavelets with an oscillatory waveform and good time-frequency localization properties [16]. This makes it suitable wavelet choice for filtering and reconstructing EEG signals [18, 17].

Signal reconstruction

Comparison of the original signal with the stationary surrogates in the wavelet domain can identify non-stationary events on the time-frequency plane. The wavelet coefficients corresponding to artifacts are expected to be of high amplitude and well localized in time and scales, while the clean EEG coefficients are expected to be small and homogeneously spread over the whole segment.

To detect artifact components, the wavelet transform of surrogates are firstly averaged to produce a mean reference spectrum of stationarity:


where is the number of surrogates and denotes the wavelet coefficients for surrogate at the th level. At each time-point of the th level of decomposition, the standard deviation for the ensemble of surrogates can also be determined as


At each point of the wavelet domain, the significance of wavelet coefficients at a given level was assessed by quantifying its statistical deviation from values obtained for the ensemble of surrogates. Thus, non-stationary components in observed signals can be detected by comparing values to a given threshold.

Distribution of wavelet coefficients from surrogates can be fitted with an appropriate distribution, such as Gaussian or Gamma distribution, and then setting a one-sided confidence interval, for rejection of non-stationary components. Here, the significance was obtained by the ratio whose p-value is given by the Chebyshev’s inequality: for any statistical distribution of : where is the chosen statistical threshold [27].

The threshold values are compared with the wavelet coefficients of the original signal in the following manner:


If the original wavelet coefficients are greater than the threshold, they are set to the average value of the reference spectrum (obtained from the surrogates). In this manner, only the stationary components of the original spectrum are retained. Here, the threshold was set as the values greater than % of the values in the surrogate distribution. After non-stationary components have been filtered, the cleaned signal can be recomposed from all levels using the inverse MODWT [28] of the cleaned coefficients .

The main steps of proposed SuBAR method for automatic artifact removal are summarized in Algorithm 1.

Input: Signal , number of surrogates (e.g. ), threshold (e.g. of significance level obtained from the surrogate distribution), number of decomposition levels
Output: Filtered signal  

1:Estimate the wavelet coefficients (MODWT) from input signal
2:for each  do
3:      Create a surrogate time series from signal
4:      Estimate the wavelet coefficients (MOWDT) from surrogate
5:end for
6:Compare the wavelet coefficients of the original signal with those obtained from surrogates
7:for each level of decomposition  do
8:      Estimate the threshold from the surrogate distribution and the significance level
9:      if  then
11:      else
13:      end if
14:end for
15:Use the cleaned coefficients to reconstruct the filtered signal with the inverse MODWT
Algorithm 1 Surrogate-based artifact removal (SuBAR) algorithm for single-channel EEG

2.2 Methods for Comparison

The proposed technique is compared with two other commonly used methods for artifact removal from single-channel EEG signals: i) a wavelet-based artifact removal [11, 6, 1, 12] which is based on the classical wavelet-thresholding and ii) a single-channel method based on the combination of CCA with the EMD [11, 34, 6].

Wavelet thresholding

The key point of the wavelet thresholding is to separate, in the wavelet domain, the artifact components from the uncontaminated EEG components. For thresholding the wavelet coefficients we have used a level-dependent threshold [19, 29, 30]: , where is the length of signal and is the estimated noise variance for the wavelet coefficients, , at the th level of decomposition, which is usually calculated by [29]: . Such level-dependent thresholding is more appropriate than a single universal threshold in case of correlated and non-Gaussian data, as such that it characterizes the underlying EEG activities [29]. Artifact removal is finally obtained by removing wavelet coefficients whose absolute values exceed the threshold [20, 1, 12] as follows:


Canonical Correlation Analysis (CCA) combined with Empirical Mode Decomposition for single channels

CCA is a BSS technique currently used for separating a number of mixed or contaminated signals [7]. The recorded multichannel EEG signals are considered as a mixture given by ,where is the unknown mixing matrix and the components in are the statistically independent and unknown source signals, which include the artifacts. As other BSS techniques, CCA estimates the mixing matrix and recovers the original sources as , where is the unmixing matrix, i.e. the estimate of the inverse of . Artifacts are removed by computing , where is the mixing matrix with the columns corresponding to artifact related sources, set to zero.

Due to their relatively low autocorrelation, muscle artifacts are generally well identified in the last components obtained by the CCA algorithm [7]. Previous studies have shown that CCA outperforms different ICA algorithms for artifact removal from multichannel EEG and fMRI signals [9, 7, 8]. Other advantages of the CCA include that, i) as CCA uses second order statistics, it is a more computationally efficient algorithm than ICA and, ii) contrary to ICA algorithms, the CCA method always provides the same result for a given input.

As ICA, the original CCA is a multi-variate technique that requires multi-channel recordings to perform the decomposition. In some recent works, multi-dimensional time series have been generated from a single-channel recording, using popular data-driven methods such as the wavelet transform and the empirical mode decomposition [2, 3, 11, 1, 6]. Artifact removal is then performed by applying BSS techniques (e.g. ICA or CCA) to the generated multi-channel signals. For our single-channel artifact removal technique we have here combined the CCA, as the best choice form multi-channel artifact removal, with an improved version of the EMD as the best choice for time series decomposition [11]. The combination of these methods has been shown to be more reliable and computationally more efficient than the EMD combined with ICA [11].

Figure 2: Examples of original clean EEG signals (blue curves), with superimposed artifacts (red curves) and removal results after the proposed algorithm (black curves). The EEG signals are contaminated by: simulated large (A) and small (B) muscular activities, superimposed large (C) and small (D) ocular artifact. Gray boxes indicate the artifactual regions.

The orignal EMD is an adaptive data-driven method, proposed by [31], to decompose non-linear and non-stationary signals into a number of sub-components called intrinsic modal functions (IMFs), with well defined instantaneous frequencies. The original signal is thus decomposed as , where stands for a residual trend, and the intrinsic modes ’s are nearly orthogonal to each other [31]. By construction, the spectral supports are decreased when going from one residual to the next. Nevertheless their frequency discrimination applies only locally (in time) and they cannot correspond to a sub-band filtering [32]. Despite its several advantages to decompose mixed signals, the EMD of noisy data may result in a corruption of modes, i.e. very similar oscillations appear in different IMFs [31]. Recently, the so-called complete Ensemble EMD with adaptive noise (CEEMDAN) was proposed to ameliorate the spectral separation of modes and to reduce computational time [33, 34]. The key idea on this algorithm relies on averaging the modes obtained by EMD applied to several realizations of Gaussian white noise added to the original signal.

Here, we use this decomposition technique (CEEMDAN), to convert the single-channel signal into a multi-channel signal . By means of the CCA, the source signals associated to artifacts can be then removed as described before. The cleaned single-channel signal without the artifacts can be finally reconstructed by adding the new IMFs components in  [11]. Hereafter, for the sake of simplicity, we denote this technique CCA-EMD.

Criteria for artifact removal with the CCA-EMD method

Eye blinks artifacts display large slow waves and have large autocorrelation compared to EEG sources. Here, EOG artifacts were thus identified from the first canonical variates due to their large autocorrelations (larger than [11]. In contrast, due to the frequency spectrum of the EMG artifacts, they resemble high frequency activity. In this work, the CCA components with spectral bandwidth larger to  Hz were associated to muscle artifacts and removed from the reconstruction [7, 4, 36]. Other filtering criteria can also be applied, possibly providing better tuning of the algorithm to the particularities of other EEG artifacts.

3 Database

To assess the performance of the proposed artifact removal technique, we employed two datasets recorded via surface electrodes (Acticap, BrainProducts GmbH, Germany) using 64 scalp positions according to the standard 10-10 montage. The first dataset consisted on a collection of clean EEG signals from two subjects who were instructed to remain quietly, but alert, with their eyes closed during two minutes. The second dataset was composed by EEG signals (two minutes) from one subject instructed to deliberately produced artifacts by eye blinking and jaw clenching at short intervals. To verify the correct realization of artifacts and to detect their instances, we used four external EOG and EMG channels (right and left frontalis and anterior temporalis muscles). In all recordings, impedance between electrodes and skin were set below  k. According to the declaration of Helsinki, written informed consent was obtained from subjects after explanation of the study, which was approved by the ethical committee CPP-IDF-VI of Paris (no 2016-A00626-45).

The EEG signals were amplified, digitized at a sampling frequency of 1024 Hz, then down-sampled to 256 Hz and segmented in sec non-overlapped windows to reduce computational cost in successive blocks. Clean signals from the first dataset were artificially contaminated with muscle artifacts and eye movements as in [4] and were used to compare quantitatively the removal of artifacts by the different methods. Trained experts at the EEG platform of the ICM visually inspected all trials and selected different artifact-free EEG segments from the first database. Finally, contaminated signals (with vertical ocular blinks and other pronounced muscular artifacts) were selected from the second dataset to qualitatively illustrate the efficacy of the denoising process.

Figure 3: General comparison of artifact removal as a function of SNR. The bars represent average of all signals and all EEG channels. Error bars represent standard error. Good SNRs range from 0 to 5 dB, mild SNRs range from -10 to 0 dB, and poor SNRs are those values less than 0 dB. The algorithms are: Wavelet Thresholding(WT), the Canonical Correlation Analysis combined with Empirical Mode Decomposition (CCA-EMD), and the proposed SuBAR method.

4 Evaluation

Artificially contaminated EEG signals were simulated using clean segments from the first dataset (absence of ocular and muscular activity and other artifacts due to body movements or technical interferences). Artifacts, superimposed to clean data, were generated in three steps:

Figure 4: Examples of RRMSE as a function of SNR on the EEG epochs containing simulated muscular artifacts. Red asterisks indicate the scalp position of electrodes (from left to right: FC1, Cz and CP2). Solid curves indicate mean values and shadowed areas display the th and th percentiles.

4.1 Characterization of spatial distribution

Since artifacts of different origins have a specific distribution in the scalp, we first computed a weight vector of dimensions to scale the artifact patterns according to the topographical information from the scalp electrodes. To obtain the topographical information for each type of artifact, we applied ICA decomposition to a selection of real EEG segments containing the artifacts. Vector was the rescaled column in the estimated mixing matrix associated to the artifactual component, found by inspecting some features such as the autocorrelation and spatial position in the scalp [4]. The components associated to eye movements and blinks () were detected as those yielding high amplitudes on the most frontal electrodes, and muscular components () as those having transient and fast activities localized most importantly on temporal electrodes.

4.2 Generation of artifact patterns

The two different artifact patterns consisted on vector signals of length obtained as follows:

  • Ocular artifacts, , were recorded from the original electrooculogram signal (EOG).

  • Muscular artifacts, , were generated using random noise band-pass filtered between and  Hz with a random length between - s (equivalent to those observed in real EEG data) [4].

From the above patterns, the matrix of simulated artifacts, of dimension , was computed by performing the product for each pattern. Clean EEG signals (selected by visual inspection) and simulated artifacts came from different subjects to ensure that all segments (trials) of simulated and real EEG data were independent of each other.

4.3 Setting of signal to noise ratios

Synthetic artifacts were superimposed on the clean EEG segments as follows: , where represents the contribution of the artifact. For each trial and artifact type, the signal to noise ratio (SNR) was adjusted by changing the parameter as follows:


where corresponds to the root mean squared value averaged over all channels, and denotes the root mean squared value of the artifact. Following [4], prior to computing SNR for the topographic artifacts, we scaled their amplitudes by the highest channel gain in the applied scalp map. Here, the artifact contribution was gradually decreased in a dB scale () from dB to dB.

The performances of the considered algorithms for artifact removal were evaluated both in terms of the amount of artifact reduction and the amount of distortion they bring into clean EEG signals. Performances were expressed in terms of the relative root mean square error (RRMSE) [37]:


where is the signal after artifact removal. To assess whether our technique preserves the frequency spectrum of clean EEG, or it introduces any phase shift in denoised signals, we have also measured the spectral coherence and phase delay between the corrected data and the clean EEG segments.

5 Results and Discussion

The proposed algorithm is applied to EEG data contaminated by simulated muscular activities and ocular artifacts. Figure 2 shows some examples of added artifacts and their removal by the SuBAR algorithm. Results suggest a good removal of both types of artifacts without distorting the background EEG signals outside the artifactual regions. Low- or band-pass filters were not capable of removing muscular artifacts without altering the underlying brain activity because the overlap of the frequency spectrum of artifacts and that of clean EEG signals.

We examined the reliability of the SuBAR method at different SNR values in terms of the RRMSE as mentioned in Section 4. We also applied two alternative methods – wavelet thresholding and CCA combined with the advanced version of the EMD – to the contaminated EEG channels, and performed a comparison through 20 independent simulations. In our simulations, the amplitude of artifacts were scaled by their spatial distribution and a prescribed SNR. Here, we present the performances of different methods for a reduced number of EEG channels which spatial positions are relevant for different practical applications of ambulatory clinical neuroimaging, in reduced settings for routine monitoring [14, 15] or for practical BCI systems based on motor imagery. EEG electrodes included are: FC1, FC2, FCz, CP1, CP2, CPz, C1,C2 and Cz.

Figure 5: The RRMSE as a function of SNR on the EEG epochs contaminated with ocular artifacts. Red asterisks indicate the scalp position of electrodes (from left to right: FC1, Cz and CP2). Same stipulations as in the caption of Fig. 4

Figure 3 shows a systematic comparison of different algorithms, expressed in terms of the RRMSE for different SNR and both muscular and ocular artifacts. Bar plots display averaged values estimated over all channels and segments. Results clearly indicate that, even with severe artifacts, our SuBAR method yields better performances than traditional artifact removal techniques.

Muscular artifacts

The performances of different methods to remove simulated muscular artifacts are shown in Figure 4. It is clearly seen that the SuBAR method outperformed its competitors for all SNRs. Although wavelet thresholding performances are stable for all SNRs, this technique is not able to recover the original EEG signals. On the other hand, as the contamination level increases in frontal and posterior regions, the performances of the data-driven CCA-EMD method are considerably degraded. Filtering modes from the CCA-EMD method is insufficient to remove large muscular artifacts without altering the underlying brain activity since the frequency spectrum of the muscle artifacts overlaps with that of the brain signals.

Results support the hypothesis that, thanks to the time-frequency localization properties of the wavelet transform, a comparison of the contaminated EEG signal with the surrogates in the wavelet domain can identify artifacts as non-stationary events embedded on a stationary signal. Long and persistant muscular artifacts could not be detected as the spectrum of the contaminated EEG will not differ from a stationary process.

Ocular artifacts

Figure 5 shows the RRMSE for the different ocular artifact removal algorithms as a function of SNR. It can be observed that both the surrogate-based algorithm and the CCA-EMD methods are not able to remove large artifacts without distorting the true signals, whereas the wavelet thresholding is an algorithm that provides better performances. This indicates that surrogates of EEG signals with large ocular artifacts cannot be distinguished in the wavelet domain from the decomposition of contaminated signals. Nevertheless, for mild and weak artifacts, our surrogate-based technique outperforms other methods and can better remove the artifacts and recover the underlying EEG signals.

Distortion of clean EEG segments

Let us now quantify the distortions –in terms of RRMSE– produced by each method when applied to artifact free EEG epochs. Figure 6 shows that, although wavelet thresholding is able to remove large ocular artifacts, it also produces an important amount of distortion on clean EEG segments. Similarly, the automatic correction of artifacts with CCA-EMD method altered clean EEG signals substantially, although the algorithm to detect ocular artifacts resulted in a larger distortion than the algorithm for muscular artifact removal. These results indicate that for single-EEG channels, the SuBAR method preserves better the EEG signals than the other considered artifact removal algorithms, which may remove true neural components from clean brain signals.

Spectral distortion produced by the SuBAR method

To identify if the proposed algorithm preserves the natural frequency spectrum in denoised signals, we computed the spectral coherence and phase delays between the original clean EEG signals and the corrected data. Figures 7-8 show that, even in presence of severe artifacts (SNR dB), the SuBAR method preserves the spectrum of EEG signals. A small amount of distortion on EEG spectrum is observed for frequencies larger than Hz. In this band of frequencies, the correction of large artifacts introduced small absolute values of phase delay of about radians. Low frequencies ( Hz) were practically undistorted by the filtering method. Results clearly indicate that, in general, the proposed method preserves well the spectral components from clean brain signals and no significant phase delays are introduced by the filtering.

Figure 6: The RRMSE of different methods when applied to artifact free EEG epochs (averaged over all segments for each channel). SuBAR is the proposed method, WT stands for wavelet thresholding, CCA-EMD muscular and ocular denote the criteria for the corresponding artifacts removal.

Correction of real contaminated data

For illustration purposes, the SuBAR algorithm was applied on the EEG epochs from the second dataset, i. e. contaminated with real artifacts. Figure 9-(A) shows the performance of the SuBAR method on an EEG epoch that contains both eye blinks and muscular artifacts. Notice that, although different artifacts were present in the same segment, they were relatively well removed. In Figures 9-(B-C), we observe that, although small amounts of muscular activities remain in the denoised signals, our non-parametric algorithm corrects well the eye blinks from the contaminated data. Error measures could not be used here as we did not have a-priori information available of the clean brain signals.

Figure 7: Distortion produced by the SuBAR method on EEG epochs contaminated with muscular artifacts: The coherence (top plots) and phase delay (bottom plots) between original clean and denoised data as a function of SNR. Red asterisks indicate the scalp position of electrodes.

Computational complexity

For a given level of decomposition, the Maximal Overlap Wavelet Transform is computationally equivalent to other shift-invariant discrete wavelet transforms, and may be calculated with computational complexity [38]. Although the wavelet thresholding and the surrogate-based algorithm have similar algorithmic complexity, the former remains highly advantageous from the computational point of view as surrogates are not generated. For the EEG segments analyzed here2, CPU times required by the WT method were, on average, hundred times faster than those required by the SuBAR filtering ( s vs  s, respectively). As expected, the CCA-EMD method requires larger CPU times to provide poorer performances ( s).

Figure 8: Distortion produced by the SuBAR method on EEG epochs contaminated with ocular artifacts: Same stipulations as in the caption of Figure 7.

6 Conclusion

In this work, a new data-driven method for automatic artifact removal in single-channel EEG was presented. The novelty of our proposal relies on the time-frequency analysis of surrogate data to identify and filter ocular and muscular artifacts embedded in single EEG channels. The efficacy of the algorithm was compared to wavelet thresholding, and the CCA combined with the EMD. Through artificially contaminated EEG signals, we demonstrate that the surrogate-based removal (SuBAR) algorithm outperforms the other techniques considered here for removing muscle and ocular artifacts from single EEG signals. Although large ocular artifacts are better removed by wavelet thresholding, the SuBAR method yields, in general, a relative error 4 to 5 times smaller than the other considered artifact removal methods. Results show that the proposed algorithm preserves well the frequency spectrum of EEG in denoised signal (without phase delay). Furthermore, our method yields the smallest distortion of signals when applied to artifact-free EEG segments. Though it is not the aim of this study, we envisage that possible further optimizations can be obtained with other families of wavelets.

Most artifact reduction techniques require multivariate EEG data, or auxiliary referential signals (e.g. EOG or EMG) [1]. Common artifact removal algorithms generally requires appropriate spectral and topographical parameters for the detection of EEG artifacts (eye blinks, saccades, muscle activity) [1, 6]. In contrast, results presented in this work suggest that our single-channel technique can be a good non-parametric filter for artifact removal in off-line environments with a reduced number of sensors. Computational profiling shows that the proposed method could be suitable for pseudo real-time environments, such as the mobile monitoring of cognitive or emotional states. The use of recent distributed signal processing algorithms might speed-up the algorithm for real-time implementations, or for the analysis of multi-channel EEG datasets [39]. The proposed data-driven SuBAR method could be used in combination with other temporal EEG features (as those used in artifact detection [40]) to facilitate further signal processing of single-channel EEGs.

Although the generation of surrogate time series and the corresponding wavelet decomposition of large EEG segments might be a bottleneck for the processing speed in on-line applications, its technical simplicity (it is fully data-driven) makes the SuBAR method highly operable in mobile environments, such as ambulatory healthcare systems [41], where there are only a few EEG channels, or even a single channel is available (e.g. sleep stage scoring or anesthesia monitoring). The practicality, accuracy and reliability of an adaptive filter based on ou method remain, however, to be explored.

Figure 9: Original EEG epochs contaminated with eye blinks and jaw clenching and the corresponding denoised signals: (A) Epoch with mixed artifacts, (B) EEG segment containing muscular artifacts and (C) EEG epoch contaminated with eye movements. For illustration purposes only channels FC1, Cz and CP2 are shown as in Figs 4-5. Boxes in the top of the plots indicate the instances of jaw clenching and eye blinks.


F. De Vico Fallani is supported by the French Agence Nationale de la Recherche (programme ANR-15-NEUC-0006-02), and X. Navarro-Sune is financially supported by Air Liquide Medical Systems S.A.


  1. In this paper, scalars are denoted by italic letters, vectors by lowercase bold-face letters and matrices by uppercase, bold-face leterrs. Superscript refers to the transpose operation.
  2. Using Matlab on a 2.8GHz dual-core Intel Core i7 processor, 16GB of memory


  1. J. A. Urigüen and B. Garcia-Zapirain, “EEG artifact removal–State-of-the-art and guidelines”, Journal of Neural Engineering, vol. 12, no. 3 p. 031001, 2015.
  2. K. T. Sweeney, T. E. Ward and S. F. McLoone, “Artifact removal in physiological signals–practices and possibilities”, IEEE Transactions on Information Technology in Biomedicine, vol. 16, no. (3), pp. 488–500, 2012.
  3. K. T. Sweeney, et al. “A methodology for validating artifact removal techniques for physiological signals”, IEEE Transactions on Information Technology in Biomedicine, vol. 16, no. 5, pp. 918–926, 2012.
  4. A. Delorme, T. Sejnowski and S. Makeig, “Enhanced detection of artifacts in EEG data using higher-order statistics and independent component analysis”, Neuroimage, vol. 34, no. 4, pp. 1443–1449, 2007.
  5. J. Iriarte, et al., “Independent component analysis as a tool to eliminate artifacts in EEG: a quantitative study”, Journal of Clinical Neurophysiology, vol. 20, no. 4, pp. 249–257, 2003.
  6. X. Chen, et al. (2016). “Removing Muscle Artifacts From EEG Data: Multichannel or Single-Channel Techniques?”, IEEE Sensors Journal, vol. 16, no. 7, pp. 1986–1997, 2016.
  7. W. De Clercq, et al., “Canonical correlation analysis applied to remove muscle artifacts from the electroencephalogram”, IEEE Transactions on Biomedical Engineering, vol. 53, no. 12, pp. 2583–2587, 2006.
  8. J. Gao, C. Zheng, and P. Wang, “Online removal of muscle artifact from electroencephalogram signals based on canonical correlation analysis”, Clinical EEG and Neuroscience, vol. 41, no. 1, pp. 53–59, 2010.
  9. M. Borga, et al., “A canonical correlation approach to exploratory data analysis in fMRI,” in Proceedings of the 10th Scientific Meeting of the International Society of Magnetic Resonance in Medicine (ISMRM’02), Honolulu, Hawaii, 2002.
  10. A. X. Patel, et al. “A wavelet method for modeling and despiking motion artifacts from resting-state fMRI time series”, Neuroimage, vol. 95, pp. 287–304, 2014.
  11. K. T. Sweeney, S. F. McLoone and T. E. Ward, “The use of ensemble empirical mode decomposition with canonical correlation analysis as a novel artifact removal technique”, IEEE Transactions on Biomedical Engineering, vol. 60, no. 1, pp. 97–105, 2013.
  12. S. Khatun, R. Mahajan and B. I. Morshed, “Comparative Study of Wavelet-Based Unsupervised Ocular Artifact Removal Techniques for Single-Channel EEG Data”, IEEE Journal of Translational Engineering in Health and Medicine, vol. 4, pp. 1–8, 2016.
  13. X. Chen, et al., “A preliminary study of muscular artifact cancellation in single-channel EEG”, Sensors, vol. 14, no. 10, pp. 18370–18389, 2014.
  14. A. Casson, et al., “Wearable electroencephalography”, IEEE Engineering in Medicine and Biology Magazine, vol. 29, no. 3, pp. 44–56, 2010.
  15. V. Mihajlović, et al., “Wearable, wireless EEG solutions in daily life applications: what are we missing?”, IEEE Journal of Biomedical and Health Informatics, vol. 19, no. 1, pp. 6–21, 2015.
  16. I. Daubechies, Ten lectures on wavelets of CBMS-NSF Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), vol. 61, Philadelphia, PA, 1992.
  17. M. I. Al-Kadi, et al., “Compatibility of mother wavelet functions with the electroencephalographic signal,” in 2012 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES), pp. 113–117, 2012.
  18. A. K. Aniyan, et al., “A wavelet based algorithm for the identification of oscillatory event-related potential components”, Journal of Neuroscience Methods, vol. 233, pp. 63–72, 2014.
  19. D. Donoho and I. Johnstone, “Adapting to unknown smoothness via wavelet shrinkage”, Journal of the American Statistical Association, vol. 90, no. 432, pp. 1200–1224, 1995.
  20. D. Safieddine, et al., “Removal of muscle artifact from EEG data: comparison between stochastic (ICA and CCA) and deterministic (EMD and wavelet-based) approaches”, EURASIP Journal on Advances in Signal Processing, vol. 1, pp. 1–15, 2012.
  21. Y. Chen, et al., “The removal of EMG in EEG by neural networks”, Physiological Measurement, vol. 31, no. 12, p. 1567, 2010.
  22. B. Efron, and R. J. Tibshirani, An Introduction to the Bootstrap. Chapman and Hall/CRC, New York, 1993.
  23. J. Theiler, and D. Prichard, “Constrained-realization Monte-Carlo method for hypothesis testing”, Physica D, vol. 94, no. 4, pp. 221–235, 1996.
  24. T. Schreiber, and A. Schmitz, “Surrogate time series”, Physica D, vol. 142, no. 3, pp. 346–382, 2000.
  25. P. Borgnat, et al., “Testing stationarity with surrogates: A time-frequency approach”, IEEE Transactions on Signal Processing, vol. 58, no. 7, pp. 3459–3470, 2010.
  26. T. Schreiber, and A. Schmitz, “Improved surrogate data for nonlinearity tests”, Physical Review Letters, vol. 77, no. 4, p. 635, 1996.
  27. A. Papoulis, and S. U. Pillai, Probability, random variables, and stochastic processes. Tata McGraw-Hill Education, 2002.
  28. D. B. Percival, and A. T. Walden, Wavelet methods for time series analysis. Cambridge University Press, 2006.
  29. I. M. Johnstone, and B. W. Silverman, “Wavelet threshold estimators for data with correlated noise”, Journal of the Royal Statistical Society, Series B, vol. 59, no. 2, pp. 319–351, 1997.
  30. M. K. Islam, A. Rastegarnia, and Z. Yang, “A Wavelet-Based Artifact Reduction from Scalp EEG for Epileptic Seizure Detection”, IEEE Journal of Biomedical and Health Informatics , vol. 20, no. 5, pp. 1321–1332, 2016.
  31. N. E. Huang, et al., “The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis”, Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, vol. 454, no. 1971, pp. 903-995, 1998.
  32. P. Flandrin, G. Rilling and P. Goncalves, “Empirical mode decomposition as a filter bank”, IEEE Signal Processing Letters, vol. 11, no. 2, pp. 112-114, 2004.
  33. M. E. Torres et al., “A Complete Ensemble Empirical Mode Decomposition with adaptive noise,” in 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4144–4147, 2011.
  34. M. A. Colominas, G. Schlotthauer and M. E. Torres, “Improved complete ensemble EMD: A suitable tool for biomedical signal processing”, Biomedical Signal Processing and Control, vol. 14, pp: 19–29, 2014.
  35. T.-W. Lee, et al., “Independent component analysis using an extended infomax algorithm for mixed subgaussian and supergaussian sources”, Neural computation, vol. 11, no. 2, pp: 417–441,1999.
  36. S. D. Muthukumaraswamy, “High-frequency brain activity and muscle artifacts in MEG/EEG: a review and recommendations”, Front. Hum. Neurosci., vol. 7, pp: 138, 2013.
  37. Z. Wang, and A. Bovik, “Mean squared error: love it or leave it”, IEEE Signal Processing Magazine, vol. 26, no. 1, pp: 98–117, 2009.
  38. J. Enders, et al., “The shift-invariant discrete wavelet transform and application to speech waveform analysis” The Journal of the Acoustical Society of America, vol. 117, no. 4, pp: 2122–2133, 2005.
  39. A. Bertrand, “Distributed signal processing for wireless EEG sensor networks” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 23, no. 6, pp: 923–935, 2015.
  40. A. Mognon, et al., “ADJUST: An automatic EEG artifact detector based on the joint use of spatial and temporal features”, Psychophysiology, vol. 48, no. 2, pp: 229–240, 2011.
  41. J.M. Ehrenfeld, and M. Cannesson Eds, Monitoring technologies in acute care environments. New York, NY, USA: Springer, 2014.
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Comments 0
Request comment
The feedback must be of minumum 40 characters
Add comment
Loading ...

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description