# Assess Sleep Stage by Modern Signal Processing Techniques

###### Abstract

In this paper, two modern adaptive signal processing techniques, Empirical Intrinsic Geometry and Synchrosqueezing transform, are applied to quantify different dynamical features of the respiratory and electroencephalographic signals. We show that the proposed features are theoretically rigorously supported, as well as capture the sleep information hidden inside the signals. The features are used as input to multiclass support vector machines with the radial basis function to automatically classify sleep stages. The effectiveness of the classification based on the proposed features is shown to be comparable to human expert classification – the proposed classification of awake, REM, N1, N2 and N3 sleeping stages based on the respiratory signal (resp. respiratory and EEG signals) has the overall accuracy (resp. ) in the relatively normal subject group. In addition, by examining the combination of the respiratory signal with the electroencephalographic signal, we conclude that the respiratory signal consists of ample sleep information, which supplements to the information stored in the electroencephalographic signal.

## I Introduction

In human beings, sleep is a universal recurring dynamical and physiological activity, and the quality of sleep influences our daily lives in diverse ways. However, it was not until recently that sleep became a brach of medicine and found its role in several seemingly unrelated clinical problems. Physiologically, it is divided into two broad stages: rapid eye movement (REM), and non-rapid eye movement (NREM) [1]. Normally, sleep proceeds in cycles in between REM and NREM. The NREM stage is further divided into shallow sleep (stage N1 and N2) and deep sleep (stage N3). In all procedures identifying sleep stages, we need a sleep scoring process with the help of polysomnography (PSG), which includes electroencephalography (EEG), electromyogram (EMG), and electrooculogram (EOG), etc.

Among these physiological signals, EEG signals are the most concentrated ones since the clinically acceptable stage of the sleep is majorly determined by reading the recorded EEG based on the R&K criteria, which were standardized in 1968 by Allan Rechtschaffen and Anthony Kales [2] and further developed by the American Academy of Sleep Medicine on 2007 (AASM 2007) [3]. However, due to the subjective judgement and different training background, the agreement of manual sleep scoring among trained clinicians and professionals has been known to be limited [4], thereby motivating the development of an objective and automatic scoring system.

Based on these clinical findings, various features of the EEG signals have been proposed to study the sleep dynamics, for example, time domain summary statistics, spectral analysis, coherence, time-frequency analysis, entropy, to name but a few [5, 6, 7, 8]. Recently, a theoretically solid approach suitable to model the underlying dynamics of the brain activity and estimate the evolutionary dynamics from recorded EEG signal was proposed in [9, 10], and had been successfully applied to predict the pre-seizure state from the intra-cranial EEG signals [11, 12].

However, it is well known that sleep is a global and systematic behavior not localized solely in the brain. For example, the muscular atonia and low amplitude EMG are related to the significant changes in the breathing pattern during normal sleep: during NREM sleep, especially stage N3, breathing is remarkably regular, while during REM sleep, breathing is irregular with sudden changes. The above physiological facts hint that the respiratory pattern of the recorded breathing signal during sleep might well reflect the sleep stage. There have been some reported studies of the sleep stage from analyzing the respiratory signal [13, 14, 15, 16, 17]. In [13] (resp. [14]), an averaged respiratory rate over a fixed window is used to estimate the REM and NREM (resp. awake and sleep). In [15], a notch filter based instantaneous frequency estimator is applied to extract features to differentiate awake, REM and NREM. In [16, 17], the adaptive harmonic model and a modern time-frequency analysis technique have been applied to further quantify the notion of respiratory dynamic. In particular, the instantaneous respiratory rate has been related to awake, REM, shallow and deep sleep stages, with a rigorous mathematical foundation.

The above-mentioned physiological patterns inside the EEG and the respiratory signals are actually outcomes of the intricate deformation of the underlying sleep dynamics, which we call intrinsic dynamical features of the sleep, that are not directly accessible to us. Although it is not an easy task to fully model or estimate the dynamical system underlying sleep, we might expect to benefit if we are able to quantify and integrate these hidden intrinsic dynamical features. In this paper, we propose to combine two modern adaptive signal processing techniques, Empirical Intrinsic Geometry (EIG) and Synchrosqueezing transform (SST), to estimate these intrinsic dynamical features of sleep guiding the observed EEG and respiratory signals – we define an index, referred to as Sleep Index, to quantify these features. Then, by applying the suitable classifier algorithm, we show that the extracted features are highly correlated to the sleep stage determined by reading the EEG by the AASM 2007 criteria. Indeed, the proposed classification based on the respiratory signal (resp. respiratory and EEG signals) has the overall accuracy (resp. ) in the relatively normal subject group, which is comparable to human expert classification.

The article is organized in the following way. In Section II, we summarize the theoretical background of EIG and SST and the associated models. Then the Sleep Index is introduced in Section III. In Section IV, the proposed Sleep Index is applied to study the whole night sleep signals. We conclude with discussion in Section V.

## Ii Two Algorithms – Synchrosqueezing transform and Empirical Intrinsic Geometry

The work presented in this paper is an application of the modern signal processing techniques to study the sleep dynamics. In particular, we will extract different features from the respiratory and EEG signals by the well studied EIG [9, 18] and SST [19, 17]. As such, the theoretical material will be presented in a compact, informal manner emphasizing on the intuitions. We provide a formal and rigorous summary without proof of the details. Those interested in the proofs are encouraged to read the associated references.

### Ii-a Adaptive Harmonic Model and Synchrosqueezing Transform

The major characteristic pattern of the respiratory signal is that it is almost periodic. We call the movement of air from the environment into the lungs inspiration and the movement of air in the opposite direction expiration. An inspiration and an expiration constitute a respiratory cycle. Breathing process is a physiological process consists of a sequential respiratory cycles. In this paper, we focus on the breathing process and call the time-varying volume occupying the lung space the physiological respiratory signal. This general observation leads us to the following phenomenological model for the respiratory signal (without noise):

(1) |

where we shall call the wave shape function; it is a -periodic real function that satisfies some mild technical conditions. See [16] for the details. The respiratory signals recorded from different devices, like the airflow measuring device or the chest wall movement, shall be understood as observations of the respiratory system. Different observations lead to different shape functions. We call the derivative of the function the instantaneous frequency (IF) of the respiratory signal . We require IF to be positive, but it does not required to be constant as long as the variations are slight from one period to the next, i.e. for all time , where is some small, pre-assigned positive number. Likewise, We call the amplitude modulation (AM) of , which should be positive, but is allowed to vary slightly as well, i.e. for all time . We refer the interested reader to [17] for the technical details and a further discussion of the well-definedness of the definition of AM and IF. Note that our treatment of the respiratory signal is purely phenomenological; that is, the parameters and indices we will derive from the signal will be based solely on these signals themselves, and not on explicit, quantitative models of the underlying mechanisms.

Physiologically, the quantities , and in the model (1) quantify the dynamics of the breathing process, which we refer to as phenomenological dynamical features. For example, one way to quantify the widely used notion breathing rate variability (BRV) [20, 21, 22] is considering the IF and AM [23]. Indeed, if where , we know that the subject breaths faster at time than at time . We mention that this “fast-slow” momentary behavior in the respiratory signal has been shown to be clinically informative and can be helpful in the ventilator weaning prediction [22, 23] and sleep stage estimation [16, 17].

Due to the inevitable measurement error and other outliers appearing inside the system, we model the recorded respiratory signal as

(2) |

where is a stationary generalized random process and is a smooth function which varies slowly. Here models the noise and other outliers and models the possible non-stationary nature of the noise. A particular example for frequently encountered in practice is the Gaussian white noise. We refer the read having interest to [17] for further information about this noise model and its mathematical details.

To estimate the phenomenological dynamical features of , and , from , we apply the Synchrosqueezing transform (SST), which is a special reallocation technique [19, 17]. In a nutshell, we evaluate any linear time-frequency analysis on the observation , for example the short time Fourier transform or the continuous wavelet transform, and we take the phase information hidden inside the chosen linear time-frequency analysis into account to obtain a sharpened time-frequency representation, which is denoted as . In addition to capturing the phenomenological dynamical features of , the SST provides a sharper time-frequency representation compared with the other traditional time-frequency analyses. See Figure 1 for an example of the respiratory signal and its instantaneous frequency. We refer the reader to [19, 17] for more details.

### Ii-B Empirical Intrinsic Geometry and its underlying Mathematical Model

In many real-world applications, the seeming complicated time series collected from the system is controlled by a relatively simple underlying process. In some situations, when the underlying evolutionary process lies on a low-dimensional Riemannian manifold, it can be parameterized through a manifold learning framework, which was first introduced and studied in [24].
The main idea in [24]^{1}^{1}1In the original paper [24], the method was referred to as Nonlinear Independent Component Analysis. However, for the sake of avoiding possible confusion with Independent Component Analysis, the name Empirical Intrinsic Geometry (EIG) was adpated in [9, 10]. and its extensions to time-series [9, 10] is to bridge between data mining, and in particular manifold learning, and dynamical systems. The authors’ observation that the accessible measurements at hand do not necessarily convey the true essence of the system led to the development of a more generalized model, which separates between measurements and intrinsic hidden variables.

One particular example for such a dynamical system is the respiratory signal recorded during sleep – we consider the model that the evolutionary process governing the respiratory signals is restricted to a low-dimensional Riemannian manifold, which is fundamentally different from the phenomenological model (2). This dependency is encoded using the state-space formalism and the model of the recorded respiratory signal (2) is extended as follows:

(3) |

where forms the inaccessible intrinsic state at time that governs the respiratory signal and evolves in time with unknown drifts and independent standard Brownian motions , .

The idea that lies behind the model (3) is twofold. First, the measured signal has typical (unknown) dynamics, modeled here by the state equation, which need to be taken into account and encoded in the desired features. Second, the accessible signal is viewed as a measurement of the neural system controlling the sleep cycle. While it can be effected by numerous factors relating to the measurement modality (e.g., measurements of airflow or chest movements), the used equipment (e.g., the type of sensors and their exact positions), and noise, the true intrinsic variable we have interest in is the intrinsic states controlling the respiratory signal (represented here by ). Indeed, the notation implies that the respiratory signal depends upon the “real” physiological variable in an unknown way, possibly through its amplitude , wave shape , or instantaneous frequency .

In the above model (3), however, due to noise and other nuisance factors, the measurement might be too redundant to faithfully describe the dependency of the the respiratory signal on the underlying state and its temporal evolutionary. Thus, to improve the underlying state observability, we introduce a high dimensional (possibly nonlinear) observer to the measured signal [12], i.e.,

(4) |

where is a map from the suitable scalar valued functional space to the -valued functional space and is an integer specified by the observer.

With the sampled observation set , a natural question is how to estimate , namely, the system intrinsic state and dynamics of interest. Such an analysis may complement the phenomenological dynamical features provided by the SST. While the SST mainly carries instantaneous information, recovering the intrinsic state of the dynamical system provides a characterization of coarser, slower dynamical changes of the shape and structure of the signal, especially when the observer is implemented as a transform that relies on short time frames analysis.

It was shown in [24] that if the observations can be written as a regular deterministic function of the samples of the underlying state, i.e., , then, by Itô’s formula, we have

(5) |

where and . By a direct calculation, the covariance matrix of the observation at time define by

(6) |

satisfies , where is the Jacobian of . This key result, along with the assumption that is locally stationary evolving much more slowly than the observation scale so that the it stays closely on a low-dimensional manifold embedded in , which is referred to as the intrinsic state manifold, as well as the assumption that is stably invertible on its range, allow the authors in [24] to estimate the inaccessible state through the solution of an eigenvector problem, which will be described later in this section. The main step leading to the solution theory is the following estimation. Suppose , and . By the Taylor expansion of , up to the error term [24], we have:

(7) |

Note that in our example, the function leading to the observation depends on the observer . As a result, with the estimated covariance matrix from the accessible collected data , we can build a graph Laplacian associated with the intrinsic state manifold from the finite observations using the estimated Euclidean distance between the corresponding underlying samples (7). This graph Laplacian gives rise to re-parametrization of the intrinsic manifold through diffusion maps (DM) [25]. This re-parametrization procedure aiming to extract the intrinsic dynamics of the observation is referred to as Empirical Intrinsic Geometry (EIG).

The remaining key question is the choice or design of a “proper” observer in (4) to the system. In particular, in order to accommodate the inevitable noise in real-world signals, estimates of the conditional probability density (e.g., histograms) were proposed as observers in [9]. The analysis relies on the following facts: (a) any measurement noise is translated to a linear transformation in the conditional densities domain, and (b) the distance (7), which is the Mahalanobis distance, is invariant under linear transformations. Indeed, these two facts allow for the estimation of the distance between two nearby samples on the intrinsic manifold in adverse noisy conditions. However, estimating the conditional probability densities requires a large amount of data and often is not feasible. Unfortunately, standard representations based on the Fourier transform are also inadequate for respiratory signals. By linear approximation of the function around a nearby sample at , the respiratory signal in (1) can be approximated by

(8) |

As a result, the modulus of the Fourier transform of around is approximated by

(9) |

where is the frequency, is the Fourier transforms of around and is the Fourier transform of , respectively, assuming changes slowly with time. The approximation in (9) implies that even an almost linear function (i.e., when the IF is ) is translated to large deformations in the Fourier domain in high frequencies [26]. Consequently, if we take the short time Fourier transform as the observer, the observations exhibit instabilities, thereby leading to irregular and a poor estimation of the Euclidean distance on the underlying intrinsic state manifold (7).

To overcome the instability of the Fourier representation, following [12], we use the scattering transform as an observer. The scattering transform has a low variance because it is based on first order moments of contractive operators, it linearizes deformations, and it can represent effectively intermittent behavior [26, 27]. The scattering transform is computed based on a cascade of wavelet transforms and nonlinear modulus operators [26]. Here, we briefly review the construction procedure of its first and second order levels, since they were empirically shown to provide a sufficient representation of the signals considered in this paper.

Let be a complex wavelet, whose real and imaginary parts are orthogonal and have the same norm. Let denote the dilated wavelet, defined as . Let denote the observations computed by applying the first and second level scattering transform to the signal samples , which are given by

where is a smoothing window, i.e., a scaling function associated with the mother wavelet. The scattering transform has been shown to be an observer that is especially suitable for deformations and intermittencies [12]. In particular, it was shown that it is regular with respect to time deformations. Therefore, the application of the scattering transform to the respiratory signal is particularly suitable.

Building on the generality of the described analysis, in this study, we use it to represent the EEG signals as well. As the respiratory signal, the EEG signal measures a physiological phenomenon (“the brain activity”), but, it is subject to noise, interferences, and nuisance factors. Likewise, it can be represented using a state-space model, similar to (3), given by

where and are the clean and noisy EEG signals, is a measurement noise, and denotes the inaccessible intrinsic state representing the brain activity that governs the EEG signal and evolves in time with unknown drifts and independent standard Brownian motions , . By applying EIG to the recorded EEG signals, we may reconstruct the intrinsic states . We remark that this approach was applied to identify the pre-seizure state from intracranial EEG data [11, 12]. We refer the interested reader to [9, 10, 12] for more technical details and references.

Before closing this section, we summarize the construction of the graph Laplacian parametrization. In a nutshell, the main ingredient is integrating local similarities at different scales, which leads to a global description of the data set. Unlike linear methods such as principal component analysis (PCA), a graph Laplacian parametrization embodies nonlinear relationships among the variables. In addition to the mathematical analysis results [25, 28, 29], it has been shown to be robust to noise perturbation [30, 31] and it is computationally efficient. We outline the algorithm here and refer the readers to these literatures for further theoretical details.

Take multivariate measurement samples and build a complete graph with vertices . We first build an affinity matrix (or adjacency matrix) of size . The affinity between a pair of samples is defined by a metric in the following way:

(10) |

Note that according to the noise analysis in [31], when the signal to noise ratio is small, it is beneficial to set the diagonal terms of the affinity matrix to . In the present work, following the analysis in [24], the metric we choose is the Mahalanobis distance (7). It is clear that the matrix is symmetric. Note that theoretically (and practically) we can choose a more general kernel function, but we focus on the Gaussian kernel to simplify the exposition. Then we define the diagonal degree/density matrix of size , consisting of the sum of rows of :

Based on and , the graph Laplacian is defined by

Note that under the manifold assumption, exists. Also note that can be viewed as a transition matrix of a Markov chain on the samples. Since is similar to the symmetric matrix , it has a complete set of right eigenvectors with corresponding eigenvalues , where [25]. By the above construction, the eigenvectors are vectors in . Through the eigenvectors, the measurement samples are mapped into via

(11) |

where is an estimate of the dimension of the intrinsic state of the system and is usually . Estimating the intrinsic dimension of the system extends the scope of the paper and is empirically set according to the spectral gap in the decay of the eigenvalues, as will be described in Section III. In (11), we obtain a -dimensional parameterization of the measurements. In particular, we view the th coordinate of the parameterization of , i.e., , as the th coordinate of the recovered hidden intrinsic state , which we view as the features associated with the sleep stage in this analysis. An illustration of the DM reparametrization process with the first non-trivial eigenvectors is shown in Figure 2.

## Iii Material and Method

### Iii-a Data Collection

Standard polysomnography was performed with at least 6 hours of sleep to confirm the presence or absence of OSA from the clinical subjects suspicious of sleep apnea in the sleep center at Chang Gung Memorial Hospital (CGMH), Linkou, Taoyuan, Taiwan. The institutional review board of the CGMH approved the study protocol (No. 101-4968A3) and the enrolled subjects provided written informed consent. Four channel EEG signals (C3A2, C4A1, O1A2 and O2A1), two channel EOG signals and chin EMG were recorded at the sampling rate Hz for sleep staging. Chest and abdominal motions are recorded by the piezo-electric bands and airflow was measured using thermistors and nasal pressure, both at the sampling rate Hz. All signals were acquired on the Alice 5 data acquisition system (Philips Respironics, Murrysville, PA). Apneas and hypopneas were defined using AASM 2007 guidelines [3], and the apnea-hyponea index (AHI) provided is the value determined during sleep.

Take the recorded EEG signals, denoted as , , and the respiratory signal, denoted as . Suppose the recording time period is . We divide into contiguous subintervals of seconds long, ; that is, and for all . We call the -th epoch. We will extract features out of the recorded respiratory and EEG signals for each epoch.

### Iii-B Features from the respiratory signal

Given a recorded respiratory signal , we extract its phenomenological dynamical features, including the instantaneous frequency and the amplitude modulation by applying the SST. Denote the estimated instantaneous frequency by and the amplitude modulation by . The mean of restricted to the -th epoch, denoted as the , and the mean of restricted to the -th epoch, denoted as , form the first two features for the respiratory signal for the -th subinterval. The third feature, denoted as , is obtained by evaluating the standard deviation of on the interval of length seconds centered on the middle of the -th epoch.

We apply the analysis described in Section II-B to in order to complement the phenomenological dynamical features and to obtain a characterization of the structural, slower underlying variables of the data. Here as well we obtain the graph Laplacian . Then, the eigenvectors and eigenvalues of are given by . The first nontrivial eigenvectors are chosen based on the following “spectral gap” thresholding criteria

(12) |

where is the threshold chosen by the user. Thus, using (11), we obtain intrinsic dynamical features of the respiratory system.

### Iii-C Features from the EEG signal

Given the EEG signal recorded from the -th channel, we run the analysis described in Section II-B and obtain the graph Laplacian . Then, the eigenvectors and eigenvalues of are given by with . The first nontrivial eigenvectors are chosen based on the thresholding criteria (12) with the same . Using the eigenvectors, each subinterval of the EEG signal is mapped into a sub-vector of dimensions according to (11). By collecting the low dimensional vectors of all the channel, we obtain a vector consisting of intrinsic dynamical features of the cortical activity for each subinterval.

### Iii-D Sleep Index

We consider the following two feature vectors. The first one is extracted only from the respiratory signal and is referred as the Respiratory Index:

The second one is extracted only from the EEG signals and is referred as the EEG Index:

An analysis result of the O1A2 EEG signal, denoted as , is shown in Figure 2. Clearly different sleep stages represented in different colors have different features and are well clustered. In addition, these different sleep stages are organized in a continuous but nonlinear way – from the right hand side of the figure to the left hand side we have awake, REM, N1 and N2 and deep sleep stages.

Next, the phenomenological respiratory features, the intrinsic respiratory features and the intrinsic dynamical features of the cortical activity at the -th epoch are combined together to comprise the Sleep Index with :

### Iii-E Sleep Stage Classifier

Support vector machine (SVM) is a commonly used technique for the purpose of classification in statistical learning theory [32]. In a nutshell, SVM determines a hyperplane in the space separating the data set into two disjoint subsets, such that each subset is lying in a different side of the hyperplane. With the help of the reproducing kernel Hilbert space theory, SVM is generalized to the kernel SVM, which allows for classification with nonlinear relationship; that is, a nonlinear surface separating the data set into two disjoints subsets may be used. We refer the interested reader to [32] for technical details. For the sake of identifying the (possible) nonlinear relationship between different sleep stages, in this work we choose the radial based function (RBF), , where , as the kernel function. Note that our dataset is multi-class – the response has more than categories – therefore, we need to further generalize the kernel SVM to the multi-class SVM to complete our mission. To this end, we apply the one versus all (OVA) classification scheme [33]. Despite its simplicity, the OVA classification scheme is highly effective and useful, as was extensively shown and discussed in [33]. Group data will be reported as mean standard deviation unless otherwise specified.

## Iv Result

Ten subjects without sleep apnea (AHI less than ) were chosen for this study. The demographic characteristics of the individuals whose data was used are as follows: males and females, age: years, range years; BMI: , range ; AHI: , range . The total recorded time are of length minutes with range minutes and we have a sleep period time of minutes with range minutes for the sleep stage estimation.

We divide the whole night sleep into contiguous epochs of length seconds. We take and the dimension of the Sleep Index is . We consider the sleep stages in this study:

Here to simplify the notation, we reindex the set of sleep stages and use the teletype-font to avoid confusion; that is, 1 is the awake stage, etc. Then we generate the different indices, , and , from the recorded EEG and respiratory signals. The sleep stages in are determined by the sleep expert as the ground truth.

The OVA kernel SVM with the RBF kernel with is applied to classify the different sleep stages. Suppose there are subintervals with sleep stage , where , in the validation dataset. Denote to be the number of subintervals with the sleep stage i as the gold standard, but classified as the sleep stage j. We call the matrix with the -th entry the confusion matrix. We also define the confusion percentage matrix as a matrix with its entry . We will call the sensitivity (SE) of the sleep stage i prediction, which is denoted as SE(i). We will also report the overall accuracy (AC) denoted as and the specificity (SP) of the sleep stage i denoted as . Note that these definitions are direct generalizations of the AC, SE and SP of the binary categorical response data.

To prevent over-fitting and confirm the classification result, we run the repeated random sub-sampling validation times and evaluate the average. To be more precise, we randomly partition the data into the training dataset and the validation dataset – the training dataset comprises of the features and the rest are used to form the validation dataset. The trained classifier based on the training dataset is applied to predict the sleep stages of the validation dataset.

With the above preparation, first, we show that the proposed features capturing the sleep information hidden inside the respiratory signal are not only theoretically rigorously supported, but also useful in practice. The overall AC is . The error bar of SE and SP of correlating the Respiratory Index and the sleep stages over repeated random sub-sampling validation for the subjects is shown in the light gray curve in Figure 3. The average SE’s (resp. SP’s) over subjects for the awake, REM, N1, N2 and N3 stages are , , , and (resp. , , , and ).

Second, we show that the EEG Index also correlates with the sleep stages. The overall AC is . The error bar of the SE and SP over repeated random sub-sampling validation for the subjects is shown in the dark gray curve in Figure 3. The average SE’s (resp. SP’s) over subjects for the awake, REM, N1, N2 and N3 stages are , , , and (resp. , , , and ).

Next, we combine all the features extracted from the respiratory signal and the EEG signals and show the result is better than simply using the Respiratory Indices or EEG Indices. The overall AC is . The error bar of the SE and SP over repeated random sub-sampling validation for the subjects is shown in the black curve in Figure 3. The average SE’s (resp. SP’s) over subjects for the awake, REM, N1, N2 and N3 stages are , , , and (resp. , , and ).

We then apply the Mann-Whitney U test to test if the Sleep Index better predicts sleep stage than the Respiratory Index under our setup. The p value less than is considered significant. For the realizations of sub-sampling validation from subjects, we obtained SE’s and SP’s for different indices respectively. The Mann-Whitney U test is applied to see if the SE’s and SP’s are significantly different. The performance of the Sleep Index compared with the Respiratory Index on the awake, REM, N1, N2 and N3 stages in the sense of SE (resp. SP) are all significant with p-values ().

Lastly, to better present the classification result, the averaged confusion percentage matrices over all subjects and sub-sampling realizations based on the Respiratory Index, EEG Index and the Sleep Index are shown in Figure 4. Note that the diagonal entries are the SE’s of sleep stage prediction.

## V Discussion

The results in Section IV show that an accurate estimation of all sleep stages by solely analyzing the respiratory signal is possible by combining EIG and SST. Indeed, in addition to the overall AC , the average SE is greater than except N3, and the average SP is greater than . On the other hand, we mention that while the features of the respiratory signal extracted by EIG and SST are complementary, only EIG can be applied to the EEG signal, since the EEG signal can not be modeled by the adaptive harmonic model. The overall AC based on EIG applied to the EEG signals is , the SE is greater than , except N1 and N3, and the average SP is greater than . Namely, the performance of the EEG Index is not better than the Respiratory Index. Nevertheless, we see an improvement in the classification performance based on the Sleep Index, which contains information from both the respiratory and EEG signals. The overall AC is increased to , the average SE is now greater than except N3, and the average SP is greater than . In addition, it has been shown that the SE and SP of the Sleep Index are significantly better than those of the Respiratory Index. Moreover, the confusion percentage matrices also indicate that except N3, the mis-classification does not land in any specific sleep stage. The above findings lead to the following two tentative conclusions: 1. in addition to the EEG signals, the respiratory signal contains ample information about the sleep stage; 2. combining the relevant but different information hidden inside the respiratory and the EEG signals leads to a better result.

The main innovation in our sleep depth analysis is the combination of the clinical observation and modern adaptive signal processing techniques. From the clinical standpoint, we take the well known physiological fact that in addition to the brain activity, sleep is a global dynamical process involving different sub-system dynamics, in particular the significant changes in the respiratory pattern among different sleep stages. From the signal processing standpoint, we emphasize the importance of the nonlinearity controlling the sleep cycle and focus on finding suitable mathematical tools not only adaptive to the signal but also with sufficient rigorousness to quantify the clinical observation. Indeed, since the unaccessible intrinsic sleep dynamics is reflected in the nonlinear behavior of the respiration, and the two modern signal processing techniques, EIG and SST, have being theoretically studied to well quantify these nonlinearity, we obtain effective features by analyzing the recorded respiratory signal, which surrogate the intrinsic sleep dynamics.

The meaning of accuracy deserves some discussion. It is well known that the sleep stage determination agreement between different sleep experts is limited to even when the subjects under examination are normal, and it is even worse on the abnormal subjects [4]^{2}^{2}2It is reported in [4] that the mean agreement in the normal subset is higher (mean 76%, range 65-85%) than in the subset of sleep disordered breathing (mean 71%, range 65-78%).. Although our cases are not diagnosed as sleep apnea, they cannot be considered as in the normal population either, thereby attaining accuracy rates higher than in our subjects may not be meaningful.
On the other hand, we found that the classification of N3 stage is consistently worse and its mis-classification tends to land in N2, as is shown in Figure 4. Notice that the subjects in our study are on average years old, and the distribution of N3 sleep stages in the normal population of this age is . However, the N3 sleep stages in our study cases is with and quantiles and respectively, which is much fewer than those in the normal population. Since the number of N3 in the training set is relatively small, even by applying the weighted SVM to handle the unbalanced data, we do not expect to attain a compatible classification rate of N3. This unbalanced training set issue, combined with the stable breathing pattern during N2 and N3, might explain the inclination of mis-classifying N3 into N2.
Furthermore, while the accuracy of our classification is compatible with/better than the state-or-art reported results, we are able better classify between different sleep stages. Indeed, in [14], the overall accuracy of classifying awake and sleep is based on the respiratory signal; in [13] an averaged respiratory rate classifies REM and NREM with accuracy over ; in [15], a notch filter based IF estimator is applied to extract respiratory features, which classifies awake, REM and NREM with mean accuracy approximately ; in [17], the IF estimated by SST is shown to be able to distinguish awake, REM, shallow and N3 with statistical significance.
With the above discussions, we conclude that our features and the selected classifier are accurate.

The sleep depth estimation by the EEG Index is inferior with respect to the traditional EEG analysis. To understand this result, we briefly revisit how a sleep expert determines the sleep stage. Based on the protocol criteria, in addition to an EEG signal of duration that exceeds 30 seconds, the expert also takes into account past and future EEG signals to determine the sleep stage. However, in our study, the EEG Index is based on the signal in epochs of length seconds. The choice of -second interval is for the sake of balancing between the dimension and number of data points in EIG. Although the local covariance structure of the EEG signal is taken into account in the EIG analysis, this relationship is different from the protocol criteria. As a result, we do not expect to obtain a compatible stratification power. However, we see that even if we only focus on these short-term EEG signals, we still can predict the sleep stage up to some accuracy and it does help to attain a better classification rate when combined with the Respiratory Index. This hints the possibility that some useful information is hidden inside a finer scale EEG signals. This interesting potential will be reported in the future study.

The discussion would not be complete without mentioning the shortcomings of our study. First, we focus on a small database containing only relatively normal subjects in this study. To confirm the usefulness of the proposed features, we need to study a larger database with different types of subjects. Second, the chosen features, in particular the features selected by EIG, are subject-dependent. Indeed, different subjects might have different dynamical systems and the number of dominant components determined by EIG might vary. A theoretical and practical study of integrating the proposed features among different subjects is undergoing.

In conclusion, by applying modern signal processing techniques to EEG and respiratory signals, we find a set of suitable features, which allow us to predict the sleep stages accurately. In addition to gaining insight into the dynamics controlling the sleep dynamics, the automatic annotation system based on the analysis might lead to an objective classification as well as reduce the required human expert analysis involved in sleep evaluation.

## Acknowledgements

Hau-tieng Wu and Ronen Talmon thank the helpful discussions with Professor Ronald Coifman. Ronen Talmon acknowledges the support by the European Union’s - Seventh Framework Programme (FP7) under Marie Curie Grant No. 630657. Yu-Lun Lo acknowledges the support by Taiwan National Science Council grant 101-2220-E-182A-001 and 102-2220-E-182A-001.

## References

- [1] T. Lee-Chiong, Sleep Medicine: Essentials and Review. Oxford, 2008.
- [2] A. Rechtschaffen and A. Kales, A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Subjects. Washington: Public Health Service, US Government Printing Office, 1968.
- [3] C. Iber, S. Ancoli-Isreal, A. Chesson Jr., and S. Quan, The AASM Manual for Scoring of Sleep and Associated Events-Rules: Terminology and Technical Specification. American Academy of Sleep Medicine, 2007.
- [4] R. Norman, I. Pal, C. Stewart, J. Walsleben, and D. Rapoport, “Breathing pattern in humans: diversity and individuality,” Interobserver agreement among sleep scorers from different centers in a large dataset, vol. 23, no. 7, pp. 901–908, 2000.
- [5] V. Bajaj and R. B. Pachori, “Automatic classification of sleep stages based on the time-frequency image of {EEG} signals,” Computer Methods and Programs in Biomedicine, vol. 112, no. 3, pp. 320 – 328, 2013.
- [6] N. Kannathal, M. Choo, U. Acharya, and P. Sadasivan, “Entropies for detection of epilepsy in eeg,” Computer Methods and Programs in Biomedicine, vol. 80, pp. 187–194, 2005.
- [7] S. Blanco, R. Quiroga, O. Rosso, and S. Kochen, “Time-frequency analysis of electroencephalogram series,” Physical Review E, vol. 51, no. 3, pp. 2624–2631, 1995.
- [8] S. Geng, W. Zhou, Q. Yuan, D. Cai, and Y. Zeng, “Eeg non-linear feature extraction using correlation dimension and hurst exponent,” Neurological Research, vol. 33, no. 9, pp. 908–912, 2011.
- [9] R. Talmon and R. Coifman, “Empirical intrinsic geometry for nonlinear modeling and time series filtering,” Proc. Nat. Acad. Sci., vol. 110, no. 31, pp. 12 535–12 540, 2013.
- [10] R. Talmon and R. R. Coifman, “Intrinsic modeling of stochastic dynamical systems using empirical geometry,” Appl. Comput. Harmon. Anal., 2014, Tech. Report YALEU/DCS/TR1467.
- [11] D. Duncan, R. Talmon, H. P. Zaveri, and R. R. Coifman, “Identifying preseizure state in intracranial eeg data using diffusion kernels,” Mathematical Biosciences and Engineering, vol. 10, pp. 579–590, 2013.
- [12] R. Talmon, S. Mallat, H. Zaveri, and R. R. Coifman, “Manifold learning for latent variable inference in dynamical systems,” submitted, 2014, Tech. Report YALEU/DCS/TR1491.
- [13] G. S. Chung, B. H. Choi, K. K. Kim, Y. G. Lim, J. W. Choi, D.-U. Jeong, and K.-S. Park, “Rem sleep classification with respiration rates,” in Information Technology Applications in Biomedicine, 2007. ITAB 2007. 6th International Special Topic Conference on, 2007, pp. 194–197.
- [14] G. Guerrero-Mora, P. Elvia, A. Bianchi, J. Kortelainen, M. Tenhunen, S. Himanen, M. Mendez, E. Arce-Santana, and O. Gutierrez-Navarro, “Sleep-wake detection based on respiratory signal acquired through a pressure bed sensor,” in Engineering in Medicine and Biology Society (EMBC), 2012 Annual International Conference of the IEEE, 2012, pp. 3452–3455.
- [15] J. Sloboda and M. Das, “A simple sleep stage identification technique for incorporation in inexpensive electronic sleep screening devices,” in Aerospace and Electronics Conference (NAECON), Proceedings of the 2011 IEEE National, 2011, pp. 21–24.
- [16] H.-T. Wu, “Instantaneous frequency and wave shape functions (I),” Appl. Comput. Harmon. Anal., vol. 35, pp. 181–199, 2013.
- [17] Y.-C. Chen, M.-Y. Cheng, and H.-T. Wu, “Nonparametric and adaptive modeling of dynamic seasonality and trend with heteroscedastic and dependent errors,” J. Roy. Stat. Soc. B, vol. 76, pp. 651–682, 2014.
- [18] R. Talmon, I. Cohen, S. Gannot, and R. R. Coifman, “Diffusion Maps for Signal Processing: A Deeper Look at Manifold-Learning Techniques Based on Kernels and Graphs,” IEEE Trans. Signal Process., vol. 30, no. 4, pp. 75–86, 2013.
- [19] I. Daubechies, J. Lu, and H.-T. Wu, “Synchrosqueezed Wavelet Transforms: an empirical mode decomposition-like tool,” Appl. Comput. Harmon. Anal., pp. 243–261, 2011.
- [20] M. Engoren, “Approximate entropy of respiratory rate and tidal volume during weaning from mechanical ventilation,” Critical Care Medicine, vol. 26, pp. 1817–1823, 1998.
- [21] G. Benchetrit, “Breathing pattern in humans: diversity and individuality,” Respir. Physiol., vol. 122, no. 2-3, pp. 123 – 129, 2000.
- [22] M. Wysocki, C. Cracco, A. Teixeira, A. Mercat, J. Diehl, Y. Lefort, J. Derenne, and T. Similowski, “Reduced breathing variability as a predictor of unsuccessful patient separation from mechanical ventilation,” Critical Care Medicine, vol. 34, pp. 2076–2083, 2006.
- [23] H.-T. Wu, S.-S. Hseu, M.-Y. Bien, Y. R. Kou, and I. Daubechies, “Evaluating physiological dynamics via synchrosqueezing: Prediction of ventilator weaning,” IEEE Transactions on Biomedical Engineering, vol. 61, pp. 736–744, 2013.
- [24] A. Singer and R. R. Coifman, “Non-linear independent component analysis with diffusion maps,” Appl. Comput. Harmon. Anal., vol. 25, no. 2, pp. 226 – 239, 2008.
- [25] R. R. Coifman and S. Lafon, “Diffusion maps,” Appl. Comput. Harmon. Anal., vol. 21, no. 1, pp. 5–30, 2006.
- [26] S. Mallat, “Group invariant scattering,” Pure and Applied Mathematics, vol. 10, no. 65, pp. 1331–1398, 2012.
- [27] J. Bruna, E. Mallat, S. Bacry, and J. F. Muzy, “Multiscale intermittent process analysis by scattering,” submitted, arXiv:1311.4104, 2013.
- [28] M. Belkin and P. Niyogi, “Convergence of laplacian eigenmaps,” in Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference, vol. 19. The MIT Press, 2007, p. 129.
- [29] A. Singer and H.-T. Wu, “Spectral convergence of the connection laplacian from random samples,” submitted, 2013.
- [30] N. El Karoui, “On information plus noise kernel random matrices,” The Annals of Statistics, vol. 38, no. 5, pp. 3191–3216, 2010.
- [31] N. El Karoui and H.-T. Wu, “Connection graph Laplacian methods can be made robust to noise,” submitted, 2014.
- [32] B. Scholkopf and A. Smola, Learning with Kernels. MIT Press, 2002.
- [33] R. Rifkin and A. Klautau, “In Defense of One-Vs-All Classification,” Journal of Machine Learning Research, vol. 5, pp. 101–141, 2004.