Feature engineering workflow for activity recognition from synchronized inertial measurement units

Feature engineering workflow for activity recognition from synchronized inertial measurement units


The ubiquitous availability of wearable sensors is responsible for driving the Internet-of-Things but is also making an impact on sport sciences and precision medicine. While human activity recognition from smartphone data or other types of inertial measurement units (IMU) has evolved to one of the most prominent daily life examples of machine learning, the underlying process of time-series feature engineering still seems to be time-consuming. This lengthy process inhibits the development of IMU-based machine learning applications in sport science and precision medicine. This contribution discusses a feature engineering workflow, which automates the extraction of time-series feature on based on the FRESH algorithm (FeatuRe Extraction based on Scalable Hypothesis tests) to identify statistically significant features from synchronized IMU sensors (IMeasureU Ltd, NZ). The feature engineering workflow has five main steps: time-series engineering, automated time-series feature extraction, optimized feature extraction, fitting of a specialized classifier, and deployment of optimized machine learning pipeline. The workflow is discussed for the case of a user-specific running-walking classification, and the generalization to a multi-user multi-activity classification is demonstrated.

1 Introduction

Human Activity Recognition (HAR) is an active research area within the field of ubiquitous sensing, which has applications in medicine (monitoring exercise routines) and sport (monitoring the potential for injuries and enhance athletes performance). For a comprehensive overview on this topic refer to [6]. Typically the design of HAR applications has to overcome the following challenges [1]:

  1. Selection of the attributes to be measured.

  2. Construction of a portable and unobtrusive data acquisition system.

  3. Design of feature extraction and inference methods.

  4. Automated adjustment to new users without the need for re-training the system.

  5. Implementation in mobile devices meeting energy and processing requirements.

  6. Collection of data under realistic conditions.

In this contribution, we are discussing the automated engineering of time-series features (challenge 3) from two synchronized inertial measurement units as provided by IMeasureU’s BlueThunder sensor [8]. Each sensor records acceleration, angular velocity, and magnetic field in three spatial dimensions. Due to the availability of machine learning libraries like tsfresh [2] or hctsa [5], which automate the extraction of time-series features for time-series classification tasks [4], we are shifting our focus from the engineering of time-series features to the engineering of time-series. For this purpose, we are considering not only the 18 sensor time-series from the two synchronized sensors but also 6 paired time-series, which measure the differences between the axes of different sensors. A further focus of this contribution is the optimization of the feature extraction process for the deployment of the machine learning pipeline (Sec. 2). The workflow is discussed for the case of a user-specific running-walking classification (Sec. 3.1), and the generalization to a multi-user multi-activity classification (Sec. 3.2) is demonstrated. The paper closes with a short discussion (Sec. 4)

2 Automated feature engineering workflow

The automated feature engineering workflow presented in this paper has two foundations: The BlueThunder sensor from IMeasureU Ltd. [8] and the time-series feature extraction library tsfresh [2, 3].

2.1 Synchronized inertial measurement units

The BlueThunder sensor is a wireless inertial measurement unit (IMU), which combines a 3-axis accelerometer, a 3-axis gyroscope, and a 3-axis compass. Its specification is listed in Tab. 1 and its dimensions are shown in Fig. 1a. One of the key features of this sensor is the fact that several units can be synchronized. Therefore, not only the measured sensor signals itself, but also paired signals, like, e.g. the difference between the acceleration in the x-direction of two different sensors can be used as an additional signal. One might interpret these computed signals as being recorded by virtual sensors, which of course are basically signal processing algorithms.

In order to demonstrate the applicability of the presented feature engineering workflow, we are going to discuss two different activity recognition experiments. The first experiment is concerned with the discrimination of running vs walking for a specific person (Sec. 3.1), the second with generalizing the classification of 10 different activities over different persons (Sec. 3.2). The running vs walking classification experiment was designed with a basic setup of two different IMUs being mounted at the left and right ankle. The multi-activity classification task considered 9-different mounting points, which were mounted at the left and right upper arm, the left and right wrist, the left and right ankle, as well as the top of the left and right foot (Fig. 1b).

accelerometer range
accelerometer resolution 16 bit
gyroscope range
gyroscope resolution 16 bit
compass range
compass resolution 13 bit
data logging 500Hz
Table 1: Specification of IMeasureU BlueThunder sensor [8].
a b
Figure 1: IMeasureU’s BlueThunder sensor. Panel a dimensions of sensor [8, p.2], panel b Mounting points of sensors at the front of head (1), left and right upper arm (2, 9), left and right wrist (3, 8), left and right ankle (4, 7), and top of left and right foot (5, 6). For the running-walking classification, sensors were mounted at the left and right ankle (4, 7). For the multi-activity classification, the optimal sensor combination was tip of right foot (5) and right upper arm (2).

2.2 Feature extraction on the basis of scalable hypothesis testing

At the core of the Python-based machine learning library tsfresh [2] is the FRESH algorithm. FRESH is the abbreviation for FeatuRe Extraction on the basis of Scalable Hypothesis testing [3]. The general idea of this algorithm is to characterise each time-series by applying a library of curated algorithms, which quantify each time-series with respect to their distribution of values, correlation properties, stationarity, entropy, and nonlinear time-series analysis. Of course, this brute force feature extraction is computationally expensive and has to be followed by a feature selection algorithm in order to prevent overfitting. The feature selection is is done by testing the statistical significance of each time-series feature for predicting the target and controlling the false discovery rate [3]. Depending on the particular feature-target combination, the algorithm chooses the type of hypothesis test to be performed and selects the set of statistically significant time-series features while preserving the false discovery rate. The pseudocode of the FRESH algorithm is given in Alg. 1 on p. 1.

Data: Labelled samples comprising different time-series
Result: Relevant time-series features
for all predefined feature extraction algorithms do
       for all time-series do
             for all samples do
                   Apply feature extraction algorithm to time-series sample and compute time-series feature;
             end for
            Test statistical significance of feature for predicting the label;
       end for
end for
Select significant features while preserving false discovery rate;
Algorithm 1 Pseudocode of Feature extRaction on the basis of Scalable Hypothesis testing (FRESH).

2.3 Feature engineering workflow for activity recognition

The general approach of the feature engineering workflow for activity recognition has five major steps:

Time-series engineering

Increase the number of time-series by designing virtual sensors, which combine the signals from different sensors, compute attributes like derivatives, or do both.

Automated time-series feature extraction

Extract a huge variety of different time-series features, which are relevant for predicting the target.

Optimized feature extraction

Identify a subset of features, which optimizes the performance of a cross-validated classifier.

Fitting of specialized classifier

Refit the classifier by using only the subset of features from the previous step.

Deployment of optimized algorithm

Extract only those time series features, which are needed for the specialized classifier.

Note that the deployment step uses the fact that every feature can be mapped to a combination of a specific time-series and a well-defined algorithm. Most likely, not all time-series are relevant and depending on the classifier, only a small set of time-series features is needed. An example of this workflow is documented in the following case-study for classifying running vs walking.

3 Activity recognition case studies

3.1 Running vs walking

The following case study trains an individualized activity recognition algorithm for discriminating running vs walking on the basis of a 560 seconds long activity sequence, for which the corresponding activities were logged manually:

  • 2 synchronized IMUs mounted at left and right ankle (cf. Fig. 1b),

  • 560 seconds of mixed running and walking,

  • 280000 measurements for each of the 18 sensors (plus 6 paired measurements),

  • 140 sections of 4s length (82 walking-sections, 58 running-sections),

  • 15605 features in total,

  • 4850 statistically significant features (false discovery rate 5%),

The virtual sensor was configured to compute the magnitude of difference between corresponding directions of the acceleration and gyroscope sensors. The time-series features were extracted with tsfresh [2]2, which was available in version 0.10.1 at the time of this case study. A random forest classifier as implemented in scikit-learn [7] (version 0.19.0) was used for discriminating running vs walking. The default configuration of the classifier already achieved 100% accuracy under 10-fold cross-validation, such that no hyperparameter tuning was performed. The following 20 time-series features were identified as optimized time-series feature subset as features with the highest feature importances from 100k fitted random forests.


These 20 time-series features are computed from 10 different time-series: four from the right ankle (accel_y_r, accel_z_r, gyro_x_r, gyro_z_r), three from the left ankle (accel_z_l, gyro_y_l, gyro_z_l), and three magnitude of differences (accel_y_diff, accel_z_diff, giro_y_diff). Each feature references the generating algorithm using the following scheme [2]: (1) the time-series kind the feature is based on, (2) the name of the feature calculator, which has been used to extract the feature, and (3) key-value pairs of parameters configuring the respective feature calculator:


The features are dominated by two different methods, which quantify the linear trend (agg_linear_trend) and the expected change of the signal (change_quantiles). A detailed description of the underlying algorithms can be found in the tsfresh documentation3. The list of features can be converted into a dictionary using the function


which can be used for restricting the time-series feature extractor of tsfresh to extract just this specific set of time-series features4.

Fig. 2a summarizes the feature engineering workflow for the running vs walking case study. The inlay at the bottom right of this figure is also depicted in Fig. 2b. It shows the estimated activity sequence as time-series of probabilities on a hold-out data set, which was recorded by the same person as the training data set but on a different date. For this activity classification, only the 20 time-series features listed above were used. The algorithm’s accuracy on the hold-out dataset was 92%.



Figure 2: Feature engineering workflow for activity recognition tasks with details for the running vs walking case study. Classification of running vs walking for validation data set operating on the 20 time-series features identified during the feature engineering phase of the case study. Red dots indicate misclassifications. The algorithm has an accuracy of 92%.

3.2 Multi-activity classification case study

The following case study involves a more complex feature engineering setup because all nine sensor mounting points, as depicted in Fig. 1, were considered for the feature engineering. The task of this case study was to find a combination of sensors for recognizing the activities

  • laying down face down,

  • push-ups,

  • running,

  • sit-ups,

  • standing,

  • star jumps, and

  • walking,

while allowing for optimal generalization to other individuals. Therefore, the feature engineering was optimized on the basis of a group 5-fold cross-validation of activities from five different persons (four men, one woman). The mean accuracy for this proband-specific cross-validation was 92.6%.

The optimal sensor mounting points for this task have been identified as the tip of the right foot and the upper right arm (Fig. 1). The evaluation of the resulting activity recognition algorithm on a sixth subject, who had recorded a 45min long evaluation data set, retrieved a similar performance (Fig. 3) and was computed in less than 20 seconds.

Figure 3: Evaluation of multi-activity recognition pipeline on the hold-out data set.

4 Discussion

The presented workflow for feature engineering of activity recognition task demonstrates a flexible and robust methodology, which is based on the combination of signals from synchronized IMUs and automated time-series feature extraction. Due to the availability of machine learning libraries for automated time-series feature extraction, it can be expected that there will be a general shift of focus in research from the engineering of time-series features to the engineering of time-series. In this work, the engineering of time-series has been modelled as virtual sensors, but in many cases, this process will be similar to the design of signal operators.


The authors like to thank Julie Férard and the team at IMeasureU for their support.


  1. a.kempa-liehr@auckland.ac.nz
  2. https://github.com/blue-yonder/tsfresh/tree/v0.10.1
  3. https://tsfresh.readthedocs.io/en/v0.10.1/text/list_of_features.html
  4. https://github.com/blue-yonder/tsfresh/blob/master/notebooks/the-fc_parameters-extraction-dictionary.ipynb


  1. Ahmadi, A., Mitchell, E., Richter, C., Destelle, F., Gowing, M., O’Connor, N.E., Moran, K.: Toward automatic activity classification and movement assessment during a sports training session. IEEE Internet of Things Journal 2(1), 23–32 (2015)
  2. Christ, M., Braun, N., Neuffer, J., Kempa-Liehr, A.W.: Time series FeatuRe extraction on basis of scalable hypothesis tests (tsfresh – a Python package). Neurocomputing 307, 72–77 (2018). https://doi.org/10.1016/j.neucom.2018.03.067
  3. Christ, M., Kempa-Liehr, A.W., Feindt, M.: Distributed and parallel time series feature extraction for industrial big data applications. Learning 1610.07717v1, arXiv (2016), https://arxiv.org/abs/1610.07717v1, Asian Conference on Machine Learning (ACML), Workshop on Learning on Big Data (WLBD)
  4. Fulcher, B.D.: Feature-based time-series analysis, pp. 87–116. Taylor & Francis, Boca Raton, FL (2018)
  5. Fulcher, B.D., Jones, N.S.: hctsa: A computational framework for automated time-series phenotyping using massive feature extraction. Cell Systems 5(5), 527–531.e3 (2017). https://doi.org/10.1016/j.cels.2017.10.001
  6. Lara, O.D., Labrador, M.A.: A survey on human activity recognition using wearable sensors. IEEE Communications Surveys & Tutorials 15(3), 1192–1209 (2013)
  7. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011)
  8. Wong, A., Vallabh, R.: IMeasureU BlueThunder sensor. Sensor Specification 1.5, Vicon IMeasureU Limited, Auckland (2018), https://imeasureu.com/wp-content/uploads/2018/05/Sensor_Specification_v1.5.pdf
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description