Discovering Hidden Structure in High Dimensional Human
Behavioral Data via Tensor Factorization
In recent years, the rapid growth in technology has increased the opportunity for longitudinal human behavioral studies. Rich multimodal data, from wearables like Fitbit, online social networks, mobile phones etc. can be collected in natural environments. Uncovering the underlying low-dimensional structure of noisy multi-way data in an unsupervised setting is a challenging problem. Tensor factorization has been successful in extracting the interconnected low-dimensional descriptions of multi-way data. In this paper, we apply non-negative tensor factorization on a real-word wearable sensor data, StudentLife, to find latent temporal factors and group of similar individuals. Meta data is available for the semester schedule, as well as the individuals’ performance and personality. We demonstrate that non-negative tensor factorization can successfully discover clusters of individuals who exhibit higher academic performance, as well as those who frequently engage in leisure activities. The recovered latent temporal patterns associated with these groups are validated against ground truth data to demonstrate the accuracy of our framework.
Behavioral data, collected from a variety of sources, has been used to understand human affect, wellbeing, social relationships, performance and decision making progress. Existing techniques such as interviews and questionnaires are inaccurate, expensive and laborious to administer. Fortunately, today’s densely instrumented world offers tremendous opportunities for continuous acquisition and analysis of multi-variant time series data that provides a multimodal, spatiotemporal characterization of an individual’s actions.
Efficiently coupling such rich sensor data with fusion and predictive modeling techniques can provide continuous, personalized, contextual, and insightful assessments of individual performance. StudentLife is a 10-week study on 48 Dartmouth undergraduate and graduate students using passive and mobile sensor data to infer wellbeing, academic performance and behavioral trends (Wang et al., 2014). The Dartmouth College research team was able to predict GPA using activity, conversational interaction, mobility, and self-reported emotion and stress data over the semester. SNAPSHOT is a 30-day study on MIT undergraduates using mobile sensors and surveys to understand sleep, social interactions, affect, performance, stress and health (Sano et al., 2015). RealityMining is a 9-month study on 75 MIT Media Laboratory students, using mobile sensor data to track the social interactions and networkings (Eagle and Pentland, 2006). The friends-and-families study collects data from 130 adult members of a young family community to study fitness intervention and social incentives (Aharony et al., 2011).
These data sources are often collected from small set of participants in natural condition, continuously and over long periods of time. Therefore, they are potentially heterogeneous, sparse, in high dimensional regime and have systematically missing values. Tensor factorization can tolerate missing values (Acar et al., 2011; Dauwels et al., 2012; Cichocki, 2014) and has been used popularly in multi-way relational data, for example clustering and temporal structure discovery in dynamic networks (Sapienza et al., 2017). In this paper we demonstrate non-negative tensor factorization (NTF) can reveal the low-dimensional patterns of noisy heterogeneous wearable sensor data, while preserving the interpretability (Cichocki, 2009). We are mainly interested in StudentLife dataset as it has a very rich set of time series collected from a cohort of students via their smartphones, plus wide range of meta data available, such as workloads and mental health state. Ref. (Wang et al., 2015) builds a matrix representation of this longitudinal data by extracting different statistics from variables across the time dimension to infer students’ wellbeing and performance. Pattern of activity and sociability behavior has been explored in (Harari et al., 2017), by aggregating variables over the individuals dimension. We would like to extract the interconnected low-dimensional latent factors, without aggregation of data over any dimension. Here we will demonstrate the strength of tensor factorization for in situ human behavior research through finding the hidden structures and validate them against ground truth. First we evaluate the variables and individuals associated with each temporal latent factor. Then we compare the distribution of students who are strongly associated with each component, across different meta data, such as GPA, personality traits and affect.
In this article we propose the following ideas:
Unsupervised extraction of low-dimensional structure of wearable sensor data
Discovering clusters of individuals who exhibit higher academic performance, as well as those who frequently engage in leisure activities
Validation of recovered latent temporal patterns associated with these groups against ground truth data
A wide range of real-world datasets such as recommendation system, multivariate time series and video streams are multi-way. A mathematical representation of such data is tensor, . One approach to work with this multi-way data is matrixizing or flattening and applying conventional supervised/unsupervised techniques, but we may loose structural information. Tensors are powerful tools to extract complex structures from the high-dimensional multi-way data in an unsupervised approach. In this section we discuss the model that we will use to analyze a 3-way longitudinal human behavioral data. For individuals, with variables for a duration of time frames, tensor will be created. Entry of this tensor corresponds to the person in the variable at the time unit. Table 1 summarizes our notation throughout this paper.
|Tensor, matrix, column vector, scaler|
|Definition of an I-dimensional vector|
2.2. Non-negative tensor decomposition
The extraction of meaningful patterns of behavior can be carried out by taking full advantage of tensor decomposition techniques. Let us consider a dataset composed by individuals, whose daily behavior (walking, running, talking, etc. during the day) has been recorded over time. To uncover groups of individuals with similar correlated trajectories, identification of lower-dimensional factors is required. Once the dataset is represented in the tensor form, we can perform tensor decomposition to discover the hidden lower dimension structure of the data.
Here, we use the CANDECOMP/PARAFAC decomposition (Carroll and Chang, 1970; Harshman, 1970), which will decompose the tensor into sum of rank-one tensors, called components, Figure 1. Also we add the non-negativity constraints to each factor matrix in order to find interpretable components.
where are the values of the tensor core , and the outer product corresponds to the component of rank-R estimation. This decomposition can be written as the following optimization problem:
are the factor matrices with their columns containing rank-1 factors, , and , respectively. Imposing the non-negativity constraints makes the factorization results interpretable.
3. Case Study
In this paper we use StudentLife dataset, a large publicly available dataset, tracking student performance, wellbeing and physiological state (Rui et al., 2014). StudentLife is a 10-week study conducted during 2013 spring semester on 48 Dartmouth students (30 undergraduate and 18 graduate students). The dataset can be divided into four sections: smartphone sensors, ecological momentary assessments (EMAs), psychometrics and academic performances. From raw sensor data, activity (stationary, walk, run and unknown), audio (silence, voice, noise and unknown), and conversation has been inferred. Psychometrics have pre-post Big Five personality (John and Srivastava, 1999), flourishing scale (Diener et al., 2009), UCLA loneliness scale (Russell et al., 1978), positive and negative affect schedule (PANAS) (Watson et al., 1988), perceived stress scale (PSS) (Cohen et al., 1983), PHQ-9 depression scale (Kroenke et al., 2001), Pittsburgh sleep quality index (PSQI) (Buysse et al., 1989) and VR-12 heath scale (Kazis et al., 2006). Academic performances have class schedule, number of deadlines, overall GPA, online class forum Piazza participation, and more. We will not use EMAs in our modeling as we are interested in learning from passively recorded data and we keep survey and EMAs only as ground truth for validation task. For psychometrics surveys, there are large amount of missing pre and post survey scores. We will use the post survey score if it is available otherwise pre survey score is used as replacement. If both scores are missing for a survey, we will drop that users when using that survey.
3.1. Feature Extraction
Our goal in this paper is understanding whether we can find the low-dimensional structure of daily life from wearable devices, without using any meta data such as self-reported EMAs or day of the week. Therefore, we only use the smartphone sensor variables in our feature set. Each time unit comprise one day worth of data, and is divided into four time bins, bedtime (midnight-6 am), morning (6 am-12 pm), afternoon (12 pm-6 pm), and evening (6 pm-midnight). We extract duration (minutes) of running, walking, stationary, silence, voice, noise, and dark, per time-bin in each day. Frequency of each behavior and number of changes in each behavior (e.g. from walking to running) for each time-bin has been also captured. From GPS and WiFi, number of unique locations visited, and from Bluetooth number of unique near by devices per time-bin are added to the variable set. We normalize all the variables to have the same range to avoid variables with large values (e.g. duration in minutes) dominate the analysis. At the end, we organize our data as tensor with individuals, variables and days. Only 5% of the tensor is missing, which we imputed by filling them with the mean value.
4. Discovering pattern of temporal behavior
In this section we employ non-negative tensor decomposition to find the hidden structure in the data. However, we need to find and fixate the correct number of components . To ensure of selecting the best approximation, we change the number of components and report the mean and standard deviation of the core consistency scores for each , Figure 2. Negative values are imputed to zero and no standard deviation is reported for them. We choose rank , the number of components that yields the largest change in the slope of the core consistency curve. After fixing the number of components, we then select the best approximation after a set of random initializations, by choosing the one corresponding to the maximum value of core consistency for the selected rank .
Figure 4 presents the extracted latent temporal components. We will use the meta data available in StudentLife dataset to find the topic of each latent component discovered, Figure 3 (Dror et al., 2014). This curves are generated based on self-reported values by students and averaged across all students each day. The first component follows the same pattern as studying and increases over the semester, Figure 3(a). The second component should be related to partying, as it decreases after the first week of semester and there is a jump around the Green Key weekend (Figure 3(b), light green box). The third component follows the pattern of deadlines. The light green box in Figure 3(c) shows the duration with higher number of deadlines from meta data, Figure3.
4.1. Structural Validation
A nice aspect of the dataset we use is the existence of ground truth available from known semester schedules, students’ course load and self-reported values. Here, we examine our hypothesis that extracted components represent studying, partying, and deadline topics. Figure 5 shows stronger correlation of component 3 with stationary, silence and number of locations visited, which follows the pattern of working on homework deadlines for students. More silence and darkness in the morning, and stronger membership of conversation during afternoon and evening, supports the hypothesis of considering topic ”Party” for component 2. Frequency of capturing voice in the evening and number of changes of audio status in afternoon have higher weights in component 2 compare to the other two components. Also we looked at the distribution of top 25% individuals who have the highest association with each component across different existing meta data, such as personality and performance. Note that all students have some degree of association in all components and one student can fall among top 25% individuals in more than one component. Kernel Density Estimation (KDE) for the distribution of extraversion scores for students associated to each component is demonstrated in Figure 5(a). Top members of component 2 have significantly bigger mean value than component 1 (p-value=0.01 based on t-test). People who score high in extraversion personality trait, enjoy being with others, participating in social gatherings and partying. This observation is along with our hypothesis which component 2 reflect partying, Figure3(b).
Running the ANOVA test for all the other four personality traits (Openness, Conscientiousness, Agreeableness, Neuroticism), the null hypothesis of equal mean values for the three sample sets is not rejected. Component 3 mostly matches the deadline pattern in Figure 3. We looked at the average number of deadlines over 10 weeks for the top members in each component and we observed students with higher weight in component 3 have more deadlines than top members of components 1 (p-value = 0.06).
Looking at the students’ performance, it is also very interesting that the individuals involved in component 2 have lower GPA than the other two components, (p-value = 0.008), Figure 5(b). Based on the average self-reported relaxing by students over the semester, top member of component 1 have significantly less relaxing duration and at the same time, they have the highest contribution in online class forum Piazza participation. These students have significantly more positive affect score compare to group of students associated in the other two components (p-value = 0.03), Figure 5(c).
5. Conclusions and Future work
Rich multimodal data collected from wearables devices (e.g. Fitbit), mobile phones, online social networks, etc. become increasingly available to reconstruct digital trails and study human behavior. In this paper, we adopt the StudentLife dataset collected over the course of 10 weeks from 48 Dartmouth undergraduate and graduate students using passive and mobile sensors, with the goal of inferring wellbeing, academic performance, and behavioral trends. We employ the unsupervised learning framework based on non-negative tensor decomposition to find groups of individuals with similar behavior. This type of decomposition can uncover latent temporal structures such as studying and partying over the semester. By applying this framework we could discover traits like that the group of students associated with Component 2 (leisure activities) have the highest average scores of self-reported extroversion, while students associated with Component 1 (studying factor) have the highest average GPA, lower relax-time and higher positive affect. Next, we plan to implement supervised predictions of individuals’ performance and personality directly from tensors, instead of mapping the tensors to matrices and using conventional supervised methods designed for two dimensional data.
The research is based upon work supported by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via IARPA Contract No 2017-17042800005, and by DARPA (grant no. D16AP00115). The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon.
- Acar et al. (2011) Evrim Acar, Daniel M. Dunlavy, Tamara G. Kolda, and Morten Morup. 2011. Scalable Tensor Factorizations for Incomplete Data. Chemometrics and Intelligent Laboratory Systems 106, 1 (March 2011), 41–56. https://doi.org/10.1016/j.chemolab.2010.08.004
- Aharony et al. (2011) Nadav Aharony, Wei Pan, Cory Ip, Inas Khayal, and Alex Pentland. 2011. Social fMRI: Investigating and shaping social mechanisms in the real world. Pervasive and Mobile Computing 7, 6 (2011), 643–659.
- Buysse et al. (1989) Daniel J. Buysse, C.F. Reynolds, T.H. Monk, S.R. Berman, and D.J. Kupfer. 1989. The Pittsburgh Sleep Quality Index (PSQI): A New Instrument for Psychiatric Research and Practice. Psychiatry Research 28, 2 (1989), 193–213. http://uacc.arizona.edu/sites/default/files/psqi_sleep_questionnaire_1_pg.pdf
- Carroll and Chang (1970) J Douglas Carroll and Jih-Jie Chang. 1970. Analysis of individual differences in multidimensional scaling via an N-way generalization of âEckart-Youngâ decomposition. Psychometrika 35, 3 (1970), 283–319.
- Cichocki (2009) Andrzej Cichocki. 2009. Nonnegative matrix and tensor factorizations: applications to exploratory multi way data analysis and blind source separation. John Wiley and Sons.
- Cichocki (2014) Andrzej Cichocki. 2014. Era of Big Data Processing: A New Approach via Tensor Networks and Tensor Decompositions. Computing Research Repository (August 2014), 1–30. https://arxiv.org/abs/1403.2048
- Cohen et al. (1983) S. Cohen, T. Kamarck, and R. Mermelstein. 1983. A Global Measure of Perceived Stress. Journal of Health and Social Behaviour 24, 4 (1983), 385–396. https://das.nh.gov/wellness/Docs/Percieved%20Stress%20Scale.pdf
- Dauwels et al. (2012) Justin Dauwels, Lalit Garg, Arul Earnest, and Leong K. Pang. 2012. Tensor Factorization for Missing Data Imputation in Medical Questionnaires. 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (March 2012), 2109–2112. https://doi.org/10.1109/ICASSP.2012.6288327
- Diener et al. (2009) Ed Diener, Derrick Wirtz, William Tov, Chu Kim-Prieto, Dong-won Choi, Shigehiro Oishi, and Robert Biswas-Diener. 2009. New Well-being Measures: Short Scales to Assess Flourishing and Positive and Negative Feelings. Social Indicators Research 39 (2009), 247–266. https://internal.psychology.illinois.edu/~ediener/Documents/FS.pdf
- Dror et al. (2014) Ben-Zeev Dror, Campbell Andrew, Chen Fanglin, Chen Zhenyu, Li Tianxing, Wang Rui, Zhou Xia, Harari Gabriella, and Tignor Stefanie. 2014. StudentLife Dataset - Dartmouth College. http://studentlife.cs.dartmouth.edu. (2014).
- Eagle and Pentland (2006) Nathan Eagle and Alex Pentland. 2006. Reality Mining: Sensing Complex Social Systems. Personal and Ubiquitous Computing 10, 4 (May 2006), 255–268. https://doi.org/10.1007/s00779-005-0046-3
- Harari et al. (2017) Gabriella M Harari, Samuel D Gosling, Rui Wang, Fanglin Chen, Zhenyu Chen, and Andrew T Campbell. 2017. Patterns of behavior change in students over an academic term: A preliminary study of activity and sociability behaviors using smartphone sensing methods. Computers in Human Behavior 67 (2017), 129–138.
- Harshman (1970) Richard A Harshman. 1970. Foundations of the parafac procedure: models and conditions for an” explanatory” multimodal factor analysis. (1970).
- John and Srivastava (1999) Oliver P. John and Sanjay Srivastava. 1999. The Big-Five trait taxonomy: History, measurement, and theoretical perspectives. Handbook of Personality 2 (1999), 102–138. http://fetzer.org/sites/default/files/images/stories/pdf/selfmeasures/Personality-BigFiveInventory.pdf
- Kazis et al. (2006) L.E. Kazis, Skinner K.M Miller, D.R., A. Lee, X.S. Ren, J.A. Clark, W.H. Rogers, A. Spiro III, A. Selim, M Linzer, S.M. Payne, D. Mansell, and B.G. Fincke. 2006. Applications of Methodologies of the Veterans Health Study in the VA Health Care System: Conclusions and Summary. The Journal of Ambulatory Care Management 29, 2 (2006), 182–188. https://www.aaos.org/uploadedFiles/PreProduction/Quality/Measures/Veterans%20RAND%2012%20(VR-12).pdf
- Kroenke et al. (2001) Kurt Kroenke, Robert L. Spitzer, and Janet B.W. Williams. 2001. The PHQ-9: Validity of a Brief Depression Severity Measure. Journal of General Internal Medicine 16, 9 (2001), 606–613.
- Rui et al. (2014) Wang Rui, Fanglin Chen, Zhenyu Chen, Tianxing Li, Gabriella Harari, Stefanie Tignor, Xia Zhou, Dror Ben-Zeev, and Andrew T. Campbell. 2014. StudentLife: Assessing Mental Health, Academic Performance and Behavoiral Trends of College Students using Smartphones. In Proceedings of the ACM Conference on Ubiquitous Computing (2014). http://studentlife.cs.dartmouth.edu/
- Russell et al. (1978) Dan Russell, Letitia Anne Peplau, and Mary Lund Ferguson. 1978. Developing A Measure of Loneliness. Journal of Personality Assessment 42 (1978), 290–294. http://fetzer.org/sites/default/files/images/stories/pdf/selfmeasures/Self_Measures_for_Loneliness_and_Interpersonal_Problems_UCLA_LONELINESS.pdf
- Sano et al. (2015) Akane Sano, Andrew J. Phillips, Amy Z. Yu, Andrew W. McHill, Sara Taylor, Natasha Jaques, Charles A. Czeisler, Elizabeth B. Klerman, and Rosalind W. Picard. 2015. Recognizing Academic Performance, Sleep Quality, Stress Level, and Mental Health using Personality Traits, Wearable Sensors and Mobile Phones. 2015 International Conference on Wearable Implantable Body Senor Networks (June 2015), 1–13. https://doi.org/10.1109/BSN.2015.7299420
- Sapienza et al. (2017) Anna Sapienza, Alessandro Bessi, and Emilio Ferrara. 2017. Non-negative Tensor Factorization for Human Behavioral Pattern Mining in Online Games. arXiv preprint arXiv:1702.05695 (2017).
- Wang et al. (2014) Rui Wang, Fanglin Chen, Zhenyu Chen, Tianxing Li, Gabriella Harari, Stefanie Tignor, Xia Zhou, Dror Ben-Zeev, and Andrew T. Campbell. 2014. StudentLife: Assessing Mental Health, Academic Performance and Behavioral Trends of College Students using Smartphones. UbiComp ’14 Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing (September 2014), 3–14. 10.1145/2632048.2632054
- Wang et al. (2015) Rui Wang, Gabriella Harari, Peilin Hao, Xia Zhou, and Andrew T Campbell. 2015. SmartGPA: how smartphones can assess and predict academic performance of college students. In Proceedings of the 2015 ACM international joint conference on pervasive and ubiquitous computing. ACM, 295–306.
- Watson et al. (1988) D. Watson, L.A. Clark, and A. Tellegen. 1988. Development and Validation of Brief Measures of Positive and Negative Affect: The PANAS Scales. Journal of Personality and Social Psychology 54, 6 (1988), 1063. https://booksite.elsevier.com/9780123745170/Chapter%203/Chapter_3_Worksheet_3.1.pdf