Quantified Self Meets Social Media: Sharing of Weight Updates on Twitter
An increasing number of people use wearables and other smart devices to quantify various health conditions, ranging from sleep patterns, to body weight, to heart rates. Of these “Quantified Selfs” many choose to openly share their data via online social networks such as Twitter and Facebook. In this study, we use data for users who have chosen to connect their smart scales to Twitter, providing both a reliable time series of their body weight, as well as insights into their social surroundings and general online behavior. Concretely, we look at which social media features are predictive of physical status, such as body weight at the individual level, and activity patterns at the population level. We show that it is possible to predict an individual’s weight using their online social behaviors, such as their self-description and tweets. Weekly and monthly patterns of quantified-self behaviors are also discovered. These findings could contribute to building models to monitor public health and to have more customized personal training interventions.
While there are many studies using either quantified self or social media data in isolation, this is one of the few that combines the two data sources and, to the best of our knowledge, the only one that uses public data.
During the last couple of years, the number of users who use “Quantified Self” (QS) health tracking devices has continuously increased 111http://nuviun.com/digital-health/quantified-self. A survey of 1,262 U.S. adult consumers conducted in December 2014 found that 31% use a QS tool to track their health and fitness.222http://quantifiedself.com/docs/RocketFuel_Quantified_Self_Research.pdf In lockstep with this proliferation of QS tools, research output on all related aspects has also seen a dramatic increase. Google Scholar lists 74 publications with “Quantified Self” in the title for all years up to and including 2013.333https://scholar.google.com/scholar?q=intitle%3A"quantified+self"&as_yhi=2013. Last accessed on Jan 8, 2016. However, since 2014 alone, already 123 publications matching this criteria have been indexed.444https://scholar.google.com/scholar?q=intitle%3A"quantified+self"&as_ylo=2014. Last accessed on Jan 8, 2016. Most of this research, however, looks at QS data in isolation, separately from other data one might obtain for a user.
In this paper we present results from an attempt to combine QS data with social media data. This link between two data sources is made possible as more and more users publicly share the QS data they generate. For our study, we use data from users who have chosen to connect their smart scale to their public Twitter stream.
Concretely, we analyze data for users who have opted in to connect their internet-enabled Withings Smart Body Analyzer555http://www2.withings.com/us/en/products/smart-body-analyzer to their Twitter account. Figure 1 shows an anonymised example tweet. We analyze data for 897 Twitter users who (i) not only have auto-generated fitness tweets from Withings, and apps such as Fitbit, or Nike, but (ii) also have “normal” tweets not generated by fitness apps.
We are interested to see if there is a link between these two data sources and, say, if a user’s weight can be inferred from their general social media behavior. Do users with a larger body weight somehow tweet differently? If a link can be found, then this opens up new opportunities as it hints at the “clinical relevance” of social media data. In particular, we envision that social media could be used as one building block, together with QS data and electronic health records, to devise more personalized, holistic interventions that take a user’s life style into account . For example, a doctor could be provided with information about the personality of the individual from their tweets or the influences on the individual from their social circles that need to be taken into account (and, in some cases, overcome) while understanding and prescribing diet and exercise interventions.
Combining QS data and social media data also helps to overcome one frequent shortcoming of health-related studies: a lack of individual-level ground truth. While county-level health statistics, such as obesity rates, are readily available, it is much harder to obtain a set of Twitter users with a known weight, ideally also traceable over time. Despite the obvious limitations due to selection bias, users who link their QS data to their public social media account still provide a valuable data set.
At the same time, there are technical challenges that need to be addressed for a successful data fusion. Social media data is famously noisy due to internet lingo, spam and bots, and data incompleteness resulting from API limitations. The QS data also has its share of issues as users share their scale with friends666We even observed an instance where, apparently, the weight of a cat was recorded. or they might weigh themselves both before and after eating a large meal or going to the rest room. The combined data creates additional challenges due to its heterogenous nature: a textual stream (and more) from normal tweets, and a time series of weight data from QS tweets. We help address some of these issues by describing a method to remove implausible weigh-in data points.
Apart from technical challenges, the ethical challenges are at least as daunting. Though legally “public” data, tweets still often contain information that many would consider “private”. This is arguably due to a misconception of the perceived audience (i.e., a user’s Twitter followers) vs. the actual audience (i.e., data scientists around the world). Such concerns are amplified for the domain of medical data.
We believe that tackling these challenges is well worth the effort, especially as our initial results are promising. Concretely, we find that social media data can predict a user’s average weight with an R-squared around . We also find that the QS data collected through the Twitter stream is valuable by itself for population-level health analysis. For example, it lets us paint a picture of weight transition across the year – yes, it goes up over Christmas and New Year – and of “dieting morale” across the week – users weigh themselves most often on weekends which, ironically, is when they are least likely to do something about their weight. Given that we would not have been able to collect this data otherwise, this is a further advantage of looking at the intersection between QS and social media.
2 Related Work
To the best of our knowledge, there is very little prior work that combines social media and QS data at the level of the individual. The vision described by Estrin  definitely includes a combination of data sources but no study on such data seems to have been performed to date. Vickey and Breslin  report a system-level study of how fitness app data is shared on Twitter, but they do not include a user-level study that links a user’s normal Twitter data with their fitness data. The StudentLife Project777http://studentlife.cs.dartmouth.edu/  uses a mobile phone app to collect detailed activity data which is linked to academic performance. This data also includes Facebook profiles, though these are not part of the publicly shared data set.
Concerning work more closely related to obesity and weight loss, studies using social media typically take a population-level, public health approach. Culotta  used geo-tagged tweets and Abbar et al.  used food-related tweets to predict geographical differences in obesity and diabetes. Though including “normal life” in their analysis, they use county-level data as “ground truth” for obesity. By using quantified self data, we can obtain weight-related information at the individual level.
There is also a body of work that studies specialized social media, such as online weight loss forums . Particular attention has been given to predicting weight loss from interactions in the social network. Chomutare et al.  showed that high levels of activity in online obesity communities and being connected to several disparate sub-communities were both predictive of weight loss. They observed that the network structure properties were more useful in predicting weight loss than the biographical information associated with the users. Li et al.  take this a step further by studying the problem of recommending a “good” friend within the context of a weight loss app. Brindal et al.  showed that the inclusion of a social networking platform did not have additive effects with respect to weight loss or retention. However, these inclusions resulted in patients using their weight loss system for a longer duration. However, in their experiments, greater use of a weight tracker tool resulted in greater loss. Though our work only looks at the combination of social media and QS for data collection, their work provides evidence for benefits for health interventions as well.
3 Data Collection
To collect our data, we use the Twitter Streaming API888https://dev.twitter.com/streaming/overview for three weeks in Oct 2015 collecting tweets containing keywords, “lb” or “kg”. Note that this broad pattern captures data both weight-related QS tweets and other tweets in several languages. We also use the Topsy API999https://otter.topsy.com/search.json?q=kg+lb, no longer supported. to gather all obtainable historical tweets containing these keywords. These tweets were then post-filtered such that only tweets being generated by “WiTwit” were kept.101010“WiTwit” is the “source” field used in Tweets generated by WiThings’ smart scale. For each unique user who generated these tweets, we obtained (i) (up to) 3,200 of their most recent tweets, (ii) their self-generated profile known as “bio”, (iii) the lists of their friends and followers, and (iv) the bios of their friends and followers.
Inspecting the data, we observed that a large fraction of users only tweeted their weight or other automatic fitness tweets. These specially created accounts, potentially as a sort of personal fitness log, were not of interest for us as, apart from a time series of weigh-ins, there was no other social media data to better understand the user. We therefore imposed an additional filter by requiring (i) each user to have at least 10 “normal” tweets not automatically generated by one of WiTwit, or FitBit 111111The full list of apps considered for this is MyFitnessPal, Fitbit, Withings, Lose It! and Nike.. and (ii) having at least ten weigh-in tweets automatically generated by WiTwit.
As we additionally wanted to make sure that users, at least potentially, have social interactions on Twitter, we further required all users to have at least 50 friends and followers, for individual analysis. The cutoff of 50 was chosen based on manual inspection. For example, a specific user with 60 friends and 41 followers, just below the cutoff, only published 269 tweets, including only 30 normal ones. Another user with 657 friends and 55 followers, just above the cutoff, published 837 tweets, including 746 normal ones. In total, we excluded 467 users, and with 430 users remaining after filtering for social interactions.
3.1 Identification of Fitness Tweets Generated By Fitness Apps
As mentioned earlier, a large fraction of users in our data set had additional automatically-generated fitness tweets, apart from the ones from Withings. We collected these tweets separately as they hold additional, valuable QS information. In order to identify these tweets, we check the source field of each individual tweet. The source field indicates the tool used to post the Tweet. For auto-generated tweets, the source field provides the name and the URL of the corresponding app, such as WiTwit, Runkeeper, or Fitbit. Table 1 shows the patterns of automatic fitness tweets we have used in this paper. 300 out of 897 users in our dataset have at least one automatic fitness tweet.
|Original Weight Loss||WiTwit|
|Other Weight Loss||Lose It!, SimpleWeight|
4 Linking Social Media Behavior and Quantified-Self Data at the Individual Level
In this section, we utilize users’ online social activities to predict their body weight. Their weight is measured by Withing scales and we assign each person the average of all of their recorded weigh-ins as their reference weight. Two types of Twitter data sources are utilized as features to predict this weight: (i) their self-description (also known as bio) and (ii) their tweets. All non-English content was translated to English using Google’s machine translation 121212https://translate.google.com.
Upon inspection of the data, we found that some users share their Withing scales with their family members. To detect and clean such weigh-in series generated by multiple people, we apply a formula for “plausible weight transitions”: for a given user, a weight transition from weight w(i) to w(i+1) [in pounds] between days d(i) and d(i+1) is “plausible” if . In words, we allow for up to 4lb of weight fluctuation within one day and 1lb for each day passed. Note that 1lb of body fat is roughly equivalent to 3,500kcal. Though larger fluctuations are possible, especially due to excessively storing or losing liquid, we decided to err on the side of caution, rather than including too much erroneous data. Users with more than three plausibility violations were excluded. In addition, we observed that some users reported suspiciously low or high some weights such as 12 lbs or 400 lbs. Therefore, users whose average weight is either smaller than 100 lbs or larger than 300 are treated as outliers and excluded from our analysis. As a point of reference, the average of the individual average weights was 178.4lb. Note that this is very close to the 2012 average weight of adults in North America of 177.9lb131313https://en.wikipedia.org/wiki/Human_body_weight#By_region, indicating that our data might not be as odd and biased as one might imagine. After applying the data filtering explained above, we use the remaining 391 users to build a prediction model.
In order to capture and summarize a user’s social media content, we utilize two existing dictionaries that have undergone psychometric validation. The first is the Linguistic Inquiry and Word Count (LIWC) dictionary141414We use the LIWC2007 dictionary in this paper.  with 64 categories, and the second is the PERMA151515PERMA is a mnemonic for Positive emotion, Engagement, Relationships, Meaning, and Achievement — the five elements of well-being. dictionary  with ten categories. Both dictionaries map terms to a set of categories such as “social”, “health” and “body” in the case of LIWC , or “positive emotion”, “engagement” and “meaning” in the case of PERMA. For example, PERMA maps the term “distract” to “negative emotion”; and LIWC maps the term “brother” to “social”. We applied this mapping both to a user’s normal tweets161616Automatically generated fitness tweets and weigh-in tweets are excluded. and their bio. For boosting the model performance, we also add Bag of Word features.
In order to quantify and interpret the effects of different indicators, we fit a support vector machine model with a linear kernel to predict their personal weight at the individual level. All the social activity features (except their actual weight) have been linear max-min scaled to [0,1]. The top 15 indicators of the support vector machine model with linear kernel for each direction are shown in Table 2.
Given the model in Table 2, it is worth looking at which Twitter features are most predictive of a person’s weight. People with higher actual weight mentioned more ingest words (Tweet_LIWC _ingest), such as food, dish, and eat, in their self description. This might suggest that people who publicly express their love for food have a higher probability to be overweight. In addition, users with a lower weight use more words regarding biological process, (Tweet _LIWC_bio), such as “eat”, or “body”, than their heavier counterparts. Previous research shows that successful weight management is linked to health awareness 171717http://www.health.harvard.edu/exercise-and-fitness/lose-weight-and-keep-it-off, which matches our findings. We observed a number of other top indicators (such as for the categories ppron, social or affect.), but these are admittedly hard to interpret. We hope that our observations help other researchers to form hypotheses around these to test in more depth.
|Gaussian Process + BoW||0.55||23.00||29.04|
|SVM (Linear Kernel)||0.34||26.65||33.31|
|Gaussian Process + BoW||0.55||22.96||28.99|
|SVM (Linear Kernel)||0.34||26.81||33.47|
Table 3 shows the weight prediction performance. We evaluate the model performance by three measures: correlation coefficient (R), Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The MAE represents an average of the absolute errors; and the RMSE shows the standard deviation of errors. They are evaluated for 391 users using 10-fold cross validation. Specifically, the baseline is shown model performance without Tweet and Bio features; and the language split model is built by splitting the data by languages (English V.S. Japanese).
Initially, using only information from normal tweets, the Gaussian Process model explains around 25% of the variance in average weight. This indicates that there is a link that’s worth exploring between QS data and social media data. Furthermore, by adding their self-description data, the performance (Table 3) drops a little.
5 Using Quantified Self Data at the Population Level
So far, all of our analysis has linked QS and social media data at the individual level. Here, we look at population-level patterns that can be obtained by using the QS weight information obtained through Twitter. Specifically, we explore the patterns of quantified-self data across days-of-weeks or months-of-years on a larger dataset, including 897 users.
5.1 Trends in QS Behavior Across Days-of-Week
Table 4 reports the frequency of automatic generated weigh-in and fitness tweets in our dataset. For comparison, we also show information for Google search volumes181818https://www.google.com/trends/explore for the three queries “BMI”, “weight loss” and “diet”, summarized for days-of-week from September 2015 to November 2015.
There are clear weekly patterns detectable for both the QS and the Google Trends data. Put simply, users are most aware of their weight on Saturdays with, by far, the largest number of weigh-ins. However, this is also the day where they are least likely to “take action” as defined by (i) generating fitness tweets or (ii) searching for diet-related information. By contrast, “corrective action” seems to be most likely on Mondays. Consistent observations were made by Weber and Achananuparp  who observed that the number of users logging their meals with MyFitnessPal is highest (lowest) on Mondays (Saturdays). Of those users that log their meals, the fraction consuming more than their self-set calorie goals is also lowest (highest) on Mondays (Saturdays). Based on this one could say that Saturday is everyone’s “cheat day”.
5.2 Trends in QS Behavior Across Months-of-Year
Whereas the previous section looked at weigh-in trends across a week, we look at actual weight changes across a year. For each person, we compare their average weight within a given month to their global average weight. For each month, these weight changes are then averaged across all users. Mean and error bar of weight changes for different months are shown in Figure 2. Monthly patterns are observed showing that users gain weight during winter, from a low in October to a high in January, before starting to lose weight again. Though the observed jump of just under 1lb during the holiday seasons might appear lower than intuition would suggest, this value is perfectly in line with the results of a meta analysis of weight gain over Christmas .191919Also see http://tinyurl.com/z7r4c5s for more information on this topic.
Similar to our week-based analysis, we wanted to see if there is a link between QS data and topical interest as observed through Google Trends. Figure 3 presents the mean and the error bar of Google search scores aggregated by month from 2005 to 2015. For all three of our search terms, search activity is highest in January, possibly due to New Year’s resolutions. Overall, the search volume changes more abruptly from December to January than the actual weight (see Figure 2). So whereas users slowly put on pounds from October to January, there appears to be a sudden change in weight loss intent from December to January – assuming that the selected Google search terms do indeed measure “weight loss intent”.
6 Discussion and Limitations
The data analyzed for this study does not come from a randomized trial or from a representative sample of the population. Users who choose to publicly tweet their weight are likely to differ from a “normal” user trying to loose weight, though (i) our population’s average weight and (ii) the weight gain over Christmas were surprisingly close to known values. Weber and Mejova  show that, with a certain amount of noise, a user’s body weight or at least classes such as “overweight or not” can be inferred from their Twitter profile pictures. We are not relying on such noisy labels but, basically, we are trading a loss in recall for an increase in precision.
A considerable fraction of users, 198 out of 391, had chosen Japanese as their interface language and, correspondingly, many of their tweets were not be in English. Google’s automatic translation might introduce errors, though for tasks such as sentiment analysis machine translation typically performs sufficiently well .
In this paper, we presented a study that combines quantified self data from internet-enabled smart scales with general social media data on Twitter. We used this combination of data sources to predict a user’s weight using only their social media activity. Our data also capture weekly patterns, such as a peak of weigh-in activity on Saturday, and monthly patterns, such as a weight increase over Christmas. We believe that such a data fusion between messy, general life style social media data and very accurate, longitudinal quantified self data has great potential to improve personalized health care.
This is a preprint of an article appearing at ACM DigitalHealth 2016.
-  S. Abbar, Y. Mejova, and I. Weber. You tweet what you eat: Studying food consumption through twitter. In CHI, pages 3197–3206, 2015.
-  C. Banea, R. Mihalcea, J. Wiebe, and S. Hassan. Multilingual subjectivity analysis using machine translation. In EMNLP, pages 127–135, 2008.
-  E. Brindal, J. Freyne, I. Saunders, S. Berkovsky, G. Smith, and M. Noakes. Features predicting weight loss in overweight or obese participants in a web-based intervention: randomized trial. J. Med. Internet Res., 14(6):e173, 2012.
-  T. Chomutare, A. Xu, and M. Iyengar. Social network analysis to delineate interaction patterns that predict weight loss performance. In CBMS, pages 271–276, 2014.
-  A. Culotta. Estimating county health statistics with twitter. In CHI, pages 1335–1344, 2014.
-  D. Estrin. Small data, where n = me. CACM, 57(4):32–34, 2014.
-  H. Haddadi, F. Ofli, Y. Mejova, I. Weber, and J. Srivastava. 360-degree quantified self. In ICHI, pages 587–592, 2015.
-  A. Li, E. W. Ngai, and J. Chai. Friend recommendation for healthy weight in social networks: A novel approach to weight loss. Industrial Management & Data Systems, 115(7):1251–1268, 2015.
-  V. Li, D. W. McDonald, E. V. Eikey, J. Sweeney, J. Escajeda, G. Dubey, K. Riley, E. S. Poole, and E. B. Hekler. Losing it online: Characterizing participation in an online weight loss community. In GROUP, pages 35–45, 2014.
-  J. Pennebaker, J. Francis, and R. Booth. Linguistic inquiry and word count: Liwc 2001, 2007. http://homepage.psy.utexas.edu/homepage/faculty/pennebaker/reprints/liwc2007_operatormanual.pdf.
-  M. Seligman. Flourish: A Visionary New Understanding of Happiness and Well-being. Atria Books, 2012.
-  T. A. Vickey and J. G. Breslin. A study on twitter usage for fitness self-reporting via mobile apps. In AAAI-SS, 2012.
-  C. Vorland. Holidays & weight gain: what the science suggests, 2011. http://nutsci.org/2011/11/22/holidays-weight-gain-what-the-science-suggests/.
-  R. Wang, F. Chen, Z. Chen, T. Li, G. Harari, S. Tignor, X. Zhou, D. Ben-Zeev, and A. T. Campbell. Studentlife: Assessing mental health, academic performance and behavioral trends of college students using smartphones. In UbiComp, pages 3–14, 2014.
-  I. Weber and A. P. Achananuparp. Insights from machine-learned diet success prediction. In PSB, pages 540–551, 2016.
-  I. Weber and Y. Mejova. Crowdsourcing health labels: Inferring body weight from profile pictures. 2016.