Assessing public health interventions using Web content
Public health interventions are a fundamental tool for mitigating the spread of an infectious disease. However, it is not always possible to obtain a conclusive estimate for the impact of an intervention, especially in situations where the effects are fragmented in population parts that are under-represented within traditional public health surveillance schemes. To this end, online user activity can be used as a complementary sensor to establish alternative measures. Here, we provide a summary of our research on formulating statistical frameworks for assessing public health interventions based on data from social media and search engines (Lampos et al., 2015 (21); Wagner et al., 2017 (38)). Our methodology has been applied in two real-world case studies: the 2013/14 and 2014/15 flu vaccination campaigns in England, where school-age children were vaccinated in a number of locations aiming to reduce the overall transmission of the virus. Disease models from online data combined with historical patterns of disease prevalence across different areas allowed us to quantify the impact of the intervention. In addition, a qualitative evaluation of our impact estimates demonstrated that they were in line with independent assessments from public health authorities.
Data generated directly or indirectly by online users —also simply referred to as user-generated data (UGC)— can reveal a significant amount of information about their offline behaviour and status. In fact, many recent research efforts have leveraged social media content or search engine usage to address interesting questions in a number of domains, ranging from the Social Sciences (2; 9; 13) to Psychology (14; 24; 36) and Health (5; 10; 19).
Drawing our focus on health-oriented applications, one of the most prominent research tasks has been the derivation of Web-based syndromic surveillance models for infectious diseases. Modelling influenza-like illness (ILI) rates was the first successful example (7; 10; 18; 32), followed by other conditions (4; 11; 35), including mental health disorders (3; 5). Criticisms regarding the accuracy of the original disease models (23; 28) have been resolved in follow-up studies by deploying more elaborate approaches (15; 20; 22). One of the key motivations behind all the aforementioned works has been the potential of adopting UGC as a complementary sensor to doctor visits or hospitalisations, which are the main sources of information in traditional public health surveillance networks. An other important factor is that online data could provide access to the bottom of a disease pyramid, i.e. cases of infection present within specific demographies that are not well represented otherwise.
In this work, we go beyond disease modelling by proposing a statistical framework for assessing the impact of a health intervention (against an infectious disease) based on online information. Public health interventions, such as improved sanitation, immunisation programmes or, simply, the promotion of health literacy, assist in reducing the risk of various infections (6; 27). However, the absence of routine evaluation systems for such interventions together with the general deficiencies of the existing disease surveillance schemes (e.g. under-represented parts of the populations), enables only partial assessments, especially in situations where interventions are targeting a seasonal disease that is not characterised by the magnitude of a pandemic.
We evaluate our algorithm against two real-world public health interventions. These are two vaccination campaigns against flu launched in England during 2013/14 (Phase A) and 2014/15 (Phase B). Live attenuated influenza vaccines (LAIV) were administered to school age children in various pilot locations, recognising that children are key factors in the transmission of the influenza virus in the general population (31). In Phase A, the vaccine was offered to primary school children (4-11 years) only (29), whereas in Phase B it was also offered to children from secondary schools (11-13 years) as well as in an expanded set of locations (30).
Data from Microsoft’s search engine, Bing, and the microblogging service of Twitter are used as the main observations for the proposed impact assessment framework. We deploy nonlinear supervised learning techniques using composite Gaussian Process kernels to model the time series of text frequencies in relation to disease rates in the population. We then utilise this disease model to uncover linear relationships between the disease rates in areas of interest during a time period prior to the intervention. Finally, we exploit this relationship to estimate a projection of disease rates to affected areas had the intervention not taken place. Our analysis yields interesting results, indicating that the intervention reduced ILI rates by more than in Phase A locations and by approximately in primary school areas in Phase B. Both estimates that are in agreement with independent assessments by Public Health England (PHE) (29; 30).111They are in agreement in principle as direct comparisons are not valid.
We briefly describe our approach for modelling disease rates from user-generated text and provide an overview of our statistical framework for assessing the impact of a public health intervention.
The estimation of disease rates from online textual information is formulated as a supervised learning task, , where represents the frequency of textual terms over time intervals, and is the disease rate at the same time intervals (as obtained by a public health authority). Provided that nonlinear models tend to outperform linear ones in text regression tasks (20; 17; 33), we composed and applied a Gaussian Process (GP) kernel for capturing the structure of our observations. GPs are defined as random variables any finite number of which have a multivariate Gaussian distribution. GP methods aim to learn a function : that is specified through a mean and a covariance (or kernel) function, i.e. , where and (both ) denote rows of the input matrix ; for a detailed description of GPs, we refer the reader to (34). By setting , a common practice in GP modelling, we just learn the hyper-parameters of the kernel. We define the following abstract kernel formulation:
where can be any compatible GP kernel in the literature (we use the Rational Quadratic and the Matérn covariance functions in (21) and (38) respectively) that is applied on categories (or clusters) of textual features,222We use categories of textual features based on the number of tokens (1 to 4). and captures noise.
|Phase||Data Source||Target Locations ()||Num. of Control Locations ()||Disease Reduction Rate % ()|
|A (2013/14)||All locations||-32.72|
|B (2014/15)||All locations|
|Primary school cohort||-16.97|
|Secondary school cohort|
|Primary & secondary|
Our methodology for assessing the intervention’s impact, influenced by the work presented in (16), will utilise the above disease rate model. It is presented in detail in Alg. 1. Assume that there is a set of target areas , where the intervention is applied, and a set of control areas , where the intervention has no effect. We firstly compute disease rate estimates for all areas as well as all possible subsets of them (, ) from UGC. Ideally, for a target area we wish to compare the disease rates during (and slightly after) the intervention with disease rates that would have occurred, had the intervention not taken place. Of course, the latter information can only be estimated. Focusing on target-control area pairs with strong linear correlations () in historical disease rates prior to the intervention (), we hypothesise that this relationship would have been maintained in the absence of an intervention. Therefore, we can learn a linear model () that estimates the disease rates in a target area based on the disease rates of a control area with data prior to the intervention. Then, we can use this model to project disease rates in a target area during the intervention period (), but had the intervention not taken place. Finally, we can quantify the impact of the intervention by computing the relative percentage of difference () between the actual estimated disease rates (from UGC) and the projected ones. Confidence intervals for can be derived via bootstrap sampling (8), and in particular by both sampling (with replacement) the linear regression’s residuals (from ) as well as the input data. Provided that the distribution of the bootstrap estimates is unimodal and symmetric, we assess an outcome as statistically significant, if its absolute value is higher than two standard deviations of the bootstrap estimates.
3. Results and Discussion
We first provide a brief overview of the data sets used in our analysis. We then summarise the outcomes of the intervention’s impact assessment in both vaccination campaigns (Phase A and B). Finally, we propose potential directions for future research.
3.1. Data Sets
For the 2013/14 vaccination campaign (Phase A), we considered target and control areas (see Table 1 in (21)). We extracted million tweets (May, 2011 to April, 2014), million of which contained flu-related -grams.333We used approximately -grams, listed in the supplementary material of (21). We additionally obtained search query data (December, 2012 to April, 2014) for a smaller time period due to user privacy regulations, which contained approx. million flu-related queries. As the campaign expanded in 2014/15 to include more locations (Phase B) and different school-age children groups, the number of target locations increased to ( primary, secondary, and primary and secondary school cohorts), and control areas were deployed (see Table 1 in (38)). For this period, we extracted million tweets geolocated in England (August, 2011 to August, 2015). This analysis did not use any search engine data. Historical ILI rates at a national level for England were obtained from the Royal College of General Practitioners, representing the number of ILI cases per people from 2011 to 2015.
3.2. Intervention Impact Assessment
A GP, as described in Section 2, was used for modelling ILI rates from UGC since it outperformed linear alternatives, namely ridge regression (12) and elastic net (40). Using a 10-fold cross validation, the mean absolute error (MAE) for the Twitter-based model during Phase A was equal to (per people) with an average Pearson correlation of , whereas the model used in Phase B (trained and tested on more data) resulted to a MAE of and . The model trained on Bing data (Phase A) outperformed other models on average (, ), but at the same time was tested on a significantly shorter time span.444A more detailed performance evaluation is provided in Section 4.1 of (21).
To assess the impact of the LAIV campaign, we first needed to identify control areas with estimated ILI rates that were strongly correlated to rates in the target vaccinated locations before the start of the intervention. In doing so for Phase A (2013/14), we looked for correlated areas in a pre-vaccination period that included the previous flu season only (2012/13). The reason for this was that the strains of influenza virus may vary between distant time periods (37) and thus, disease rates may be non homogeneous. For Phase B (2014/15), however, we could not anymore use the previous flu season to establish relationships, given that the Phase A campaign had already violated the assumed geographical homogeneity for 2013/14. Thus, we resided to using the period 2011/13555Includes two flu seasons from August, 2011 to August, 2013. based on the fact that the circulated flu strains were not characterised by any significant anomalies. Nevertheless, that resulted in less robust estimates as indicated by our bootstrap sampling analysis (which yielded many of them as not statistically significant) and, taking into account the one-year gap between training and applying, perhaps less accurate projections as well.
A summary of the overall impact assessments is provided in Table 1, where outcomes in bold are statistically significant. During Phase A, both data sets (Twitter and Bing) point to significant reductions of disease rates, i.e. from (Bing) to (Twitter) on average. A subsequent sensitivity analysis (see Table 4 in (21)), where more than one control areas were used to project disease rates indicated that results from Twitter were generally more robust, with the overall impact estimate () being the most consistent one. PHE’s own impact estimates compared vaccinated to all non vaccinated areas, and ranged from based on sentinel surveillance ILI data to using laboratory confirmed influenza hospitalisations. Note though that these numbers represent different levels of severity or sensitivity, and notably none of these computations was statistically significant (29). As a further evaluation point, we observed an analogy between the actual level of vaccine uptake and the estimated impact from our end for a number of areas.
In Phase B, our analysis indicated that areas where primary school children were vaccinated benefited the most with an estimated of . However, for the current implementation of the secondary school only vaccination programme, there was no clear evidence of any population wide effect. Both these conclusions are in line with findings of previous studies and complement traditional surveillance sources in exhibiting community wide effects of the LAIV pilot campaign (29; 30).
3.3. Future Work
Our approach faces common limitations of research efforts based on unstructured user-generated text. Better methods that automate the semantic interpretation of language can be deployed to derive more accurate results. In fact, in follow-up works, we have proposed techniques that are capable of combining the text statistics (e.g. frequency time series) with a word embedding representation (39; 22; 26; 25). A further, perhaps more significant limitation, is that the entirety of this work relies on the existence of ground truth. Knowing historical disease rates is essential in order to train a disease model from UGC. However, this may not be possible for places with less established healthcare systems or for new infectious diseases. In addition, even when syndromic surveillance can provide estimates for the prevalence of a disease, it is very likely that these will incorporate demographic biases, carrying them over to any supervised model. Thus, there is a necessity to establish unsupervised disease indicators from UGC. This is a harder problem as it will be difficult to evaluate solutions and one will need to account for the specific demographic biases of the online users in order to produce any viable conclusion. Nevertheless, ongoing work will focus on resolving these issues as well as investigating the framework’s applicability in assessing different types of a public health intervention.
Acknowledgements.This work presented in this extended abstract has been supported by the grant EP/K031953/1 (EPSRC, “i-sense”).
- Bakshy et al. (2015) E. Bakshy, S. Messing, and L. A. Adamic. 2015. Exposure to ideologically diverse news and opinion on Facebook. Science 348, 6239 (2015), 1130–1132.
- Benton et al. (2017) A. Benton, M. Mitchell, and D. Hovy. 2017. Multitask Learning for Mental Health Conditions with Limited Social Media Data. In Proc. of EACL ’17. 152–162.
- Chan et al. (2011) E. H. Chan, V. Sahai, C. Conrad, and J. S. Brownstein. 2011. Using Web Search Query Data to Monitor Dengue Epidemics: A New Model for Neglected Tropical Disease Surveillance. PLOS Negl. Trop. Dis. 5, 5 (2011), e1206.
- Choudhury et al. (2013) M. De Choudhury, M. Gamon, S. Counts, and E. Horvitz. 2013. Predicting Depression via Social Media. In Proc. of ICWSM ’13. 128–137.
- Cohen (2000) M. L. Cohen. 2000. Changing patterns of infectious disease. Nature 406, 6797 (2000), 762–767.
- Culotta (2010) A. Culotta. 2010. Towards Detecting Influenza Epidemics by Analyzing Twitter Messages. In Proc. of the Workshop on Social Media Analytics. 115–122.
- Efron and Tibshirani (1994) B. Efron and R. J. Tibshirani. 1994. An Introduction to the Bootstrap. CRC press.
- Gil de Zúñiga et al. (2012) H. Gil de Zúñiga, N. Jung, and S. Valenzuela. 2012. Social Media Use for News and Individuals’ Social Capital, Civic Engagement and Political Participation. J. Comput. Mediat. Commun. 17, 3 (2012), 319–336.
- Ginsberg et al. (2009) J. Ginsberg, M. H. Mohebbi, R. S. Patel, et al. 2009. Detecting influenza epidemics using search engine query data. Nature 457, 7232 (2009), 1012–1014.
- Gomide et al. (2011) J. Gomide, A. Veloso, W. Meira, Jr., V. Almeida, F. Benevenuto, F. Ferraz, and M. Teixeira. 2011. Dengue Surveillance Based on a Computational Model of Spatio-temporal Locality of Twitter. In Proc. of WebSci ’11. 1–8.
- Hoerl and Kennard (1970) A. E. Hoerl and R. W. Kennard. 1970. Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12 (1970), 55–67.
- Kosinski et al. (2013) M. Kosinski, D. Stillwell, and T. Graepel. 2013. Private traits and attributes are predictable from digital records of human behavior. Proc. Natl. Acad. Sci. 110, 15 (2013), 5802–5805.
- Kramer et al. (2014) A. D. I. Kramer, J. E. Guillory, and J. T. Hancock. 2014. Experimental evidence of massive-scale emotional contagion through social networks. Proc. Natl. Acad. Sci. 111, 24 (2014), 8788–8790.
- Lamb et al. (2013) A. Lamb, M. J. Paul, and M. Dredze. 2013. Separating Fact from Fear: Tracking Flu Infections on Twitter. In Proc. of NAACL ’13. 789–795.
- Lambert and Pregibon (2008) D. Lambert and D. Pregibon. 2008. Online Effects of Offline Ads. In Proc. of the 2nd International Workshop on Data Mining and Audience Intelligence for Advertising. 10–17.
- Lampos et al. (2014) V. Lampos, N. Aletras, D. Preoţiuc-Pietro, and T. Cohn. 2014. Predicting and Characterising User Impact on Twitter. In Proc. of EACL ’14. 405–413.
- Lampos and Cristianini (2010) V. Lampos and N. Cristianini. 2010. Tracking the flu pandemic by monitoring the Social Web. In Proc. of CIP ’10. 411–416.
- Lampos and Cristianini (2012) V. Lampos and N. Cristianini. 2012. Nowcasting Events from the Social Web with Statistical Learning. ACM Trans. Intell. Syst. Technol. 3, 4 (2012), 1–22.
- Lampos et al. (2015a) V. Lampos, A. C. Miller, S. Crossan, and C. Stefansen. 2015a. Advances in nowcasting influenza-like illness rates using search query logs. Sci. Rep. 5, 12760 (2015).
- Lampos et al. (2015b) V. Lampos, E. Yom-Tov, R. Pebody, and I. J. Cox. 2015b. Assessing the impact of a health intervention via user-generated Internet content. Data Min. Knowl. Discov. 29, 5 (2015), 1434–1457.
- Lampos et al. (2017) V. Lampos, B. Zou, and I. J. Cox. 2017. Enhancing Feature Selection Using Word Embeddings: The Case of Flu Surveillance. In Proc. of WWW ’17. 695–704.
- Lazer et al. (2014) D. Lazer, R. Kennedy, G. King, and A. Vespignani. 2014. The Parable of Google Flu: Traps in Big Data Analysis. Science 343, 6176 (2014), 1203–1205.
- Manago et al. (2012) A. M. Manago, T. Taylor, and P. M. Greenfield. 2012. Me and my 400 friends: The anatomy of college students’ Facebook networks, their communication patterns, and well-being. Dev. Psychol. 48, 2 (2012), 369–380.
- Mikolov et al. (2013a) T. Mikolov, K. Chen, G. S. Corrado, and J. Dean. 2013a. Efficient Estimation of Word Representations in Vector Space. In Proc. of the ICLR, Workshop Track. 1–12.
- Mikolov et al. (2013b) T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. 2013b. Distributed Representations of Words and Phrases and their Compositionality. In Advances in NIPS 26. 3111–3119.
- Nutbeam (2000) D. Nutbeam. 2000. Health literacy as a public health goal: a challenge for contemporary health education and communication strategies into the 21st century. Health Promot. Int. 15, 3 (2000), 259–267.
- Olson et al. (2013) D. R. Olson, K. J. Konty, M. Paladini, et al. 2013. Reassessing Google Flu Trends Data for Detection of Seasonal and Pandemic Influenza: A Comparative Epidemiological Study at Three Geographic Scales. PLoS Comput. Biol. 9, 10 (2013), e1003256.
- Pebody et al. (2014) R. Pebody et al. 2014. Uptake and impact of a new live attenuated influenza vaccine programme in England: early results of a pilot in primary school-age children, 2013/14 influenza season. Eurosurveillance 19, 22 (2014), 20823.
- Pebody et al. (2015) R. Pebody et al. 2015. Uptake and impact of vaccinating school age children against influenza during a season with circulation of drifted influenza A and B strains, England, 2014/15. Eurosurveillance 20, 39 (2015), 30029.
- Petrie et al. (2013) J. G. Petrie et al. 2013. Influenza Transmission in a Cohort of Households with Children: 2010-2011. PLoS ONE 8, 9 (2013), e75339.
- Polgreen et al. (2008) P. M. Polgreen, Y. Chen, D. M. Pennock, F. D. Nelson, and R. A. Weinstein. 2008. Using Internet Searches for Influenza Surveillance. Clin. Infect. Dis. 47, 11 (2008), 1443–1448.
- Preoţiuc-Pietro et al. (2015) D. Preoţiuc-Pietro, S. Volkova, V. Lampos, Y. Bachrach, and N. Aletras. 2015. Studying User Income through Language, Behaviour and Affect in Social Media. PLoS ONE 10, 9 (2015), e0138717.
- Rasmussen and Williams (2006) C. E. Rasmussen and C. K. I. Williams. 2006. Gaussian Processes for Machine Learning. MIT Press.
- Rohart et al. (2016) F. Rohart, G. J. Milinovich, S. M. R. Avril, K.-A. Lê Cao, S. Tong, and W. Hu. 2016. Disease surveillance based on Internet-based linear models: an Australian case study of previously unmodeled infection diseases. Sci. Rep. 6, 38522 (2016).
- Schwartz et al. (2013) H. A. Schwartz et al. 2013. Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach. PLoS ONE 8, 9 (2013), e73791.
- Smith et al. (2004) D. J. Smith et al. 2004. Mapping the antigenic and genetic evolution of influenza virus. Science 305, 5682 (2004), 371–376.
- Wagner et al. (2017) M. Wagner, V. Lampos, E. Yom-Tov, R. Pebody, and I. J. Cox. 2017. Estimating the Population Impact of a New Pediatric Influenza Vaccination Program in England Using Social Media Content. J. Med. Internet Res. 19, 12 (2017), e416.
- Zou et al. (2016) B. Zou, V. Lampos, R. Gorton, and I. J. Cox. 2016. On Infectious Intestinal Disease Surveillance Using Social Media Content. In Proc. of the 6th International Conference on Digital Health. 157–161.
- Zou and Hastie (2005) H. Zou and T. Hastie. 2005. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Series B Stat. Methodol. 67, 2 (2005), 301–320.