Learning to Address Health Inequality in the United States with a Bayesian Decision Network
Life-expectancy is a complex outcome driven by genetic, socio-demographic, environmental and geographic factors. Increasing socio-economic and health disparities in the United States are propagating the longevity-gap, making it a cause for concern. Earlier studies have probed individual factors but an integrated picture to reveal quantifiable actions has been missing. Amidst growing concerns about the further widening of healthcare inequality and differential access created by Artificial Intelligence, it is imperative to explore it’s potential for illuminating biases and enabling transparent policy decisions. In this work, we reveal actionable interventions for decreasing the longevity-gap in the United States by analyzing a County-level data resource with healthcare, socio-economic, behavioral, education and demographic features. We learn an ensemble-averaged structure, draw inferences using the joint probability distribution and extend it to a Bayesian Decision Network for identifying policy actions. We draw quantitative estimates for the positive roles of diversity, preventive-care quality and stable-families within the unified framework of our decision network. Finally, we make this analysis and dashboard available as an interactive web-application for enabling users and policy-makers to validate our insights on bridging the longevity-gap and explore the ones beyond reported in this work.
Inequities in healthcare impose an estimated burden of $300 billion per year in the United States [LaVeist, Gaskin, and Richard2011]. Longevity-gap is a collective effect of inequities such as economic, racial, ethnic, gender and environmental that lead to a decrease in life-expectancy. While the effects of inequities on longevity have been a subject of study for social scientists for decades, the society is now presented with an opportunity (or a threat) for the use of Artificial Intelligence (AI) in addressing (or propagating) healthcare inequities. Most of the AI research in healthcare has focused on developing discriminative models for automated diagnosis and predictions. Despite many promising applications, and the raised bar for demonstrating value in Clinical Trials [Abràmoff et al.2018], the opaque nature of many of these models has raised concerns about propagation of blind-spots resulting from human-biases [Cabitza, Rasoini, and Gensini2017, AS and Smith2018, Keane and Topol2018]. In this study we argue for the use of expressive, interpretable and explainable AI for understanding healthcare inequities themselves and for learning actions that may help mitigate these. We learn the complex interplay of factors that influence life-expectancy through a data-driven Bayesian Decision Network, i.e. a joint probabilistic graphical model (PGM) learned from data and extended to a decision framework for revealing policy actions. The key motivation for this study was to create a unified model that integrates geographical, socio-economic, behavioral, demographic and healthcare indices at county-level resolution in order to discover the skeleton and key drivers of the longevity-gap. Since graphical models are intuitive and interpretable, we enhance the utility of our model with an interactive web-application available to users for exploring and discovering further insights from our model.
Determinants of the Longevity-gap
While access to medical care is the most tangible factor for improving longevity, the latter is determined by a more nuanced interplay of social, behavioral, and economic factors. This interplay has been difficult to de-convolve so far because the non-availability of integrated datasets gathered over a long time-span and a high spatial resolution. Hence, various theories and models have assessed these factors in isolation. The Health Inequity Project, [Chetty et al.2016] compiled and released granular data at the County, Commute Zone, Core Based Statistical Area and State levels, and demonstrated the longevity gap attributable to income disparity in the United States. Income data from 1.4 billion Tax Records were combined with County-level Mortality (Census), Healthcare (Dartmouth Atlas, Small Area Insurance Estimates) Health-behaviors (CDC Behavioral Risk Factor Surveillance System, BRFSS), and Education (National Center for Education Statistics Common Core of Data & Integrated Postsecondary Education Data System) for the period covering 1999-2014, thus generating the most fine-grained data resource on socio-demographic determinants of longevity so far. Standard statistical analysis revealed significant correlations with longevity, however a bigger picture integrating the heterogeneous data into a single model has been lacking. Next, we review the formulation of problem as Bayesian structure-learning, inference, and decision networks, which we used in this study as a unified model to draw estimates and policy for addressing the longevity-gap. The problem was set up as a data-driven Bayesian decision network learning for discrete policy decisions and inferences.
Bayesian Decision Networks
We first establish the notational conventions before giving the definitions of Bayesian Networks (BNs) and Bayesian Decision Networks (BDNs). We will use to represent a node and to represent a set of nodes. The parent nodes of are represented as .
A Bayesian Network [Pearl1985] is a triple, defined over a set of random variables and consists of a directed acyclic graph (DAG), together with the conditional probability distributions, . The DAG, is made up of vertices and directed edges specifying the assumptions of conditional independence between random variables according to the d-separation criterion [Geiger, Verma, and Pearl1990]. The set of conditional probability distributions, constrains for each random variable . Since a Bayesian Network encodes the conditional independencies based on the d-separation criterion, it provides a compact representation for the data. The joint-probability distribution over a set of variables, can be factored as the product of probabilities of each node conditioned upon its parents.
The structure of a Bayesian network represents conditional independencies and can either be specified by an expert or can be learned from data. In data-driven structure-learning, the goal is to identify a model representing the underlying joint distribution of the data, . For simplification, we assume the faithfulness criterion, that is, the underlying joint probability distribution can be represented as a DAG. Structure-learning of the DAG can then be carried out with one of the three classes of algorithms, i.e. constraint-based, score based or hybrid algorithms. Since constraint-based methods rely on individual tests of independence, these are known to suffer from the problem of being sensitive to individual failures [Koller and Friedman2009]. Score based methods view the structure learning as a model-selection problem and are implemented as search and score-based strategy. These are computationally more expensive because of the super-exponential space of models, however recent theoretical developments have made these tractable [Koller and Friedman2009]. In this study, hill-climbing search was used along with the Bayesian Information Criterion scoring function, (BIC). The scoring function is a measure of goodness of each , for representing the data . The BIC scoring function takes the form
where is the sample size of the data , and is the likelihood - fit to the data. We chose the BIC scoring function as it penalizes complex models over sparse ones and is a good trade-off between likelihood and the model complexity, thus reducing over-fitting.
The hill-climbing search algorithm is a local strategy wherein each step, the next candidate neighboring structure of the current candidate is selected based on the highest score and is described as follows.
Initialize the structure .
Add, remove or reverse an arc at a time for modifying to generate a set G of candidate structures.
Compute the score of each candidate structure in G.
Pick the change with highest score and set it as new
Until the score cannot be improved.
The next task is to parametrize the learned structure with computed posterior marginal distribution of a variable given an evidence on a set of variables . The prior marginal distribution can be computed as
Using (1) in (2), we get
Further evidence can be incorporated as
For each , where is the evidence function for is the node representing . The likelihood is then defined as,
that is, the likelihood function of given . Application of Bayes rule and using (4) and (6), then yields
where the proportionality constant can be computed from by summation over .
This parametrization of the learned structure with posterior marginal conditional probabilities allows Inference, thus the task is called inference-learning. Inference learning is an extremely useful step that allow us to estimate marginal probabilities after setting evidences in the network. Depending upon the size of the network, inference-learning can be carried out with exact, i.e. closed form or approximate i.e. based upon Monte Carlo simulation methods. The details of these can be found in standard texts [Koller and Friedman2009]. In this work, we derived both exact estimates using the clique-tree algorithm and approximate estimates using rejection sampling to estimate conditional probabilities.
Bayesian Decision Network
A Bayesian Network can be extended to a decision network (BDN) with the addition of utility nodes . A BDN extends a BN in the sense that a BN is a probabilistic network for belief update, where as a BDN is a probabilistic network for decision making under uncertainty. Formally defined, a BDN is a quadruple BDN: and has three types of nodes
Chance nodes ( ) : Nodes which represent events not controlled by the decision maker.
Decision nodes ( ) : Nodes which represent actions under direct control of decision maker.
Utility nodes ( Nodes which represent the decision maker’s preference. These nodes can’t be parents of chance or decision nodes.
A decision maker interested in choosing best possible actions can either specify a structure and marginal probability distributions or use the model learned with structure and inference learning. The decision maker then ascribes a utility functions to particular states of a node. The objective of decision analysis is then to identify the decision options that maximize the expected utility.
The expected utility for each decision option is computed and the one which has the maximum utility output is returned. The mathematical formulation for the same can be given as follows; Let be the decision variable with options and is the hypothesis with states and is a set of observations in the form of evidence, then the utility of an outcome () is given by where is the utility function and the expected utility is given as:
Where is our belief in given . We select the decision option which maximizes the expected utility such that:
County-level data from The Health Inequality Project [Chetty et al.2016], available from https://healthinequality.org/data/ were used in this study. County-level characteristics (Online Data Table 12) were merged with County-level life expectancy estimates for men and women by income quartiles (Online Data Table 11). The merged table had data on 1559 Counties and in addition to Life-expectancy estimates, it included County-level features representing (1) Healthcare, such as quality of preventive care, acute care, percentage of population insured, Medicare reimbursements (2) Health behaviors, such as prevalence of smoking and exercise by income quartiles (3) Income and affluence of the area, such as median house Value and mean household income, (4) Socioeconomic features such as absolute upward mobility, percentage of children born to single-mothers, crime rate, (5) Education at the K-12 and post-secondary level, school expenditure per student, pupil-teacher ratios, test scores and income-adjusted dropout rates (6) Demographic factors, such as population diversity, density, absolute counts, race, ethnicity, migration, urbanization (7) Inequality indices, such as Gini Index, Poverty rate, Income segregation, (8) Social cohesion indices, such as social capital index, fraction of religious adherents in the county, (9) Labor market conditions, such as unemployment rate, percentage change in population since 1980, percentage change in labor force since 1980 and fraction of employed persons involved in manufacturing and (10) Local Taxation. Since the motivation of our work was to address the factors associated with income disparity, additional variables derived by us included (1) Q1 - Q4 longevity-gap, i.e., the difference in life-expectancy between income quartiles Q4 and Q1 in both males and females, (2) Mean pooled life-expectancy estimates across the income quartiles Q1 through Q4 in both males and females and (3) Pooled Standard-deviation of life-expectancy estimates across the Q1 through Q4 income quartiles and (4) Proportion of income quartiles Q1-Q4 in both males and females relative to the total population of the County. Data were non-missing for most of the variables, wherever missing these were imputed with a non-parametric method for imputation in mixed-type data [Stekhoven and Bühlmann2012] implemented in R language for Statistical Computing [R Development Core Team2011]. Discretization of continuous variables was done using an in-house code written in R which used k-means, frequency-based, quantile and uniform-interval based methods in that order of preference for each variable. For each case the number of discrete classes were fixed at three for the ease of interpretation of discrete policy decisions.
Structure-learning and Inference
Although the data were observational, our aim was to reason causally about some of the variables such as State which are known to have effects on healthcare inequities through different State policies [Fisher et al.2003a, Fisher et al.2003b, Gottlieb et al.2010, Murray et al.2006, Braveman et al.2010]. In order to encode this effect in the structure of our model, we black-listed all incoming edges to State, County and Core Based Statistical Area (CBSA) prior to start of structure-learning. The out-going edges from these nodes were allowed to be learned from the data by the structure learning algorithm. Structure learning using the hill-climbing search algorithm was repeated 1001 times on bootstrapped datasets using the R package bnlearn [Scutari2010] and the scoring function used was the Bayesian Information Criterion (BIC). The learned 1001 structures were ensemble-averaged using majority voting criteria to arrive at the ensemble-network (Figure 1). Both exact and approximate methods were used to estimate conditional probabilities and draw inferences using the package gRain [Jsgaard2012] and bnlearn [Scutari2010] respectively. Since approximate methods rely upon MCMC sampling, we utilized this fact to repeat approximate inferences 25 times for confidence estimates using the standard deviation of estimated posterior probabilities.
Bayesian Decision Network
We did not find any open source implementations that allowed a combination of data-driven structure learning with bootstraps, ensemble averaging and the extension of the learned structure to a decision network. Therefore, we wrote custom codes that allowed interfacing of the structure with available implementations that allow manual specification of a decision network [Dalton and Nutter2018]. We verified the consistency of network specifications and automated creation of decision networks along with their probability distributions. In order to learn optimal policy, preferences (between -1 to +1) were defined for the states of the Utility node. For example, for assessing the factors that could minimize the longevity-gap between Q4 and Q1 income quartiles, the maximum preference (+1) was assigned to the minimum longevity-gap level (see Fig. 2, 3). Decision nodes were specified on the basis of actionable interventions (e.g. quality of preventive care in the County). Gibbs sampling [Dalton and Nutter2018] was used to estimate the best combination of actions (policy) that maximized the expected utility. Figure.2 shows a part of the Decision network with setting of LE_Q_Disp_M (difference in longevity between highest and lowest income quartiles in males), a derived variable defined by us.
We deployed the learned model as an interactive web-application developed with shinydashboard package in R [Chang and Borges Ribeiro2018]. We mapped the inferences to States using the data of Global Administrative Areas (GADM) through the use of leaflet package [Cheng, Karambelkar, and Xie2018]. (Author’s Note: Screenshots of the web-application are provided in the appendix. The application will been made available on the github page linked to this study at the time of peer-reviewed publication.)
Results and Discussion
With the data-driven Bayesian Network, a generative model encoding the joint probability distribution over the factors influencing longevity, we queried the network for the following questions:-
(Q 1): What minimizes the longevity-gap between the lowest and the highest income quartiles in the Health Inequality Data?
(A 1): Population diversity of the County. We created the variables LE_Q_Disp_M and LE_Q_Disp_F as the difference in life-expectancy between income quartiles Q4 and Q1 in males and females. These represent the longevity-gap attributable to income-disparity in males and females. We observed that among all factors in the network, only cs_born_foreign, the proportion of foreign-born residents in the County was a parent-node of longevity-gap. It is important to note that there might be more associations when explored through pair-wise correlations, however BNs being joint probability models, adjust for confounder, mediator and collider biases [Pearl2011]. Thus the BN structure reveals that population diversity encapsulates all the factors that may be indirect influences upon longevity-gap. Exact inference with setting evidence on different levels of diversity revealed that males in Counties with lowest diversity had a 41% probability of Q4 - Q1 longevity-gap 9 years versus males living in Counties with high diversity which showed only 3% probability for Q4 - Q1 longevity-gap 9 years. The model structure also revealed that high diversity was a first-degree neighbor of higher median house value and second degree neighbor of proportion of high-earning females in the County. Interestingly, Chetty et al observed that beyond a certain threshold, increasing income does not yield proportionate gains in longevity, and our inference may be indicative of the same effect. Thus our model captures this observation along with more nuances in the network is able to provide quantitative estimates for these effects as inferred from the joint probability model.
(Q 2): What maximizes the mean life expectancy in males and females?
(A 2): Preventive Care Quality. We observed that med_prev_qual_z, the Index of Preventive Care, was the only parent-node of le_mean_pool_F and a grand-parent of le_mean_pool_M (variables derived by us from the Health Inequality Data). Estimates drawn through exact inference reveal that high quality Preventive Care med_prev_qual_z improves the probability of living beyond 85 years of age by a staggering 43% in females and 30% in males. Preventive Quality Indices (PQI) provide a proxy for healthcare quality of the system outside the hospital setting and were compiled from the Dartmouth Atlas as a part of The Health Inequality Project. PQIs are based upon ”ambulatory care sensitive conditions” (ACSCs) such as diabetes, i.e., conditions in which high-quality outpatient care or early interventions can prevent hospitalizations and complications. PQIs are used along with discharges for ASCS per thousand and the association between these was captured as a first-order relation between these variables in our graphical model. Therefore, improving PQIs is the most actionable step for increasing mean life-expectancy and for reducing economic burden due to hospitalizations as indicated by our model.
(Q 3): Which Preventive Care measure maximizes the probability of life-expectancy beyond 85 years?
(A 3): Annual Lipid Testing in the diabetic population. We asked this question as a policy-learning question from the perspective of maximizing the availability of these tests during the preventive care visits. This was pertinent as Medicare reimbursements were found to be drastically different across the states (visualized as a heatmap on the web-application) which in-turn were linked to the quality of preventive care. The data included PQI indicators only for diabetes and mammography. We set a high preference on high longevity as the utility node and PQIs as decision nodes. The policy table learned from simulations (Table 1) indicates that the payoff was maximized by focusing on Annual Lipid Testing in the proportion of population that was diabetic.
In addition to the directed questions, the graphical decision model allowed us to make the following observations:-
(O 1): Acute mortality and mean household-income. We observed that mean household income is a direct (and only) parent of 30-day hospital mortality index in our model. Areas with high mean household-income (greater than $45,000 p.a.) have a 30% less probability of having high 30-day Hospital Mortality Rate Index (greater than 0.92) as compared to areas with low mean household-income (less than $30,000 p.a.). Tracing the grand-children nodes of Hospital Mortality Rate Index, Pneumonia had the highest contribution to this effect among other available diseases including congestive heart failure and acute myocardial infarction.
(O 2): Smoking and mean life-expectancy. We observed that smoking was a child-node of mean life expectancy in males in the network, i.e. Counties with lower proportions of currently smoking males in Q4 income quartile showed a 30% increase in probability of living beyond 82 years.
(O 3): Education, Exercise, Obesity and Longevity. We observed that graduate level education cs_educ_ba was a major distributor of probabilistic influence in the network and was linked to obesity, exercise, income and unemployment rates. This indicates that a significant part of the effect of exercise and obesity can be apportioned to education as the driver of healthy behaviors and higher income. Although these findings are not surprising, our model reveals these in an transparent, unified manner and allows inference queries to estimate quantitative effects of these factors on health outcomes. For example, we estimated that areas with exercising populations, especially in Q1 income quartile, have a 19% lower probability of hospitalization rates in the highest band after correcting influences from other variables present in the data.
(O 4): Poverty breeds poverty, links with racial factors and stability of families. In addition to health-inequities, our model illuminated the social disparities and their indirect role in propagating health inequity through income disparities. We observed that high income-segregation (Gini index) was a parent of and negatively associated with absolute upward mobility i.e. the upward mobility in percentage of children born to lower quartile income parents. This indicates that poverty was associated with lower inter-generational mobility, thus perpetuating the socio-economic disparities in the society which are well studied in the United States [Levy and Wilson1989]. Our model also confirmed the univariate correlations between low social mobility being linked with lower family stability(40% lowered mobility) and higher Gini disparity (37% lowered mobility) as indicated by [Chetty et al.2016]. The latter phenomenon referred to as assortative mating or the ”marriage-gap” has been noticed to consistently increase in the recent years in the United States and is an under-appreciated factor in widening income and health disparities.
This study presents the overarching potential for using data-driven graphical models extended to decision networks as a unified framework for enabling healthcare policy. We demonstrate this potential through the use-case of developing a coherent understanding of health-inequality in the United States given from a heterogenous dataset. Our proposed framework for understanding the longevity-gap uses a generative modeling approach to gain a system-level understanding of healthcare inequities. It also emphasizes transparency and explainability and extends this motivation through creation of a web-application that encapsulates our model inferences and visualizations for general users and policymakers. We expect that the users will not only be able to validate our findings but also explore further insights into healthcare and social inequity that we may have missed. In conclusion, with this study we reason that Artificial Intelligence research has the potential to reduce disparities and recommend actionable solutions for promote individual and social health.
We acknowledge the inputs and support provided by Dr. Nigam Shah, Biomedical Informatics Research, Stanford University, USA and Dr. Rakesh Lodha, Department of Pediatrics, All India Institute of Medical Sciences, New Delhi, India. This work was supported in part by the Wellcome Trust/DBT India Alliance Early Career Award IA/CPHE/14/1/501504 to Tavpritesh Sethi.
- [Abràmoff et al.2018] Abràmoff, M. D.; Lavin, P. T.; Birch, M.; Shah, N.; and Folk, J. C. 2018. Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. npj Digital Medicine 1(1):39.
- [AS and Smith2018] AS, A., and Smith, A. 2018. Machine learning and health care disparities in dermatology. JAMA Dermatology.
- [Braveman et al.2010] Braveman, P. A.; Cubbin, C.; Egerter, S.; Williams, D. R.; and Pamuk, E. 2010. Socioeconomic disparities in health in the united States: What the patterns tell us. American Journal of Public Health.
- [Cabitza, Rasoini, and Gensini2017] Cabitza, F.; Rasoini, R.; and Gensini, G. F. 2017. Unintended Consequences of Machine Learning in Medicine. JAMA.
- [Chang and Borges Ribeiro2018] Chang, W., and Borges Ribeiro, B. 2018. shinydashboard: Create Dashboards with ’Shiny’.
- [Chetty et al.2016] Chetty, R.; Stepner, M.; Abraham, S.; Lin, S.; Scuderi, B.; Turner, N.; Bergeron, A.; and Cutler, D. 2016. The association between income and life expectancy in the United States, 2001-2014. JAMA - Journal of the American Medical Association.
- [Dalton and Nutter2018] Dalton, J. E., and Nutter, B. 2018. HydeNet: Hybrid Bayesian Networks Using R and JAGS.
- [Fisher et al.2003a] Fisher, E. S.; Wennberg, D. E.; Stukel, T. A.; Gottlieb, D. J.; Lucas, F. L.; and Pinder, É. L. 2003a. The implications of regional variations in Medicare spending. Part 1: The content, quality, and accessibility of care. Annals of Internal Medicine.
- [Fisher et al.2003b] Fisher, E. S.; Wennberg, D. E.; Stukel, T. A.; Gottlieb, D. J.; Lucas, F. L.; and Pinder, É. L. 2003b. The implications of regional variations in Medicare spending. Part 2: Health outcomes and satisfaction with care. Annals of Internal Medicine.
- [Geiger, Verma, and Pearl1990] Geiger, D.; Verma, T.; and Pearl, J. 1990. Identifying independence in bayesian networks. Networks.
- [Gottlieb et al.2010] Gottlieb, D. J.; Zhou, W.; Song, Y.; Andrews, K. G.; Skinner, J. S.; and Sutherland, J. M. 2010. Prices don’t drive regional Medicare spending variations. Health Affairs.
- [Jsgaard2012] Jsgaard, S. H. 2012. Graphical Independence Networks with the gRain Package for R. Journal of Statistical Software.
- [Keane and Topol2018] Keane, P. A., and Topol, E. J. 2018. With an eye to AI and autonomous diagnosis. npj Digital Medicine 1(1):40.
- [Koller and Friedman2009] Koller, D., and Friedman, N. 2009. Probabilistic Graphical Models: Principles and Techniques, volume 2009.
- [LaVeist, Gaskin, and Richard2011] LaVeist, T. A.; Gaskin, D.; and Richard, P. 2011. Estimating the Economic Burden of Racial Health Inequalities in the United States. International Journal of Health Services.
- [Levy and Wilson1989] Levy, F., and Wilson, W. J. 1989. The Truly Disadvantaged. Journal of Policy Analysis and Management.
- [Murray et al.2006] Murray, C. J.; Kulkarni, S. C.; Michaud, C.; Tomijima, N.; Bulzacchelli, M. T.; Iandiorio, T. J.; and Ezzati, M. 2006. Eight Americas: Investigating mortality disparities across races, counties, and race-counties in the United States. PLoS Medicine.
- [Pearl1985] Pearl, J. 1985. Bayesian Networks A Model of Self-Activated Memory for Evidential Reasoning.
- [Pearl2011] Pearl, J. 2011. Causality: Models, reasoning, and inference, second edition.
- [R Development Core Team2011] R Development Core Team, R. 2011. R: A Language and Environment for Statistical Computing, volume 1.
- [Scutari2010] Scutari, M. 2010. Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software 35(3):1–22.
- [Stekhoven and Bühlmann2012] Stekhoven, D. J., and Bühlmann, P. 2012. Missforest-Non-parametric missing value imputation for mixed-type data. Bioinformatics.
Supplementary File (submitted separately)
Screenshots of the web-application
The web-application provides interactive visualizations including leaflet maps for geographic regions of the United States that can be explored for interactive inference. The dashboard will be made available as a web-application along with the paper. The Graphical User Interface of the dashboard is shown in Figure 5.
The inference learning menu, shown in Figure 6, enables the user to learn and explore conditional probability plots on the learned structure. The user can set an event and multiple evidences can be inserted and removed. This crucial feature enables the user to explore probability plots on event nodes in the network conditionalized on number of evidences. The web-application allows two ways to perform for inference learning,
Approximate Inference. Faster to learn on large networks, but time adds up for each inference as it relies upon Monte Carlo simulations for each. We add the option of repeating the sampling 25 times to get an estimate of error bars for each inference.
Exact inference for smaller networks. One-time learning, may be intractable for large networks but is faster when inferences need to be repeatedly assessed as it avoids Monte Carlo sampling every time.
While the web-application learns approximate learning as default, user must explicitly learn exact inferences whenever the structure is learned or updated. We report exact inferences in this paper.
Bayesian Decision Network.
Figure 7 shows the Decision Network constructed by setting the decision nodes for optimal policy learning.
Inferring Policy Actions.
Figure 8 shows an example of the learned policy on the web-application.