The Frontiers of Fairness in Machine Learning
The last few years have seen an explosion of academic and popular interest in algorithmic fairness. Despite this interest and the volume and velocity of work that has been produced recently, the fundamental science of fairness in machine learning is still in a nascent state. In March 2018, we convened a group of experts as part of a CCC visioning workshop to assess the state of the field, and distill the most promising research directions going forward. This report summarizes the findings of that workshop. Along the way, it surveys recent theoretical work in the field and points towards promising directions for research.
The last decade has seen a vast increase both in the diversity of applications to which machine learning is applied, and to the import of those applications. Machine learning is no longer just the engine behind ad placements and spam filters: it is now used to filter loan applicants, deploy police officers, and inform bail and parole decisions, amongst other things. The result has been a major concern for the potential for data driven methods to introduce and perpetuate discriminatory practices, and to otherwise be unfair. And this concern has not been without reason: a steady stream of empirical findings has shown that data driven methods can unintentionally both encode existing human biases and introduce new ones (see e.g. [Swe13, BCZ16, CBN17, BG18] for notable examples).
At the same time, the last two years have seen an unprecedented explosion in interest from the academic community in studying fairness and machine learning. “Fairness and transparency” transformed from a niche topic with a trickle of papers produced every year (at least since the work of [PRT08]) to a major subfield of machine learning, complete with a dedicated archival conference (ACM FAT*). But despite the volume and velocity of published work, our understanding of the fundamental questions related to fairness and machine learning remain in its infancy. What should fairness mean? What are the causes that introduce unfairness in machine learning? How best should we modify our algorithms to avoid unfairness? And what are the corresponding tradeoffs with which we must grapple?
In March 2018, we convened a group of about fifty experts in Philadelphia, drawn from academia, industry, and government, to asses the state of our understanding of the fundamentals of the nascent science of fairness in machine learning, and to identify the unanswered questions that seem the most pressing. By necessity, the aim of the workshop was not to comprehensively cover the vast growing field, much of which is empirical. Instead, the focus was on theoretical work aimed at providing a scientific foundation for understanding algorithmic bias. This document captures several of the key ideas and directions discussed.
2 What We Know
2.1 Causes of Unfairness
Even before we precisely specify what we mean by “fairness”, we can identify common distortions that can lead off-the-shelf machine learning techniques to produce behavior that is intuitively unfair. These include:
Bias Encoded in Data: Often, the training data that we have on hand already includes human biases. For example, in the problem of recidivism prediction used to inform bail and parole decisions, the goal is to predict whether an inmate, if released, will go on to commit another crime within a fixed period of time. But we do not have data on who commits crimes — we have data on who is arrested. There is reason to believe that arrest data — especially for drug crimes — is skewed towards minority populations that are policed at a higher rate [Rot14]. Of course, machine learning techniques are designed to fit the data, and so will naturally replicate any bias already present in the data. There is no reason to expect them to remove existing bias.
Minimizing Average Error Fits Majority Populations: Different populations of people have different distributions over features, and those features have different relationships to the label that we are trying to predict. As an example, consider the task of predicting college performance based on high school data. Suppose there is a majority population and a minority population. The majority population employs SAT tutors and takes the exam multiple times, reporting only the highest score. The minority population does not. We should naturally expect both that SAT scores are higher amongst the majority population, and that their relationship to college performance is differently calibrated compared to the minority population. But if we train a group-blind classifier to minimize overall error, if it cannot simultaneously fit both populations optimally, it will fit the majority population. This is because — simply by virtue of their numbers — the fit to the majority population is more important to overall error than the fit to the minority population. This leads to a different (and higher) distribution of errors in the minority population. This effect can be quantified, and can be partially alleviated via concerted data gathering efforts [CJS18].
The Need to Explore: In many important problems, including recidivism prediction and drug trials, the data fed into the prediction algorithm depends on the actions that algorithm has taken in the past. We only observe whether an inmate will recidivate if we release him. We only observe the efficacy of a drug on patients to whom it is assigned. Learning theory tells us that in order to effectively learn in such scenarios, we need to explore — i.e. sometimes take actions we believe to be sub-optimal in order to gather more data. This leads to at least two distinct ethical questions. First, when are the individual costs of exploration borne disproportionately by a certain sub-population? Second, if in certain (e.g. medical) scenarios, we view it as immoral to take actions we believe to be sub-optimal for any particular patient, how much does this slow learning, and does this lead to other sorts of unfairness?
2.2 Definitions of Fairness
With a few exceptions, the vast majority of work to date on fairness in machine learning has focused on the task of batch classification. At a high level, this literature has focused on two main families of definitions
Statistical Definitions of Fairness
Most of the literature on fair classification focuses on statistical definitions of fairness. This family of definitions fixes a small number of protected demographic groups (such as racial groups), and then ask for (approximate) parity of some statistical measure across all of these groups. Popular measures include raw positive classification rate, considered in work such as [CV10, KAS11, DHP12, FFM15] (also sometimes known as statistical parity [DHP12]), false positive and false negative rates [Cho17, KMR17, HPS16, ZVGG17] (also sometimes known as equalized odds [HPS16]), and positive predictive value [Cho17, KMR17] (closely related to equalized calibration when working with real valued risk scores). There are others — see e.g. [BHJ18] for a more exhaustive enumeration. This family of fairness definitions is attractive because it is simple, and definitions from this family can be achieved without making any assumptions on the data and can be easily verified. However, statistical definitions of fairness do not on their own give meaningful guarantees to individuals or structured subgroups of the protected demographic groups. Instead they give guarantees to “average” members of the protected groups. (See [DHP12] for a litany of ways in which statistical parity and similar notions can fail to provide meaningful guarantees, and [KNRW18b] for examples of how some of these weaknesses carry over to definitions which equalize false positive and negative rates.) Different statistical measures of fairness can be at odds with one another. For example, [Cho17] and [KMR17] prove a fundamental impossibility result: except in trivial settings, it is impossible to simultaneously equalize false positive rates, false negative rates, and positive predictive value across protected groups. Learning subject to statistical fairness constraints can also be computationally hard [WGOS17], although practical algorithms of various sorts are known [HPS16, ZVGG17, ABD18].
Individual Definitions of Fairness
Individual notions of fairness, on the other hand, ask for constraints that bind on specific pairs of individuals, rather than on a quantity that is averaged over groups. For example, [DHP12] give a definition which roughly corresponds to the constraint that “similar individuals should be treated similarly”, where similarity is defined with respect to a task-specific metric that must be determined on a case by case basis. [JKMR16] suggest a definition which roughly corresponds to “less qualified individuals should not be favored over more qualified individuals”, where quality is defined with respect to the true underlying label (unknown to the algorithm). However, although the semantics of these kinds of definitions can be more meaningful than statistical approaches to fairness, the major stumbling block is that they seem to require making significant assumptions. For example, the approach of [DHP12] pre-supposes the existence of an agreed upon similarity metric, whose definition would itself seemingly require solving a non-trivial problem in fairness, and the approach of [JKMR16] seems to require strong assumptions on the functional form of the relationship between features and labels in order to be usefully put into practice. These obstacle are serious enough that it remains unclear whether individual notions of fairness can be made practical — although attempting to bridge this gap is an important and ongoing research agenda.
3 Questions at the Research Frontier
3.1 Between Statistical and Individual Fairness
Given the limitations of extant notions of fairness, is there a way to get some of the “best of both worlds”? In other words, constraints that are practically implementable without the need for making strong assumptions on the data or the knowledge of the algorithm designer, but which nevertheless provide more meaningful guarantees to individuals? Two recent papers, [KNRW18b] and [HJKRR18] (see also [KNRW18a, KGZ18] for empirical evaluations of the algorithms proposed in these papers), attempt to do this by asking for statistical fairness definitions to hold not just on a small number of protected groups, but on an exponential or infinite class of groups defined by some class of functions of bounded complexity. This approach seems promising: because ultimately they are asking for statistical notions of fairness, the approaches proposed by these papers enjoy the benefits of statistical fairness: that no assumptions need be made about the data, nor is any external knowledge (like a fairness metric) needed. It also better addresses concerns about “intersectionality”, a term used to describe how different kinds of discrimination can compound and interact for individuals who fall at the intersection of several protected classes.
At the same time, the approach raises a number of additional questions: what function classes are reasonable, and once one is decided upon (e.g. conjunctions of protected attribures) what features should be “protected”? Should these only be attributes that are sensitive on their own, like race and gender, or might attributes that are innocuous on their own correspond to groups we wish to protect once we consider their intersection with protected attributes (for example clothing styles intersected with race or gender)? Finally, this family of approaches significantly mitigates some of the weaknesses of statistical notions of fairness by asking for the constraints to hold on average not just over a small number of coarsely defined groups, but over very finely defined groups as well. Ultimately, however, it inherits the weaknesses of statistical fairness as well, just on a more limited scale.
Another recent line of work aims to weaken the strongest assumption needed for the notion of individual fairness from [DHP12]: namely that the algorithm designer has perfect knowledge of a “fairness metric”. [KRR18] assume that the algorithm has access to an oracle which can return an unbiased estimator for the distance between two randomly drawn individuals according to an unknown fairness metric, and show how to use this to ensure a statistical notion of fairness related to [KNRW18b, HJKRR18] which informally states that “on average, individuals in two groups should be treated similarly if on average the individuals in the two groups are similar” — and this can be achieved with respect to an exponentially or infinitely large set of groups. Similarly, [GJKR18] assumes the existence of an oracle which can identify fairness violations when they are made in an online setting, but cannot quantify the extent of the violation (with respect to the unknown metric). It is shown that when the metric is from a specific learnable family, this kind of feedback is sufficient to obtain an optimal regret bound to the best fair classifier while having only a bounded number of violations of the fairness metric. [RY18] consider the case in which the metric is known, and show that a PAC-inspired approximate variant of metric fairness generalizes to new data drawn from the same underlying distribution. Ultimately, however, these approaches all assume that fairness is perfectly defined with respect to some metric, and that there is some sort of direct access to it. Can these approaches be generalized to a more “agnostic” setting, in which fairness feedback is given by human beings who may not be responding in a way that is consistent with any metric?
3.2 Data Evolution and Dynamics of Fairness
The vast majority of work in computer science on algorithmic fairness has focused on one-shot classification tasks. But real algorithmic systems consist of many different components that are combined together, and operate in complex environments that are dynamically changing, sometimes because of the actions of the learning algorithm itself. For the field to progress, we need to understand the dynamics of fairness in more complex systems.
Perhaps the simplest aspect of dynamics that remains poorly understood is how and when components that may individually satisfy notions of fairness compose into larger constructs that still satisfy fairness guarantees. For example, if the bidders in an advertising auction individually are fair with respect to their bidding decisions, when will the allocation of advertisements be “fair”, and when will it not? [BKN17] and [DI18] have made a preliminary foray in this direction. These papers embark on a systematic study of fairness under composition, and find that often the composition of multiple fair components will not satisfy any fairness constraint at all. Similarly, the individual components of a “fair” system may appear to be unfair in isolation. There are certain special settings, e.g. the “filtering pipeline” scenario of [BKN17] — modeling a scenario in which a job applicant is selected only if she is selected at every stage of the pipeline — in which (multiplicative approximations of) statistical fairness notions compose in a well behaved way. But the high level message from these works is that our current notions of fairness compose poorly. Experience from differential privacy [DMNS06, DR14] suggests that graceful degradation under composition is key to designing complicated algorithms satisfying desirable statistical properties, because it allows algorithm design and analysis to be modular. Thus, it seems important to find satisfying fairness definitions and richer frameworks that behave well under composition.
In dealing with socio-technical systems, it is also important to understand how algorithms dynamically effect their environment, and the incentives of human actors. For example, if the bar (for e.g. college admission) is lowered for a group of individuals, this might increase the average qualifications for this group over time because of at least two effects: a larger proportion of children in the next generation grow up in households with college educated parents (and the opportunities this provides), and the fact that a college education is achievable can incentivize effort to prepare academically. These kinds of effects are not considered when considering either statistical or individual notions of fairness in one-shot learning settings. The economics literature on affirmative action has long considered such effects — although not with the specifics of machine learning in mind: see e.g. [FV92, CL93, Bec10]. More recently, there have been some preliminary attempts to model these kinds of effects in machine learning settings — e.g. by modeling the environment as a markov decision process [JJK17], considering the equilibrium effects of imposing statistical definitions of fairness in a model of a labor market [HC18], specifying the functional relationship between classification outcomes and quality [LDR18], or by considering the effect of a classifier on a downstream Bayesian decision maker [KRZ18]. However, the specific predictions of most of the models of this sort are brittle to the specific modeling assumptions made — they point to the need to consider long term dynamics, but do not provide robust guidance for how to navigate them. More work is needed here.
Finally, decision making is often distributed between a large number of actors who share different goals and do not necessarily coordinate. In settings like this, in which we do not have direct control over the decision making process, it is important to think about how to incentivize rational agents to behave in a way that we view as fair. [KKM17] takes a preliminary stab at this task, showing how to incentivize a particular notion of individual fairness in a simple, stylized setting, using small monetary payments. But how should this work for other notions of fairness, and in more complex settings? Can this be done by controlling the flow of information, rather than by making monetary payments (monetary payments might be distasteful in various fairness-relevant settings)? More work is needed here as well. Finally, [CDPF17] take a welfare maximization view of fairness in classification, and characterize the cost of imposing additional statistical fairness constraints as well. But this is done in a static environment. How would the conclusions change under a dynamic model?
3.3 Modeling and Correcting Bias in the Data
Fairness concerns typically surface precisely in settings where the available training data is already contaminated by bias. The data itself is often a product of social and historical process that operated to the disadvantage of certain groups. When trained in such data, off-the-shelf machine learning techniques may reproduce, reinforce, and potentially exacerbate existing biases. Understanding how bias arises in the data, and how to correct for it, are fundamental challenges in the study of fairness in machine learning.
[BCZ16] demonstrate how machine learning can reproduce biases in their analysis of the popular word2vec embedding trained on a corpus of Google News texts (parallel effects were independently discovered by [CBN17]). The authors show that the trained embedding exhibit female/male gender stereotypes, learning that “doctor” is more similar to man than to woman, along with analogies such as “man is to computer programmer as woman is to homemaker”. Even if such learned associations accurately reflect patterns in the source text corpus, their use in automated systems may exacerbate existing biases. For instance, it might result in male applicants being ranked more highly than equally qualified female applicants in queries related to jobs that the embedding identifies as male-associated.
Similar risks arise whenever there is potential for feedback loops. These are situations where the trained machine learning model informs decisions that then affect the data collected for future iterations of the training process. [LI16] demonstrate how feedback loops might arise in predictive policing if arrest data were used to train the model
Correcting for data bias generally seems to require knowledge of how the measurement process is biased, or judgments about properties the data would satisfy in an “unbiased” world. [FSV16] formalize this as a disconnect between the observed space—features that are observed in the data, such as SAT scores—and the unobservable construct space—features that form the desired basis for decision making, such as intelligence. Within this framework, data correction efforts attempt to undo the effects of biasing mechanisms that drive discrepancies between these spaces. To the extent that the biasing mechanism cannot be inferred empirically, any correction effort must make explicit its underlying assumptions about this mechanism. What precisely is being assumed about the construct space? When can the mapping between the construct space and the observed space be learned and inverted? What form of fairness does the correction promote, and at what cost? The costs are often immediately realized, whereas the benefits are less tangible. We will directly observe reductions in prediction accuracy, but any gains hinge on a belief that the observed world is not one we should seek to replicate accurately in the first place. This is an area where tools from causality may offer a principled approach for drawing valid inference with respect to unobserved counterfactually ‘fair’ worlds.
3.4 Fair Representations
Fair representation learning is a data de-biasing process that produces transformations (intermediate representations) of the original data that retain as much of the task-relevant information as possible while removing information about sensitive or protected attributes. This is one approach to transforming biased observational data in which group membership may be inferred from other features, to a construct space where protected attributes are statistically independent of other features. First introduced in the work of [ZWS13], fair representation learning produces a de-biased data set that may in principle be used by other parties without any risk of disparate outcomes. [FFM15] and [MOW17] formalize this idea by showing how the disparate impact of a decision rule is bounded in terms of its balanced error rate as a predictor of the sensitive attribute.
Several recent papers have introduced new approaches for constructing fair representations. [FFM15] propose rank-preserving procedures for repairing features to reduce or remove pairwise dependence with the protected attribute. [JL17] build upon this work, introducing a likelihood-based approach that can additionally handle continuous protected attributes, discrete features, and which promotes joint independence between the transformed features and the protected attributes. There is also a growing literature on using adversarial learning to achieve group fairness in the form of statistical parity or false positive/false negative rate balance [ES15, BCZC17, ZLM18, MCPZ18].
Existing theory shows that the fairness promoting benefits of fair representation learning rely critically on the extent to which existing associations between the transformed features and the protected characteristics are removed. Adversarial downstream users may be able to recover protected attribute information if their models are more powerful than those used initially to obfuscate the data. This presents a challenge both to the generators of fair representations as well as to auditors and regulators tasked with certifying that the resulting data is fair for use. More work is needed to understand the implications of fair representation learning for promoting fairness in the real world.
3.5 Beyond Classification
Although the majority of the work on fairness in machine learning focuses on batch classification, batch classification is only one aspect of how machine learning is used. Much of machine learning — e.g. online learning, bandit learning, and reinforcement learning — focuses on dynamic settings in which the actions of the algorithm feed back into the data it observes. These dynamic settings capture many problems for which fairness is a concern. For example, lending, criminal recidivism prediction, and sequential drug trials all are so-called bandit learning problems, in which the algorithm cannot observe data corresponding to counterfactuals. We cannot see whether someone not granted a loan would have paid it back. We cannot see whether an inmate not released on parole would have gone on to commit another crime. We cannot see how a patient would have responded to a different drug.
The theory of learning in bandit settings is well understood, and it is characterized by a need to trade off exploration with exploitation. Rather than always making a myopically optimal decision, when counter-factuals cannot be observed, it is necessary for algorithms to sometimes take actions that appear to be sub-optimal so as to gather more data. But in settings in which decisions correspond to individuals, this means sacrificing the well-being of a particular person for the potential benefit of future individuals. This can sometimes be unethical, and a source of unfairness [BBC16]. Several recent papers explore this issue. For example, [BBK17] and [KMR18] give conditions under which linear learners need not explore at all in bandit settings, thereby allowing for best-effort service to each arriving individual, obviating the tension between ethical treatment of individuals and learning. [RSVW18] show that the costs associated with exploration can be unfairly bourn by a structured sub-population, and that counter-intuitively, those costs can actually increase when they are included with a majority population, even though more data increases the rate of learning overall. However, these results are all preliminary: they are restricted to settings in which the learner is learning a linear policy, and the data really is governed by a linear model. While illustrative, more work is needed to understand real-world learning in online settings, and the ethics of exploration.
There is also some work on fairness in machine learning in other settings — for example, ranking [YS17, CSV17], selection [KRW17, KR18], personalization [CV17], bandit learning [JKM18, LRD17], human-classifier hybrid decision systems [MPZ17], and reinforcement learning [JJK17, DTB17]. But outside of classification, the literature is relatively sparse. This should be rectified, because there are interesting and important fairness issues that arise in other settings — especially when there are combinatorial constraints on the set of individuals that can be selected for a task, or when there is a temporal aspect to learning.
This material is based upon work supposed by the National Science Foundation under Grant No. 1136993. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
We are indebted to all of the participants of the CCC visioning workshop, held March 18-19 2018 in Philadelphia. The workshop discussion shaped every aspect of this document. We are grateful to Helen Wright and Ann Drobnis, who are instrumental in making the workshop happen. Finally, we are thankful to Cynthia Dwork, Sampath Kannan, Michael Kearns, Toni Pitassi, and Suresh Venkatasubramanian who provided valuable feedback on this report.
- There is also an emerging line of work that considers causal notions of fairness (see e.g., [KCP17, KLRS17, NS18]). We intentionally avoided discussions of this potentially important direction because it will be the subject of its own CCC visioning workshop.
- Predictive policing models are generally proprietary, and so it is not clear whether arrest data is used to train the model in any deployed system.
- Alekh Agarwal, Alina Beygelzimer, Miroslav Dudík, John Langford, and Hanna Wallach. A reductions approach to fair classification. In Proceedings of the 35th International Conference on Machine Learning, ICML, volume 80 of JMLR Workshop and Conference Proceedings, pages 2569–2577. JMLR.org, 2018.
- Sarah Bird, Solon Barocas, Kate Crawford, Fernando Diaz, and Hanna Wallach. Exploring or exploiting? social and ethical implications of autonomous experimentation in ai. 2016.
- Hamsa Bastani, Mohsen Bayati, and Khashayar Khosravi. Exploiting the natural exploration in contextual bandits. arXiv preprint arXiv:1704.09011, 2017.
- Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in Neural Information Processing Systems, pages 4349–4357, 2016.
- Alex Beutel, Jilin Chen, Zhe Zhao, and Ed H Chi. Data decisions and theoretical implications when adversarially learning fair representations. arXiv preprint arXiv:1707.00075, 2017.
- Gary S Becker. The economics of discrimination. University of Chicago press, 2010.
- Joy Buolamwini and Timnit Gebru. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on Fairness, Accountability and Transparency, pages 77–91, 2018.
- Richard Berk, Hoda Heidari, Shahin Jabbari, Michael Kearns, and Aaron Roth. Fairness in criminal justice risk assessments: The state of the art. Sociological Methods & Research, 0(0):0049124118782533, 2018.
- Amanda Bower, Sarah N Kitchen, Laura Niss, Martin J Strauss, Alexander Vargas, and Suresh Venkatasubramanian. Fair pipelines. arXiv preprint arXiv:1707.00391, 2017.
- Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334):183–186, 2017.
- Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 797–806. ACM, 2017.
- Alexandra Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5(2):153–163, 2017.
- Irene Chen, Fredrik D Johansson, and David Sontag. Why is my classifier discriminatory? 2018.
- Stephen Coate and Glenn C Loury. Will affirmative-action policies eliminate negative stereotypes? The American Economic Review, pages 1220–1240, 1993.
- L Elisa Celis, Damian Straszak, and Nisheeth K Vishnoi. Ranking with fairness constraints. arXiv preprint arXiv:1704.06840, 2017.
- Toon Calders and Sicco Verwer. Three naive bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery, 21(2):277–292, 2010.
- L Elisa Celis and Nisheeth K Vishnoi. Fair personalization. arXiv preprint arXiv:1707.02260, 2017.
- Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pages 214–226. ACM, 2012.
- Cynthia Dwork and Christina Ilvento. Fairness under composition. Manuscript, 2018.
- Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography Conference, pages 265–284. Springer, 2006.
- Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3–4):211–407, 2014.
- Shayan Doroudi, Philip S. Thomas, and Emma Brunskill. Importance sampling for fair policy selection. In Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence, UAI. AUAI Press, 2017.
- Danielle Ensign, Sorelle A. Friedler, Scott Neville, Carlos Scheidegger, and Suresh Venkatasubramanian. Runaway feedback loops in predictive policing. In 1st Conference on Fairness, Accountability and Transparency in Computer Science (FAT*), 2018.
- Harrison Edwards and Amos Storkey. Censoring representations with an adversary. arXiv preprint arXiv:1511.05897, 2015.
- Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. Certifying and removing disparate impact. In KDD, 2015.
- Sorelle A Friedler, Carlos Scheidegger, and Suresh Venkatasubramanian. On the (im) possibility of fairness. arXiv preprint arXiv:1609.07236, 2016.
- Dean P Foster and Rakesh V Vohra. An economic argument for affirmative action. Rationality and Society, 4(2):176–188, 1992.
- Stephen Gillen, Christopher Jung, Michael Kearns, and Aaron Roth. Online learning with an unknown fairness metric. In Advances in Neural Information Processing Systems, 2018.
- Lily Hu and Yiling Chen. A short-term intervention for long-term fairness in the labor market. In Pierre-Antoine Champin, Fabien L. Gandon, Mounia Lalmas, and Panagiotis G. Ipeirotis, editors, Proceedings of the 2018 World Wide Web Conference on World Wide Web, WWW, pages 1389–1398. ACM, 2018.
- Ursula Hébert-Johnson, Michael P Kim, Omer Reingold, and Guy N Rothblum. Calibration for the (computationally-identifiable) masses. In Proceedings of the 35th International Conference on Machine Learning, ICML, volume 80 of JMLR Workshop and Conference Proceedings, pages 2569–2577. JMLR.org, 2018.
- Moritz Hardt, Eric Price, and Nati Srebro. Equality of opportunity in supervised learning. In Advances in neural information processing systems, pages 3315–3323, 2016.
- Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, and Aaron Roth. Fairness in reinforcement learning. In International Conference on Machine Learning, pages 1617–1626, 2017.
- Matthew Joseph, Michael Kearns, Jamie Morgenstern, Seth Neel, and Aaron Roth. Fair algorithms for infinite and contextual bandits. In AAAI/ACM Conference on AI, Ethics, and Society, 2018.
- Matthew Joseph, Michael Kearns, Jamie H Morgenstern, and Aaron Roth. Fairness in learning: Classic and contextual bandits. In Advances in Neural Information Processing Systems, pages 325–333, 2016.
- James E Johndrow and Kristian Lum. An algorithm for removing sensitive information: application to race-independent recidivism prediction. arXiv preprint arXiv:1703.04957, 2017.
- Toshihiro Kamishima, Shotaro Akaho, and Jun Sakuma. Fairness-aware learning through regularization approach. In Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on, pages 643–650. IEEE, 2011.
- Niki Kilbertus, Mateo Rojas Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing, and Bernhard Schölkopf. Avoiding discrimination through causal reasoning. In Advances in Neural Information Processing Systems, pages 656–666, 2017.
- Michael P Kim, Amirata Ghorbani, and James Zou. Multiaccuracy: Black-box post-processing for fairness in classification. arXiv preprint arXiv:1805.12317, 2018.
- Sampath Kannan, Michael Kearns, Jamie Morgenstern, Mallesh Pai, Aaron Roth, Rakesh Vohra, and Zhiwei Steven Wu. Fairness incentives for myopic agents. In Proceedings of the 2017 ACM Conference on Economics and Computation, pages 369–386. ACM, 2017.
- Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. Counterfactual fairness. In Advances in Neural Information Processing Systems, pages 4069–4079, 2017.
- Jon M. Kleinberg, Sendhil Mullainathan, and Manish Raghavan. Inherent trade-offs in the fair determination of risk scores. In 8th Innovations in Theoretical Computer Science Conference, ITCS, 2017.
- Sampath Kannan, Jamie Morgenstern, Aaron Roth, Bo Waggoner, and Zhiwei Steven Wu. A smoothed analysis of the greedy algorithm for the linear contextual bandit problem. In Advances in Neural Information Processing Systems, 2018.
- Michael Kearns, Seth Neel, Aaron Roth, and Zhiwei Steven Wu. An empirical study of rich subgroup fairness for machine learning. arXiv preprint arXiv:1808.08166, 2018.
- Michael J. Kearns, Seth Neel, Aaron Roth, and Zhiwei Steven Wu. Preventing fairness gerrymandering: Auditing and learning for subgroup fairness. In Jennifer G. Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, ICML, volume 80 of JMLR Workshop and Conference Proceedings, pages 2569–2577. JMLR.org, 2018.
- Jon Kleinberg and Manish Raghavan. Selection problems in the presence of implicit bias. arXiv preprint arXiv:1801.03533, 2018.
- Michael P Kim, Omer Reingold, and Guy N Rothblum. Fairness through computationally-bounded awareness. In Advances in Neural Information Processing Systems, 2018.
- Michael Kearns, Aaron Roth, and Zhiwei Steven Wu. Meritocratic fairness for cross-population selection. In International Conference on Machine Learning, pages 1828–1836, 2017.
- Sampath Kannan, Aaron Roth, and Juba Ziani. Downstream effects of affirmative action. arXiv preprint arXiv:1808.09004, 2018.
- Lydia T Liu, Sarah Dean, Esther Rolf, Max Simchowitz, and Moritz Hardt. Delayed impact of fair machine learning. In Proceedings of the 35th International Conference on Machine Learning, ICML, 2018.
- Kristian Lum and William Isaac. To predict and serve? Significance, 13(5):14–19, 2016.
- Yang Liu, Goran Radanovic, Christos Dimitrakakis, Debmalya Mandal, and David C Parkes. Calibrated fairness in bandits. arXiv preprint arXiv:1707.01875, 2017.
- David Madras, Elliot Creager, Toniann Pitassi, and Richard Zemel. Learning adversarially fair and transferable representations. arXiv preprint arXiv:1802.06309, 2018.
- Daniel McNamara, Cheng Soon Ong, and Robert C Williamson. Provably fair representations. arXiv preprint arXiv:1710.04394, 2017.
- David Madras, Toniann Pitassi, and Richard S. Zemel. Predict responsibly: Increasing fairness by learning to defer. CoRR, abs/1711.06664, 2017.
- Razieh Nabi and Ilya Shpitser. Fair inference on outcomes. In Proceedings of the… AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence, volume 2018, page 1931. NIH Public Access, 2018.
- Dino Pedreshi, Salvatore Ruggieri, and Franco Turini. Discrimination-aware data mining. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 560–568. ACM, 2008.
- Jonathan Rothwell. How the war on drugs damages black social mobility. The Brookings Institution, published Sept, 30, 2014.
- Manish Raghavan, Alexandrs Slivkins, Jennifer Wortman Vaughan, and Zhiwei Steven Wu. The unfair externalities of exploration. In Conference on Learning Theory, 2018.
- Guy N Rothblum and Gal Yona. Probably approximately metric-fair learning. In Proceedings of the 35th International Conference on Machine Learning, ICML, volume 80 of JMLR Workshop and Conference Proceedings, pages 2569–2577. JMLR.org, 2018.
- Latanya Sweeney. Discrimination in online ad delivery. Queue, 11(3):10, 2013.
- Blake Woodworth, Suriya Gunasekar, Mesrob I Ohannessian, and Nathan Srebro. Learning non-discriminatory predictors. In Conference on Learning Theory, pages 1920–1953, 2017.
- Ke Yang and Julia Stoyanovich. Measuring fairness in ranked outputs. In Proceedings of the 29th International Conference on Scientific and Statistical Database Management, page 22. ACM, 2017.
- Brian Hu Zhang, Blake Lemoine, and Margaret Mitchell. Mitigating unwanted biases with adversarial learning. 2018.
- Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez-Rodriguez, and Krishna P. Gummadi. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web, WWW, pages 1171–1180. ACM, 2017.
- Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. Learning fair representations. In ICML, 2013.