We study fairness in machine learning. A learning algorithm, given a training set drawn from an underlying population, learns a classifier that will be used to make decisions about individuals. The concern is that this classifier’s decisions might be discriminatory, favoring certain subpopulations over others. The seminal work of Dwork et al. [ITCS 2012] introduced fairness through awareness, positing that a fair classifier should treat similar individuals similarly. Similarity between individuals is measured by a taskspecific similarity metric. In the context of machine learning, however, this fairness notion faces serious difficulties, as it does not generalize and can be computationally intractable.
We introduce a relaxed notion of approximate metricfairness, which allows a small fairness error: for a random pair of individuals sampled from the population, with all but a small probability of error, if they are similar then they are treated similarly. In particular, this provides discriminationprotections to every subpopulation that is not too small. We show that approximate metricfairness does generalize from a training set to the underlying population, and we leverage these generalization guarantees to construct polynomialtime learning algorithms that achieve competitive accuracy subject to fairness constraints.
1 Introduction
Machine learning is increasingly used to make consequential classification decisions about individuals. Examples range from predicting whether a user will enjoy a particular article, to estimating a felon’s recidivism risk, to determining whether a patient is a good candidate for a medical treatment. Automated classification comes with great benefits, but it also raises substantial societal concerns (cf. [O’N16] for a recent perspective). One prominent concern is that these algorithms might discriminate against individuals or groups in a way that violates laws or social and ethical norms. This might happen due to biases in the training data or due to biases introduced by the algorithm. To address these concerns, and to truly unleash the full potential of automated classification, there is a growing need for frameworks and tools to mitigate the risks of algorithmic discrimination. A growing literature attempts to tackle these challenges by exploring different fairness criteria.
Discrimination can take many guises. It can be difficult to spot and difficult to define. Imagine a protected minority population (defined by race, gender identity, political affiliation, etc). A natural approach for protecting the members of from discrimination is to make sure that they are not mistreated on average. For example, that on average members of and individuals outside of are classified in any particular way with roughly the same probability. This is a “grouplevel” notion of fairness, sometimes referred to as statistical parity.
Pointing out several weakness of grouplevel notions of fairness, the seminal work of [DHP12] introduced a notion of individual fairness. Their notion relies on a taskspecific similarity metric that specifies, for every two individuals, how similar they are with respect to the specific classification task at hand. Given such a metric, similar individuals should be treated similarly, i.e. assigned similar classification distributions (their focus was on probabilistic classifiers, as will be ours). In this work, we refer to their fairness notion as perfect metricfairness.
Given a good metric, perfect metricfairness provides powerful protections from discrimination. Furthermore, the metric provides a vehicle for specifying social norms, cultural awareness, and taskspecific knowledge. While coming up with a good metric can be challenging, metrics arise naturally in prominent existing examples (such as credit scores and insurance risk scores), and in natural scenarios (a metric specified by an external regulator). Dwork et al. studied the goal of finding a (probabilistic) classifier that minimizes utility loss (or maximizes accuracy), subject to satisfying the perfect metricfairness constraint. They showed how to phrase and solve this optimization problem for a given collection of individuals.
1.1 This Work: Approximately MetricFair Machine Learning
Building on these foundations, we study metricfair machine learning. Consider a learner that is given a similarity metric and a training set of labeled examples, drawn from an underlying population distribution. The learner should output a fair classifier that (to the extent possible) accurately classifies the underlying population.
This goal departs from the scenario studied in [DHP12], where the focus was on guaranteeing metricfairness and utility for the dataset at hand. Generalization of the fairness guarantee is a key difference: we focus on guaranteeing fairness not just for the (training) data set at hand, but also for the underlying population from which it was drawn. We note that perfect metricfairness does not, as a rule, generalize from a training set to the underlying population. This presents computational difficulties for constructing learning algorithms that are perfectly metricfair for the underlying population. Indeed, we exhibit a simple learning task that, while easy to learn without fairness constraints, becomes computationally infeasible under the perfect metricfairness constraint (given a particular metric).^{1}^{1}1We remark that perfect metricfairness can always be obtained trivially by outputting a constant classifier that treats all individuals identically, the challenge is achieving metricfairness together with nontrivial accuracy. See below and in Section 1.6 for further details.
We develop a relaxed approximate metricfairness framework for machine learning, where fairness does generalize from the training set to the underlying population, and present polynomialtime fair learning algorithms in this framework. We proceed to describe our setting and contributions.
Problem setting.
A metricfair learning problem is defined by an domain , a similarity metric , and a distribution over labeled examples from . A metricfair learning algorithm gets as input the metric and a sample of labeled examples, drawn i.i.d. from , and outputs a classifier . To accommodate fairness, we focus on probabilistic classifiers , where we interpret as the probability of label 1 (the probability of is thus ).
Approximate MetricFairness.
Taking inspiration from Valiant’s celebrated PAC learning model [Val84], we allow a small fairness error, which opens the door to generalization. We require that for two individuals sampled from the underlying population, with all but a small probability, if they are similar then they should be treated similarly. Similarity is measured by the statistical distance between the classification distributions given to the two individuals (we also allow a small additive slack in the similarity measure). We refer to this condition as approximate metricfairness (MF). Similarly to PAC learning, we also allow a small probability of a complete fairness failure.
Given a welldesigned metric, approximate metricfairness guarantees that almost every individual gets fair treatment compared to almost every other individual. In particular, it provides discriminationprotections to every group that is not too small. However, this guarantee also has limitations: particular individuals and even small groups might encounter bias and discrimination. There are certainly settings in which this is problematic, but in other settings protecting all groups that are not too small is an appealing guarantee. Moreover, approximate fairness opens the door to fairnessgeneralization bounds, as well as efficient learning algorithms for a rich collection of problems (see below). We elaborate on these choices and their consequences in Section 1.2.
Competitive accuracy.
Turning our attention to the accuracy objective, we follow [DHP12] in considering fairness to be a hard constraint (e.g. imposed by a regulator). Given the fairness constraint, what is a reasonable accuracy objective? Ideally, we would like the predictor’s accuracy to approach (as the sample size grows) that of the most accurate approximately MF predictor. This is analogous to the accuracy guarantee pioneered in [DHP12]. A probably approximately correct and fair (PACF) learning algorithm guarantees both approximate MF and “bestpossible” accuracy. A more relaxed accuracy benchmark is approaching the accuracy of the best classifier that is approximately MF for a tighter (more restrictive) fairnesserror. We refer this as a relaxed PACF learning algorithm (looking ahead, our efficient algorithms achieve this relaxed accuracy guarantee). We note that even relaxed PACF guarantees that the classifier is (at the very least) competitive with the best perfectly metricfair classifier. We elaborate in Section 1.3.
Generalization bounds.
A key issue in learning theory is that of generalization: to what extent is a classifier that is accurate on a finite sample also guaranteed to be accurate w.r.t the underlying distribution? We develop strong generalization bounds for approximate metricfairness, showing that for any class of predictors with bounded Rademacher complexity, approximate MF on the sample implies approximate MF on the underlying distribution (w.h.p. over the choice of sample ). The use of Rademacher complexity guarantees fairnessgeneralization for finite classes and also for many infinite classes. Proving that approximate metricfairness generalizes well is a crucial component in our analysis: it opens the door to polynomialtime algorithms that can focus on guaranteeing fairness (and accuracy) on the sample. Generalization also implies informationtheoretic samplecomplexity bounds for PACF learning that are similar to those known for PAC learning (without any fairness constraints). We elaborate in Section 1.4.
Efficient algorithms.
We construct polynomialtime (relaxed) PACF algorithms for linear and logistic regression. Recall that (for fairness) we focus on regression problems: learning predictors that assign a probability in to each example. For linear predictors, the probability is a linear function of an example’s distance from a hyperplane. Logistic predictors compose a linear function with a sigmoidal transfer function. This allows logistic predictors to exhibit sharper transitions from low predictions to high predictions. In particular, a logistic predictor can better approximate a classifier that labels examples that are below a hyperplane by , and examples that are above the hyperplane by 1. Linear and logistic predictors can be more powerful than they first seem: by embedding a learning problem into a higherdimensional space, linear functions (over the expanded space) can capture the power of many of the function classes that are known to be PAC learnable [HS07]. We overview these results in Section 1.5. We note that a key challenge in efficient metricfair learning is that the fairness constraints are neither Lipschitz nor convex (even when the predictor is linear). This is also a challenge for proving generalization and sample complexity bounds. Berk et al. [BHJ17] also study fair regression and formulate a measure of individual fairness loss, albeit in a different setting without a metric (see Section 1.7).
Perfect metricfairness is hard.
Under mild cryptographic assumptions, we exhibit a learning problem and a similarity metric where: there exists a perfectly fair and perfectly accurate simple (linear) predictor, but any polynomialtime perfectly metricfair learner can only find a trivial predictor, whose error approaches 1/2. In contrast, there does exist a polynomialtime (relaxed) PACF learning algorithm for this task. This is an important motivation for our study of approximate metricfairness. We elaborate in Section 1.6.
Organization.
In the remainder of this section we provide an overview of our contributions. Section 1.2 details and discusses the definition of approximate metricfairness and its relationship to related works. Accurate and fair (PACF) learning is discussed in Section 1.3. We state and prove fairnessgeneralization bounds in Section 1.4. Our polynomialtime PACF learning algorithms for linear and logistic regression are in Section 1.5. Section 1.6 elaborates on the hardness of perfectly metricfair learning. Further related work is discussed in Section 1.7.
1.2 Approximate MetricFairness
We require that metricfairness holds for all but a small fraction of pairs of individuals. I.e., with all but probability over a choice of two individuals from the underlying distribution, if the two individuals are similar then they get similar classification distributions. We think of as a small constant, and note that setting recovers the definition of perfect metricfairness (thus, setting to be a small constant larger than 0 is indeed a relaxation). Similarity is measured by the statistical distance between the classification distributions given to the two individuals, where we also allow a small additive slack in the similarity measure. The larger is, the more “differently” similar individuals might be treated. We think of as a small constant, close to 0.
Definition 1.1.
A predictor is ) approximately metricfair (MF) with respect to a similarity metric and a data distribution if:
(1) 
The MF loss of the classifier on the pair is 1 if the (internal) inequality in Equation (1) holds, and 0 otherwise (hence, we refer to this as a fairness loss). A classifier is approximately MF if with all but probability over two individuals sampled from , the MF loss is 0.
Similarly to the PAC learning model, we also allow a small probability of failure. This probability is taken over the choice of the training set and over the learner’s coins. For example, bounds the probability that the randomly sampled training set is not representative of the underlying population. We think of as very small or even negligible. A learning algorithm is probably approximately metricfair if with all but probability over the sample (and the learner’s coins), it outputs a classifier that is approximately MF. Further details are in Section 2.
Given a welldesigned metric, approximate metricfairness (for sufficiently small ) guarantees that almost every individual gets fair treatment compared to almost every other individual (see Section 2.3 for a quantitative discussion). Every protected group of fractional size significantly larger than is protected in the sense that, on average, members of are treated similarly to similar individuals outside of . We note, however, that this guarantee does not protect single individuals or small groups (see the discussion in Section 1.1).
Between group and individual fairness: related works.
Recent works [HJKRR17, KNRW17] study fairness notions that aim to protect large collections of sufficientlylarge groups. Similarly to our work, these can be viewed as falling between individual and group notions of fairness. A distinction from these works is that approximate metricfairness protects every sufficientlylarge group, rather than a large collection of groups that is fixed a priori. Recent works [GJKR18, KRR18] extend the study of metric fairness to settings where the metric is not known (whereas we focus on a setting where the metric is fixed and known in its entirety), and consider relaxed fairness notions that allow individual fairness to be violated.
1.3 Accurate and Fair Learning
Our goal is obtaining learning algorithms that are probably approximately metricfair, and that simultaneously guarantee nontrivial accuracy. Recall that fairness, on its own, can always be obtained by outputting a constant classifier that ignores its input and treats all individuals identically (indeed, such a classifier is perfectly metricfair). It is the combination of the fairness and the accuracy objectives that makes for an interesting task. As discussed above, we follow [DHP12] in focusing on finding a classifier that maximizes accuracy, subject to the approximate metricfairness constraint. This is a natural formulation, as we think of fairness as a hard requirement (imposed, for example, by a regulator), and thus fairness cannot be traded off for better accuracy.
Problem Statement.
A learning problem is defined by an instance domain , a class of classifiers , and a distribution over labeled examples from . A fair learning problem also includes a similarity metric . The learning algorithm gets as input the metric and a sample of labeled examples, drawn i.i.d. from , and its goal is to output a (probabilistic) classifier that is both fair and as accurate as possible. To accommodate the fairness constraints, we allow the learned classifier to return real values in , where we interpret as the probability that the label is 1. We refer to such probabilistic classifiers as predictors. A proper learner outputs a predictor in the class , whereas an improper learner’s output is unconstrained (but is used as a benchmark for accuracy). For a learned (realvalued) predictor , we use to denote the expected error of (the absolute loss) on a random sample from .^{2}^{2}2All results also translate to error (the squared loss).
Accuracy guarantee: PACF learning.
As discussed above, the goal in metricfair and accurate learning is optimizing the predictor’s accuracy subject to the fairness constraint. Ideally, we aim to approach (as the sample size grows) the error rate of the most accurate classifier that satisfies the fairness constraints. A more relaxed benchmark is guaranteeing approximate metricfairness, while approaching the accuracy of the best classifier that is approximately metricfair, for and . Our efficient learning algorithms will achieve this more relaxed accuracy goal (see below). We note that even relaxed competitiveness means that the classifier is (at the very least) competitive with the best perfectly metricfair classifier.
These goals are captured in the following definition of probability approximately correct and fair (PACF) learning. Crucially, both fairness and accuracy goals are stated with respect to the (unknown) underlying distribution.
Definition 1.2 (PACF Learning).
A learning algorithm PACFlearns a hypothesis class if for every metric and population distribution , every required fairness parameters , every failure probability , and every error parameters the following holds:
There exists a sample complexity and constants (specified below), such that with all but probability over an i.i.d. sample of size and ’s coin tosses, the output predictor satisfies the following two conditions:

Fairness: is approximately metricfair w.r.t. the metric and the distribution .

Accuracy: Let denote the subclass of hypotheses in that are approximately metricfair, then:
We say that is efficient if it runs in time . If accuracy holds for and , then we stay that is a strong PACF learning algorithm. Otherwise, we say that is a relaxed PACF learning algorithm.
See Section 3 and Definitions 3.2 and 3.3 for a full treatment. Note that the accuracy guarantee is agnostic: we make no assumptions about the way the training labels are generated. Agnostic learning is particularly well suited to our metricfairness setting: since we make no assumptions about the metric , even if the labels are generated by , it might be the case that does not allow for accurate predictions, in which case a fair learner cannot compete with ’s accuracy.
1.4 Generalization
Generalization is a key issue in learning theory. We develop strong generalization bounds for approximate metricfairness, showing that with high probability, guaranteeing empirical approximate MF on a training set also guarantees approximate MF on the underlying distribution (w.h.p. over the choice of sample ). This generalization bound opens the door to polynomialtime algorithms that can focus on guaranteeing fairness (and accuracy) on the sample and effectively rules out the possibility of creating a “false facade” of fairness (i.e, a classifier that appears fair on a random sample, but is not fair w.r.t new individuals).
Towards proving generalization, we define the empirical fairness loss on a sample (a training set). Fixing a fairness parameter , a predictor and a pair of individuals in the training set, consider the MF loss on the “edge” between and (recall that the MF loss is 1 if the “internal” inequality of Equation (1) holds, and 0 otherwise). Observe that the losses on the edges are not independent random variables (over the choice of ), because each individual affects many edges. Thus, rather than count the empirical MF loss over all edges, we restrict ourselves to a “matching” in the complete graph whose vertices are : a collection of edges, where each individual is involved in exactly one edge. The empirical MF loss of on is defined as the average MF loss over edges in .^{3}^{3}3The choice of which matching is used does not affect any of the results. Note that we could also choose to average over all the edges in the graph induced by . Generalization bounds still follow, but the rate of convergence is not faster than restricting our attention to a matching. Note that, since we restricted our attention to a matching, the MF losses on these edges are now independent random variables (over the choice of ). A classifier is empirically approximately MF if its empirical MF loss is at most . We are now ready to state our generalization bound:
Theorem 1.3.
Let be a hypothesis class with Rademacher complexity . For every and every , there exists a sample complexity , such that the following holds:
With probability at least over an i.i.d sample , simultaneously for every : if is approximately metricfair on the sample , then is also approximately metricfair on the underlying distribution .
See Section 2.5 and Theorem 2.12 for a full statement and discussion (and see Definition 2.11 for a definition of Rademacher complexity). Rademacher complexity differs from celebrated VCdimension in several respects: first, it is defined for any class of realvalued functions (making it suitable for our setting of learning probabilistic classifiers); second, it is datadependent and can be measured from finite samples (indeed, Theorem 1.3 can be stated w.r.t. the empirical Rademacher complexity on a given sample); third, it often results in tighter uniform convergence bounds (see, e.g, [KP02]). We note that for every finite hypothesis class whose range is , the Rademacher complexity is bounded by .
Technical Overview of Theorem 1.3.
For any class of (bounded) realvalued functions , the maximal difference (over all functions ) between the function’s empirical average on a randomly drawn sample, and the function’s true expectation over the underlying distribution, can be bounded in terms of the Rademacher complexity of the class (as well as the sample size and desired confidence). For a hypothesis class and a loss function , applying this result for the class yields a bound on the maximal difference (over all hypotheses ) between the true loss and the empirical loss, in terms of the Rademacher complexity of the composed class . If the loss function is Lipschitz, this can be converted to a bound in terms of the Rademacher complexity of using the fact that .
Turning our attention to generalization of the fairness guarantee, we are faced with the problem that our “01” MF loss function is not Lipschitz. We resolve this by defining an approximation to the MF loss that is a piecewise linear and Lipschitz function. The approximation does generalize, and so we conclude that the empirical MF loss is close to the empirical value of , which is close to the true value of , which in turn is close to the true MF loss. The approximation incurs a additive slack in the fairness guarantee. The larger is, the more accurately approximates the MF loss, but this comes at the price of increasing the Lipschitz constant (which hurts generalization). The generalization theorem statement above reflects a choice of that trades off these conflicting concerns.
1.4.1 InformationTheoretic Sample Complexity
The fairnessgeneralization result of Theorem 1.3 implies that, from a samplecomplexity perspective, any hypothesis class is strongly PACF learnable, with sample complexity comparable to that of standard PAC learning. An exponentialtime PACF learning algorithm simply finds the predictor in that minimizes the empirical error, while also satisfying empirical approximate metricfairness.
Theorem 1.4.
Let be a hypothesis class with Rademacher complexity . Then is informationtheoretically strongly PACF learnable with sample complexity , for .
1.5 Efficient Fair Learning
One of our primary contributions is the construction of polynomialtime relaxedPACF learning algorithms for expressive hypothesis classes. We focus on linear classification tasks, where the labels are determined by a separating hyperplane. Learning linear classifiers, also referred to as halfspaces or linear threshold functions, is a central tool in machine learning. By embedding a learning problem into a higherdimensional space, linear classifiers (over the expanded space) can capture surprisingly strong classes, such as polynomial threshold functions (see, for example, the discussion in [HS07]). The “kernel trick” (see, e.g, [SSBD14]) can allow for efficient solutions even over very high (or infinite) dimensional embeddings. Many of the known (distributionfree) PAC learning algorithms can be derived by learning linear threshold functions [HS07].
Recall that in metricfair learning, we aim to learn a probabilistic classifier, or a predictor, that outputs a real value in . We interpret the output as the probability of assigning the label . We are thus in the setting of regression. We show polynomialtime relaxedPACF learning algorithms for linear regression and for logistic regression. See Section 5 for full and formal details.
1.5.1 Linear Regression
Linear regression, the task of learning linear predictors, is an important and wellstudied problem in the machine learning literature. In terms of accuracy, this is an appealing class when we expect a linear relationship between the probability of the label being and the distance from a hyperplane. Taking the domain to be the unit ball, we define the class of linear predictors as:
We restrict to the unit ball to guarantee that . We then invoke a linear transformation so that the final prediction is in , as required. Restricting the predictor’s output to the range is important. In particular, it means that a linear predictor must be Lipschitz, which might not be appropriate for certain classification tasks (see the discussion of logistic regression below).
We show a relaxed PACF learning algorithm for :
Theorem 1.5.
is relaxed PACF learnable with sample and time complexities of . For every and , the accuracy of the learned predictor approaches (or beats) the most accurate approximately MF predictor.
Algorithm overview.
Since the Rademacher complexity of (bounded) linear functions is small [KST09], Theorem 1.3 implies that empirical approximate metricfairness on the training set generalizes to the underlying population. Thus, given the metric and a training set, our task is to find a linear predictor that is as accurate as possible, conditioned on the empirical fairness constraint. We use to denote the class of linear predictors defined above. Fixing desired fairness parameters , let be the subset of linear functions that are also approximately MF on the training set. Given a training set of labeled examples, we would like to solve the following optimization problem:
Observe, however, that is not a convex set. This is a consequence of the “” metricfairness loss. Thus, we do not know how to solve the above optimization problem efficiently. Instead, we will further constrain the predictor by bounding its MF loss. For a predictor let its (empirical) MF violation be given by:
For , we take to be the set of linear predictors s.t. . For any fixed , this is a convex set, and we can find the most (empirically) accurate predictor in in polynomial time. For fairness, we show that small fairness loss also implies the standard notion of approximate metricfairness (with related parameters ). For accuracy, we also show that approximate metricfairness (with smaller fairness parameters) implies small loss. Thus, optimizing over predictors whose loss is bounded gives a predictor that is competitive with (a certain class of) approximately MF predictors. In particular for we have:
Thus, by picking we guarantee (empirical) approximate metricfairness. Moreover, for any choice of , the set over which we optimize contains all of the predictors that are approximately MF. Thus, our (empirical) accuracy is competitive with all such predictors, and we obtain a relaxed PACF algorithm. The empirical fairness and accuracy guarantees generalize beyond the training set by Theorem 1.3 (fairnessgeneralization) and a standard uniform convergence argument for accuracy.
1.5.2 Logistic Regression
Logistic regression is another appealing class. Here, the prediction need not a be a linear function of the distance from a hyperplane. Rather, we allow the use of a sigmoid function defined as (which is continuous and Lipschitz). The class of logistic predictors is formed by composing a linear function with a sigmoidal transfer function:
(2) 
The sigmoidal transfer function gives the predictor the power to exhibit sharper transitions from low predictions to high predictions around a certain distance (or decision) threshold. For example, suppose a distance from the hyperplane provides a quality score for candidates with respect to a certain task. Suppose also that an employer wants to hire candidates whose quality scores are above some threshold . The class can give probabilities close to 0 to candidates whose quality scores are under , and probabilities close to 1 to candidates whose quality scores are over . Linear predictors, on the other hand, need to be Lipschitz (since we restrict their output to be in , see Section 1.5.1).^{4}^{4}4This might, at first glance, seem like a technicality. After all, why not simply consider linear predictors whose output can be in a larger range? The problem is that it isn’t clear how to plug these larger values into the fairness constraints in a way that keeps the optimization problem convex and also has competitive accuracy. Logistic predictors seem considerably bettersuited to this type of scenario. Indeed, the class can achieve good accuracy on linearly separable data whose margin (i.e. the expected distance from the hyperplane) is larger than . Moreover, similarly to linear threshold functions, logistic regression can be applied after embedding the learning problem into a higherdimensional space. For example, in the “quality score” example above, the score could be computed by a lowdegree polynomial.
Our primary technical contribution is a polynomialtime relaxed PACF learner for where is constant.
Theorem 1.6.
For every constant , is relaxed PACF learnable with sample and time complexities of . For every and , the accuracy of the learned predictor approaches (or beats) the most accurate approximately MF predictor.
More generally, our algorithm is exponential in the parameter . Recall that we expect to have good accuracy on linearly separable data whose margins are larger than ). Thus, one can interpret the algorithm as having runtime that is exponential in the reciprocal of the (expected) margin.
Algorithm overview.
We note that fair learning of logistic predictors is considerably more challenging than the linear case, because the sigmoidal transfer function specifies nonconvex fairness constraints. In standard logistic regression, where fairness is not a concern, polynomialtime learning is achieved by replacing the standard loss with a convex logistic loss. In metricfair learning, however, it not clear how to replace the sigmoidal transfer function by a convex surrogate.
To overcome these barriers, we use improper learning. We embed the linear problem at hand into a higherdimensional space, where logistic predictors and their fairness constraints can be approximated by convex expressions. To do so, we use a beautiful result of ShalevSchwartz et al. [SSSS11] that presents a particular infinitedimensional kernel space where our fairness constraints can be made convex.
In particular, we replace the problem of PACF learning with the problem of PACF learning , a class of linear predictors with norm bounded by B in a RHKS defined by Vovk’s infinitedimension polynomial kernel, . We learn the linear predictor in this RHKS using the result of Theorem 1.5 to obtain a relaxed PACF algorithm for . We use the kernel trick to argue that the sample complexity is , where , and the time complexity is .
For every , we can thus learn a linear predictor (in the above RHKS) that is (empirically) sufficiently fair, and whose (empirical) accuracy is competitive with all the linear predictors with norm bounded by that are approximately MF, for any choice of . To prove PACF learnability of , we build on the polynomial approximation result of ShalevSchwartz et al. [SSSS11] to show that taking to be sufficiently large ensures that the accuracy of the set of AMF predictors in is comparable to the accuracy of the set of AMF predictors in . This requires a choice of that is , which is where the exponential dependence on comes in.
1.6 Hardness of Perfect MetricFairness
As discussed above, perfect metricfairness does not generalize from a training set to the underlying population. For example, consider a very small subset of the population that isn’t represented in the training set. A classifier that discriminates against this small subset might be perfectly metricfair on the training set. The failure of generalization poses serious challenges to constructing learning algorithms. Indeed, we show that perfect metricfairness can make simple learning tasks computationally intractable (with respect to a particular metric).
We present a natural learning problem and a metric where, even though a perfectly fair and perfectly accurate simple (linear) classifier exists, it cannot be found by any polynomialtime learning algorithm that is perfectly metricfair. Indeed, any such algorithm can only find trivial classifiers with error rate approaching 1/2 (not much better than random guessing). The learner can tell that a particular (linear) classifier is empirically perfectly fair (and perfectly accurate). However, even though the classifier is perfectly fair on the underlying distribution, the (polynomialtime) learner cannot certify that this is the case, and thus it has to settle for outputting a trivial classifier. We note that there does exist an exponentialtime perfectly metricfair learning algorithm with a competitive accuracy guarantee,^{5}^{5}5For example, an exponentialtime algorithm could learn by enumerating all possible classifiers, discarding all the ones that are not perfectly metricfair (using a bruteforce search over all pairs of individuals for each candidate classifier), and then output the mostaccurate classifier among the perfectly metricfair ones. It is important to note that this algorithm doesn’t try to guarantee empirical perfect metricfairness, which we know does not generalize. Rather, the learner has to consider the fairness behavior over all pairs of individuals. the issue is the computational complexity of this task. In contrast, the relaxed notion of approximate metricfairness does allow for polynomialtime relaxedPACF learning algorithms that obtain competitive accuracy for this task (as it does for a rich class of learning problems, see Section 1.5).
We present an overview of the hard learning task and discuss its consequences below. See Section 6 and Theorem 6.1 for a more formal description. Since we want to argue about computational intractability, we need to make computational assumptions (in particular, if , then perfectly metricfair learning would be tractable). We will make the minimal cryptographic hardness assumption that oneway functions exist, see [Gol01] for further background.
Simplified construction.
For this sketch, we take a uniform distribution over a domain . For an item (or individual) , its label will be given by the linear classifier . Note that the linear classifier indeed is perfectly accurate.^{6}^{6}6Note that the expected margin in this distribution is small compared to the norms of the examples. This is for simplicity and readability. The full hardness result is shown (in a very similar manner) for data where the margins are large. In particular, this means that the class of predictors can achieve good accuracy with constant . See Section 6.
To argue that fair learning is intractable, we construct two metrics and that are computationally indistinguishable: no polynomialtime algorithm can tell them apart (even given the explicit description of the metric).^{7}^{7}7More formally, we construct two distribution on metrics, such that no polynomialtime algorithm can tell whether a given metric was sampled from the first distribution or from the second. For readability, we mostly ignore this distinction in this sketch. We construct these metrics so that does not allow any nontrivial accuracy, whereas essentially imposes no fairness constraints. Thus, is a perfectly fair and perfectly accurate classifier w.r.t. . Now, since a polynomialtime learning algorithm cannot tell and apart, it has to output the same (distribution on) classifiers given either of these two metrics. If , given , outputs a classifier with nontrivial accuracy, then it violates perfect metricfairness. Thus, when given , must (with high probability) output a classifier with error close to . This remains the case even when is given the metric (by indistinguishability), despite the fact perfect metricfairness under allows for perfect accuracy.
We construct the metrics as follows. The metric gives every pair of individuals distance 1. The metric , on the other hand, partitions the items in into disjoint pairs where the label of is , the label of is , but the distance between and is 0.^{8}^{8}8Formally, is a pseudometric, since it has distinct items at distance 0. We can make be a true metric by replacing the distance 0 with an arbitrarily small positive quantity. The hardness result is essentially unchanged. Thus, the metric assigns to each item a “hidden counterpart” that is identical to , but has the opposite label. The distance between any two distinct elements that are not “hidden counterparts” is 1 (as in ). The metric specifies that hidden counterparts are identical, and thus any perfectly metricfair classifier must treat them identically. Since and have opposing labels, ’s average error on the pair must be . The support of is partitioned into disjoint hidden counterparts, and thus we conclude that . Note that this is true regardless of ’s complexity (in particular, it also rules out improper learning). We construct the metrics using a cryptographic pseudorandom generator (PRG), which specifies the idden counterparts (in ) or their absence (in ). See the full version for details.
Discussion.
We make several remarks about the above hardness result. First, note that the data distribution is fixed, and the optimal classifier is linear and very simple: it only considers a single coordinate. This makes the hardness result sharper: without fairness, the learning task is trivial (indeed, since the classifier is fixed there is nothing to learn). It is the fairness constraint (and only the fairness constraint) that leads to intractability. The computational hardness of perfectly fair learning applies also to improper learning. Finally, the metrics for which we show hardness are arguably contrived (though we note they do obey the triangle inequality). This rules out perfectly metricfair learners that work for any given metric. A natural direction for future work is restricting the choice of metric, which may make perfectly metricfair learning feasible.
1.7 Further Related Work
There is a growing body of work attempting to study the question of algorithmic discrimination, particularly through the lens of machine learning. This literature is characterized by an abundance of definitions, each capturing different discrimination concerns and notions of fairness. This literature is vast and growing, and so we restrict our attention to the works most relevant to ours.
One highlevel distinction can be drawn between group and individual notions of fairness. Groupfairness notions assume the existence of a protected attribute (e.g gender, race), which induces a partition of the instance space into some small number of groups. A fair classifier is one that achieves parity of some statistical measure across these groups. Some prominent measures include classification rates (statistical parity, see e.g [FFM15]), calibration, and false positive or negative rates [KMR16, Cho17, HPS16]. It has been established that some of these notions are inherently incompatible with each other, in all but trivial cases [KMR16, Cho17]. The work of [WGOS17] takes a step towards incorporating the fairness notion of [HPS16] into a statistical and computational theory of learning, and considers a relaxation of the fairness definition to overcome the computational intractability of the learning objective. The work of [DIKL17] proposes an efficient framework for learning different classifiers for different groups in a fair manner.
Individual fairness [DHP12] posits that “similar individuals should be treated similarly”. This powerful guarantee is formalized via a Lipschitz condition (with respect to an existing taskspecific similarity metric) on the classifier mapping individuals to distributions over outcomes. Recent works [JKMR16, JKM] study different individuallevel fairness guarantees in the contexts of reinforcement and online learning. The work of [ZWS13] aims to learn an intermediate “fair” representation that best encodes the data while successfully obfuscating membership in a protected group. See also the more recent work [BCZC17].
Several works have studied fair regression [KAAS12, CKK13, ZVGRG17, BHJ17]. The main differences in our work are a focus on metricbased individual fairness, a strong rigorous fairness guarantee, and proofs of competitive accuracy (both stated with respect to the underlying population distribution).
2 Metric Fairness Definitions
2.1 (Perfect) MetricFairness
Dwork et al. [DHP12] introduced individual fairness, a similaritybased fairness notion in which a probabilistic classifier is said to be fair if it assigns similar distributions to similar individuals.
Definition 2.1 (Perfect MetricFairness).
A probabilistic classifier is said to be perfectly metricfair w.r.t a distance metric , if for every ,
(3) 
where is interpreted as the probability will assign the label to , is a distance measure between distributions and is a taskspecific distance metric that is assumed to be known in advance. Throughout this work we take to be the statistical distance, yielding ).
In the setting considered by [DHP12], a finite set of individuals should be assigned outcomes from a set . Under the assumption that is known, they demonstrated that the problem of minimizing an arbitrary loss function , subject to the individual fairness constraint can be formulated as an LP and thus can be solved in time poly.
2.2 Approximate MetricFairness
We consider a learning setting in which the goal is learning a classifier that satisfies the fairness constraint in Equation (3) with respect to some unknown distribution over , after observing a finite sample . To this end, we introduce a metricfairness loss function that, for a given classifier and a pair of individuals in , assigns a penalty of 1 if the fairness constraint is violated by more than a additive term.
Definition 2.2.
For a metric and , the metricfairness loss on a pair is
(4) 
The overall metricfairness loss for a hypothesis is the expected violation for a random pair according to .
Definition 2.3 (MetricFairness Loss).
For a metric and ,
(5) 
We go on to define the empirical fairness loss, a datadependent quantity designed to estimate the unknown . To this end, we think of a sample as defining a complete weighed graph, denoted , whose vertices are and whose edges are weighed by . Now, observe that when is sampled i.i.d from , any matching ^{9}^{9}9Note that from the structure of , it has exactly matchings, each of size (w.l.o.g, we assume is odd). is an i.i.d sample from . We now define the empirical loss by replacing the expectation over in Equation (5) with the expectation over some matching .
Definition 2.4 (Empirical MetricFairness loss).
(6) 
Finally, we will say that a classifier is fair w.r.t (respectively, ) and if its respective metricfairness loss is at most .
Definition 2.5 (MetricFairness).
A probabilistic classifier is said to be fair w.r.t a metric and (respectively, ) if (respectively, ).
Notation
When and are clear from context, we will use the more succinct notation for the true fairness loss and for the empirical fairness loss. When dealing with a hypothesis class , we use to denote all the fair hypotheses in (w.r.t ), and for those which are fair w.r.t .
2.3 Approximate Metric Fairness: Interpretation
An fair classifier (for ) no longer holds any guarantee for any single individual. To interpret the guarantee it does give, we consider the following definition.
Definition 2.6.
A probabilistic classifier is said to be metricfair w.r.t if
(7) 
Definition 2.6 is very similar to 2.5 but it lends itself to a more intuitive interpretation of fairness for groups. Informally, we will say that an individual feels discriminated against by if the proportion of individuals with whom his constraint is violated (think: individuals who are equally qualified to him but receive different treatment) exceeds ; now, fairness ensures that the proportion of individuals who find to be discriminatory does not exceed . Hence, this is a guarantee for groups: an fair classifier cannot cause an entire group of fractional mass to be discriminated against. The strength of this guarantee is that it holds for any such group (even for those formed âexanteâ). In this sense, fairness represents a middleground between the strict notion of individual fairness and the loose notions of groupfairness.
Finally, we show that the two definitions are related: any fair classifier is also fair, for every for which . This demonstrates that optimizing for accuracy under an fairness constraint is a flexible way of achieving interpretable fairness guarantees for a range of desired values.
Claim 2.7.
For every , and for which , if is fair then it is also fair.
Proof of Claim 2.7. For simplicity, we define the following indicator function,
Let such that , and assume that is fair w.r.t . Assume for contradiction that is not fair w.r.t . That means that
If we denote the subset of “discriminated” individuals as ,
then the assumption is equivalent to . We now obtain:
where the first transition is from the assumption that is fair, and the final transition is from the assumption that is not fair. We therefore have that , which contradicts our assumption.
2.4 MetricFairness
In this work we focus on approximate metricfairness and the metric fairness loss (Definition 2.2). We find that this notion provides appealing and interpretable protections from discrimination: as discussed above, for small enough , every sufficiently large group is protected from blatant discrimination (see Section 2.3). However, in turning to design efficient metricfair learning algorithms, working directly with this definition presents difficulties (see Section 5). In particular, the “0/1” nature of the metric fairness loss means that the set is not a convex set. Trying to learn an empirically metricfair that optimizes accuracy is a nonconvex optimization problem, and it isn’t clear how to optimize using convexoptimization tools.
In light of this difficulty, we introduce a different metricfairness loss definition. It overcomes the nonconvexity by replacing the bound on the expected number of fairness violations with a bound on the expected sum of the fairness violations.
Definition 2.8 ( MF loss).
For a metric , the metricfairness loss on a pair is
Similarly to the regular metricfairness loss, the loss for a hypothesis is the expected violation for a random pair according to , and the empirical loss replaces the expectation over with the expectation over some matching .
Definition 2.9 ( MetricFairness).
A probabilistic classifier is said to be metricfair w.r.t a metric and w.r.t (respectively, ) if its respective MF loss is bounded by .
When are clear from context we use the notation (respectively, ) for the subset of hypotheses from which are MF w.r.t (respectively, ). We use (previously ) to emphasize that fairness is calculated w.r.t the standard MF loss.
The main advantage of metricfairness is that it induces convex constraints and tractable optimization problems. We note that metricfairness also generalizes from a sample (indeed, in this case the fairness loss is Lipschitz and thus it’s easier to prove generalization bounds). The main disadvantage of metricfairness is in interpreting the guarantee: it is less immediately obvious how bounding the sum of “fairness deviations” translates into protections for groups or for individuals. Nonetheless, we show that metricfairness implies approximate metric fairness for every . Thus, when we optimize over the set of MF predictors, we are guaranteed to output an approximately metricfair predictor. Moreover, since approximate MF also implies MF, any solution to the () optimization problem will also be competitive with a certain class of approximately metricfair classifiers.
The connection between approximate and metric fairness is quantified in Lemma 2.10 below. We note that there is a gap between these upper and lower bounds. This is the “price” we pay for relaxing from approximate to metric fairness. We also note that in proving that MF implies approximate metricfairness, it is essential to use a nonzero additive violation term.
Lemma 2.10.
For every sample , matching and every , .
Proof of Lemma 2.10. We begin by defining the induced violation vector of a classifier . For a sample , a matching and a value , the induced violation vector is defined as:
where is the th edge in the matching .
We first prove that . If , then this means that . Assume for contradiction that for some we have . But this implies that , which is a contradiction. We therefore have that from which we conclude that .
Next, we prove that . To do so, we’ll prove that . Observe that for every we have that because . To see that , recall that implies that . Thus:
which implies , as required.
2.5 Generalization
A key issue in learning theory is that of generalization: to what extent is a classifier that is accurate on a finite sample also guaranteed to be accurate w.r.t the underlying distribution? In this section, we work to develop similar generalization bounds for our metricfairness loss function. Proving that fairness can generalize well is a crucial component in our analysis  it effectively rules out the possibility of creating a “false facade” of fairness (i.e, a classifier that only appears fair on a sample, but is not fair w.r.t new individuals).
The generalization bounds will be based on proving uniform convergence of the empirical estimates (in our case, ) to the fairness loss, simultaneously for every , in terms of the Rademacher complexity of the hypotheses class . Rademacher complexity differs from celebrated VCdimension complexity measure in three aspects: first, it is defined for any class of realvalued functions (making it suitable for our setting of learning probabilistic classifiers); second, it is datadependent and can be measured from finite samples; third, it often results in tighter uniform convergence bounds (see, e.g, [KP02]).
Definition 2.11 (Rademacher complexity).
Let be an input space, a distribution on , and a realvalued function class defined on . The empirical Rademacher complexity of with respect to a sample is the following random variable:
(8) 
The expectation is taken over where the ’s are independent uniformly random variables taking values in . The Rademacher complexity of is defined as the expectation of over all samples of size :
(9) 
In our case, the fact that our fairness loss is a 01 styleloss means that for infinite hypothesis classes, the generalization argument is involved and we incur an extra approximation term in .
Theorem 2.12 (RademacherBased Uniform Convergence of the MetricFairness Loss).
Let be a hypotheses class with Rademacher complexity . For every , every and every (w.l.o.g assume is odd), with probability at least over an i.i.d sample , simultaneously for every :
(10) 
where .
Before proving the theorem, we note that an immediate corollary is that our metricfairness loss is capable of generalizing well for any hypothesis class that has a small Rademacher complexity. In particular, we will be using the following result for the class of linear classifiers with norm bounded by in a RHKS whose inner products are implemented by a kernel (i.e, exists a mapping such that ):
Corollary 2.13.
Let as above, for any and kernel . For every , every and every , w.p at least , simultaneously for every
(11) 
where and .
Proof of Corollary 2.13. The proof follows from Theorem 2.12 and the fact that the Rademacher complexity is bounded by (see [KST09]).
To set the stage for proving Theorem 2.12, we state some useful properties of the Rademacher complexity notion, see [BM02].
Theorem 2.14 (Twosided Rademacherbased uniform convergence).
Consider a set of functions mapping to . For every , with probability at least over a random draw of a sample of size , every satisfies
(12)  
(13) 
Lemma 2.15.
Let be a class of real functions. Then, for any sample of size :

For any function , .

If is Lipschitz and satisfies , then .

For every , w.p at least over the choice of ,
Proof of Theorem 2.12.
Consider the input space and consider the functions , where is the fairness loss defined as
For a given sample (w.l.o.g, assume is odd), let be any matching in the graph induced by . Observe that is indeed an i.i.d sample from and recall that . For the sake of simplicity, we hereby denote as the empirical Rademacher complexity with respect to induced by a random sample .
Denote the threshold function at :
Hence,
Observe that can be further decomposed as
where , abs is the absolute value function (which is 1Lipschitz), and .
Claim 2.16.
.
Proof of Claim 2.16. Denote where . By definition,