Probably Approximately Metric-Fair Learning

We study fairness in machine learning. A learning algorithm, given a training set drawn from an underlying population, learns a classifier that will be used to make decisions about individuals. The concern is that this classifier’s decisions might be discriminatory, favoring certain subpopulations over others. The seminal work of Dwork et al. [ITCS 2012] introduced fairness through awareness, positing that a fair classifier should treat similar individuals similarly. Similarity between individuals is measured by a task-specific similarity metric. In the context of machine learning, however, this fairness notion faces serious difficulties, as it does not generalize and can be computationally intractable.

We introduce a relaxed notion of approximate metric-fairness, which allows a small fairness error: for a random pair of individuals sampled from the population, with all but a small probability of error, if they are similar then they are treated similarly. In particular, this provides discrimination-protections to every subpopulation that is not too small. We show that approximate metric-fairness does generalize from a training set to the underlying population, and we leverage these generalization guarantees to construct polynomial-time learning algorithms that achieve competitive accuracy subject to fairness constraints.

## 1 Introduction

Machine learning is increasingly used to make consequential classification decisions about individuals. Examples range from predicting whether a user will enjoy a particular article, to estimating a felon’s recidivism risk, to determining whether a patient is a good candidate for a medical treatment. Automated classification comes with great benefits, but it also raises substantial societal concerns (cf. [O’N16] for a recent perspective). One prominent concern is that these algorithms might discriminate against individuals or groups in a way that violates laws or social and ethical norms. This might happen due to biases in the training data or due to biases introduced by the algorithm. To address these concerns, and to truly unleash the full potential of automated classification, there is a growing need for frameworks and tools to mitigate the risks of algorithmic discrimination. A growing literature attempts to tackle these challenges by exploring different fairness criteria.

Discrimination can take many guises. It can be difficult to spot and difficult to define. Imagine a protected minority population (defined by race, gender identity, political affiliation, etc). A natural approach for protecting the members of from discrimination is to make sure that they are not mistreated on average. For example, that on average members of and individuals outside of are classified in any particular way with roughly the same probability. This is a “group-level” notion of fairness, sometimes referred to as statistical parity.

Pointing out several weakness of group-level notions of fairness, the seminal work of [DHP12] introduced a notion of individual fairness. Their notion relies on a task-specific similarity metric that specifies, for every two individuals, how similar they are with respect to the specific classification task at hand. Given such a metric, similar individuals should be treated similarly, i.e. assigned similar classification distributions (their focus was on probabilistic classifiers, as will be ours). In this work, we refer to their fairness notion as perfect metric-fairness.

Given a good metric, perfect metric-fairness provides powerful protections from discrimination. Furthermore, the metric provides a vehicle for specifying social norms, cultural awareness, and task-specific knowledge. While coming up with a good metric can be challenging, metrics arise naturally in prominent existing examples (such as credit scores and insurance risk scores), and in natural scenarios (a metric specified by an external regulator). Dwork et al. studied the goal of finding a (probabilistic) classifier that minimizes utility loss (or maximizes accuracy), subject to satisfying the perfect metric-fairness constraint. They showed how to phrase and solve this optimization problem for a given collection of individuals.

### 1.1 This Work: Approximately Metric-Fair Machine Learning

Building on these foundations, we study metric-fair machine learning. Consider a learner that is given a similarity metric and a training set of labeled examples, drawn from an underlying population distribution. The learner should output a fair classifier that (to the extent possible) accurately classifies the underlying population.

This goal departs from the scenario studied in [DHP12], where the focus was on guaranteeing metric-fairness and utility for the dataset at hand. Generalization of the fairness guarantee is a key difference: we focus on guaranteeing fairness not just for the (training) data set at hand, but also for the underlying population from which it was drawn. We note that perfect metric-fairness does not, as a rule, generalize from a training set to the underlying population. This presents computational difficulties for constructing learning algorithms that are perfectly metric-fair for the underlying population. Indeed, we exhibit a simple learning task that, while easy to learn without fairness constraints, becomes computationally infeasible under the perfect metric-fairness constraint (given a particular metric).111We remark that perfect metric-fairness can always be obtained trivially by outputting a constant classifier that treats all individuals identically, the challenge is achieving metric-fairness together with non-trivial accuracy. See below and in Section 1.6 for further details.

We develop a relaxed approximate metric-fairness framework for machine learning, where fairness does generalize from the training set to the underlying population, and present polynomial-time fair learning algorithms in this framework. We proceed to describe our setting and contributions.

##### Problem setting.

A metric-fair learning problem is defined by an domain , a similarity metric , and a distribution over labeled examples from . A metric-fair learning algorithm gets as input the metric and a sample of labeled examples, drawn i.i.d. from , and outputs a classifier . To accommodate fairness, we focus on probabilistic classifiers , where we interpret as the probability of label 1 (the probability of is thus ).

##### Approximate Metric-Fairness.

Taking inspiration from Valiant’s celebrated PAC learning model [Val84], we allow a small fairness error, which opens the door to generalization. We require that for two individuals sampled from the underlying population, with all but a small probability, if they are similar then they should be treated similarly. Similarity is measured by the statistical distance between the classification distributions given to the two individuals (we also allow a small additive slack in the similarity measure). We refer to this condition as approximate metric-fairness (MF). Similarly to PAC learning, we also allow a small probability of a complete fairness failure.

Given a well-designed metric, approximate metric-fairness guarantees that almost every individual gets fair treatment compared to almost every other individual. In particular, it provides discrimination-protections to every group that is not too small. However, this guarantee also has limitations: particular individuals and even small groups might encounter bias and discrimination. There are certainly settings in which this is problematic, but in other settings protecting all groups that are not too small is an appealing guarantee. Moreover, approximate fairness opens the door to fairness-generalization bounds, as well as efficient learning algorithms for a rich collection of problems (see below). We elaborate on these choices and their consequences in Section 1.2.

##### Competitive accuracy.

Turning our attention to the accuracy objective, we follow [DHP12] in considering fairness to be a hard constraint (e.g. imposed by a regulator). Given the fairness constraint, what is a reasonable accuracy objective? Ideally, we would like the predictor’s accuracy to approach (as the sample size grows) that of the most accurate approximately MF predictor. This is analogous to the accuracy guarantee pioneered in [DHP12]. A probably approximately correct and fair (PACF) learning algorithm guarantees both approximate MF and “best-possible” accuracy. A more relaxed accuracy benchmark is approaching the accuracy of the best classifier that is approximately MF for a tighter (more restrictive) fairness-error. We refer this as a relaxed PACF learning algorithm (looking ahead, our efficient algorithms achieve this relaxed accuracy guarantee). We note that even relaxed PACF guarantees that the classifier is (at the very least) competitive with the best perfectly metric-fair classifier. We elaborate in Section 1.3.

##### Generalization bounds.

A key issue in learning theory is that of generalization: to what extent is a classifier that is accurate on a finite sample also guaranteed to be accurate w.r.t the underlying distribution? We develop strong generalization bounds for approximate metric-fairness, showing that for any class of predictors with bounded Rademacher complexity, approximate MF on the sample implies approximate MF on the underlying distribution (w.h.p. over the choice of sample ). The use of Rademacher complexity guarantees fairness-generalization for finite classes and also for many infinite classes. Proving that approximate metric-fairness generalizes well is a crucial component in our analysis: it opens the door to polynomial-time algorithms that can focus on guaranteeing fairness (and accuracy) on the sample. Generalization also implies information-theoretic sample-complexity bounds for PACF learning that are similar to those known for PAC learning (without any fairness constraints). We elaborate in Section 1.4.

##### Efficient algorithms.

We construct polynomial-time (relaxed) PACF algorithms for linear and logistic regression. Recall that (for fairness) we focus on regression problems: learning predictors that assign a probability in to each example. For linear predictors, the probability is a linear function of an example’s distance from a hyperplane. Logistic predictors compose a linear function with a sigmoidal transfer function. This allows logistic predictors to exhibit sharper transitions from low predictions to high predictions. In particular, a logistic predictor can better approximate a classifier that labels examples that are below a hyperplane by , and examples that are above the hyperplane by 1. Linear and logistic predictors can be more powerful than they first seem: by embedding a learning problem into a higher-dimensional space, linear functions (over the expanded space) can capture the power of many of the function classes that are known to be PAC learnable [HS07]. We overview these results in Section 1.5. We note that a key challenge in efficient metric-fair learning is that the fairness constraints are neither Lipschitz nor convex (even when the predictor is linear). This is also a challenge for proving generalization and sample complexity bounds. Berk et al. [BHJ17] also study fair regression and formulate a measure of individual fairness loss, albeit in a different setting without a metric (see Section 1.7).

##### Perfect metric-fairness is hard.

Under mild cryptographic assumptions, we exhibit a learning problem and a similarity metric where: there exists a perfectly fair and perfectly accurate simple (linear) predictor, but any polynomial-time perfectly metric-fair learner can only find a trivial predictor, whose error approaches 1/2. In contrast, there does exist a polynomial-time (relaxed) PACF learning algorithm for this task. This is an important motivation for our study of approximate metric-fairness. We elaborate in Section 1.6.

##### Organization.

In the remainder of this section we provide an overview of our contributions. Section 1.2 details and discusses the definition of approximate metric-fairness and its relationship to related works. Accurate and fair (PACF) learning is discussed in Section 1.3. We state and prove fairness-generalization bounds in Section 1.4. Our polynomial-time PACF learning algorithms for linear and logistic regression are in Section 1.5. Section 1.6 elaborates on the hardness of perfectly metric-fair learning. Further related work is discussed in Section 1.7.

Full and formal details are in Sections 2 through 6. Conclusions and a discussion of future directions are in Section 7.

### 1.2 Approximate Metric-Fairness

We require that metric-fairness holds for all but a small fraction of pairs of individuals. I.e., with all but probability over a choice of two individuals from the underlying distribution, if the two individuals are similar then they get similar classification distributions. We think of as a small constant, and note that setting recovers the definition of perfect metric-fairness (thus, setting to be a small constant larger than 0 is indeed a relaxation). Similarity is measured by the statistical distance between the classification distributions given to the two individuals, where we also allow a small additive slack in the similarity measure. The larger is, the more “differently” similar individuals might be treated. We think of as a small constant, close to 0.

###### Definition 1.1.

A predictor is ) approximately metric-fair (MF) with respect to a similarity metric and a data distribution if:

 \bf Prx,x′∼D[|h(x)−h(x′)|>d(x,x′)+γ]≤α (1)

The MF loss of the classifier on the pair is 1 if the (internal) inequality in Equation (1) holds, and 0 otherwise (hence, we refer to this as a fairness loss). A classifier is -approximately MF if with all but probability over two individuals sampled from , the MF loss is 0.

Similarly to the PAC learning model, we also allow a small probability of failure. This probability is taken over the choice of the training set and over the learner’s coins. For example, bounds the probability that the randomly sampled training set is not representative of the underlying population. We think of as very small or even negligible. A learning algorithm is probably approximately metric-fair if with all but probability over the sample (and the learner’s coins), it outputs a classifier that is -approximately MF. Further details are in Section 2.

Given a well-designed metric, approximate metric-fairness (for sufficiently small ) guarantees that almost every individual gets fair treatment compared to almost every other individual (see Section 2.3 for a quantitative discussion). Every protected group of fractional size significantly larger than is protected in the sense that, on average, members of are treated similarly to similar individuals outside of . We note, however, that this guarantee does not protect single individuals or small groups (see the discussion in Section 1.1).

##### Between group and individual fairness: related works.

Recent works [HJKRR17, KNRW17] study fairness notions that aim to protect large collections of sufficiently-large groups. Similarly to our work, these can be viewed as falling between individual and group notions of fairness. A distinction from these works is that approximate metric-fairness protects every sufficiently-large group, rather than a large collection of groups that is fixed a priori. Recent works [GJKR18, KRR18] extend the study of metric fairness to settings where the metric is not known (whereas we focus on a setting where the metric is fixed and known in its entirety), and consider relaxed fairness notions that allow individual fairness to be violated.

### 1.3 Accurate and Fair Learning

Our goal is obtaining learning algorithms that are probably approximately metric-fair, and that simultaneously guarantee non-trivial accuracy. Recall that fairness, on its own, can always be obtained by outputting a constant classifier that ignores its input and treats all individuals identically (indeed, such a classifier is perfectly metric-fair). It is the combination of the fairness and the accuracy objectives that makes for an interesting task. As discussed above, we follow [DHP12] in focusing on finding a classifier that maximizes accuracy, subject to the approximate metric-fairness constraint. This is a natural formulation, as we think of fairness as a hard requirement (imposed, for example, by a regulator), and thus fairness cannot be traded off for better accuracy.

##### Problem Statement.

A learning problem is defined by an instance domain , a class of classifiers , and a distribution over labeled examples from . A fair learning problem also includes a similarity metric . The learning algorithm gets as input the metric and a sample of labeled examples, drawn i.i.d. from , and its goal is to output a (probabilistic) classifier that is both fair and as accurate as possible. To accommodate the fairness constraints, we allow the learned classifier to return real values in , where we interpret as the probability that the label is 1. We refer to such probabilistic classifiers as predictors. A proper learner outputs a predictor in the class , whereas an improper learner’s output is unconstrained (but is used as a benchmark for accuracy). For a learned (real-valued) predictor , we use to denote the expected error of (the absolute loss) on a random sample from .222All results also translate to error (the squared loss).

##### Accuracy guarantee: PACF learning.

As discussed above, the goal in metric-fair and accurate learning is optimizing the predictor’s accuracy subject to the fairness constraint. Ideally, we aim to approach (as the sample size grows) the error rate of the most accurate classifier that satisfies the fairness constraints. A more relaxed benchmark is guaranteeing -approximate metric-fairness, while approaching the accuracy of the best classifier that is -approximately metric-fair, for and . Our efficient learning algorithms will achieve this more relaxed accuracy goal (see below). We note that even relaxed competitiveness means that the classifier is (at the very least) competitive with the best perfectly metric-fair classifier.

These goals are captured in the following definition of probability approximately correct and fair (PACF) learning. Crucially, both fairness and accuracy goals are stated with respect to the (unknown) underlying distribution.

###### Definition 1.2 (PACF Learning).

A learning algorithm PACF-learns a hypothesis class if for every metric and population distribution , every required fairness parameters , every failure probability , and every error parameters the following holds:

There exists a sample complexity and constants (specified below), such that with all but probability over an i.i.d. sample of size and ’s coin tosses, the output predictor satisfies the following two conditions:

1. Fairness: is -approximately metric-fair w.r.t. the metric and the distribution .

2. Accuracy: Let denote the subclass of hypotheses in that are -approximately metric-fair, then:

 errD(h)≤minh′∈H′FerrD(h′)+ϵ

We say that is efficient if it runs in time . If accuracy holds for and , then we stay that is a strong PACF learning algorithm. Otherwise, we say that is a relaxed PACF learning algorithm.

See Section 3 and Definitions 3.2 and 3.3 for a full treatment. Note that the accuracy guarantee is agnostic: we make no assumptions about the way the training labels are generated. Agnostic learning is particularly well suited to our metric-fairness setting: since we make no assumptions about the metric , even if the labels are generated by , it might be the case that does not allow for accurate predictions, in which case a fair learner cannot compete with ’s accuracy.

### 1.4 Generalization

Generalization is a key issue in learning theory. We develop strong generalization bounds for approximate metric-fairness, showing that with high probability, guaranteeing empirical approximate MF on a training set also guarantees approximate MF on the underlying distribution (w.h.p. over the choice of sample ). This generalization bound opens the door to polynomial-time algorithms that can focus on guaranteeing fairness (and accuracy) on the sample and effectively rules out the possibility of creating a “false facade” of fairness (i.e, a classifier that appears fair on a random sample, but is not fair w.r.t new individuals).

Towards proving generalization, we define the empirical fairness loss on a sample (a training set). Fixing a fairness parameter , a predictor and a pair of individuals in the training set, consider the MF loss on the “edge” between and (recall that the MF loss is 1 if the “internal” inequality of Equation (1) holds, and 0 otherwise). Observe that the losses on the edges are not independent random variables (over the choice of ), because each individual affects many edges. Thus, rather than count the empirical MF loss over all edges, we restrict ourselves to a “matching” in the complete graph whose vertices are : a collection of edges, where each individual is involved in exactly one edge. The empirical MF loss of on is defined as the average MF loss over edges in .333The choice of which matching is used does not affect any of the results. Note that we could also choose to average over all the edges in the graph induced by . Generalization bounds still follow, but the rate of convergence is not faster than restricting our attention to a matching. Note that, since we restricted our attention to a matching, the MF losses on these edges are now independent random variables (over the choice of ). A classifier is empirically -approximately MF if its empirical MF loss is at most . We are now ready to state our generalization bound:

###### Theorem 1.3.

Let be a hypothesis class with Rademacher complexity . For every and every , there exists a sample complexity , such that the following holds:

With probability at least over an i.i.d sample , simultaneously for every : if is -approximately metric-fair on the sample , then is also -approximately metric-fair on the underlying distribution .

See Section 2.5 and Theorem 2.12 for a full statement and discussion (and see Definition 2.11 for a definition of Rademacher complexity). Rademacher complexity differs from celebrated VC-dimension in several respects: first, it is defined for any class of real-valued functions (making it suitable for our setting of learning probabilistic classifiers); second, it is data-dependent and can be measured from finite samples (indeed, Theorem 1.3 can be stated w.r.t. the empirical Rademacher complexity on a given sample); third, it often results in tighter uniform convergence bounds (see, e.g,  [KP02]). We note that for every finite hypothesis class whose range is , the Rademacher complexity is bounded by .

##### Technical Overview of Theorem 1.3.

For any class of (bounded) real-valued functions , the maximal difference (over all functions ) between the function’s empirical average on a randomly drawn sample, and the function’s true expectation over the underlying distribution, can be bounded in terms of the Rademacher complexity of the class (as well as the sample size and desired confidence). For a hypothesis class and a loss function , applying this result for the class yields a bound on the maximal difference (over all hypotheses ) between the true loss and the empirical loss, in terms of the Rademacher complexity of the composed class . If the loss function is -Lipschitz, this can be converted to a bound in terms of the Rademacher complexity of using the fact that .

Turning our attention to generalization of the fairness guarantee, we are faced with the problem that our “0-1” MF loss function is not Lipschitz. We resolve this by defining an approximation to the MF loss that is a piece-wise linear and -Lipschitz function. The approximation does generalize, and so we conclude that the empirical MF loss is close to the empirical value of , which is close to the true value of , which in turn is close to the true MF loss. The approximation incurs a additive slack in the fairness guarantee. The larger is, the more accurately approximates the MF loss, but this comes at the price of increasing the Lipschitz constant (which hurts generalization). The generalization theorem statement above reflects a choice of that trades off these conflicting concerns.

#### 1.4.1 Information-Theoretic Sample Complexity

The fairness-generalization result of Theorem 1.3 implies that, from a sample-complexity perspective, any hypothesis class is strongly PACF learnable, with sample complexity comparable to that of standard PAC learning. An exponential-time PACF learning algorithm simply finds the predictor in that minimizes the empirical error, while also satisfying empirical approximate metric-fairness.

###### Theorem 1.4.

Let be a hypothesis class with Rademacher complexity . Then is information-theoretically strongly PACF learnable with sample complexity , for .

### 1.5 Efficient Fair Learning

One of our primary contributions is the construction of polynomial-time relaxed-PACF learning algorithms for expressive hypothesis classes. We focus on linear classification tasks, where the labels are determined by a separating hyperplane. Learning linear classifiers, also referred to as halfspaces or linear threshold functions, is a central tool in machine learning. By embedding a learning problem into a higher-dimensional space, linear classifiers (over the expanded space) can capture surprisingly strong classes, such as polynomial threshold functions (see, for example, the discussion in [HS07]). The “kernel trick” (see, e.g, [SSBD14]) can allow for efficient solutions even over very high (or infinite) dimensional embeddings. Many of the known (distribution-free) PAC learning algorithms can be derived by learning linear threshold functions [HS07].

Recall that in metric-fair learning, we aim to learn a probabilistic classifier, or a predictor, that outputs a real value in . We interpret the output as the probability of assigning the label . We are thus in the setting of regression. We show polynomial-time relaxed-PACF learning algorithms for linear regression and for logistic regression. See Section 5 for full and formal details.

#### 1.5.1 Linear Regression

Linear regression, the task of learning linear predictors, is an important and well-studied problem in the machine learning literature. In terms of accuracy, this is an appealing class when we expect a linear relationship between the probability of the label being and the distance from a hyperplane. Taking the domain to be the unit ball, we define the class of linear predictors as:

 Hlin% def={x↦1+⟨w,x⟩2:∥w∥≤1},

We restrict to the unit ball to guarantee that . We then invoke a linear transformation so that the final prediction is in , as required. Restricting the predictor’s output to the range is important. In particular, it means that a linear predictor must be -Lipschitz, which might not be appropriate for certain classification tasks (see the discussion of logistic regression below).

We show a relaxed PACF learning algorithm for :

###### Theorem 1.5.

is relaxed PACF learnable with sample and time complexities of . For every and , the accuracy of the learned predictor approaches (or beats) the most accurate -approximately MF predictor.

##### Algorithm overview.

Since the Rademacher complexity of (bounded) linear functions is small [KST09], Theorem 1.3 implies that empirical approximate metric-fairness on the training set generalizes to the underlying population. Thus, given the metric and a training set, our task is to find a linear predictor that is as accurate as possible, conditioned on the empirical fairness constraint. We use to denote the class of linear predictors defined above. Fixing desired fairness parameters , let be the subset of linear functions that are also -approximately MF on the training set. Given a training set of labeled examples, we would like to solve the following optimization problem:

 argminh∈HerrS(h)% subject toh∈ˆHα,γ

Observe, however, that is not a convex set. This is a consequence of the “” metric-fairness loss. Thus, we do not know how to solve the above optimization problem efficiently. Instead, we will further constrain the predictor by bounding its MF loss. For a predictor let its (empirical) MF violation be given by:

 ξS(h)=∑(x,x′)∈M(S)max(0,|h(x)−h(x′)|−d(x,x′)).

For , we take to be the set of linear predictors s.t. . For any fixed , this is a convex set, and we can find the most (empirically) accurate predictor in in polynomial time. For fairness, we show that small fairness loss also implies the standard notion of approximate metric-fairness (with related parameters ). For accuracy, we also show that approximate metric-fairness (with smaller fairness parameters) implies small loss. Thus, optimizing over predictors whose loss is bounded gives a predictor that is competitive with (a certain class of) approximately MF predictors. In particular for we have:

 ˆHτ−σ,σ⊆ˆHτℓ1⊆ˆHτγ,γ

Thus, by picking we guarantee (empirical) -approximate metric-fairness. Moreover, for any choice of , the set over which we optimize contains all of the predictors that are -approximately MF. Thus, our (empirical) accuracy is competitive with all such predictors, and we obtain a relaxed PACF algorithm. The empirical fairness and accuracy guarantees generalize beyond the training set by Theorem 1.3 (fairness-generalization) and a standard uniform convergence argument for accuracy.

#### 1.5.2 Logistic Regression

Logistic regression is another appealing class. Here, the prediction need not a be a linear function of the distance from a hyperplane. Rather, we allow the use of a sigmoid function defined as (which is continuous and -Lipschitz). The class of logistic predictors is formed by composing a linear function with a sigmoidal transfer function:

 (2)

The sigmoidal transfer function gives the predictor the power to exhibit sharper transitions from low predictions to high predictions around a certain distance (or decision) threshold. For example, suppose a distance from the hyperplane provides a quality score for candidates with respect to a certain task. Suppose also that an employer wants to hire candidates whose quality scores are above some threshold . The class can give probabilities close to 0 to candidates whose quality scores are under , and probabilities close to 1 to candidates whose quality scores are over . Linear predictors, on the other hand, need to be -Lipschitz (since we restrict their output to be in , see Section 1.5.1).444This might, at first glance, seem like a technicality. After all, why not simply consider linear predictors whose output can be in a larger range? The problem is that it isn’t clear how to plug these larger values into the fairness constraints in a way that keeps the optimization problem convex and also has competitive accuracy. Logistic predictors seem considerably better-suited to this type of scenario. Indeed, the class can achieve good accuracy on linearly separable data whose margin (i.e. the expected distance from the hyperplane) is larger than . Moreover, similarly to linear threshold functions, logistic regression can be applied after embedding the learning problem into a higher-dimensional space. For example, in the “quality score” example above, the score could be computed by a low-degree polynomial.

Our primary technical contribution is a polynomial-time relaxed PACF learner for where is constant.

###### Theorem 1.6.

For every constant , is relaxed PACF learnable with sample and time complexities of . For every and , the accuracy of the learned predictor approaches (or beats) the most accurate -approximately MF predictor.

More generally, our algorithm is exponential in the parameter . Recall that we expect to have good accuracy on linearly separable data whose margins are larger than ). Thus, one can interpret the algorithm as having runtime that is exponential in the reciprocal of the (expected) margin.

##### Algorithm overview.

We note that fair learning of logistic predictors is considerably more challenging than the linear case, because the sigmoidal transfer function specifies non-convex fairness constraints. In standard logistic regression, where fairness is not a concern, polynomial-time learning is achieved by replacing the standard loss with a convex logistic loss. In metric-fair learning, however, it not clear how to replace the sigmoidal transfer function by a convex surrogate.

To overcome these barriers, we use improper learning. We embed the linear problem at hand into a higher-dimensional space, where logistic predictors and their fairness constraints can be approximated by convex expressions. To do so, we use a beautiful result of Shalev-Schwartz et al. [SSSS11] that presents a particular infinite-dimensional kernel space where our fairness constraints can be made convex.

In particular, we replace the problem of PACF learning with the problem of PACF learning , a class of linear predictors with norm bounded by B in a RHKS defined by Vovk’s infinite-dimension polynomial kernel, . We learn the linear predictor in this RHKS using the result of Theorem 1.5 to obtain a relaxed PACF algorithm for . We use the kernel trick to argue that the sample complexity is , where , and the time complexity is .

For every , we can thus learn a linear predictor (in the above RHKS) that is (empirically) sufficiently fair, and whose (empirical) accuracy is competitive with all the linear predictors with norm bounded by that are -approximately MF, for any choice of . To prove PACF learnability of , we build on the polynomial approximation result of Shalev-Schwartz et al. [SSSS11] to show that taking to be sufficiently large ensures that the accuracy of the set of -AMF predictors in is comparable to the accuracy of the set of -AMF predictors in . This requires a choice of that is , which is where the exponential dependence on comes in.

### 1.6 Hardness of Perfect Metric-Fairness

As discussed above, perfect metric-fairness does not generalize from a training set to the underlying population. For example, consider a very small subset of the population that isn’t represented in the training set. A classifier that discriminates against this small subset might be perfectly metric-fair on the training set. The failure of generalization poses serious challenges to constructing learning algorithms. Indeed, we show that perfect metric-fairness can make simple learning tasks computationally intractable (with respect to a particular metric).

We present a natural learning problem and a metric where, even though a perfectly fair and perfectly accurate simple (linear) classifier exists, it cannot be found by any polynomial-time learning algorithm that is perfectly metric-fair. Indeed, any such algorithm can only find trivial classifiers with error rate approaching 1/2 (not much better than random guessing). The learner can tell that a particular (linear) classifier is empirically perfectly fair (and perfectly accurate). However, even though the classifier is perfectly fair on the underlying distribution, the (polynomial-time) learner cannot certify that this is the case, and thus it has to settle for outputting a trivial classifier. We note that there does exist an exponential-time perfectly metric-fair learning algorithm with a competitive accuracy guarantee,555For example, an exponential-time algorithm could learn by enumerating all possible classifiers, discarding all the ones that are not perfectly metric-fair (using a brute-force search over all pairs of individuals for each candidate classifier), and then output the most-accurate classifier among the perfectly metric-fair ones. It is important to note that this algorithm doesn’t try to guarantee empirical perfect metric-fairness, which we know does not generalize. Rather, the learner has to consider the fairness behavior over all pairs of individuals. the issue is the computational complexity of this task. In contrast, the relaxed notion of approximate metric-fairness does allow for polynomial-time relaxed-PACF learning algorithms that obtain competitive accuracy for this task (as it does for a rich class of learning problems, see Section 1.5).

We present an overview of the hard learning task and discuss its consequences below. See Section 6 and Theorem 6.1 for a more formal description. Since we want to argue about computational intractability, we need to make computational assumptions (in particular, if , then perfectly metric-fair learning would be tractable). We will make the minimal cryptographic hardness assumption that one-way functions exist, see [Gol01] for further background.

##### Simplified construction.

For this sketch, we take a uniform distribution over a domain . For an item (or individual) , its label will be given by the linear classifier . Note that the linear classifier indeed is perfectly accurate.666Note that the expected margin in this distribution is small compared to the norms of the examples. This is for simplicity and readability. The full hardness result is shown (in a very similar manner) for data where the margins are large. In particular, this means that the class of predictors can achieve good accuracy with constant . See Section 6.

To argue that fair learning is intractable, we construct two metrics and that are computationally indistinguishable: no polynomial-time algorithm can tell them apart (even given the explicit description of the metric).777More formally, we construct two distribution on metrics, such that no polynomial-time algorithm can tell whether a given metric was sampled from the first distribution or from the second. For readability, we mostly ignore this distinction in this sketch. We construct these metrics so that does not allow any non-trivial accuracy, whereas essentially imposes no fairness constraints. Thus, is a perfectly fair and perfectly accurate classifier w.r.t. . Now, since a polynomial-time learning algorithm cannot tell and apart, it has to output the same (distribution on) classifiers given either of these two metrics. If , given , outputs a classifier with non-trivial accuracy, then it violates perfect metric-fairness. Thus, when given , must (with high probability) output a classifier with error close to . This remains the case even when is given the metric (by indistinguishability), despite the fact perfect metric-fairness under allows for perfect accuracy.

We construct the metrics as follows. The metric gives every pair of individuals distance 1. The metric , on the other hand, partitions the items in into disjoint pairs where the label of is , the label of is , but the distance between and is 0.888Formally, is a pseudometric, since it has distinct items at distance 0. We can make be a true metric by replacing the distance 0 with an arbitrarily small positive quantity. The hardness result is essentially unchanged. Thus, the metric assigns to each item a “hidden counterpart” that is identical to , but has the opposite label. The distance between any two distinct elements that are not “hidden counterparts” is 1 (as in ). The metric specifies that hidden counterparts are identical, and thus any perfectly metric-fair classifier must treat them identically. Since and have opposing labels, ’s average error on the pair must be . The support of is partitioned into disjoint hidden counterparts, and thus we conclude that . Note that this is true regardless of ’s complexity (in particular, it also rules out improper learning). We construct the metrics using a cryptographic pseudorandom generator (PRG), which specifies the idden counterparts (in ) or their absence (in ). See the full version for details.

##### Discussion.

We make several remarks about the above hardness result. First, note that the data distribution is fixed, and the optimal classifier is linear and very simple: it only considers a single coordinate. This makes the hardness result sharper: without fairness, the learning task is trivial (indeed, since the classifier is fixed there is nothing to learn). It is the fairness constraint (and only the fairness constraint) that leads to intractability. The computational hardness of perfectly fair learning applies also to improper learning. Finally, the metrics for which we show hardness are arguably contrived (though we note they do obey the triangle inequality). This rules out perfectly metric-fair learners that work for any given metric. A natural direction for future work is restricting the choice of metric, which may make perfectly metric-fair learning feasible.

### 1.7 Further Related Work

There is a growing body of work attempting to study the question of algorithmic discrimination, particularly through the lens of machine learning. This literature is characterized by an abundance of definitions, each capturing different discrimination concerns and notions of fairness. This literature is vast and growing, and so we restrict our attention to the works most relevant to ours.

One high-level distinction can be drawn between group and individual notions of fairness. Group-fairness notions assume the existence of a protected attribute (e.g gender, race), which induces a partition of the instance space into some small number of groups. A fair classifier is one that achieves parity of some statistical measure across these groups. Some prominent measures include classification rates (statistical parity, see e.g  [FFM15]), calibration, and false positive or negative rates [KMR16, Cho17, HPS16]. It has been established that some of these notions are inherently incompatible with each other, in all but trivial cases [KMR16, Cho17]. The work of [WGOS17] takes a step towards incorporating the fairness notion of [HPS16] into a statistical and computational theory of learning, and considers a relaxation of the fairness definition to overcome the computational intractability of the learning objective. The work of [DIKL17] proposes an efficient framework for learning different classifiers for different groups in a fair manner.

Individual fairness [DHP12] posits that “similar individuals should be treated similarly”. This powerful guarantee is formalized via a Lipschitz condition (with respect to an existing task-specific similarity metric) on the classifier mapping individuals to distributions over outcomes. Recent works [JKMR16, JKM] study different individual-level fairness guarantees in the contexts of reinforcement and online learning. The work of  [ZWS13] aims to learn an intermediate “fair” representation that best encodes the data while successfully obfuscating membership in a protected group. See also the more recent work [BCZC17].

Several works have studied fair regression [KAAS12, CKK13, ZVGRG17, BHJ17]. The main differences in our work are a focus on metric-based individual fairness, a strong rigorous fairness guarantee, and proofs of competitive accuracy (both stated with respect to the underlying population distribution).

## 2 Metric Fairness Definitions

### 2.1 (Perfect) Metric-Fairness

Dwork et al. [DHP12] introduced individual fairness, a similarity-based fairness notion in which a probabilistic classifier is said to be fair if it assigns similar distributions to similar individuals.

###### Definition 2.1 (Perfect Metric-Fairness).

A probabilistic classifier is said to be perfectly metric-fair w.r.t a distance metric , if for every ,

 Λ(h(x),h(x′))≤d(x,x′) (3)

where is interpreted as the probability will assign the label to , is a distance measure between distributions and is a task-specific distance metric that is assumed to be known in advance. Throughout this work we take to be the statistical distance, yielding ).

In the setting considered by [DHP12], a finite set of individuals should be assigned outcomes from a set . Under the assumption that is known, they demonstrated that the problem of minimizing an arbitrary loss function , subject to the individual fairness constraint can be formulated as an LP and thus can be solved in time poly.

### 2.2 Approximate Metric-Fairness

We consider a learning setting in which the goal is learning a classifier that satisfies the fairness constraint in Equation (3) with respect to some unknown distribution over , after observing a finite sample . To this end, we introduce a metric-fairness loss function that, for a given classifier and a pair of individuals in , assigns a penalty of 1 if the fairness constraint is violated by more than a additive term.

###### Definition 2.2.

For a metric and , the metric-fairness loss on a pair is

 ℓγ,d(h,(x,x′))={1Λ(h(x),h(x′))>d(x,x′)+γ0Λ(h(x),h(x′))≤d(x,x′)+γ (4)

The overall metric-fairness loss for a hypothesis is the expected violation for a random pair according to .

###### Definition 2.3 (Metric-Fairness Loss).

For a metric and ,

 (5)

We go on to define the empirical fairness loss, a data-dependent quantity designed to estimate the unknown . To this end, we think of a sample as defining a complete weighed graph, denoted , whose vertices are and whose edges are weighed by . Now, observe that when is sampled i.i.d from , any matching 999Note that from the structure of , it has exactly matchings, each of size (w.l.o.g, we assume is odd). is an i.i.d sample from . We now define the empirical loss by replacing the expectation over in Equation (5) with the expectation over some matching .

###### Definition 2.4 (Empirical Metric-Fairness loss).
 LFS,d,γ(h)=2m−1⋅∑(x,x′)∈M(S)[ℓγ,d(h,(x,x′))] (6)

Finally, we will say that a classifier is -fair w.r.t (respectively, ) and if its respective metric-fairness loss is at most .

###### Definition 2.5 ((α,γ)-Metric-Fairness).

A probabilistic classifier is said to be -fair w.r.t a metric and (respectively, ) if (respectively, ).

##### Notation

When and are clear from context, we will use the more succinct notation for the true fairness loss and for the empirical fairness loss. When dealing with a hypothesis class , we use to denote all the -fair hypotheses in (w.r.t ), and for those which are -fair w.r.t .

### 2.3 Approximate Metric Fairness: Interpretation

An -fair classifier (for ) no longer holds any guarantee for any single individual. To interpret the guarantee it does give, we consider the following definition.

###### Definition 2.6.

A probabilistic classifier is said to be metric-fair w.r.t if

 Prx∼D[Prx′∼D[Λ(h(x),h(x′))>d(x,x′)+γ]>α2]≤α1 (7)

Definition 2.6 is very similar to 2.5 but it lends itself to a more intuitive interpretation of fairness for groups. Informally, we will say that an individual feels -discriminated against by if the proportion of individuals with whom his constraint is violated (think: individuals who are equally qualified to him but receive different treatment) exceeds ; now, -fairness ensures that the proportion of individuals who find to be -discriminatory does not exceed . Hence, this is a guarantee for groups: an -fair classifier cannot cause an entire group of fractional mass to be discriminated against. The strength of this guarantee is that it holds for any such group (even for those formed âex-anteâ). In this sense, -fairness represents a middle-ground between the strict notion of individual fairness and the loose notions of group-fairness.

Finally, we show that the two definitions are related: any -fair classifier is also -fair, for every for which . This demonstrates that optimizing for accuracy under an -fairness constraint is a flexible way of achieving interpretable fairness guarantees for a range of desired values.

###### Claim 2.7.

For every , and for which , if is -fair then it is also -fair.

Proof of Claim 2.7. For simplicity, we define the following indicator function,

 \mathds1γ,h(x,x′)={1Λ(h(x),h(x′))>d(x,x′)+γ0o.w

Let such that , and assume that is -fair w.r.t . Assume for contradiction that is not -fair w.r.t . That means that

 Prx∼D[Prx′∼D[\mathds1γ,h(x,x′)=1]>α2]>α1

If we denote the subset of “-discriminated” individuals as ,

 B≜{x∈X:Prx′∼D[\mathds1γ,h(x,x′)=1]>α2}

then the assumption is equivalent to . We now obtain:

 α ≥Prx,x′∼D[\mathds1γ,h(x,x′)=1] =Prx∼D[x∈S]⋅Prx′∼D[\mathds1γ,h(x,x′)=1|x∈S]+Prx∼D[x∉S]⋅Prx′∼D[\mathds1γ,h(x,x′)=1|x∉S]≥0 ≥Prx∼D[x∈S]⋅Prx′∼D[\mathds1γ,h(x,x′)=1|x∈S] >α1⋅α2

where the first transition is from the assumption that is -fair, and the final transition is from the assumption that is not -fair. We therefore have that , which contradicts our assumption.

### 2.4 ℓ1-Metric-Fairness

In this work we focus on approximate metric-fairness and the metric fairness loss (Definition 2.2). We find that this notion provides appealing and interpretable protections from discrimination: as discussed above, for small enough , every sufficiently large group is protected from blatant discrimination (see Section 2.3). However, in turning to design efficient metric-fair learning algorithms, working directly with this definition presents difficulties (see Section 5). In particular, the “0/1” nature of the metric fairness loss means that the set is not a convex set. Trying to learn an empirically metric-fair that optimizes accuracy is a non-convex optimization problem, and it isn’t clear how to optimize using convex-optimization tools.

In light of this difficulty, we introduce a different metric-fairness loss definition. It overcomes the non-convexity by replacing the bound on the expected number of fairness violations with a bound on the expected sum of the fairness violations.

###### Definition 2.8 (ℓ1 MF loss).

For a metric , the metric-fairness loss on a pair is

 ℓ1d(h,(x,x′))=max(0,∣∣h(x)−h(x′)∣∣−d(x,x′))

Similarly to the regular metric-fairness loss, the loss for a hypothesis is the expected violation for a random pair according to , and the empirical loss replaces the expectation over with the expectation over some matching .

###### Definition 2.9 (τℓ1-Metric-Fairness).

A probabilistic classifier is said to be -metric-fair w.r.t a metric and w.r.t (respectively, ) if its respective MF loss is bounded by .

When are clear from context we use the notation (respectively, ) for the subset of hypotheses from which are -MF w.r.t (respectively, ). We use (previously ) to emphasize that fairness is calculated w.r.t the standard MF loss.

The main advantage of metric-fairness is that it induces convex constraints and tractable optimization problems. We note that metric-fairness also generalizes from a sample (indeed, in this case the fairness loss is Lipschitz and thus it’s easier to prove generalization bounds). The main disadvantage of metric-fairness is in interpreting the guarantee: it is less immediately obvious how bounding the sum of “fairness deviations” translates into protections for groups or for individuals. Nonetheless, we show that -metric-fairness implies approximate metric fairness for every . Thus, when we optimize over the set of -MF predictors, we are guaranteed to output an approximately metric-fair predictor. Moreover, since approximate MF also implies -MF, any solution to the () optimization problem will also be competitive with a certain class of approximately metric-fair classifiers.

The connection between approximate and metric fairness is quantified in Lemma 2.10 below. We note that there is a gap between these upper and lower bounds. This is the “price” we pay for relaxing from approximate to metric fairness. We also note that in proving that -MF implies approximate metric-fairness, it is essential to use a non-zero additive violation term.

###### Lemma 2.10.

For every sample , matching and every , .

Proof of Lemma 2.10. We begin by defining the induced violation vector of a classifier . For a sample , a matching and a value , the induced violation vector is defined as:

 [ξγ(h)]i=max{0,∣∣h(x)−h(x′)∣∣−d(x,x′)−γ}

where is the -th edge in the matching .

We first prove that . If , then this means that . Assume for contradiction that for some we have . But this implies that , which is a contradiction. We therefore have that from which we conclude that .

Next, we prove that . To do so, we’ll prove that . Observe that for every we have that because . To see that , recall that implies that . Thus:

 ∥ξ0(h)∥1 ≤τ−γ1−γ⋅1+(1−τ−γ1−γ)⋅γ=τ

which implies , as required.

### 2.5 Generalization

A key issue in learning theory is that of generalization: to what extent is a classifier that is accurate on a finite sample also guaranteed to be accurate w.r.t the underlying distribution? In this section, we work to develop similar generalization bounds for our metric-fairness loss function. Proving that fairness can generalize well is a crucial component in our analysis - it effectively rules out the possibility of creating a “false facade” of fairness (i.e, a classifier that only appears fair on a sample, but is not fair w.r.t new individuals).

The generalization bounds will be based on proving uniform convergence of the empirical estimates (in our case, ) to the fairness loss, simultaneously for every , in terms of the Rademacher complexity of the hypotheses class . Rademacher complexity differs from celebrated VC-dimension complexity measure in three aspects: first, it is defined for any class of real-valued functions (making it suitable for our setting of learning probabilistic classifiers); second, it is data-dependent and can be measured from finite samples; third, it often results in tighter uniform convergence bounds (see, e.g,  [KP02]).

Let be an input space, a distribution on , and a real-valued function class defined on . The empirical Rademacher complexity of with respect to a sample is the following random variable:

 ˆRm(F)=Eσ[supf∈F(1mm∑i=1σif(zi))] (8)

The expectation is taken over where the ’s are independent uniformly random variables taking values in . The Rademacher complexity of is defined as the expectation of over all samples of size :

 Rm(F)=ES∼Dm[ˆRm(F)] (9)

In our case, the fact that our fairness loss is a 0-1 style-loss means that for infinite hypothesis classes, the generalization argument is involved and we incur an extra approximation term in .

###### Theorem 2.12 (Rademacher-Based Uniform Convergence of the Metric-Fairness Loss).

Let be a hypotheses class with Rademacher complexity . For every , every and every (w.l.o.g assume is odd), with probability at least over an i.i.d sample , simultaneously for every :

 LFγ+1G(h)−Δm≤ˆLFγ(h)≤LFγ−1G(h)+Δm (10)

where .

Before proving the theorem, we note that an immediate corollary is that our metric-fairness loss is capable of generalizing well for any hypothesis class that has a small Rademacher complexity. In particular, we will be using the following result for the class of linear classifiers with norm bounded by in a RHKS whose inner products are implemented by a kernel (i.e, exists a mapping such that ):

 Hψ,Cdef={x↦⟨v,ψ(x)⟩:v∈V,∥v∥≤C}
###### Corollary 2.13.

Let as above, for any and kernel . For every , every and every , w.p at least , simultaneously for every

 LFγ+1G(h)−Δm≤ˆLFγ(h)≤LFγ−1G(h)+Δm (11)

where and .

Proof of Corollary 2.13. The proof follows from Theorem 2.12 and the fact that the Rademacher complexity is bounded by (see [KST09]).

To set the stage for proving Theorem 2.12, we state some useful properties of the Rademacher complexity notion, see [BM02].

###### Theorem 2.14 (Two-sided Rademacher-based uniform convergence).

Consider a set of functions mapping to . For every , with probability at least over a random draw of a sample of size , every satisfies

 ∣∣ED[f(z)]−ˆE[f(z)]∣∣≤2Rm(F)+√log2δ2m (12) ∣∣ED[f(z)]−ˆE[f(z)]∣∣≤2ˆRm(F)+3√log4δ2m (13)
###### Lemma 2.15.

Let be a class of real functions. Then, for any sample of size :

1. For any function , .

2. If is -Lipschitz and satisfies , then .

3. For every , w.p at least over the choice of ,

 −2√log2δ2m≤Rm(F)−ˆRm(F)≤2√log2δ2m

Proof of Theorem 2.12.

Consider the input space and consider the functions , where is the fairness loss defined as

 ℓhγ(z)=ℓh(x,x′)=\mathds1[∣∣h(x)−h(x′)∣∣>d(x,x′)+γ]≜{1|h(x)−h(x′)|>d(x,x′)+γ0o.w

For a given sample (w.l.o.g, assume is odd), let be any matching in the graph induced by . Observe that is indeed an i.i.d sample from and recall that . For the sake of simplicity, we hereby denote as the empirical Rademacher complexity with respect to induced by a random sample .

Denote the threshold function at :

 σγ(x)={1x>γ0x≤γ

Hence,

 F =σγ\,∘G G ={(x,x′)↦∣∣h(x)−h(x′)∣∣−d(x,x′)}h∈H

Observe that can be further decomposed as

 G=abs\,∘H′+f

where , abs is the absolute value function (which is 1-Lipschitz), and .

###### Claim 2.16.

.

Proof of Claim 2.16. Denote where . By definition,

 R˜m(H′) =ES∼Dm[ˆR˜m(H′)] =ES,σ[supgh∈H′(1~mm∑i=1σigh(zi))] =ES,σ[supgh∈H′(1~mm∑i=1σi(h(x1i)−h(x2i)))] ≤ES,σ[supgh∈H′(1~mm∑i=1σih(x1i))+supgh∈H′(1~mm∑i=1−σih(x2i))] =E