Ranking with Fairness Constraints

# Ranking with Fairness Constraints

L. Elisa Celis École Polytechnique Fédérale de Lausanne (EPFL), Switzerland Damian Straszak École Polytechnique Fédérale de Lausanne (EPFL), Switzerland Nisheeth K. Vishnoi École Polytechnique Fédérale de Lausanne (EPFL), Switzerland
###### Abstract

The problem of ranking a set of items is a fundamental algorithmic task in today’s data-driven world. Ranking algorithms lie at the core of applications such as search engines, news feeds, and recommendation systems. However, recent events and studies show that bias exists in the output of such applications. This results in unfairness or decreased diversity in the presentation of the content and can exacerbate stereotypes and manipulate perceptions. Motivated by these concerns, in this paper we introduce a framework for incorporating fairness in ranking problems. In our model, we are given a collection of items along with 1) the value of placing an item at a particular position, 2) the collection of possibly non-disjoint attributes (e.g., gender and ethnicity or genre and price point depending on the context) of each item and 3) a collection of fairness constraints that bound the number of items with each attribute that are allowed to appear in the top positions of the ranking. The goal is to output a ranking that maximizes value while respecting the fairness constraints. We present algorithms along with complementary hardness results which, together, come close to settling the complexity of this constrained ranking maximization problem.

## 1 Introduction

Selecting and ranking a subset of data is a fundamental problem in information retrieval and is at the core of ubiquitous applications including Google search, Facebook feeds, Amazon products and Netflix recommendations, in addition to increasingly appearing in social settings such as bank loan applications [ZVGRG15] and recidivism risk scores [ALMK16]. The basic algorithmic problem that arises is as follows: There are items (e.g., products, images or people), and the goal is to output a list of items in the order that is most valuable to a given user or company. For each item and a position one is given a number that captures the value that item contributes to the ranking if placed at position . These values can be tailored to a particular query or user and a significant effort has gone into developing models and mechanisms to learn these parameters [MRS08]. In practice there are many ways one could arrive at , each of which results in a slightly different metric for the value of a ranking – prevalent examples include versions of discounted cumulative gain (DCG), Bradley-Terry and Spearman’s rho (see Appendix A for a discussion). Generally, in such metrics, is non-increasing in both and , and if we interpret to mean that has better quality than , then the value of the ranking can only increase by placing above in the ranking. Formally, the s satisfy , and , and the Monge property

 (1) wi1j1+wi2j2 ≥ wi1j2+wj1i2

for all and . Then, the ranking maximization problem is to find an assignment of the items to each of the positions that maximizes the total value obtained. This problem is equivalent to finding the maximum weight matching in a complete bipartite graph and has a simple and widely deployed solution

However, recent studies have shown that optimizing rankings in this manner can result in one type of content being overrepresented at the expense of another and can lead to grave societal consequences: from image search results that inadvertently promote stereotypes in various professions by over/under-representing people with certain racial or gender attributes [KMM15], to recommendation systems that inherently hold gender and other human biases [BCZ16, CBN17], to news feeds that promote extremist ideology [BMA15, CHRG16] and can even manipulate the results of elections [ER15] to grave discriminatory actions such as disproportionately rating members of minority populations as being at higher risk of recidivism [ALMK16].

Prior approaches redefine the objective function in the maximization problem to incorporate a notion of fairness or diversity. E.g., a common approach is to re-weight the s to attempt to capture the amount of diversity item would introduce at position conditioned on the items placed above it [CG98, ZMKL05], or cast it directly as an (unconstrained) multi-objective optimization problem [YS17]. Alternate approaches aggregate different rankings, e.g., as generated by different interpretations of a query [DKNS01]. Despite these efforts, the above mentioned biased outcomes still occur. In essence, there is a tension between utility and fairness – if the s for items that have a given property are sufficiently higher than the rest, the above approaches do not correct for overrepresentation.

To address this, in this paper, we cast the problem as a constrained optimization problem and introduce the constrained ranking maximization problem, to guarantee that the ranking that is output has no type of content that dominates – i.e., to ensure the rankings are fair. 111Note that, beyond fairness, traditional diversification concerns in information retrieval such as query ambiguity (does “jaguar” refer to the car or the animal?) or user context (does the user want to see webpages, news articles, academic papers or images?) can also be cast in our framework. As fairness (or bias) could mean different things in different contexts, rather than fixing a specific notion of fairness, we allow the user to specify a set of fairness constraints. Theoretically, the optimization problem that arises in our model is a type of a constrained matching problem and our work significantly extends past work on this problem – both in terms of the quality of the solution and the running times. In particular we leverage the fact that the set of constraints, while quite general, have a nested structure, and the objective function satisfies the Monge property stated in (1). We present our model and the corresponding theoretical results in Section 1.1. An overview of our proofs is given in Section 1.4. Overall, our model, that allows for the specification of general and flexible fairness constraints, and, along with the accompanying algorithms, our results present a way forward towards incorporating fairness in ranking algorithms and alleviating corresponding biases in society.

### 1.1 Our contributions

#### Our model.

As a motivating example, consider the setting in which the set of items consists of images of scientists, each image is associated with several (possibly non-disjoint) sensitive attributes or properties such as gender, ethnicity and age, and a subset of size needs to be selected and ranked. In our model, the user can specify an upper-bound on the number of items with property that are allowed to appear in the top positions of the ranking. Formally, let be a set of properties and let be the set of items that have the property . Let be an binary assignment matrix whose -th column contains a one in the -th position if item is assigned to position (each position must be assigned to exactly one item and each item can be assigned to at most one position). We say that satisfies the fairness constraints if for all and , we have

 (2) ∑1≤j≤k∑i∈Pℓxij≤ukℓ,

If we let be the family of all assignment matrices that satisfy the fairness constraints, the constrained ranking maximization problem is: Given the sets of items with each property , the fairness constraints , and the values , find

 (3) argmaxx∈B∑i∈[m],j∈[n]wijxij.

This problem is equivalent to finding a maximum weight matching of size that satisfies the given fairness constraints in a weighted complete bipartite graph, and now becomes non-trivial – its complexity is the central object of study in this paper. Our model allows for s to be non-disjoint. Thus, by controlling s, one can check the bias with respect to a wide range of discrimination metrics for arbitrary group structures in rankings.

#### Our results.

Let the type of item be the set of properties that the item has. Our first result is a exact algorithm for solving the constrained ranking maximization problem whose running time is polynomial if the number of distinct s, denoted by , is constant.

###### Theorem 1.1 (Exact dynamic programming-based algorithm; see Theorem 2.1)

There is an algorithm that solves the constrained ranking maximization problem in time when the values satisfy property (1).

This algorithm combines a geometric interpretation of our problem along with dynamic programming and proceeds by solving a sequence of dimensional sub-problems. The proof of Theorem 1.1 is provided in Section 2. When is allowed to be large, the problem is -hard; see Theorem 5.1.

Generally, we may not be able to assume that is a constant and, even then, it would be desirable to have algorithms whose running time is close to , the size of the input. Towards this we consider a natural parameter of the set of properties: The size of the largest , namely . The complexity of the constrained ranking maximization problem seems to show interesting behavior with respect to (note that and typically ). The case when corresponds to the simplest practical setting where there are disjoint properties, i.e., the properties partition the set of items. For instance, a set of images of humans could be partitioned based on the ethnicity or age of the individual. Note that even though for , this could still be large and the previous theorem may have a prohibitively large running time.

When we give two different exact algorithms for the constrained ranking maximization problem. The first is a fast greedy algorithm and the second relies on a natural linear programming (LP) relaxation for the constrained ranking maximization problem and reveals interesting structure of the problem. Formally, the relaxation considers the set defined as

 Ωm,n:={x∈[0,1]m×n:n∑j=1xij≤1 for all i∈[m],   m∑i=1xij=1, for all j∈[n]}

and the following linear program

 (4) maxx∈Ωm,n m∑i=1n∑j=1wijxij s.t. ∑i∈Pℓk∑j=1xij≤ukℓ, ∀ ℓ∈[p], k∈[n].

Observe that in the absence of fairness constraints, (4) represents the maximum weight bipartite matching problem – it is well known that the feasible region of its fractional relaxation has integral vertices and hence the optimal values of these two coincide. However, in the constrained setting, even for , the feasible region is no longer integral – it can have fractional vertices (see Fact 3.2). For this reason, it is not true that maximizing any linear objective results in an integral or even optimal solution. Surprisingly, we prove that for the cost functions we consider are special and never yield optimal fractional (vertex) solutions.

###### Theorem 1.2 (Exact LP-based algorithm for Δ=1; see Theorems 3.1 and 3.3)

Consider the above linear programming relaxation for the constrained ranking maximization problem when and the objective function satisfies (1). Then there exists an optimal solution with integral entries and hence the relaxation is exact. Further, there exists a greedy algorithm to find an optimal integral solution in time.

The proof uses a combinatorial argument on the structure of tight constraints that crucially uses the assumption that and the property (1) of the objective function, and the argument cannot be extended for .

When trying to design algorithms for larger , the difficulty is that the constrained ranking feasibility problem remains -hard (in fact, hard to approximate) for ; see Theorems 5.1 and 5.2. Together, these latter results imply that unless we restrict to feasible instances of the constrained ranking problem, it is impossible to obtain any reasonable approximation algorithm for this problem. In order to bypass this barrier, we present an algorithmically verifiable condition for feasibility and argue that it is quite natural in the context of information retrieval. For each we consider the set

 Sk:={l∈[p]:u(k−1)ℓ+1≤ukℓ}

of all properties whose constraints increase by at least when going from the st to the th position. We observe that the following abundance of items condition is sufficient for feasibility:

 (5) ∀kthere are n itemsis.t.Ti⊆Sk.

Simple examples show that this condition can be necessary for certain constraints . In practice, the abundance of items assumption is almost never a problem – the available items (e.g., webpages) far outnumber the size of the ranking (e.g., number of positions on the first search result page) and the number of properties (i.e., there are only so many “types” of webpages).

We show that assuming this condition, there is an algorithm that achieves an -approximation, while only slightly violating the constraints. This result does not need assumption (1) on the objective function, rather only that the s are non-negative. This result is near-optimal; we provide an hardness of approximation result (see Section 5).

###### Theorem 1.3 ((Δ+2)-approximation algorithm; see Theorem 4.1)

For the constrained ranking maximization problem, under the assumption (5), there is an algorithm that in polynomial time outputs a ranking with value at least times the optimal one, such that satisfies the fairness constraints with at most a twice multiplicative violation, i.e.,

 ∑i∈Pℓ∑kj=1xij≤2ukℓ,  %forall$ℓ∈[p]$and$k∈[n]$.

Lastly we summarize our hardness results for the constrained ranking problem.

###### Theorem 1.4 (Hardness Results – Informal)

The following variants of the constrained ranking feasibility and constrained ranking maximization problem are -hard.

1. Deciding feasibility for the case of (Theorem 5.1).

2. Under the feasibility condition (5), approximating the optimal value of a ranking within a factor , for any (Theorem 5.2).

3. Deciding feasibility when only the number of items , number of positions , and upper-bounds are given as input; the properties are fixed for every (Theorem 5.3).

4. For every constant , deciding between whether there exists a feasible solution or every solution violates some constraint by a factor of (Theorem 5.4).

### 1.2 Discussion

In this paper, motivated by controlling and alleviating algorithmic bias in information retrieval, we propose the versatile constrained ranking maximization problem and study its complexity. Our results indicate that the constrained ranking maximization problem, which is a generalization of the classic bipartite matching problem, shows nuanced complexity and, in the most practically relevant cases ( being a small constant), has fast and good algorithmic solutions. For instance, data that is partitioned into groups such as gender of genres (when ranking movies) corresponds to , and having two demographic partitions such as ethnicity and age brackets would correspond to . In these instances, one can easily see how adding fairness constraints can allow one to satisfy detailed proportionality constraints; e.g., for gender, by setting the to be , our model ensures that at any stage in the ranking, we have listed approximately half men and half women. Theorem 1.2 shows how to find an optimal ranking that satisfies such fairness constraints very quickly.

The constrained ranking maximization problem also generalizes several hypergraph matching/packing problems, here we mention the most relevant. [AFK96] considered the bipartite perfect matching problem with constraints. They present a polynomial time randomized algorithm that finds a near-perfect matching which violates each constraint additively by at most . [GRSZ14] improved the above result to a approximation algorithm; however, the running time of their algorithm is roughly where is the number of hard constraints and the output is a matching. [Sri95] studied the approximability of the packing integer program problem which, when applied to our setting, gives an approximation algorithm. For our constrained ranking maximization problem all of these results seem inadequate as the number of fairness constraints is which would make the running time of [GRSZ14] too large and an additive violation of would render the upper bound constraints impotent. Our algorithmic results bypass the obstacles implicit in the past theory work by leveraging on the structural properties of the constraints and common objective functions from information retrieval.

It would be interesting to test out our algorithms in the real-world to see how much utility is lost by the addition of fairness constraints. Finally, it would interesting to conduct experiments with human subjects to test if deploying our framework can actually undo people’s perceptions or opinions.

### 1.3 Other related work

Information retrieval, which focuses on selecting and ranking subsets of data, has a rich history in computer science, and is a well-established subfield in and of itself; see, e.g., the foundational work by [SB88]. The probability ranking principle (PRP) forms the foundation of information retrieval research [MK60, Rob77]; in our context it states that a system’s ranking should order items by decreasing value. Our problem formulation and solutions are in line with this – subject to satisfying the diversity constraints.

A related problem is diverse data summarization in which a subset of items with varied properties must be selected from a large set [PDSAT12, CDKV16]. However, the formulation of the problem is considerably different as there is no need to produce a ranking of the selected items, and hence also no ranking constraints. Extending work on fairness in classification problems [ZWS13], the fair ranking problem has also been studied as an (unconstrained) multi-objective optimization problem, and various metrics for measuring the fairness of a ranking have been proposed [YS17].

Combining the learning of values along with the ranking of items has also been studied [RKJ08, SRG13]; in each round an algorithm chooses an ordered list of documents as a function of the estimated values and can receive a click on one of them. These clicks are used to update the estimate of the s, and bounds on the regret (i.e., learning rate) can be given using a bandit framework. In this problem, while there are different types of items that can affect the click probabilities, there are no constraints on how they should be displayed.

Recent work has shown that, in many settings, there are impossibility results that prevent us from attaining both property and item fairness [KMR17]. Indeed, our work focuses on ensuring property fairness (i.e., no property is overrepresented), however this comes at an expense of item fairness (i.e., depending on which properties an item has, it may have much higher / lower probability of being displayed than another item with the same value). In our motivating application we deal with the ranking of documents or webpages, and hence are satisfied with this trade-off. However, further consideration may be required if, e.g., we wish to rank people as this would give individuals different likelihoods of being near the top of the list based on their properties rather than solely on their value.

### 1.4 Proof overviews

#### Overview of the proof of Theorem 1.1.

We first observe that the constrained ranking maximization problem has a simple geometric interpretation. Every item can be assigned a property vector whose -th entry is if item has property and otherwise. We can then think of the constrained ranking maximization problem as finding a sequence of distinct items such that

 ∑kj=1tij≤uk,      for all k∈[n]

where is the vector whose -th entry is . In other words, we require that the partial sums of the vectors corresponding to the top items in the ranking stay within the region defined by the fairness constraints.

Let be the set of all the different property vectors that appear for items , and let us denote its elements by . A simple but important observation is that whenever two items (with say ) have the same property vector: , then in every optimal solution either will be ranked above , only is ranked, or neither are used. This follows from the assumption that the weight matrix is monotone in and and satisfies the property as stated in (1).

Let us now define the following sub-problem that asks for the property vectors of a feasible solution: Given a tuple such that , what is the optimal way to obtain a feasible ranking on items such that of them have property vector equal to for all ? Given a solution to this sub-problem, using the observation above, it is easy to determine which items should be used for a given property vector, and in what order. Further, one can easily solve such a sub-problem given the solutions to smaller sub-problems (with a smaller sum of s), resulting in a dynamic programming algorithm with states and, hence, roughly the same running time.

#### Overview of the proof of Theorem 1.2.

Unlike the case where the LP-relaxation (4) has no non-integral vertex (it is the assignment polytope), as shown in Fact 3.2, even when , fractional vertices can arise. Theorem 1.2 implies that for , although the feasible region of (4) is not integral in all directions, it is along the directions of interest. In the proof we first reduce the problem to the case when (i.e., when one has to rank all of the items) and has the strict form of property (1) (i.e., when the inequalities in assumption (1) are strict). Our strategy then is to prove that for every fractional feasible solution there is a direction such that the solution is still feasible (for some ) and its weight is larger than the weight of . This implies that every optimal solution is necessarily integral.

Combinatorially, the directions we consider correspond to -cycles in the underlying complete bipartite graph, such that the weight of the matching can be improved by swapping edges along the cycle. The argument that shows the existence of such a cycle makes use of the special structure of the constraints in this family of instances.

To illustrate the approach, suppose that there exist two items that have the same property , and for some ranking positions we have

 (6) xi1j2>0    and    xi2j1>0.

Following the strategy outlined above, consider with to be zero everywhere except and . We would like to prove that the weight of is larger than the weight of and that is feasible for some (possibly small) . The reason why we gain by moving in the direction of follows from property (1). Feasibility in turn follows because is orthogonal to every constraint defining the feasible region. Indeed, the only constraints involving items are those corresponding to the property . Further, every such constraint is of the form222By we denote the inner product between two matrices, i.e., if then where is the indicator vector of a rectangle . Such a rectangle contains either all non-zero entries of , two non-zero entries (with opposite signs), or none. In any of these cases, .

Using a reasoning as above, one can show that no configuration of the form (6) can appear in any optimal solution for that share a property . This implies that the support of every optimal solution has a certain structure when restricted to items that have any given property ; this structure allows us to find an improvement direction in case the solution is not integral. To prove integrality we show that for every fractional solution there exists a fractional entry that can be slightly increased without violating the fairness constraints. Moreover since the -th row and the -th column must contain at least one more fractional entry each (since the row- and column-sums are ), we can construct (as above) a direction , along which the weight can be increased. The choice of the corresponding entries that should be altered requires some care, as otherwise we might end up violating fairness constraints.

The second result in Theorem 1.2 is an algorithm for solving the constrained ranking maximization problem for in optimal (in the input size) running time of . We show that a natural greedy algorithm can be used. More precisely, one iteratively fills in ranking positions by always selecting the highest value item that is still available and does not lead to a constraint violation. An inductive argument based that relies on property 1 and the assumption gives the correctness of such a procedure. Proofs of both parts of Theorem 1.2 appear in Sections 3.1 and 3.2.

#### Overview of the proof of Theorem 1.3.

Let be arbitrary. It is relevant to note that when the constraints are restricted to a single position in the ranking, the problem becomes a variant of the weighted -hypergraph -matching problem. In this problem, one is given a -hypergraph (i.e, is a collection of subsets of , each of which has cardinality at most ), hyperedge weights and a vector of bounds . The goal is to find a set of hyperedges of maximum total weight, such that every vertex belongs to at most hyperedges in .

Our problem can be seen as sequence of nested instances of -hypergraph -matching: The properties are the set of vertices of a hypergraph and each item introduces a hyperedge . Then, the constrained ranking maximization problem can be reformulated as follows: Find a sequence of hyperedges (with distinct ) such that

 ∣∣{j∈[k]:l∈Tij}∣∣≤ukℓ.

The objective is to maximize . In other words one is required to solve an incremental hypergraph matching problem – for every add exactly one hyperedge to the solution so that the degree constraints are satisfied.

There are polynomial time -approximation algorithms known for the -hypergraph matching problem [KY09], hence a tempting approach could be to solve each instance of hypergraph -matching separately and then try merge them into one incremental solution. However, it is not clear how to combine two or more solutions when inconsistencies arise between them. Furthermore, such a merging procedure would likely incur a loss in the quality of the solution, hence after merges one would recover a bound incurring loss in the value.

Below we explain our approach on how to get around these obstacles and obtain an algorithm whose approximation ratio is independent of . The most important part of our algorithm is a greedy procedure that finds a large weight solution to a slightly relaxed problem in which not all positions in the ranking have to be occupied. It processes pairs in non-increasing order of weights and puts item in position whenever this does not lead to constraint violation.

To analyze the approximation guarantee of this algorithm let us first inspect the combinatorial structure of the feasible set. In total there are fairness constraints in the problem and additionally “matching” constraints, saying that no “column” or “row” can have more than a single one in the solution matrix . However, after relaxing the problem to the one where not all ranking positions have to be filled, one can observe that the feasible set is just an intersection of matroids on the common ground set . Indeed, two of them correspond to the matching constraints, and are partition matroids. The remaining matroids correspond to properties: for every property there is a chain of subsets of such that is the set of independent sets in this (laminar) matroid. In the work [Jen76] it is shown that the greedy algorithm run on an intersection of matroids yields -approximation, hence -approximation of our algorithm follows.

To obtain a better – -approximation bound, a more careful analysis is required. The proof is based on the fact that, roughly, if a new element is added to a feasible solution , then at most elements need to be removed from to make it again feasible. Thus adding greedily one element can cost us absence of other elements of weight at most the one we have added. This idea can be formalized and used to prove the -approximation of the greedy algorithm; see Section 4. This is akin to the framework of -extendible systems by [Mes06] in which this greedy procedure can be alternatively analyzed. Finally, we observe that since the problem solved was a relaxation of the original ranking maximization problem, the approximation ratio we obtain with respect to the original problem is still .

It remains to complete the ranking by filling in any gaps that may have been left by the above procedure. This can be achieved in a greedy manner that only increases the value of the solution, and violates the constraints by at most a multiplicative factor of . A detailed proof of the theorem appears in Section 4.

#### Overview of the proof of Theorem 1.4.

Our hardness results are based on a general observation that one can encode various types of packing constraints using instances of the constrained ranking maximization and feasibility problem. The first result (Theorem 5.1) – -hardness of the feasibility problem (for ) is established by a reduction from the hypergraph matching problem. Given an instance of the hypergraph matching problem one can think of its hyperedges as items and its vertices as properties. Degree constraints on vertices can then be encoded by upper bound constraints on the number of items that have a certain property in the ranking. The inapproximability result (Theorem 5.2) is also established by a reduction from the hypergraph matching problem, however in this case one needs to be more careful as the reduction is required to output instances that are feasible.

Our next hardness result (Theorem 5.3) illustrates that the difficulty of the constrained ranking optimization problem could be entirely due to the upper bound numbers s. In particular, even when the part of the input corresponding to which item has which property is fixed, and only depends on (and, hence, can be pre-processed as in [FJ12]), the problem remains hard. This is proven via a reduction from the independent set problem. The properties consists of all pairs of items for . Given any graph on vertices, we can set up a constrained ranking problem whose solutions are independent sets in of a certain size. Since every edge is a property, we can set a constraint that allows at most one item (vertex) from this property (edge) in the ranking.

Finally, Theorem 5.4 states that it is not only hard to decide feasibility but even to find a solution that does not violate any constraint by more than a constant multiplicative factor The obstacle in proving such a hardness result is that, typically, even if a given instance is infeasible, it is easy to find a solution that violates many constraints by a small amount. To overcome this problem we employ an inapproximability result for the maximum independent set problem by [Has96] and an idea by [CK05]. Our reduction (roughly) puts a constraint on every clique in the input graph , so that at most one vertex (item) is picked from it. Then a solution that does not violate any constraint by a multiplicative factor more than corresponds to a set of vertices such that the induced subgraph has no -clique. Such a property allows us to prove (using elementary bounds on Ramsey numbers) that has a large independent set. Hence, given an algorithm that is able to find a feasible ranking with no more than a -factor violation of the constraints, we can approximate the maximum size of an independent set in a graph up to a factor of roughly ; which is hard by [Has96].

### 1.5 Organization of the rest of the paper.

The proof of Theorem 1.1 on the exact (but potentially slow) algorithm for general is presented in Section 2. Section 3 contains the proof of Theorem 1.2 which shows that there exists an integral solution and gives an exact algorithm for ; it also provides a simple and fast greedy algorithm for the ranking maximization problem for . Section 4 contains the proof of Theorem 1.3, which gives our approximation result for general . Our hardness results are presented in Section 5. Finally in Appendix A we give a brief overview of some common ranking metrics and explain how they can be captured by values that satisfy (1).

## 2 Dynamic Programming-based Exact Algorithm

Recall that for an instance of constrained ranking maximization, every item has a type assigned to it, which is the set of all properties item has. In this section, we present an exact dynamic programming algorithm for solving the constrained ranking maximization problem which is efficient when the number of distinct types in the instance is small. We start by providing a geometric viewpoint of the problem, which (arguably) makes it easier to visualize and provides us with convenient notation under which the dynamic programming algorithm is simpler to state and understand.

### 2.1 Geometric interpretation of fairness constraints

Recall that in an instance of constrained ranking maximization we are given items, ranking positions and properties, together with fairness constraints on them. Let be the vector indicating which sets of item belongs to (we call this the type of ).

Note that every ranking can be described either by a binary matrix such that if and only if item is ranked at position , or alternatively by a one-to-one function such that is the item ranked at position , for every . Using the latter convention we can encode the fairness condition as

 ∀ k∈[n]    k∑j=1tπ(j)≤uk,

where is the vector of upper bounds for fairness constraints at position . In other words, a ranking is feasible if and only if the th partial sum of all vectors of items at top- positions belongs to the rectangle , for every .

### 2.2 The dynamic programming algorithm

###### Theorem 2.1

There is an algorithm that solves the constrained ranking maximization problem when the objective function satisfies property (1) in time (where is the number of different types of items).

• Proof:   It is convenient to assume that the matrix satisfies a strict variant of property (1) in which all the inequalities are strict. The general case follows by an analogous argument. For an item , recall that is the vector indicating which sets item belongs to. Further, let be the set of all realized types. Denote the elements of by . For every define and let .

For every we denote by the list of items in in increasing order. Note that if in an optimal solution to the ranking maximization problem, exactly items come from , then these items are exactly and they appear in increasing order in the solution. This follows from property (1) of as follows: Suppose that an item is placed at position and an item is placed at position , with and . Swapping these two items in the ranking does not affect feasibility of the solution and the difference in value is

 wi1j1+wi2j2−wi1j2−wi2j1.

This is positive due to the (strict) property (1). Hence the swap can only increase the weight of the solution. A similar reasoning shows that it is beneficial to swap a ranked item with an unranked item , whenever .

One of the consequences of the above observations is that we can assume that for all and hence . This is because we can keep at most best items from every set and discard the remaining ones as they will not be part of any optimal solution. Such discarding can be done in time roughly if an instance with is given.

The above allows us to reduce the number of candidate rankings which one has to check to roughly . However, this number is still prohibitively large. As in many scenarios of interest, we construct a dynamic programming algorithm with a much fewer states .

Now, consider the following sub-problem: For any tuple with let be the largest weight of a feasible ranking with top- positions occupied, such that exactly items are picked from for every . Let us now describe an algorithm for computing . First, initialize all entries to and set . Next, consider all valid tuples in order of increasing values of , i.e., . Suppose that we would like to compute . First one must check whether the fairness constraint at position is satisfied; for this we calculate

 v=q∑ℓ=1sℓvℓ.

Note that the th coordinate of represents the number of items having property . Hence, a necessary condition for the tuple to represent a feasible ranking is that . If that is not satisfied we just set . Otherwise, consider all possibilities for the type of item that is placed at the last position: . Suppose it is of type (i.e., it belongs to ). Then we have

 D[s1,s2,…,sq]=D[s1,…,sℓ−1,sℓ−1,sℓ+1,…,sq]+wik

where Hence, in order to compute we simply iterate over all possible types and find the maximum value we can get from the above. Correctness follows from the fact that the th item of type in every optimal ranking is always .

The total number of sub-problems is at most . Hence, the above algorithm can be implemented in time , where time is required to read the input and construct the list of elements of every given type. The second term appears because there are subproblems, each such sub-problem considers cases, and every case has a feasibility check that takes time.

## 3 Algorithms for Δ=1

### 3.1 Integrality of LP solutions

###### Theorem 3.1

Consider the linear programming relaxation (4) for the constrained ranking maximization problem when the properties are pairwise disjoint (i.e., ). If has property (1), then there exists an optimal integral solution to (4).

• Proof:   Without loss of generality, we can assume that via a simple extension of the problem as follows: Extend the matrix to a square matrix by setting

 ˜wij={wij for i∈[m] and j∈[n]0 for i∈[m] and j∈{n+1,…,m}.

Further, for every and we set ; i.e., no constraint is imposed on these positions. Note that still satisfies property (1). Moreover, every solution to the original problem (4) can be extended to a solution while preserving the weight; i.e., (where denotes the inner product between two matrices, i.e., ). Similarly, every solution to the extended problem, when restricted to first columns, yields a solution to the original problem with the same weight. Thus, it suffices to prove that the extended problem has an optimal integral solution, and for the remainder of this section we assume that .

For simplicity, assume that the matrix satisfies the strict variant of property (1); i.e., when the inequalities in (1) are strict. This can be achieved by a small perturbation of the weights without changing the optimal ranking.

Our proof consist of two phases. In the first phase, we show that every optimal solution satisfies a certain property on its support. In the second phase we show that no optimal solution that has this property can have fractional entries. Let us state the property of a feasible solution that we would like to establish:

 (7) ∀ℓ∈[p]  ∀i1,i2∈Pℓ  ∀j1,j2∈[m] (i1

In other words, whenever we have two items that have the same property , if is before (i.e., is better than ) then for any position such that is above , then and cannot both be positive. We show that if does not satisfy condition (7) then it is not optimal.

To this end, take a fractional solution for which there is some and for which the condition does not hold. Now, consider a solution of the form where for some and denotes the matrix with a single non-zero entry at of value 1. Since and , we can find some such that . Furthermore, we claim that such a solution is still feasible for (4). Indeed, for every item we have we can conclude that

 m∑j=1x′ij=m∑j=1xij≤1.

Similarly, for every rank position , we have . Hence, .

It remains to show that satisfies all of the fairness constraints. Note that it is enough to consider fairness constraints coming from the property , as , for any , and variables do not appear in other constraints. Every such constraint is of the form where is the indicator vector (matrix) of the rectangle (i.e., submatrix) . Since for every such rectangle, we have

 ⟨1Rk,x′⟩=⟨1Rk,x+εy⟩=⟨1Rk,x⟩≤ukℓ.

Therefore is feasible for (4). Furthermore, because of the (strict) property (1), we have:

As this is a feasible solution with a strictly better objective value, we conclude that was not optimal. Hence, every optimal solution necessarily satisfies (4).

Suppose now, for sake of contradiction, that satisfies (4) and is not integral. Consider a fractional entry of with as small as possible, and (in case of a tie) as small as possible. Suppose that the item belongs to for some . Note that there exists an entry with such that . This is due to the fact that

 m∑j=1xi0j=1    and     j0∑j=1xi0j=xi0j0<1.

Fix the smallest possible with this property. Because of the constraint , there exists at least one more fractional entry in the th column, let us call it . It follows that . Note also that , as if then condition (7) would be violated.

Let us again consider a new candidate solution using the indices defined above:

 x′:=x+ε(e(i1,j1)+e(i2,j2)−e(i1,j2)−e(i2,j1)).

We show that is feasible for some , which then contradicts the fact that is optimal because of the strict version of property (1). To do this, it suffices to ensure that does not violate any fairness constraints imposed by the property . Note that for the constraints

 ∑i∈Pℓk∑j=1x′ij=∑i∈Pℓk∑j=1xij≤ukℓ

remain satisfied. Hence, it only remains to show that no constraint is tight at for .

Observe that because of our choice of , all entries in the rectangle are integral. Furthermore, in the rectangle , the only non-zero entry is due to the fact that and condition (7) is satisfied. Now, because but for as above , the constraint cannot be tight. Thus, is feasible for some and hence is not an optimal solution to (4). Hence, no optimal solution has fractional entries.

In contrast to the above theorem, some vertices of the feasible region might be non-integral.

###### Fact 3.2

There exists an instance of the ranking maximization problem for , such that the feasible region of (4) has fractional vertices.

• Proof:  Let and suppose there is only one property and the constraints are and for . In other words, we only constrain the ranking to have at most element of property in the top-2 entries.

Consider the following point.

 x=⎛⎜ ⎜ ⎜ ⎜⎝\nicefrac1200\nicefrac120\nicefrac12\nicefrac120\nicefrac120\nicefrac1200\nicefrac120\nicefrac12⎞⎟ ⎟ ⎟ ⎟⎠

Clearly is feasible. Observe that the support of has elements and there are exactly that many linearly independent tight constraints at that point. Indeed the doubly-stochastic constraints give us constraints out of which are linearly independent, and the remaining one is

 x1,1+x1,2+x2,1+x2,2=1.

Therefore is a (non-integral) vertex of the feasible region of (4).

### 3.2 Fast greedy algorithm

Due to the special structure for , we are able to find a fast simple algorithm for the ranking maximization problem in this case.

###### Theorem 3.3

There exists an algorithm which, given an instance of the constrained ranking maximization problem with and objective function that satisfies property (1), outputs an optimal ranking in time.

• Proof:   For simplicity, assume that satisfies the strict variant of property (1) (with strict inequalities in the definition). This can be assumed without loss of generality by slightly perturbing . Consider the following greedy algorithm that iteratively constructs a ranking (i.e., is the item ranked at position , for all ).333This alternate notation makes the exposition in this section cleaner – see also the notation and problem formulation in Section 2.1.

• For to

• Let be the smallest index of an item which was not yet picked and can be added at position without violating any constraint. If there is no such , output INFEASIBLE.

• Set .

• Output .

It is clear that if the above algorithm outputs a ranking then is feasible. Assume now that it indeed outputs a ranking. We will show that it is optimal.

Take any optimal ranking . Let be any property (for ) and let be the list of items in in increasing order. We claim that if ranks exactly items from then these have to be , in that order. For this, note that when swapping two elements, say , at positions in the ranking (with say ) the change in weight is equal to

 wi1j1+wi2j2−wi1j2−wi2j1>0

because of the (strict) property (1). Hence it is always beneficial to rank the items in in increasing order. Furthermore, it can be argued using monotonicity that it is always optimal to select the items with smallest indices for the ranking.

One of the consequences of the above observations is that we can assume that for all and hence . This is because we can keep at most best items with property , and discard the remaining ones as they will not be part of any optimal solution. Such a discarding can be done in time time if an instance with is given.

Further, the above observation allows us to now prove optimality of the greedy strategy. Take the largest number such that and agree on , i.e., for . If then there is nothing to prove. Let us then assume that and . There are two cases: either is ranked in or it is not.

In the first case, let be the position in such that , clearly . Let be a ranking identical to but with positions and swapped. We claim that is still feasible and has larger weight than . The claim about weights follows easily from the strict property (1). Let us now reason about feasibility of . Let be the only property of (i.e. ) and let be the total number of elements of property in top- positions of (or equivalently of ). Note that by doing the swap we could have only violated some constraint corresponding to . Since we know that (and similarly ). Further, because of our previous observation, no item is ranked at position for in . For this reason, the fairness constraints corresponding to at are satisfied for (there are elements of property in top- items in ). The second case is very similar. One can reason that if is not included in the ranking then by changing its th position to we obtain a ranking which is still feasible but has larger value.

Hence, if , this contradicts the optimality of . Thus, and is the optimal ranking. By the same argument, one can show that if the instance is feasible, the greedy algorithm will never fail to output a solution (i.e., report infeasibility).

Let us now discuss briefly the running of such a greedy algorithm. For every property we can maintain an ordered list of elements of which were not yet picked to the solution and a count of items of property which are already part of the solution. Then for every ranking position we just need to look at the first element of every list for and one of them will be “the best feasible item”. Having the counters we can check feasibility in time and we can also update our lists and counters in per rank position. For this reason, every rank position is handled in time. Note also that at the beginning all the lists can be constructed in total time, since we can go over the items in the reverse order and place every item at the beginning of a suitable list in time. Hence, the total running time is .

## 4 A (Δ+2)-Approximation Algorithm

###### Theorem 4.1

There exists a polynomial time algorithm which given an instance of the constrained ranking maximization problem satisfying condition (5), outputs a ranking whose weight is at least times the optimal one and satisfies all fairness constraints up to a factor of , i.e.,

 ∑i∈Pℓk∑j=1xi,j≤2ukℓ,  for all ℓ∈[p] and k∈[n].
• Proof:   The algorithm can be divided into two phases: First we construct a partial ranking that may leave some positions empty, and then we refine it to yield a complete ranking.

The first phase is finding a (close to optimal) solution to the following integer program:

 (8) maxx∈˜Ωm,