A Stable Marriage Requires Communication

# A Stable Marriage Requires Communication

## Abstract

The Gale-Shapley algorithm for the Stable Marriage Problem is known to take steps to find a stable marriage in the worst case, but only steps in the average case (with women and men). In 1976, Knuth asked whether the worst-case running time can be improved in a model of computation that does not require sequential access to the whole input. A partial negative answer was given by Ng and Hirschberg, who showed that queries are required in a model that allows certain natural random-access queries to the participants’ preferences. A significantly more general — albeit slightly weaker — lower bound follows from Segal’s general analysis of communication complexity, namely that Boolean queries are required in order to find a stable marriage, regardless of the set of allowed Boolean queries.

Using a reduction to the communication complexity of the disjointness problem, we give a far simpler, yet significantly more powerful argument showing that Boolean queries of any type are indeed required for finding a stable — or even an approximately stable — marriage. Notably, unlike Segal’s lower bound, our lower bound generalizes also to (A) randomized algorithms, (B) allowing arbitrary separate preprocessing of the women’s preferences profile and of the men’s preferences profile, (C) several variants of the basic problem, such as whether a given pair is married in every/some stable marriage, and (D) determining whether a proposed marriage is stable or far from stable. In order to analyze “approximately stable” marriages, we introduce the notion of “distance to stability” and provide an efficient algorithm for its computation.

Keywords: stable marriage; stable matching; approximately stable; communication complexity; distance to stability.

## 1Introduction

In the classic Stable Marriage Problem [11], there are women and men; each woman has a full preference order over the men and each man has a full preference order over the women. The challenge is to find a stable marriage: a one-to-one mapping between women and men that is stable in the sense that it contains no blocking pair: a woman and man who mutually prefer each other over their current spouse in the marriage. Gale and Shapley [11] proved that such a stable marriage exists by providing an algorithm for finding one. Their algorithm takes steps1 in the worst case [11], but only steps in the average case, over independently and uniformly chosen preferences [35].

In 1976, Knuth [19] asked whether this quadratic worst-case running time can be improved upon. A related question was put forward in 1987 by Gusfield [14], who asked whether even verifying the stability of a proposed marriage can be done any faster. As the input size here is quadratic in , these questions only make sense in models that do not require sequentially reading the whole input, but rather provide some kind of random access to the preferences of the participants.

While Knuth’s and Gusfield’s questions arose from computational concerns, they also have tangible economic significance. In many real-world matching mechanisms, it is unreasonable to expect participants to provide their full preference list over all alternatives. This is not merely due to the effort of writing down an immense ordered list of alternatives, but also due to the sheer cognitive or physical effort of forming these preferences (for example, by conducting interviews). Formally, the process of forming or revealing one’s preferences can be modeled as “querying” the individual’s (perhaps implicitly defined) preferences. Each query consists of answering a single question about the individual’s preferences that requires only a short response.2 Thus, the inherent complexity of a marriage mechanism can be measured by the number of queries necessary to participate in the mechanism.

A partial answer to both Knuth’s and Gusfield’s questions was given by Ng and Hirschberg [24], who considered a model that allows two types of unit-cost queries to the preferences of the participants: “what is woman ’s ranking of man ?” (and, dually, “what is man ’s ranking of woman ?”) and “which man does woman rank at place ?” (and, dually, “which woman does man rank at place ?”). In this model, they prove a tight lower bound on the number of queries that any deterministic algorithm that solves the stable marriage problem, or even verifies whether a given marriage is stable, must make in the worst case. Chou and Lu [6] later showed that even if one is allowed to separately query each of the bits of the answer to queries such as “which man does woman rank at place ?” (and its dual query), such Boolean queries are still required in order to deterministically find a stable marriage.

These results still leave two questions open. The first is whether some more powerful model may allow for faster algorithms. While many “natural” algorithms for stable marriage do fit into these models, there may be others that do not. Indeed, there exist problems for which “computationally unnatural” operations, such as various types of hashing, arithmetic operations, or even “cognitively natural” operations such as processing through a neural network, do give algorithmic speedups. Further, it may be the case that the participants’ preferences are only defined implicitly by their actual input. For example, a participant’s type could be a point in some (possibly high dimensional) geometric space, such that they prefer to be married to partners whose types are geometrically close to their own (see, for example, [5]). In this case, a natural query may be of the form “what is your type’s th coordinate?” Thus, it is of interest to ask whether an algorithm that queries the actual (geometric) input can be significantly more efficient than one that only queries the implicitly defined preferences.

The second question concerns randomized algorithms: can they do better than deterministic ones? This question is especially fitting for the stable marriage problem as the expected running time is known to be small when the preferences are chosen uniformly at random.3 We give a negative answer to both hopes, as well as several other related problems, thereby showing that answering a wide variety of basic questions related to stable marriages requires a quadratic number of queries (that is, requires querying nearly the entire preference structure):

Our proof of Theorem ? comes from a reduction to the well-known lower bounds for the disjointness problem [18] in Yao’s [36] model of two-party communication complexity (see [20] for a survey). We consider a scenario in which Alice holds the preferences of the women and Bob holds the preferences of the men, and show that each of the problems from Theorem ? requires the exchange of bits of communication between Alice and Bob.

We note that Segal [33] shows by a general argument that any deterministic or nondeterministic4 communication protocol among all participants for finding a stable marriage requires bits of communication. Our argument for Theorem ?(a), in addition to being significantly simpler, generalizes Segal’s result to account for randomized algorithms,5 and even when considering only two-party communication between Alice and Bob (essentially allowing arbitrary communication within the set of women and within the set of men without cost). Furthermore, our lower bound holds even for merely determining whether a given marriage is stable or far from stable (Theorem ?(b)), as well as for the additional related problems described in Theorem ?(c,d). These results immediately imply the same lower bounds for any type of Boolean queries in the original computation model, as Boolean queries can be simulated by a communication protocol.

As indicated above, Theorem ?(a), as well as the corresponding lower bound on the two-party communication complexity, holds not only for stable marriages but also for approximately stable marriages. In the context of communication complexity, Chou and Lu [6] also study such a relaxation of the stable marriage problem in a restricted computational model in which communication is non-interactive (a sketching model). Chou and Lu show that any (deterministic, non-interactive, -party) protocol that finds a marriage where only a constant fraction of participants are involved in blocking pairs requires bits of communication. Our results are not directly comparable to these, as the two notions of approximate stability are not comparable. Furthermore, we use a significantly more general computation model (randomized, interactive, two-party), but give a slightly weaker lower bound.

Our lower bound for verification complexity (given in Theorem ?(b)) is tight. Indeed there exists a simple deterministic algorithm for verifying the stability of a proposed marriage, which requires queries even in the weak comparison model that allows only for queries of the form “does woman prefer man over man ?” and, dually, “does man prefer woman over woman ?”6 We do not know whether the lower bound is tight also for finding a stable marriage (Theorem ?(a)). Gale and Shapley’s algorithm uses queries in the worst case, but of these queries require answers of bits each. Thus, the algorithm requires a total of Boolean queries, or bits of communication. We do not know whether Boolean queries suffice for any algorithm. While the gap between Gale and Shapley’s algorithm and our lower bound is small, we believe that it is interesting, as the number of queries performed by the algorithm is exactly linear in the input encoding length. An even slightly sublinear algorithm would therefore be interesting.7 We indeed do not have any algorithm, even randomized and even in the strong two-party communication model, nor do we have any improved lower bound, even for deterministic algorithms and even in the simple comparison model.8

## 2Model and Preliminaries

### 2.1The Stable Marriage Problem

#### Full Preference Lists

For ease of presentation, we consider a simplified version of the model of Gale and Shapley [11]. Let and be disjoint finite sets, of women and men, respectively, such that .

#### Arbitrary Preference Lists

While our main results are phrased in terms of full preference lists and perfect marriages, some additional and intermediate results in Section Section 4 and in the Appendix deal with an extended model, which allows for preferences to specify “blacklists” (i.e. declare some potential spouses as unacceptable) and for marriages to specify that some participants remain single. (This model is nonetheless also a simplified version of that of [11].) A (not necessarily full) preference list over is a totally-ordered subset of . We once again interpret a preference list as a ranking, from best to worst, of acceptable spouses. We interpret participants absent from a preference list as declared unacceptable, even at the cost of remaining single. Analogously, a profile of preference lists for over is a specification of a preference list over for each woman ; we denote the set of all profiles of preference lists for over by . In this extended model, a woman is said to prefer a man over a man not only when precedes on the preference list of , but also when is on the preference list of while is not. Again, if we say that weakly prefers over if either prefers over or . (We once again define preference lists and profiles of preference list for over analogously.)

A (not necessarily perfect) marriage between and is a one-to-one mapping between a subset of and a subset of . Given a marriage , we denote the set of married women (i.e. the subset of over which is defined) by ; we analogously denote the set of married men by . For a marriage to be stable (with respect to and ), we require not only that no blocking pair exist with respect to it, but also that no participant be married to someone not on the preference list of .

We note that this model of arbitrary (not necessarily full) preference lists generalizes the model of full preference lists described in Section ?. Indeed, if the preference list of each participant happens to contain all participants of the opposite gender, then the two notions of stability agree. In particular, any marriage that is stable with respect to such preference lists prescribes that no participant remains single.

#### Known Results

We now survey a few known results regarding the stable marriage problem, which we utilize throughout this paper. For the duration of this section, let be a marriage market, defined either according to the definitions of Section ? or according those of Section ?.

Gale and Shapley [11] provide an efficient algorithm for finding the -optimal stable marriage. Their algorithm runs in steps in the worst case, performing a query of bits in each step. Hence, the Gale-Shapley algorithm queries bits in the worst case.

#### Approximately-Stable Marriages

In this section, we describe a notion of an “approximately stable marriage.” For ease of presentation, we restrict ourselves to marriage markets with full preference lists (i.e. the model described in Section ?). We define an approximately stable marriage as a perfect marriage that shares many married pairs with some (exactly) stable (perfect) marriage. Our definition is a natural generalization of that of Ünver [34] (who considers only marriage markets with unique stable marriages), but it appears to be novel in its exact formulation. Our notion of approximate stability has the theoretical advantage of being derived from a metric on the set of all perfect marriages between and .

When only a single stable marriage exists (in this case, as noted above, our definition of approximate stability coincides with that of Ünver [34]), efficiently computing the divorce distance to stability of a given perfect marriage is straightforward: first use the Gale-Shapley algorithm to compute the -optimal stable marriage (which in this case is the unique stable marriage), and then calculate the divorce distance between this (unique) stable marriage and . Unfortunately, this computation fails to generalize as brute-force computation of by iterating over all stable marriages is infeasible for general preferences, since the set of all stable marriages can be exponentially large [19]. Fortunately, by exploiting the combinatorial structure of , we show that can still be efficiently computed (albeit in slightly slower time ). We describe an algorithm to this effect in Section 6. We believe this algorithm to be of independent interest.

The concept of divorce distance is perhaps most valuable in developing our understanding of the qualitative behavior of exactly stable marriage mechanisms. Given that (as our results on exact stability show) finding an exactly stable marriage requires very high communication (or alternatively, a very large number of queries), one may consider dynamic mechanisms that refine their output over time until reaching a stable marriage. Such mechanisms would produce some initial (not necessarily stable) marriage after an initial stage of communication/queries, and then after additional communication/queries, adjust the marriage to form a stable marriage.9 If the social cost of each divorce due to this adjustment is high, then we would want to minimize the number of divorces in the second stage of the mechanism. That is, we would seek a marriage with small divorce distance to stability in the first stage, and would seek to replace it with the closest stable marriage in the second stage. The analysis of Section 6 implies that if such a first stage can be constructed, then a corresponding second stage can be implemented in a computationally efficient manner (using quadratically many queries, of course). Our main result regarding approximately stable marriages gives a negative answer to the question of whether such a first stage can be implemented using significantly less communication or fewer queries than finding an exactly stable marriage.

### 2.2Communication Complexity

We work in Yao’s [36] model of two-party communication complexity (see [20] for a survey). Consider a scenario where two agents, Alice and Bob, hold values and , respectively, and wish to collaborate in performing some computation that depends on both and . Such a computation typically requires the exchange of some data between Alice and Bob. The communication cost of a given protocol (i.e. distributed algorithm) for such a computation is the number of bits that Alice and Bob exchange under this protocol in the worst case (i.e. for the worst ); the communication complexity of the computation that Alice and Bob wish to perform is the lowest communication cost of any protocol for this computation. Generalizing, we also consider randomized communication complexity, defined analogously using randomized protocols that for every given fixed input, produce a correct output with probability at least .10

Of particular interest to us is the disjointness function, . Let and let Alice and Bob hold subsets , respectively. The value of the disjointness function is if , and otherwise. We can also consider as a Boolean function by identifying and with their respective characteristic vectors and , defined by and . Thus, we can express using the Boolean formula . All of our results heavily rely on the following result of Kalyanasundaram and Schintger [18] (see also Razborov [28]):

Our results regarding lower bounds on communication complexities all follow from defining suitable embeddings of into various problems regarding stable marriages, i.e. mapping and into suitable marriage markets (more specifically, mapping into and into ), such that finding a stable marriage (or solving any of the other problems from Theorem ?) reveals the value of . Some of our proofs (namely those presented in Section 5) indeed assume that the input to satisfies .

## 3Summary of Results

All of our results provide lower bounds for various computations regarding the stable marriage problem. The variety of these results conveys our main message: that, roughly speaking, answering practically any meaningful basic question regarding a stable marriage in a marriage market, requires a quadratic number of queries, i.e. nearly amounts to querying the entire preference structure.

For the duration of this section, let be a marriage market with full preference lists, where is held by Alice and is held by Bob.

Although Corollaries ? and ? are immediate consequences of Theorems ? and ?, respectively, we give direct proofs (of somewhat distinct flavors than those of Theorems ? and ?) of these important special cases in Section 4. We believe these proofs (and the construction that they share) to be insightful in their own right; furthermore, the proof of Corollary ? includes a novel application of the Rural Hospitals Theorem (Theorem ?), which we believe may be of independent interest.

The proofs of Theorems ? through ? are given in Section 5.2. The proofs all follow from the embedding of disjointness into a marriage market that is described in Section 5.1.

## 4Lower Bounds for Exact Stability

In this section, we give direct proofs of Corollaries ? and ?, of a somewhat different flavor than the proofs given in Section 5. We prove these corollaries by embedding suitably large instances of into the problems of finding a stable marriage or verifying the stability of some marriage. Thus, by Theorem ? we obtain the desired lower bounds on communication complexities. We note that the construction given in this section does not assume the input to to be uniquely intersecting.

For the duration of this section, let , and let and be disjoint sets such that . Let be the perfect marriage in which is married to for every . To prove Corollary ?, we embed disjointness into verification of stability.

To define , for every we define the preference list of to consist of all such that , in arbitrary order (say, sorted by ), followed by , followed by all other men in arbitrary order. Similarly, to define , for every we define the preference list of to consist of all such that , in arbitrary order (say, sorted by ), followed by , followed by all other women arbitrary order.

is unstable with respect to and there exist such that and there exist such that and .

To prove Corollary ?, we embed disjointness into finding a stable marriage through the intermediate problem of finding a stable marriage with respect to arbitrary (i.e. not necessarily full) preference lists.

To define , for every we define the preference list of to consist of all such that , in arbitrary order (say, sorted by ), followed by (with all other men absent). Similarly, to define , for every we define the preference list of to consist of all such that , in arbitrary order (say, sorted by ), followed by (with all other women absent).

We first show that is stable with respect to and iff . Indeed, since every participant is married by to someone on their preference list, we have:

is unstable with respect to and there exist such that and there exist such that and .

It remains to show that if is stable with respect to and , then it is the unique stable marriage with respect to these profiles of preference lists. For the remainder of the proof assume, therefore, that is stable (with respect to and ). Let be a stable marriage (with respect to these profiles of preference lists). As is stable and perfect, by Theorem ?, since is stable, it is perfect as well. Therefore, each is married by to someone on the preference list of , and so weakly prefers over , as in the latter is married to the last person on the preference list of . Thus, is both the -pessimal stable marriage and the -pessimal stable one, and so, by Corollary ?, is the unique stable marriage.

Corollary ? follows from Lemma ? by showing that we can embed the problem of finding a stable marriage with respect to possibly-partial preference lists into finding a stable marriage with respect to full preference lists. See Appendix B for details.

The techniques used to prove Lemmas ? and ? can also be used to prove Theorem ? — see Appendix C. Although Theorem ? shows that determining the marital status of a fixed pair requires communication, we do not know how to prove a similar lower bound for finding some married couple (see Open Problem ? in Section 7). In the next section, we however show a weaker related result, namely that finding any constant fraction of the couples married in a stable marriage requires communication. This result stems from a different construction than that underlying the results of the current section. The construction that follows will also serve as the basis for our results regarding approximate stability.

## 5General Proof of Main Results

### 5.1Embedding Disjointness into Preferences

Similarly to the proofs given in Section 4, the proofs of the remaining results from Section 3 follow from embedding suitably large instances of into various problems regarding (approximately) stable marriages. In order to prove these remaining results, we reconstruct the embeddings to have the property that small changes in the participants’ preferences yield very large changes in the global structure of the stable marriages for these preferences. Informally, we construct the preferences so that resolving blocking pairs resulting from such small changes in participants’ preferences creates large rejection chains that ultimately affect most married couples.

#### Preference Description

Let and let and be disjoint such that . We divide the participants into three sets: high, mid and low, which we denote , and respectively for the women and , and respectively for the men. These sets have sizes

where is a parameter with , to be chosen later. The low and mid participants preferences will be fixed, while we will use the preferences of the high participants to embed an instance of disjointness of size . We assume that the participants are

where in both cases the first participants are high, the next participants are mid and the remaining participants are low. Since the low and mid participants’ preferences are the same for all instances, we describe those first. As before, the participants’ preferences are symmetric in the sense that the men’s and women’s preferences are constructed analogously.

low participants

The low women’s preferences over men are “in order”: (and symmetrically for low men, whose preference over women are “in order”). In particular, each low participant prefers all high participants over all mid participants over all low participants.

mid participants

The mid participants prefer low participants over high participants over mid participants. Within each group, the preferences are “in order.” Specifically, the mid women have preferences and symmetrically for the men.

high participants

We use the preferences of each of the high participants to encode a bit vector of length . Together, the men and women’s preferences thus encode an instance of of size . For each , we denote her bit vector ; the preference list of , from most-preferred to least-preferred, is:

1. men such that ;

2. men ;

3. men ;

4. men such that .

Within each group, the preferences are once again “in order”, i.e. sorted by numeric index. The men’s preferences are constructed analogously, with each man encoding the bit vector and preferring first and foremost women such that .

#### Stable Marriage Description

Let be a stable marriage; we will show that . We first argue that every high and mid participant is married to a low participant in . Suppose to the contrary that some for is married to some with in . By the definition of the preferences and the assumption that , at least one of and prefers every low participant over their spouse. Assume without loss of generality that prefers all with over . That is, prefers all low men over her spouse . Since is married to a medium or high man, there must be some low man that is married to a low woman . But prefers all high and medium women over . In particular, he prefers over . Therefore, is a blocking pair, so is not stable. Thus any stable marriage must marry low participants to mid or high participants and vice versa.

Now we argue that if , then we must have . The argument for pairs is identical. Suppose that with . Then there is some such that is married to with . But then mutually prefer each other, contradicting the stability of . We arrive at a similar contradiction if , hence we must have , as desired.

We first argue that for any stable marriage for the preferences described above. Since is stable, if , then at least one of and , say , must be married to someone she prefers over . From ’s preferences, this implies that for some with for which . Since the instance of is uniquely intersecting, we must have . Thus prefers all low women over . Since at most medium and high men are married to low women (indeed is a high man married to a high woman) and there are low women, some low woman is married to a low man. But then and mutually prefer each other, hence forming a blocking pair. Thus, we must have .

The remainder of the proof of the lemma is analogous to the proof of Lemma ? if we remove and from all the participants’ preferences.

This follows from the following two observations:

1. All mid women and men have different spouses in and .

2. No mid women are married to mid men in either or .

From these facts, we can conclude that

### 5.2Derivation of Main results

In this section we use the construction of Section 5.1 to prove all the results formulated in Section 3.

Suppose that is a randomized communication protocol (between Alice and Bob) that outputs a -stable marriage using bits of communication. As , there exists sufficiently small such that . Suppose outputs a -stable marriage for the preferences described in Section ?. If , then by Lemma ?, is the unique stable marriage, so .

Suppose . By Lemma ?, is the unique stable marriage, so . Applying Lemma ? and the triangle inequality, we obtain . Thus, if , then and if , then . Given , Alice and Bob can compute without communication, so they can use to determine the value of using bits of communication. Thus, by Theorem ?, as desired.

Suppose that is a randomized communication protocol that determines whether a given marriage is stable or -unstable with respect to given preferences using bits of communication. As , there exists sufficiently small such that . Let be the marriage defined in Lemma ?; by that lemma, if , then is stable (with respect to the preferences described in Section ?). By Lemmas ? and ?, if , then is -unstable. Thus, if determines whether is stable or -unstable, then also determines the value of , hence by Theorem ?.

Suppose that is a randomized communication protocol that for a given pair determines whether for some (every) stable marriage using bits of communication. Set . By choosing preferences as in Section ? and taking , by Lemmas ? and ?, is in some (equivalently every) stable marriage for the given preferences if and only if . Thus, once again by Theorem ?, .

Suppose that is a randomized communication protocol that outputs pairs contained in some (every) stable marriage using bits of communication. Choose preferences as described in the Section ? with some , say . Recall from the proof of Lemma ? that no participants in and are ever married to one another in a stable marriage. Therefore, since and since outputs pairs, we have that must output some pair with or . Recall from the proof of Lemma ? that knowing the stable spouse of any participant in or reveals the value of . Thus, by Theorem ?, .

We prove Part (a) of the theorem. Suppose there is a randomized algorithm that computes a -stable marriage using Boolean queries to the women and men. We will use to construct a -bit communication protocol for the approximate stable marriage problem. The protocol works as follows. Alice and Bob both simulate . Whenever queries the women’s preferences, Alice sends the result of the query to Bob (since Alice knows the women’s preferences). Symmetrically, when queries the men’s preferences, Bob sends Alice the result of the query. This protocol uses bits of communication. Thus, by Theorem ?, we must have , as desired.

Parts (b)–(d) follow similarly from Theorems ?, ? and ?, respectively.

## 6Computing Distance to Stability

In this section, we describe an efficient method for computing the divorce distance to stability of a given marriage, . Recall that the divorce distance to stability is given by

In fact, our algorithm solves the following general problem, which we believe may be of independent interest: given a marriage market and an arbitrary marriage , find a stable marriage that shares the greatest number of pairs with . Since the set of all stable marriages can be exponentially large [19], brute-force computation of is infeasible. Fortunately, by exploiting the structure of , we are able to efficiently reduce the computation of to a max-flow/min-cut problem of size quadratic in . Thus, any number of efficient algorithms may be applied to compute .

### 6.1The Rotation Poset and Digraph

Our exposition follows the work of Gusfield [14] and of Irving and Leather [17] (see also [15]). Let be a stable marriage. A rotation exposed by is a sequence of pairs such that for each , is the first woman on ’s preference list that prefers to her partner in (where addition is conducted modulo ). Given and , we form a marriage called the elimination of from , denoted , which contains the pairs , in addition to all pairs from that are not part of the rotation . It is straightforward to verify that is a stable marriage.

Let denote the -optimal stable marriage (the marriage found by the Gale-Shapley algorithm). Irving and Leather [17] prove that every stable marriage can be obtained from by successively eliminating a unique set of rotations that appear in and subsequent stable marriages. Given a stable marriage , let denote this unique set of rotations, which can be eliminated (starting at ) to obtain . ( may contain some rotations that are not exposed in , but only in subsequent marriages.)

We denote the set of all rotations exposed in one or more stable marriages in by . We endow with a partial order where if for every in which is exposed. In other words, if whenever is eliminated during the construction of a stable marriage by elimination from , it is the case that has been eliminated before . A subset is (downward) closed if for all and , we have that . Irving and Leather prove the following remarkable correspondence between closed subsets of and stable marriages.

For algorithmic purposes, it is advantageous to have a sparse representation of the partial order on , which preserves its closed subsets. To this end, Gusfield [14] proved the following theorem. For the remainder of this section, we use the standard notation to suppress factors, which we find less interesting in the context of our discussion of time complexity in this section (in contrast to the discussion of communication and query complexity in the rest of this paper).

### 6.2Rotation Weights

In this section, we show how to assign weights to rotations in such a way that can be computed directly from and . Let be any rotation and a stable marriage in which is exposed. For any marriage , we define the weight of relative to by

(The absence of from the notation becomes clear in Equation 1 below.) That is, is the net change, following the elimination of from , in the number of pairs contained in the intersection of and . We note that can be computed directly from (without being given an explicit stable marriage which exposes ). Specifically, letting be the set of pairs replacing when eliminating from any stable marriage,11 we have

We argue by induction on . If , then , so the result is immediate. Suppose the claim is true for and is a rotation exposed in , then by the induction hypothesis and by definition of ,

which gives the desired result.

Applying Lemma ? and Theorem ?, we obtain the following result.

By Lemma ?, for any stable marriage ,

By Theorem ?, is a bijection onto the set of closed subsets of . Thus,

as desired.

### 6.3Reduction to Max-Flow/Min-Cut

We now wish to apply Theorems ? and ? to efficiently calculate . The divorce distance can easily be computed in time by using the Gale-Shapley algorithm to compute . Since the partial order on is the transitive closure of (where is the rotation digraph described in Theorem ?), the closed subsets of are precisely the (downward) closed subsets (vertex sets with no incoming edges) of , and so to compute it suffices to maximize over closed subsets of . Thus, we have reduced the problem of computing to finding a maximum closed subset (i.e. maximum-weight vertex set with no incoming edges) in a directed acyclic graph. This problem is well studied, in particular for its applications to open-pit mining (see, e.g. [27]). For completeness and due to some differences in terminology between this paper and that of Picard [27], we briefly describe the application to our specific maximum closed subset problem of Picard’s [27] efficient reduction of maximum closed subset to max-flow/min-cut.12

Denote the vertex and edge sets of by and respectively. Let

We add a source vertex and a sink vertex to to form a new -graph where and

We assign nonnegative capacity to each edge by

In light of Theorem ?, we can reduce the computation of to known efficient algorithms for max-flow/min-cut. We summarize the procedure as follows.

We remark that since has size , the runtime of is nearly quadratic in the input size.

The correctness of follows immediately from Theorems ? and ?. We analyze the runtime as follows. Step 1 can be computed in time using the Gale-Shapley algorithm and brute force computation of . For Step 2, by Theorem ?, (and hence ) can also be computed in time . The weights in Step 3 can be computed in linear time for each rotation, so computing all the weights can be accomplished in time . For Step 4, the min-cut can be computed in time using, for example, Hochbaum’s algorithm [16].

We remark that since Algorithm ? finds both and for a stable marriage closest to , it is a trivial task, which does not increase the asymptotic runtime complexity of Algorithm ?, to also compute in addition to computing .

## 7Commentary and Open Problems

A number of recent papers [2] have touched on various aspects of the amorphic question of “how much do the preferences of the women in the Gale-Shapley algorithm affect the produced (-optimal) stable marriage.” The fact that we prove our lower-bound result in a strong two-sided communication model (and not a weaker -sided communication model or an even-weaker query model) allows our results to also be viewed in the context of this line of research. Our communication lower bounds show that a significant amount of information about the preferences of the women is indeed needed in order to deduce the -optimal stable marriage, as well as for solving any of the other problems described in Theorems ? and ?.

One qualitative feature of of the Gale-Shapley algorithm is that a single proposal at any point can precipitate a cascade of rejections that affects a large portion of the population. Thus, it is impossible for participants to know if their current partner is their final partner until the algorithm has terminated. Our results imply that this feature is common to all stable marriage mechanisms that dynamically refine a marriage and converge to a stable marriage. Indeed, consider any stable marriage algorithm and arbitrarily divide it into a “first” stage and a “second” stage. A consequence of Theorems ? and ? along with our novel definition of divorce distance is that if the query complexity of the first stage is significantly lower than that of querying the entire input, then after the first stage a large fraction of the participants may not yet be married to their final spouses.

In many real-world marriage markets, centralized clearinghouses are employed to prevent undesirable outcomes [29]. Specifically, these clearinghouses were implemented to avoid “unraveling” — wherein participants are incentivized to match extremely early — as well as instability. While unraveling is undesirable in its own right, the early binding commitments made in an unraveling market have been shown to have adverse effects [32], presumably because the early matches are necessarily made with incomplete knowledge about the market. A consequence of Theorems ? and ? (specifically, part (d)) is that any marriage mechanism that allows even a small fraction of participants to match in early binding commitments (i.e. before essentially all of the preferences are queried) cannot generally produce a stable marriage. Thus the empirical phenomenon of instability in decentralized markets where early-accepted proposals are binding, which was observed in [29], is not merely a feature of the particular marriage mechanisms that arose in practice, but is a general theoretical feature inherent in the stable marriage problem.

It is interesting to compare the lower bound that is proved in this paper for the communication complexity of finding an approximately stable marriage to known complexity bounds for the problem of finding an approximately maximum-weight matching in a bipartite graph. Even though these two problems seem similar, the latter can be solved with communication [7], i.e. with significantly less communication than many variants of the former. It is worthwhile to compare this surprising dissimilarity between these problems with a qualitatively similar message that emerges from a significantly different, recent line of work [22], which shows that finding a Pareto-efficient perfect matching requires considerably less strategic reasoning than finding a stable marriage.

The classic Gale-Shapley algorithm [11] terminates after steps, and each step consists of a message of bits. Thus, the Gale-Shapley algorithm provides a communication upper bound of for the problem of finding a stable marriage. As mentioned in the introduction, our Corollary ? matches this up to a logarithmic factor, but it is not immediately clear how to close this gap.

Our definition of -stability is nonstandard. A more common notion of approximate stability is that a marriage induce few (say, at most ) blocking pairs (see [9]). As noted in Remark ?, the blocking-pairs notion of approximate stability is strictly coarser than ours. It is therefore natural to ask if the communication lower bound of Theorem ? holds for blocking-pairs approximate stability as well.

Recently, Ostrovsky and Rosenbaum [26] showed that it is possible to find a marriage with blocking pairs for arbitrary using communication rounds for a distributed model of computation. While their result does not imply anything nontrivial about the total communication, we believe their techniques may be relevant for finding communication protocols for blocking-pairs approximate stability (if such protocols exist). Interestingly, an analogue of Theorem ? does not hold for blocking-pairs approximate stability.

Choose a pair uniformly at random from . If prefers over his spouse in , the men query the women to see if also prefers over her spouse in using communication. The probability that is a blocking pair is precisely , where is the fraction of blocking pairs in . Repeat this procedure to estimate to any desired accuracy in a bounded number of steps depending only on the desired accuracy.

Theorem ? shows that any protocol that produces a constant fraction of pairs in a stable marriage (regardless of which pairs are found) requires communication. It would be interesting to improve this result (or find an efficient protocol) for finding even a single pair that appears in a stable marriage.

Finally, we notice that in contrast to e.g. Theorems ? and ?, our statement of Theorem ? requires that . It is natural to ask what can be obtained regarding other values of .

## Acknowledgements

Yannai Gonczarowski is supported by the Adams Fellowship Program of the Israel Academy of Sciences and Humanities. The work of Yannai Gonczarowski was supported in part by the European Research Council under the European Community’s Seventh Framework Programme (FP7/2007-2013) / ERC grant agreement no. [249159].

The work of Noam Nisan was supported in part by ISF grants 230/10 and 1435/14 administered by the Israeli Academy of Sciences, and by Israel-USA Bi-national Science Foundation (BSF) grant number 2014389.

The work of Rafail Ostrovsky was supported in part by NSF grants 09165174, 1065276, 1118126, 1136174 and 1619348; US-Israel BSF grant 2012366, OKAWA Foundation Research Award, IBM Faculty Research Award, Xerox Faculty Research Award, B. John Garrick Foundation Award, Teradata Research Award, and Lockheed-Martin Corporation Research Award. This material is based upon work supported in part by DARPA SafeWare program. The views expressed are those of the author and do not reflect the official policy or position of the Department of Defense or the U.S. Government.

We would like to thank two anonymous referees for many helpful comments.

## AAsymptotic Notation

Throughout this paper, we use standard computer-science asymptotic notation to describe the order of growth of single-dimensional functions of natural numbers. For example, for a positive function , we write where is also a positive function (usually, one simple to write down), if there exist positive numbers and such that for all . (Equivalently and succinctly, we write if .) Intuitively, one may find it helpful to read the notation as , where is the class of functions whose order of growth (as grows large) is at most that of . This notation allows for the simplification of the exposition of many results, where the order of magnitude of the result serves as the main message. For example, for a highly complex function that satisfies (that is, does not grow any faster than , where for all ), instead of specifying and saying that some algorithm takes at most steps, one could concisely say that this algorithm takes at most steps, without the need to explicitly write down the complex function . The following table summarizes this and other similar standard “order of growth” notation used throughout this paper.

## BEmbedding Arbitrary Preferences into Complete Preferences

This section contains the remaining technical details needed to complete the direct proof of Corollary ? given in Section 4.

Denote , , , and .

To define , for every we define the preference list of to consist of her preference list in (in the same order), followed by , followed by all other men in arbitrary order; we define the preference list of to consist of , followed by all other men in arbitrary order. Similarly, to define , for every we define the preference list of to consist of his preference list in (in the same order), followed by , followed by all other women in arbitrary order; we define the preference list of to consist of , followed by all other women in arbitrary order.

It is straightforward to verify that the lemma holds with respect to these definitions of and ; the details are left to the reader.

## CDetermining the Marital Status of a Given Couple or Participant

In this appendix, we give an alternate proof of Theorem ?, which uses the construction of Section 4. We prove Theorem ? once again using Theorem ?, by embedding disjointness in both problems. We embed disjointness via an intermediate problem of determining whether a given participant is single (i.e. not married to anyone) in some stable marriage, given profiles of arbitrary (i.e. not necessarily full) preference lists.13 We therefore obtain the same lower bounds for this problem as well.

Assume without loss of generality that and denote . Denote and .

To define , for every we define the preference list of to consist of all such that , in arbitrary order (say, sorted by ), followed by (with all other men absent). We define the preference list of to consist of all , in arbitrary order (say, sorted by ), with all other men absent. We define the preference list of every to be empty (these women can be ignored, and are defined purely for aesthetic reasons — so that and be of equal cardinality). To define , for every we define the preference list of to consist of all such that , in arbitrary order (say, sorted by ), with all other women absent. For every we define the preference list of to consist of , followed by (with all other women absent).

Let be the marriage in which is married to for every , and in which all other participants are single. We first show that iff is stable, and then show that is stable iff is single in some stable marriage; we commence with the former.

We begin by noting that every participant that is married in is married to someone on their preference list; therefore, is stable iff no pair would rather deviate. Obviously, no would rather deviate with anyone. Furthermore, while would rather deviate with any , these are all married to their top choices, and so none of them would deviate with . Since for every , the preference list of consists of and of a subset of , we therefore have that is unstable iff there exists such that both and is on the preference list of . Similarly to the proof of Lemma ?, this holds precisely if there exists such that and , which holds iff .

We complete the proof by showing that is stable iff is single in some stable marriage. The first direction follows immediately from the fact that is single in . For the second direction, assume that there exists a stable marriage in which is single. By stability of and since all men on the preference list of have on their preference list, all such men are married in and prefer their spouses over . Therefore, for every , we have that is married to in . By stability of , every is single in . As and coincide on all women, we have that . Therefore, is stable and the proof is complete.

The proof is similar to that of Lemma ?. Denote , , , and .

To define , for every we define the preference list of to consist of her preference list in (in the same order), followed by , followed by all other men in arbitrary order; we define the preference list of to consist of , followed by all other men in arbitrary order. Similarly, to define , for every we define the preference list of to consist of his preference list in (in the same order), followed by , followed by all other women in arbitrary order; we define the preference list of to consist of , followed by all other women in arbitrary order.

Similarly to the proof of Lemma ?, we have that is single in some marriage between and that is stable with respect to and iff and are married in some marriage (a corresponding “supermarriage” of ) between and that is stable with respect to and . Additionally, by Theorem ? (in conjunction with Theorem ?), we have: is single in some marriage between and that is stable with respect to and is single in every marriage between and that is stable with respect to and and are married in every marriage between and that is stable with respect to and .

## DVerifying the Output of a Given Stable Marriage Mechanism

As noted in Section 3, while the lower bound of Corollary ? are tight, we do now know whether that of Corollary ? is tight as well. We note that we do not even know a tight lower bound for verifying whether a given marriage is the -optimal stable marriage.

As in the case of Open Problem ?, we do not have any algorithm for verification of the -optimal stable marriage, even randomized and even in the strong two-party communication model, nor do we have any lower bound, even for deterministic algorithms and even in the simple comparison model.

In this section, we the derive a lower bound for verification of the -optimal stable marriage. In fact, we show this lower bound not only for verifying the -optimal stable marriage, but also for verifying the output of any other stable marriage mechanism.

Theorem ? may be proven either via a direct application of the machinery of Section 5, or using the machinery of Section 4, with Lemma ? replaced by the following lemma.

We define and as in Lemma ?, only with appearing sorted by (as opposed to in arbitrary order) on the preference lists of , and with appearing sorted by (as opposed to in arbitrary order) on the preference lists of . By Lemma ?, we have both that ? holds, and that if is the unique stable marriage with respect to and , then it is a submarriage of every marriage that is stable with respect to and ; it is straightforward to show that every “supermarriage” of , apart from , is unstable, thus proving ? as well.

## ENondeterminism

All the lower bounds in this paper are based upon reductions to the well-studied communication complexity of the disjointness function. Since the disjointness function also has nondeterministic communication complexity [20], it follows that all our lower bounds apply not only to randomized communication complexity, but also to nondeterministic communication complexity. For nondeterministic communication complexity, the lower bound for finding a stable marriage is in fact tight (and so still is the bound for verification of stability).

For the decision problem of verifying the stability of a given marriage, the co-nondeterministic communication complexity may be easily seen to be . In contrast, we note that the proof of Theorem ? may be easily adapted to show a lower bound also for the co-nondeterministic communication complexities of determining the marital status of a given couple.

For completeness, we show this lower bound also for the nondeterministic and co-nondeterministic communication complexities of the intermediate problem of determining whether a given participant is single, which we presented in Appendix C. (This proof also yields Theorem ? using the tools of that appendix and of Section 4.) These lower bounds follow from the results of Appendix C in conjunction with the following lemma.

To define , we define the preference list of as her preference list in (in the same order), followed by ; we define the preference list of every other woman in as her preference list in (in the same order and with absent), and define the preference list of to be empty (once again, can be ignored, and is defined purely for aesthetic reasons — so that and be of equal cardinality). To define , we define the preference list of every man in as his preference list in (in the same order and with absent); we define the preference list of to consist solely of .

Directly from definition of and , we have that a natural bijection exists between stable marriages with respect to and and stable marriages with respect to and ; this bijection is given by:

• If is married in , then (with and single in ).

• If is single in , then is the marriage obtained from by marrying to (with once again single in ).

Once again by Theorem ? (in conjunction with Theorem ?), and by the existence of this bijection, we have: is single in some marriage between and that is stable with respect to and is single in every marriage between and that is stable with respect to and is married in every marriage between and that is stable with respect to and .

We note that the nondeterministic lower bound of for determining whether a given couple is married in some stable marriage, as well as the co-nondeterministic lower bound of for determining whether a given couple is married in every stable marriage (and both the nondeterministic and co-nondeterministic lower bounds of for determining whether a given participant is single in some/every stable marriage), is in fact tight. (Recall that we do not know whether any of these problems can be deterministically or even probabilistically solved using communication.) The questions of a tight co-nondeterministic lower bound for the former problem and a tight nondeterministic lower bound for the latter remain open in all query models. We note that the latter problem may be solved by checking whether the pair in question is married in both the -optimal stable marriage and the -optimal stable marriage; a -Boolean-queries algorithm (even a nondeterministic one) for verification of the -optimal stable marriage (see Open Problem ? in Appendix D) would therefore also settle the question of the nondeterministic communication complexity of this problem.

## FOptimality of Deferred Acceptance with respect to Queries onto Women

Gale and Shapley’s (1962) proof of Theorem ? is constructive, providing an efficient algorithm for finding the -optimal stable marriage. In this algorithm, men are asked queries of the form “which woman is next on the preference list of man after woman ?” (or alternatively, “which woman does man rank at place ?”), while women are asked queries of the form “whom does woman prefer most out of the set of men ?”; all of these queries require an answer of length bits.

Dubins and Freedman [8] presented a variant of Gale and Shapley’s algorithm, which runs in the same worst-case time complexity, but performs a significantly more limited class of queries, namely only pairwise-comparison queries, onto women. In Open Problem ? in the Introduction, we raise the question of a tight lower bound for the complexity of finding a stable marriage using only such queries for both women and men. In this section, we show that regardless of how complex the queries onto the men may be, no algorithm for finding any stable marriage (and even no algorithm for verifying the stability of a given marriage, when input a stable marriage) that performs only pairwise-comparison queries onto women, may perform any less such queries onto them than Dubins and Freedman’s variant of Gale and Shapley’s algorithm (given the same preference lists). For the duration of this section, let , let and be disjoint sets such that .

Let be a run of the men-proposing deferred-acceptance algorithm with respect to and , and let be a given run of an algorithm for finding/verifying a stable marriage with respect to and