An n-ary Constraint for the Stable Marriage Problem††thanks: The first author is supported by EPSRC. Software support was given by an ILOG SA’s academic grant.
We present an n-ary constraint for the stable marriage problem. This constraint acts between two sets of integer variables where the domains of those variables represent preferences. Our constraint enforces stability and disallows bigamy. For a stable marriage instance with men and women we require only one of these constraints, and the complexity of enforcing arc-consistency is which is optimal in the size of input. Our computational studies show that our n-ary constraint is significantly faster and more space efficient than the encodings presented in . We also introduce a new problem to the constraint community, the sex-equal stable marriage problem.
In the Stable Marriage problem (SM) [2, 5] we have men and women. Each man ranks the women into a preference list, as do the women. The problem is then to produce a matching of men to women such that it is stable. By a matching we mean that there is a bijection from men to women, and by stable we mean that there is no incentive for partners to divorce and elope. A matching is unstable if there are two couples and such that prefers to his current partner , and prefers to her current partner .
Figure 1 is an instance of the stable marriage problem, and has 6 men and 6 women. Figure 1 shows the problem initially, with each man and woman’s preference list. Figure 2 shows the intersection of the male and female-oriented Gale-Shapley lists (GS-lists) , where the GS-lists are reduced preference lists. A man-optimal (woman-pessimal) stable matching can now be found by marrying men (women) to their most (least) preferred choices in there GS-lists. Conversely, we can produce a woman-optimal (man-pessimal) matching by marrying women (men) to their most (least) preferred choice in their GS-lists. An instance of SM admits at least one stable matching and this can be found via the Extended Gale-Shapley algorithm in time , where there are men and women.
|Men’s lists||Women’s lists|
|1: 1 3 6 2 4 5||1: 1 5 6 3 2 4|
|2: 4 6 1 2 5 3||2: 2 4 6 1 3 5|
|3: 1 4 5 3 6 2||3: 4 3 6 2 5 1|
|4: 6 5 3 4 2 1||4: 1 3 5 4 2 6|
|5: 2 3 1 4 5 6||5: 3 2 6 1 4 5|
|6: 3 1 2 6 5 4||6: 5 1 3 6 4 2|
|Men’s lists||Women’s lists|
|1: 1||1: 1|
|2: 2||2: 2|
|3: 4||3: 4 6|
|4: 6 5 3||4: 3|
|5: 5 6||5: 6 4 5|
|6: 3 6 5||6: 5 6 4|
We present a simple constraint encoding for the stable marriage problem. We introduce a specialised n-ary constraint with only three methods, where each method is no more than six lines of code. We show how enforcing arc-consistency in this encoding results in the male-oriented Gale-Shapley lists. This minimal encoding cannot be used in search and only achieves directed arc-consistency, from men to women. We then go on to show how we can extend this encoding by introducing a modest amount of additional code, such that the encoding can be used in search, can be embedded in richer impure problems where the stability of marriages is only part of a larger problem, and the male and female oriented GS-lists are produced. Our empirical results suggest, that although our encodings has time complexity, the same as the optimal encoding proposed in , our constraint significantly outperforms this encoding in both space and time.
2 The Extended Gale-Shapley Algorithm (EGS)
We now describe the male-oriented Extended Gale-Shapley (EGS) algorithm (shown in Figure 3). In particular, we explain what is meant by a proposal, an engagement, and for a man to become free. We will use this later to show that this algorithm and our constraint encoding are equivalent.
The EGS algorithm  produces a stable matching between men to and women to , where each man (woman) ranks each of the women (men) into preference order. Via a process of proposals from men to women the algorithm delivers reduced preference lists, called GS-lists (Gale-Shapley lists), such that if each man (woman) is paired with his (her) best (worst) partner in their GS-list the marriages will be stable.111Strictly speaking, the given algorithm produces MGS-lists, the male GS-lists. But for the sake of brevity we will refer to them as GS-lists.
We will assume that we have an instance of the stable marriage problem, and that for any person in , is the ordered list of persons in the original preference list of and is the ordered list of people in the GS-list for , and initially equals . In a proposal from man to woman , will be at the head of the man’s GS-list . This leads to an engagement where is no longer free and all men that prefers less than are removed from her GS-list, i.e. the last entry in becomes . Further, when a man is removed from that woman is also removed from his GS-list, i.e. is removed from , consequently bigamy is disallowed. Therefore and are engaged when is no longer free, is head of , and is at the tail of . A man becomes free when was engaged to (i.e. the head of is ) and receives a proposal from man that she prefers to . On becoming free, is added to the list of free men and is removed from .
The algorithm starts with all men free and placed on a list (line 1). The algorithm then performs a sequence of proposals (lines 2 to 13). A man is selected from the free list (line 2), and his most preferred woman is selected (line 4). If is engaged, then her partner becomes free. The pair and then become engaged (lines 7 to 12).
We assume that the men and women’s preference lists have been read into two 2-dimensional integer arrays and respectively. is the preference list for the man where is the i man’s j preference, and similarly is the preference list for the woman. Using our problem in Figure 1, if we consider our man he will have a preference list .
We also assume we have the inverse of the preference lists, i.e. and , where is the man’s preference for the woman and is the k woman’s preference for the l man. Again, considering the 3 man in Figure 1, his inverse preference list will be , is his preference for the woman, and that is 6, i.e. woman 2 is in the position of man 3’s preference list.222The inverse of the preference lists can be created when reading in the preference lists such that , and this does not affect the overall complexity of constructing our model.
We associate a constrained integer variable with each man and each woman, such that is a constrained integer variable representing the man in stable marriage instance and has a domain initially of 1 to . Similarly, we have an array of constrained integer variables for women, such that represents the woman in . The values in the domain of a variable correspond to preferences, such that if variable is assigned the value this corresponds to being married to his choice of woman, and this will be woman . For example, if (in Figure 1) is set to 3 then this corresponds to marrying his choice, (and conversely would then have to be assigned the value 5). Again referring to Figure 1 our man’s domain is , as is everyone else’s, and in Figure 2 . We also assume that we have the following functions, each being of complexity, that operate over constrained integer variables:
delivers the smallest value in .
delivers the largest value in .
delivers the instantiated value of .
sets the maximum value in to be min.
instantiates the variable to the value .
removes the value from .
We assume that constraints are processed by an arc-consistency algorithm such as AC5  or AC3 . That is, the algorithm has a stack of constraints that are awaiting revision and if a variable loses values then all the constraints that the variable is involved in are added to the stack along with the method that must be applied to those constraints, i.e. the stack contains methods and their arguments. Furthermore, we also assume that a call to a method, with its arguments, is only added to the stack if it is not already on the stack. We’ll refer to this stack as the call stack.
4 An n-ary Stable Marriage Constraint (SM2N)
We now give a description of our n-ary stable marriage constraint, where arc-consistency on such an encoding is equivalent to an application of the male-oriented EGS algorithm. Note that the constraint as described minimally cannot be used within a search process, however we will later show how this can be done. Our constraint is n-ary in that it constrains men and women such that stability is maintained and bigamy is disallowed, although it achieves only 2-consistency.333A detailed explanation of just what we mean by 2-consistency in this model is given in section 6. In a stable marriage problem with men and women we will then require only one of these constraints. We now start by describing the attributes of the constraint and the three methods that act upon it. We will use a java-like pseudo-code such that the (dot) operator is an attribute selector, such that delivers the attribute of .
4.1 The attributes
A n-ary stable marriage constraint (SM2N) is an object that acts between men and women, and has the following attributes:
and are constrained integer variable arrays representing the men and women that are constrained, such that is the constrained integer variable corresponding to and corresponds to .
and are 2-dimensional integer arrays which contain the male and female preference lists respectively, such that equals and contains ’s choice woman.
and are 2-dimensional integer arrays which contain the male and female inverse preference lists respectively, such that contains man ’s preference for .
is an array of integer variables which contain the previous upper bounds of all variables. All are set to at the start of search and are updated by the deltaMax(i) method detailed below.
4.2 The propagation methods
We now describe three methods that achieve male-oriented arc-consistency.
This method is called when the lower bound of increases. The lower bound of increasing signifies that has been rejected by his favourite choice of partner and thus must propose to his new favourite available partner. To do this we first find ’s favourite available partner (line 2), then remove all men from the list of she likes less than (line 3).
1. deltaMin(i) 2. j = xPy[i][getMin(x[i])] 3. setMax(y[j],yPx[j][i])
This method is called when the upper bound of is reduced. To maintain consistency needs to be removed from the domains of all men that have been removed from her domain. This is done by looping once for each value that has been removed from the tail of since the last call to deltaMax(j) (line 2). Within the loop a that has been removed from is selected (line 3) and then is removed from . When all relevant men have had their domains’ altered (line 5) is updated (line 6).
1. deltaMax(j) 2. FOR (k = getMax(y[j])+1 to yub[j]) 3. i = yPx[j][k] 4. remVal(x[i],xPy[i][j]) 5. END FOR LOOP 6. yub[j] = getMax(y[j])
The method is called when the constraint is created, and is simply a call to for each of the men variables.
1. init() 2. FOR (i = 1 to n) 3. deltaMin(i) 4. END FOR LOOP
5 Comparison to EGS
We now compare the behaviour of our n-ary constraint model (SM2N) to the male-oriented EGS algorithm. In our comparison we will describe steps in the EGS algorithm in italics and the SM2N constraint encoding in normal font. Sometimes we will use and as a particular person (rather than and ), and and as particular variables (rather than and ) for sake of brevity. Additionally, we assume we have the function and that it delivers the integer where , i.e. is the least preferred partner of .
Initially the EGS algorithm sets all men to be free by adding them to the free list (line 1). Equivalently, when propagation starts the call to will cause the set of calls to be added to the empty call stack.
EGS picks a man from the free list and he then proposes to his first choice woman (lines 4 to 7). Initially the call stack will contain calls to the method, called directly via . When executing the call , man will make the equivalent of a proposal to his first choice woman (as described next).
When makes a proposal to all values that appear in after the proposing man are removed (lines 8 to 10), i.e. they become engaged. When the call is made, where is favourite, the maximum of is set to preference for , therefore removing all less preferred men. Effectively, and become engaged.
To maintain monogamy EGS removes the newly engaged woman from the GS-lists of all men that have just been removed from her preference list (line 11). From the action above, the maximum of has been lowered, consequently a call to will be added to the call stack. In that call to , is removed from for all where has been removed from the tail of . Therefore, and can never be married.
In EGS, if makes a proposal to , who is already engaged to , then previous fiance is assigned to be free and added to the free list (lines 5 and 6.) On initiating the call where is favourite available woman, fiance corresponds to the maximum value in , because all less preferred men will have been removed (as above). Therefore if receives a proposal from via the call , and prefers to her current fiance (where ) the maximum of will be set lower than her preference for and therefore her preference for will be removed from . Consequently, the call will then be put on the call stack, which will remove preference for from . Because was previous favourite, preference for would have been . Therefore removing that value will increase domain minimum, and the call will then be added to the stack. And this effectively assigns man to be free.
6 Arc-consistency in the Model
On the completion of arc-consistency processing, the variable domains can be considered as . That is, . Furthermore, .
The GS-domains are 2-consistent such that if man is married to a woman (i.e. ) then any woman can then marry some man without forming a blocking pair or a bigamous relationship. That is, for an arbitrary woman there exists a value such that . Furthermore if a man is married to a woman then any other man can then marry some woman , where .
It is important to note, that although our constraint is n-ary it only achieves 2-consistency. It is our opinion that the cost of achieving a higher level of consistency would be of little advantage. This is so because by maintaining 2-consistency, and using a suitable value ordering heuristic in the model during search we are guaranteed failure-free enumeration of all solutions .
In  Theorem 1.2.2 it is proved that all possible executions of the Gale-Shapley algorithm (with men as proposers) yield the same stable matchings. Our encoding mimics the EGS algorithm (as shown in section 5) and we claim (without proof) that the encoding reaches the same fixed point for all ordering of the revision methods on the call stack.
7 Complexity of the model
In  section 1.2.3 it is shown in the worst case there is at most proposals that can be made by the EGS algorithm, and that the complexity is then . We argue that the complexity of our SM2N encoding is also . First we claim that the call to our method is of complexity . The method is of complexity , where is the number of values removed from the tail of variable since the last call to for this variable.
Because there are values in the domain of variable the worse case complexity for all possible calls to is . Equally there are values in the domain of variable and thus the worse case complexity for all possible calls to is . Therefore because there are variables and variables, the total worst case complexity for all possible calls to and is .
8 Enhancing the model
The full GS-Lists are the union of the male and female Gale-Shapley lists remaining after executing male and female oriented versions of EGS. It has been proven that the same lists can be produced by running the female orientated version of EGS on the male-oriented GS-lists . Because SM2N produces the same results as EGS the full GS-Lists can be produced in the same way. But because of the structure of this specialised constraint it is also possible to combine the male and female orientated versions of SM2N into one constraint. This combined gender free version of SM2N will then produce the full GS-List with only one run of the arc-consistency algorithm. To create the gender free version all of the methods presented in this paper must then be symmetrically implemented from the male and female orientations.
The SM2N constraint as presented so far has only considered domain values being removed by the constraint’s own methods. If we were to use the constraint to find all possible stable matchings, unless arc consistency reduces all variable domains to a singleton, it will be necessary to assign and remove values from variable domains as part of a search process. Therefore, we need to add code to SM2N to maintain consistency and stability in the event that domain values are removed by methods other than those within SM2N. It is important to note that these external domain reductions could also be caused by side constraints as well as a search process.
There are four types of domain reduction that external events could cause: a variable is instantiated; a variable’s minimum domain value is increased; a variable’s maximum domain value is reduced; one or more values are removed from the interior of a variable’s domain. We now describe two additional methods, and , and the enhancements required for . We note that does not need to change, and describe the required enhancements for incomplete preference lists.
The method is called when a variable is instantiated.
1. inst(i) 2. For (k = 0 to getVal(x[i])-1) 3. j = xPy[i][k] 4. setMax(y[j],yPx[j][i]-1) 5. END FOR LOOP 6. j = xPy[i][getVal(x[i])] 7. setVal(y[j],yPx[j][i]) 8. For (k = getVal(x[i])+1 to n) 9. j = xPy[i][k] 10. remVal(y[j],yPx[j][i]) 11. END FOR LOOP
This method removes all values from the set of variables to prevent variable being involved in a blocking pair or inconsistency. To prevent from creating a blocking pair, all the values that corresponds to men less preferred than , are removed from the domains of all women that prefers to his assigned partner (lines 2-5). Since is matched to , must now be matched to (lines 6,7). To maintain consistency is removed from the domains of all other women (lines 8-11)). The complexity of this method is and because there are variables and each can only be instantiated once during propagation, the total time complexity of all possible calls to is .
This method is called when the integer value is removed from , and this value is neither the largest nor smallest in .
1. removeValue(i,a) 2. j = xPy[i][a] 3. remVal(y[j],yPx[j][i])
The woman the value corresponds to is found (line 2) then is removed from her domain (line 3), and this must be done to prevent bigamy.
8.0.3 Enhancements to deltaMin(i)
Up till now we have assumed that all values removed from the head of are as a result of being rejected by some . We now drop this assumption in the following enhanced version. In this method we add a new variable array named , and this is similar to the array except it holds the previous lower bound of . All elements in are initialised to and are updated and used only by the method.
1. deltaMin(i) 2. j = xPy[i][getMin(x[i])] 3. setMax(y[j],yPx[j][i]) 4. FOR (k = xlb[i] to getMin(x[i])-1) 5. j = xPy[i][k] 6. setMax(y[j],yPx[j][i]-1) 7. END FOR LOOP 8. xlb[i] = getMin(x[i])
Lines 1 to 3 are as the original. The next four lines (lines 4-7) cycle through each of the values that have been removed from the head of since the last call to (line 4). , which the removed value corresponds to, is then found (line 5), and then all values that are not strictly greater than her preference for are removed from (line 6). The lower bound of the man variable is then updated (line 8).
8.0.4 No enhancements to deltaMax(j)
We now consider the situation where some process, other than a proposal, removes values from the tail of , i.e. when the maximum value of changes. The method will be called, and the instance continues to be stable as all values remaining in corresponding to men prefers to the removed values. However, we need to prevent bigamy, by removing from the corresponding variables removed from the tail of , and this is just what does. Therefore, no enhancement is required.
8.0.5 Incomplete Lists (SMI)
The encoding can also deal with incomplete preference lists, i.e. instances of the stable marriage problems with incomplete lists (SMI). For a SM instance of size we introduce the value . The value must appear in the preference lists and as a punctuation mark, such that any people after are considered unacceptable. For example, if we had an instance of size 3 and a preference list = (3,2) we would construct and this would result in the inverse . Consequently would always prefer to be unmatched (assigned the value 4) than to be married to . We now need to modify the method such that it sets the maximum value in to be . These modifications will only work in the full implementation (i.e. it requires the above enhancements).
8.0.6 Reversible integers
In this encoding we have used two variable arrays which contain dynamic data. and are initialised to and 1 respectively, but these values will be updated as the problem is being made arc-consistent. If we are only looking for the first solution then we need only use normal integers to hold these values. However, when the constraint solver backtracks and values that had been removed from the domain of a variable are reintroduced then the values held in and will no longer be correct. To fix this problem we have to tell the solver that when it backtracks it needs to reverse the changes to and as well as the variables domains. This is done by using a reversible integer variable. This class should be supplied in the constraint solver toolkit. The solver will then store the values of each of the reversible variables at each choice point and restore them on backtracking.
9 Computational Experience
We implemented our encodings using the JSolver toolkit , i.e. the Java version of ILOG Solver. In a previous paper  we presented a specialised binary constraint (SM2) for the stable marriage problem, and presented some results comparing the SM2 constraint with the two constraint encoding in . Here we show a chopped down version of those results, with the results obtained by running SM2N on the same set of test data included. The other model shown in the results table is the optimal boolean encoding (Bool) as presented in . Our experiments were run on a Pentium 4 2.8Ghz processor with 512 Mbytes of random access memory, running Microsoft Windows XP Professional and Java2 SDK 18.104.22.168 with an increased heap size of 512 Mbytes.
Our first experiment measures the time taken to generate a model of a given SM instance and make that model arc-consistent, i.e. to produce the GS-lists. Table 1 shows the average time taken to produce the GS-lists for ten randomly generated instances of size 100 up to 1000. Time is measured in seconds, and an entry means that an out of memory error occurred. We can see that the SM2N constraint dominates the other models.
This second experiment measures the time taken to generate a model and find all possible stable matchings. Table 2 shows the average time taken to find all solutions on the same randomly generated instances used in the first experiment. Again it can be seen that the SM2N model dominates the other models. In summary, when the boolean encoding solves a problem the n-ary constraint does so nearly 100 times faster, and the n-ary constraint can model significantly larger problems than the boolean encoding.
Tables 1 and 2 raise the following question, if the Bool encoding is optimal then why is it dominated by the SM2 encoding, when SM2 is time and the Bool encoding is time? The main reason for this is that there is no significant difference in the space required to represent variables with significant differences in domain size, because domains are represented as intervals when values are consecutive. Considering only the variables, the Bool encoding uses space whereas the SM2 model uses space. For example, with the Bool encoding runs out of memory just by creating the variables whereas the SM2 model takes less than 0.25 seconds to generate the required 2600 variables each with a domain of 1 to 1300. Theoretically the space complexity of the constraints used by SM2 and Bool are the same. In practise this is not the case as SM2 requires exactly constraints to solve a problem of size whereas Bool requires constraints. Therefore the Bool encoding requires more variables and more constraints, resulting in a prohibitively large model. The same argument also applies to the performance of the SM2N constraint, i.e. the n-ary constraint is more space efficient that the Bool encoding, is of the same time complexity, and this results in superior performance. The space and time complexities of these models are tabulated below. Note that the constraint-space for SM2N is a consequence of the storage of the preference lists and their inverses.
This Third experiment shows how SM2N can handle larger problems. Table 4 shows the average time taken to both produce the GS-Lists and find all solutions for one hundred randomly generated instances of size 1000 up to 2000, again the times are in seconds.
10 Sex equal optimisation
The sex equal stable marriage problem (SESMP) as posed in  as an open problem, is essentially an optimisation problem. A male optimal solution to an SMP is where all men get there best possible choices from all possible stable matchings (and all women get there worst), and in a woman optimal solution all women are matched to there best possible choices (and all men to there worst). A sex equal matching is where both the men and the women are equally well matched. This problem has been proven to be NP-Hard .
In a all men will have a score for each woman and all women will have a score for each man, man ’s score for woman is and woman ’s score for man is . In an unweighted all scores will be the same as the preferences, so would equal and would equal . In a weighted this is not so, but the same ordering must be maintained meaning iff . For any matching all men and women will score the matching determined by which partner they are match to in . If man is matched to woman in matching then will give that matching a score of and woman will give it a score of . The sum of all scores given by men for a matching equals and the sum of the women’s scores is . A matching for an instance of the stable marriage problem is sex equal iff there exists no matching such that the absolute difference between the and is less than the absolute difference between and .
Because the values in the domains of the and variables are preferences, it makes finding an unweighted sex equal matching with simple. All that is required is to add a search goal to minimise the absolute difference between the sum of all variables and the sum of all variables. We tested this using the same test data as in Table 4 and the results are tabulated below. These results can be compared to those in Figure 6 of , where the Bool encoding failed to model problems with 300 or more men and women, and at the SM2 model was more than 15 times slower than the SM2N model. We believe that this demonstrates the versatility of our constraint, in that we can easily use the constraint as part of a richer problem.
The SM2N constraint was originally developed using the choco constraints tool kit, and the way the constraint has been introduced reflects that. In choco to implement a user defined constraint, the class is extended. This class contains the methods , , , and . These methods are the equivalent of the ones used to introduce the constraint. is the same as , and are the same as and and is the same as . To implement a constraint in Ilog JSolver we first state when the constraint needs to be propagated, i.e. when a domain value is removed, when the range changes (meaning the upper or lower bound changes) or just when a variable is instantiated. We then need to define a method that will handle propagation when such an event occurs. For the SM2N constraint we stated it was to be propagated every time the range of a variable changed. We then used conditional statements to ascertain which bound had changed, and used the methods as presented above to handle the propagation.
We have presented a specialised n-ary constraint for the stable marriage problem, possibly with incomplete lists. The constraint can be used when stable marriage is just a part of a larger, richer problem. Our experience has shown that this constraint can be implemented in a variety of constraint programming toolkits, such as JSolver, JChoco, and Koalog. The complexity of the constraint is . Although this is theoretically equal to the optimal complexity of the Boolean encoding in , our constraint is more practical, typically being able to solve larger problems faster. For example, we have been able to enumerate all solutions to instances of size 2000 in seconds, whereas in  the largest problems investigated were of size 60. We have also presented the first study of SESMP using a constraint solution, i.e. where the stable matching constraints are part of a richer problem.
We are grateful to ILOG SA for providing us with the JSolver toolkit via an Academic Grant licence. We would also like to thank our four reviewers.
-  ILOG JSolver. http://www.ilog.com/products/jsolver/.
-  D. Gale and L.S. Shapley. College admissions and the stability of marriage. American Mathematical Monthly, 69:9–15, 1962.
-  I.P. Gent, R.W. Irving, D.F. Manlove, P. Prosser, and B.M. Smith. A constraint programming approach to the stable marriage problem. In CP’01, pages 225–239, 2001.
-  I.P. Gent and P. Prosser. An empirical study of the stable marriage problem with ties and incomplete lists. In ECAI’02, 2002.
-  D. Gusfield and R. W. Irving. The Stable Marriage Problem: Structure and Algorithms. The MIT Press, 1989.
-  Akiko Kato. Complexity of the sex-equal stable marriage problem. Japan Journal of Industrial and Applied Mathematics (JJIAM), 10:1–19, 1993.
-  A. K. Mackworth. Consistency in networks of relations. Artificial Intelligence, 8:99–118, 1977.
-  C. Unsworth and P. Prosser. A specialised binary constraint for the stable marriage problem. In SARA05, 2005.
-  Pascal van Hentenryck, Yves Deville, and Choh-Man Teng. A generic arc-consistency algorithm and its specializations. Artificial Intelligence, 57:291–321, 1992.