Mathematical models for stable matching problems with ties and incomplete lists

Mathematical models for stable matching problems with ties and incomplete lists

Maxence Delorme School of Mathematics, University of Edinburgh, United Kingdom
School of Computing Science, University of Glasgow, United Kingdom
Corresponding author maxence.delorme@ed.ac.uk, phone +44 0131 650 5870
Sergio García School of Mathematics, University of Edinburgh, United Kingdom
School of Computing Science, University of Glasgow, United Kingdom
Corresponding author maxence.delorme@ed.ac.uk, phone +44 0131 650 5870
Jacek Gondzio School of Mathematics, University of Edinburgh, United Kingdom
School of Computing Science, University of Glasgow, United Kingdom
Corresponding author maxence.delorme@ed.ac.uk, phone +44 0131 650 5870

Joerg Kalcsics
School of Mathematics, University of Edinburgh, United Kingdom
School of Computing Science, University of Glasgow, United Kingdom
Corresponding author maxence.delorme@ed.ac.uk, phone +44 0131 650 5870
David Manlove School of Mathematics, University of Edinburgh, United Kingdom
School of Computing Science, University of Glasgow, United Kingdom
Corresponding author maxence.delorme@ed.ac.uk, phone +44 0131 650 5870
William Pettersson School of Mathematics, University of Edinburgh, United Kingdom
School of Computing Science, University of Glasgow, United Kingdom
Corresponding author maxence.delorme@ed.ac.uk, phone +44 0131 650 5870
Abstract

We present new integer linear programming (ILP) models for -hard optimisation problems in instances of the Stable Marriage problem with Ties and Incomplete lists (SMTI) and its many-to-one generalisation, the Hospitals / Residents problem with Ties (HRT). These models can be used to efficiently solve these optimisation problems when applied to (i) instances derived from real-world applications, and (ii) larger instances that are randomly-generated. In the case of SMTI, we consider instances arising from the pairing of children with adoptive families, where preferences are obtained from a quality measure of each possible pairing of child to family. In this case we seek a maximum weight stable matching. We present new algorithms for preprocessing instances of SMTI with ties on both sides, as well as new ILP models. Algorithms based on existing state-of-the-art models only solve 6 of our 22 real-world instances within an hour per instance, and our new models incorporating dummy variables and constraint merging, together with preprocessing and a warm start, solve all 22 instances within a mean runtime of a minute. For HRT, we consider instances derived from the problem of assigning junior doctors to foundation posts in Scottish hospitals. Here we seek a maximum size stable matching. We show how to extend our models for SMTI to HRT and reduce the average running time for real-world HRT instances by two orders of magnitude. We also show that our models outperform by a wide margin all known state-of-the-art models on larger randomly-generated instances of SMTI and HRT.

Keywords: Assignment, Stable Marriage problem, Hospitals / Residents problem, Ties and Incomplete lists, Exact algorithms.

1 Introduction

1.1 Background

In a stable matching problem, we are given a set of agents, each of whom ranks all the others in strict order of preference, indicating their level of desire to be matched to each other. A solution of the problem is a pairing of all agents such that no two agents form a blocking pair, i.e., a pair that are not currently matched together, but would prefer to be matched to each other rather than to their currently assigned partners.

Without any other constraints, this problem is known as the Stable Roommates (SR) problem [11, 16], and the objective is to partition the agents into pairs (e.g., doubles in a tennis tournament) such that no blocking pair exists.

The Stable Marriage problem (SM) is a bipartite restriction of SR, where the agents are split into equal-sized sets of men and women, and it is assumed that men only find women acceptable and vice versa. This problem was first introduced by Gale and Shapley [11], who also gave a linear-time algorithm for finding a stable matching.

It is not always desirable, or even possible, to have every agent express a preference over all other agents. In the Stable Marriage problem with Incomplete lists (SMI), agents can identify potential partners as being unacceptable, meaning that they would rather be unmatched than matched to such agents, and a slight modification of the Gale-Shapley algorithm will find a stable matching in linear time [16, Section 1.4.2]. It turns out that all stable matchings in a given instance of SMI have the same size [12].

In many applications it is not realistic to expect that agents have sufficient information to enable them to rank their acceptable potential partners in strict order of preference. In reality, preference lists may include ties, where a tie indicates a set of agents that are equally desirable. This gives rise to another variant of SM known as the Stable Marriage problem with Ties (SMT) [20]. It is known that resolving indifference by employing tie-breaking is not a good strategy, since it over-constrains the problem [8]. Instead, three levels of stability [20] have been defined in the SMT case, where ties are retained, that vary according to whether agents will agree to swap between choices they find equally acceptable. Under the weakest of these three definitions, which we assume in this paper, a stable matching can always be found by arbitrarily breaking the ties, resulting in an instance of SM.

If both ties and incomplete lists are introduced we obtain the Stable Marriage problem with Ties and Incomplete lists, or SMTI [31]. In an instance of SMTI, stable matchings do not necessarily have the same size, and MAX-SMTI, the problem of finding a stable matching of maximum size, is -hard [31].

The Stable Roommates problem with Globally Ranked Pairs (SR-GRP) [1, 3] is a variant of the Stable Roommates problem involving ties and incomplete lists in which each pair of compatible agents has a weight assigned to their potential pairing, and the preference lists of each agent can be derived from these weights in the obvious manner: given two compatible pairs and , prefers to if and only if . This problem can be restricted to give the Stable Marriage problem with Ties, Incomplete lists, and Globally Ranked Pairs (SMTI-GRP) by splitting the agents into two sets as per the Stable Marriage problem.

In this work, we study one application of SMTI-GRP involving the pairing of children with adoptive families as coordinated by the British charity Coram111Coram | Better chances for children since 1739, https://www.coram.org.uk. Social workers determine a weight to be assigned to each child–family pair , as a predicted measure of the suitability of placing  with , giving an instance of SMTI-GRP. Currently Coram is using a clearing house system which pairs children and families at suitable specified intervals. Similar to the case for kidney exchange programmes [39], this allows for a more efficient pairing of children and families, at the cost of a slightly increased delay between entering the system and being paired. In such a system Coram has decided that the goal should be to find a stable matching that pairs as many children as possible and/or has maximum overall weight222The child–family pairings in a computed stable matching are treated merely as suggestions that will be investigated further by social workers for suitability before any actual assignments are made.. Moreover, Coram would like to ensure that the computed matching is viable in the long term. To this end, a lower bound, or threshold, on suitable weights is used to create refined instances of SMTI-GRP where child–family pairs with weights below the threshold are not allowed to be matched together. However, attempts to determine appropriate threshold values, as well as good weighting functions and suitable intervals between matching runs, have been hampered by the lack of tractable algorithms for finding maximum weight stable matchings for such instances. Indeed, in the SMTI-GRP setting, -hardness holds for each of the problems of finding a maximum size stable matching [1] and a maximum weight stable matching [4].

Whilst SMTI is a one-to-one matching problem, in some applications one set of agents can be matched with more than one partner. The Hospitals / Residents (HR) problem [11, 30] is a many-to-one extension of SMI that models the assignment of intending junior doctors (residents) to hospitals. Each doctor is to be assigned to at most one hospital, whilst each hospital may be assigned multiple doctors up to some given capacity. HR can be generalised to include ties in the preference lists, leading to the Hospitals / Residents with Ties (HRT), the many-to-one generalisation of SMTI. HRT has many applications: it models, for example, the assignment of medical graduates to Scottish hospitals as part of the Scottish Foundation Allocation Scheme (SFAS), which ran between 1999 and 2012. Since then, the UK has amalgamated all such schemes into the UK Foundation Programme, which handles the assignment of almost 8000 doctors to approximately 7000 positions across 20 Foundation Schools, each of which consists of multiple hospitals [43]. In this setting a key aim is to find a stable matching of maximum size, which is an -hard problem in view of the -hardness of MAX-SMTI.

An overview of the differences between problems discussed in the paper is given in Table 1. The relationships between these problems are demonstrated in Figure 1. In the diagram, an arrow from problem A to problem B indicates that problem B is a special case of problem A. For example, SMTI-SYM is the special case of SMTI-GRP in which preferences are symmetric.

Variant Bipartite Incompatible pairs Ties Weights Capacity
SR No No No No No
SR-GRP No Yes Yes Yes No
SM Yes No No No No
SMI Yes Yes No No No
SMT Yes No Yes No No
SMTI Yes Yes Yes No No
SMTI-GRP Yes Yes Yes Yes No
SMTI-SYM Yes Yes Yes Yes No
HRT Yes Yes Yes No Yes
Table 1: Summary of matching problems

SM

SR

SMI

SMT

SMTI

HRT

SMTI-GRP

SR-GRP

SMTI-SYM

Figure 1: Relationships between matching problems

1.2 Our contribution

In this paper we have developed several new techniques that improve the performance of ILP models for instances of both SMTI and HRT. Our first contribution is to present two algorithms for preprocessing instances of SMTI with ties on both sides. Without such preprocessing, only 6 of 22 real-world instances from Coram could be solved within an hour per instance using state-of-the-art models from the literature. Our new preprocessing significantly improves this, finding solutions to 21 of the 22 instances in an average of 434 seconds. We also present new ILP models for SMTI and HRT. These use dummy variables to reduce the number of non-zero entries in their corresponding constraint matrices, which vastly increases the sparsity of the constraint matrix at the cost of additional variables. Further, we formulate different sets of constraints to model stability, including the use of redundant constraints to improve the continuous relaxations of our models. We test each of these individually, and these improvements together allow us to find solutions to all real-world instances in a mean runtime of less than 60 seconds. Turning to randomly-generated instances, the new models also solve all 30 random instances of SMTI that we generated with agents on either side and preference lists of length 5 on one side, while existing state-of-the-art models could only solve 20. We extend our new ILP models to HRT, where we show a reduction in the mean runtime on existing real-world instances of HRT from SFAS, decreasing the average runtime from 144 seconds to only 3 seconds. We also generate 90 random instances that mimic the UK Foundation Programme (with about 7500 doctors and positions). Existing models solve 66 of these, while our new models solve 81.

1.3 Related work

MAX-SMTI is known to be -hard even if each tie occurs at the end of some agent’s preference list, ties occur on one side only and each tie is of length two [31]. The special case of MAX-SMTI that asks if an instance of SMTI has a stable matching that matches every man and woman is also -complete [31], and this result holds even when preference lists have lengths of at most 3 and ties occur on one side only [22].

MAX-SMTI is also not approximable within a factor of 21/19 [17] unless , even if preferences on one side are strictly ordered, and on the other side are either strictly ordered or a tie of length two. The best currently-known performance guarantee is 3/2, achieved first in non-linear running time [34] and later improved to linear time [24, 36], although better guarantees can be achieved in certain restrictions [23]. Király [24] shows how to extend his 3/2-approximation algorithm for MAX-SMTI to HRT.

The Stable Marriage problem with Ties, Incomplete lists and Symmetric preferences (SMTI-SYM) is a restriction of SMTI-GRP such that (i) for each man–woman pair , the rank of in ’s list, i.e., the integer such that belongs to the th tie in ’s list, is equal to the rank of in ’s list, and (ii) the weight of is precisely this integer . Finding a maximum size stable matching in an instance of SMTI-SYM is -hard, and therefore the same result holds for SMTI-GRP [1]. Given an instance of SMTI-GRP, if the goal is to find a matching that maximises the total weight rather than the total size, this problem is -hard also [4].

Linear programming models for SM and SMI have been long studied, and stable matchings correspond exactly to extreme points of the solution polytopes of such models [16, 44]. These formulations have been extended to give ILP models for finding maximum size stable matchings in instances of SMTI and HRT [26, 27]. ILP models have also been given for a common extension of HR that allows doctors to apply as couples, typically so that both members can be matched to hospitals that are geographically close [2, 7, 18, 28, 33]. Other techniques in the field include constraint programming, which has been applied to SM and its variants [14, 15, 32, 35], and the use of SAT models and SAT solvers [7, 15].

Diebold and Bichler [6] performed a thorough experimental study of eight algorithms for HRT, giving a comparison of these algorithms when applied to real-world HRT instances derived from a course allocation system at the Technical University of Munich. These datasets ranged in size from 18-733 students (the “doctors”) and 3-43 courses (the “hospitals”). The authors measured a number of attributes of the algorithms, including the sizes of the computed stable matchings. The methods that they considered included three exact algorithms for MAX-HRT based on the ILP model presented in [27].

Slaugh et al. [42] described improvements they had made to the mechanism for matching children to adoptive families as utilized by the Pennsylvania Adoption Exchange. The process is semi-decentralized in that up to ten match attempts are made against families when each child arrives. By contrast, the more centralized process adopted by Coram involves a pool of children and families building up over time, leading to the use of a matching algorithm for the resulting two-sided matching problem.

For more details on the diverse variants of stable matching problems, we direct the reader to [29] and for an economic overview of these problems we recommend [40].

1.4 Layout of the paper

The rest of the paper is organised as follows. Section 2 defines the problems that are studied in this paper, and we introduce and discuss existing models for these in Section 3. This is followed by a theorem and two algorithms for preprocessing instances of SMTI in order to reduce instance sizes, in Section 4. Section 5 introduces our first new model, which reduces the number of non-zero elements in the constraint matrix through dummy variables. Further models are presented in Section 6 with new stability constraints. We demonstrate our new models and improvements experimentally in Section 7 and we provide some conclusions in Section 8.

2 Problem definitions

In this section we give formal definitions of the three key problems that we consider in this paper.

2.1 Stable Marriage with Ties and Incomplete Lists

An instance of the Stable Marriage problem with Ties and Incomplete lists (SMTI) comprises a set of children and a set of families, where each child (respectively family) ranks a subset of the families (respectively children) in order of preference, possibly with ties. We say that a child finds a family acceptable if belongs to ’s preference list, and we define acceptability for a family in a similar way. We assume that preference lists are consistent, that is, given a child–family pair , finds acceptable if and only if finds acceptable. If does find acceptable then we call an acceptable pair.

A matching in is a subset of acceptable pairs such that, for each agent , appears in at most one pair in . If appears in a pair of , we say that is matched, otherwise is unmatched. In the former case, denotes ’s partner in , that is, if , then and . We now define stability, which is the key condition that must be satisfied by a matching in .

Definition 1.

Let be an instance of SMTI and let be a matching in . A child-family pair is a blocking pair of , or blocks , if

  1. is an acceptable pair,

  2. either is unmatched in or prefers to , and

  3. either is unmatched in or prefers to .

is said to be stable if it admits no blocking pair.

In SMTI, the goal is to find an arbitrary stable matching. We denote the problem of finding a maximum size stable matching, given an instance of SMTI, by MAX-SMTI.

2.2 Globally Ranked Pairs

An instance of the Stable Marriage problem with Ties, Incomplete lists and Globally-Ranked Pairs (SMTI-GRP) comprises a set of children, a set of , a subset of acceptable child–family pairs, and a weight function .

The set of acceptable pairs and the weight function are used to define the SMTI instance  corresponding to as follows: for any two acceptable pairs and , prefers to if , and is indifferent between and if . Preference lists of families are constructed in a similar fashion. A stable matching in can then be defined by applying Definition 1 to .

Given a matching in , the weight of , denoted by , is defined to be . The problem of finding a stable matching of maximum size is called MAX-SMTI-GRP, and the problem of finding a stable matching of maximum weight is called MAX-WT-SMTI-GRP.

Given an instance of MAX-WT-SMTI-GRP, we can construct a refined instance of MAX-WT-SMTI-GRP from by setting a threshold value with the effect that the acceptable pairs in are precisely the acceptable pairs in which have weight at least . The effect of imposing different threshold values on is of interest to Coram.

Example 1.

Our first example demonstrates how different threshold values create instances of SMTI-GRP with differently sized maximum size stable matchings. Let be a set of children, be a set of families, and let the weight function be defined by the following table:

95 85 80
95 80 80
80 45 75

By taking we obtain an instance of SMTI-GRP in which all pairs are acceptable. In this instance, is the unique maximum weight stable matching and its weight is 255. However, if we take and construct an instance of SMTI-GRP, then the only acceptable pair that involves is and no stable matching can involve . The unique maximum weight stable matching is then , which has a weight of 180.

Example 2.

Our second instance of SMTI-GRP is intended to show that a maximum weight stable matching may be smaller in size than a maximum size stable matching. Let , ,

and the weight function be given by the following table:

1
4 4
3 4
4 1

Let and . It is easy to verify that and are both stable matchings. However and , whereas and .

2.3 Hospitals / Residents with Ties

An instance of the Hospitals / Residents problem with Ties (HRT) comprises a set of  resident doctors and a set of hospitals. Each doctor (respectively hospital) ranks a subset of the hospitals (respectively doctors) in order of preference, possibly with ties. Additionally, each hospital has a capacity , meaning that can be assigned at most doctors, while each doctor is assigned to at most one hospital. The definitions of the terms consistent and acceptable are analogous to the SMTI case.

A matching in is a subset of acceptable pairs such that each doctor appears in at most one pair, and each hospital appears in at most pairs. Given a doctor , the terms matched and unmatched, and the notation , are defined as in the SMTI case. Given a hospital , we let . We say that is full or undersubscribed in if or , respectively. We next define stability by extending Definition 1 to the HRT case.

Definition 2.

Let be an instance of HRT and let be a matching in . A doctor–hospital pair is a blocking pair of , or blocks , if

  1. is an acceptable pair,

  2. either is unmatched in or prefers to , and

  3. either is undersubscribed in or prefers to some member of .

is said to be stable if it admits no blocking pair.

As in the SMTI case, the problem of finding a maximum size stable matching, given an instance of HRT, is denoted MAX-HRT.

While the definition for HRT does allow an arbitrary number of preferences to be expressed by any doctor, in reality doctors’ preference lists are often short: for example in the SFAS application until 2009, every doctor’s list was of length 6.

3 Existing formulations

The first mathematical models for SM were proposed in the late 1980s independently by Gusfield and Irving [16] and by Vande Vate [44]. Rothblum [41] extended Vande Vate’s model to SMI. In the following, we show how to extend Rothblum’s model to formulate both MAX-SMTI and MAX-HRT, as was done previously by Kwanashie and Manlove [27]. These existing models for MAX-SMTI and MAX-HRT are described here as they will be extended in later sections.

3.1 Mathematical model for MAX-SMTI

Based on our Coram application, we will adopt the terminology from that context when presenting models for MAX-SMTI. When reasoning about models, we will use and to represent a child and family, rather than and , respectively, as and are by convention more typically used as subscript variables. Let us consider the following notation:

  • is the set of families acceptable for child .

  • is the set of children acceptable for family .

  • is the rank of family for child , defined as the integer such that belongs to the th most-preferred tie in ’s list . The smaller the value of , the better family is ranked for child .

  • is the rank of child for family , defined as the integer such that belongs to the th most-preferred tie in ’s list . The smaller the value of , the better child is ranked for family .

  • is the set of families that child ranks at the same level or better than family , that is, .

  • is the set of children that family ranks at the same level or better than child , that is, .

By introducing binary decision variables that take value if child  is matched with family , and otherwise , MAX-SMTI can be modelled as follows:

(1)
 s.t. (2)
(3)
(4)
(5)

The objective function (1) maximises the number of children assigned. If instead, one wants to maximise the score of the children assigned (as in MAX-WT-SMTI-GRP), it is enough to use in the objective function. Constraints (2) ensure that each child is matched with at most one family and constraints (3) impose that each family is matched with at most one child. Finally, constraints (4) enforce stability by ruling out the existence of any blocking pair. More specifically, they ensure that if child is not matched with family or any other family they rank at the same level or better than (i.e., ), then family is matched with a child it ranks at the same level or better than (i.e., ).

3.2 Mathematical model for MAX-HRT

An adaptation of model (1)-(5) for MAX-HRT was proposed in [27]. It uses the same notation that was used for MAX-SMTI except that:

  • The term “family” is replaced by “hospital” and , , and are changed into , , and , respectively.

  • The term “child” is replaced by “doctor” and , , and are changed to , , and , respectively.

  • The capacity of hospital is denoted by .

By introducing binary decision variables that take value if doctor is assigned to hospital , and otherwise , MAX-HRT can be modelled as follows:

(6)
 s.t. (7)
(8)
(9)
(10)

While the meaning of the objective function and constraints (7) remains the same, constraints (8) ensure now that each hospital does not exceed its capacity. Constraints (9) are the adaptation of the stability constraints (4) when capacity is considered. More specifically, they ensure that if doctor was not assigned to hospital or any other hospital they rank at the same level or higher than (i.e., ), then hospital has filled its capacity with doctors it ranks at the same level or higher than (i.e., ).

3.3 Discussion on the models

Although the model for SM was proposed almost thirty years ago, the computational behaviour of its extension to MAX-SMTI and MAX-WT-SMTI-GRP (i.e., in one-to-one instances specifically) has never been studied, to the best of our knowledge. However, we mention that our direct implementation of (1)-(5) on real-world MAX-WT-SMTI-GRP instances involving children, families, and a large list of preferences cannot be solved by state-of-the-art solvers within hours. Indeed, the model becomes too difficult as it requires up to stability constraints, each of them including nonzero elements (i.e., up to ).

Regarding MAX-HRT, computational experiments with (6)-(10) applied to real-world and randomly generated instances have been carried out previously [6, 26, 27]. Kwanashie [26] observed a significant increase in terms of average running time when the number of doctors goes above 400. As our objective is to solve instances of the magnitude of the UK Foundation Programme application (involving almost doctors and hospitals), the model in its current form is not suitable.

In the next sections, we introduce various techniques aimed at reducing the size of the two models and strengthening their continuous relaxation.

4 Preprocessing SMTI with ties on both sides

It is quite common in combinatorial optimisation to use some simple analysis to fix the optimal value of a subset of variables and, thus, reduce the problem size. This is particularly useful for stable matching problems as one variable, one stability constraint, and up to non-zero elements are associated with each acceptable pair. Two procedures, “Hospitals-offer” and “Residents-apply’, have been proposed for removing acceptable pairs that cannot be part of any stable matching for HRT when ties only occur in hospitals’ preference lists [21].

When ties can belong to the preference lists of both sets of agents, a reduction technique is known for the special case of SMTI in which preference lists on one side are of length at most two [22]. However the aforementioned preprocessing algorithms are not applicable to arbitrary instances of SMTI. In this section we introduce a new sufficient condition to find a set of acceptable pairs that cannot be part of any stable matching for SMTI. We then propose two greedy algorithms to detect such pairs which can then be removed from the instance without affecting any stable matching. Our technique is based on the following result.

Theorem 1.

Let be an instance of SMTI. Given a child and a set of families such that for every family , is an acceptable pair, let be the set of children that at least one family in ranks at the same level or better than , i.e., . If , then in any stable matching , child  will be matched with a family such that .

Proof.

Assume for a contradiction that is a stable matching in in which child is matched with a family with or is unmatched. Since , at least one family must be matched with some child  or be unmatched. Then either is unmatched or prefers to , and either is unmatched or prefers to . In all cases blocks , which is a contradiction. ∎

There is no obvious efficient way to find, for each child, the set that removes the largest number of acceptable pairs from an instance of SMTI. Instead we present two polynomial-time algorithms to find sets that allow a significant number of acceptable pairs to be removed. Algorithm 1, “first-rank-family”, considers the first rank of children for each family , i.e., the children that thinks are the most desirable. Algorithm 2, “full-child-preferences”, completely analyses the preference lists of the children to find reductions. Note that each of these algorithms can also be applied to the preferences of the other set of agents by symmetry to obtain “first-rank-child" and “full-family-preferences", and that they may each be applied iteratively until no more reductions are possible.

1:Input: An instance of SMTI with children and families
2:Output: A set containing pairs that cannot be part of any stable matching
3:for each do for each subset of children in the powerset
4:     
5:end for
6:
7:for each family do
8:      the set of children family considers equally best
9:     
10:end for
11:for each do
12:     
13:     if  then
14:          for each do
15:               for each with do
16:                    
17:               end for
18:          end for
19:     end if
20:end for
21:return
Algorithm 1 first-rank-family

After initialisation (lines 3–6), Algorithm 1 considers each family in turn, determining the set of children that family ranks as (equally) most desirable (line 8) and storing this fact (line 9). Once this has been recorded, the algorithm searches through all these stored sets (line 11) to find sets of children and the set of families which all consider the set as their (equally) best choice. If the set of families is at least as big as the set of children (line 13) then for each child and each family ranked worse than the worst family in , we add the pair to our reduction set (lines 14–16).

As written, Algorithm 1 requires operations, as we must iterate over each possible subset of children (in both lines 3 and 11). However, if we only explicitly store the subsets  and generated by lines 7-10, we will obtain at most subsets and at most  subsets . To only store these specific subsets, we need to quickly look up whether such a set exists, and create it if it does not, before adding a family  to . A hashmap is a suitable data structure for carrying out these operations, and will reduce the overall complexity to .

Algorithm 2 incrementally builds up the sets and for each child . To build , we simply add each family from the preference list of in order from most preferable to least (lines 6–7), considering agents within ties in increasing indicial order. At each step, when we have added , we then add to all children that finds at least as preferable as (line 8). By construction these satisfy Theorem 1. Thus, if is large enough compared to , we add to our reduction all the pairs where are the families ranked worse than the worst family in (lines 9–11). Algorithm 2 requires steps as the outer (respectively middle and inner) for each loop is executed (respectively and ) times, and line 8 requires time.

1:Input: An instance of SMTI with children and families
2:Output: A set containing pairs that cannot be part of any stable matching
3:for each child do
4:     
5:     
6:     for each do for each family in descending order of preference
7:          
8:          
9:          if  then
10:               for each with do
11:                    
12:               end for
13:               break
14:          end if
15:     end for
16:end for
17:return
Algorithm 2 full-child-preferences

We note that: (i) this preprocessing is more powerful when the number of ranks (i.e., groups of tied elements) is high and when there are only a few agents in each rank, and (ii) rather than adding families in descending order of preference, more sophisticated heuristics could find a larger number of reductions at the cost of a higher time complexity. However, it is worth mentioning that our greedy approach works particularly well when there is a strong correlation between the scores obtained by a given agent among the other agents, e.g., if a family is ranked first for a given child, it also tends to be ranked highly by other children, which is the case in our application. We show in Section 7 that the greedy approaches given by Algorithms 1 and 2 can significantly reduce running times for our SMTI-GRP instances. We also remark that we did not try to extend Algorithms 1 and 2 to HRT instances with ties on both sides, as our practical application involving SFAS instances allows ties on one side only, and in such a setting we may apply Algorithms “Hospitals-offer” and “Residents-apply” from [21].

We conclude this section with an example of the application of Algorithms 1 and 2.

Example 3.

Let us consider an SMTI instance with 5 families and 4 children with the following preference lists:

In this example, child 1 prefers to be matched equally with family 1, 2, and 3. If his first choice is not granted, then child 1 prefers to be matched with family 4.

We start by running “first-rank-child", but we see that no two children share the same common set of families as their first preference, so no acceptable pair is removed. We then run “first-rank-family”, which highlights that both and have the same pair of children as their equally-first choice ( and ). This tells us that children and will never be matched with a family that they prefer less than both and . Therefore, there is no need for to ever consider . This leaves the following preferences.

As the instance was reduced, we could now re-run “first-rank-child” to see if any further reductions are to be found. However, no more reductions will be found, and so we move on to “full-child-preferences” and “full-family-preferences”. We demonstrate the former on child to obtain the following sequence of sets and :

As , we know that cannot be matched with a family that would rank as worse than the worst family in . This means that will never consider , so the acceptable pair  can be removed, leaving the following reduced instance.

Since we did reduce the instance, it is possible that re-running one of the other algorithms might reduce the instance even further, but in this particular instance no more reductions can be found.

5 Reducing the number of non-zero elements

Even if the reduction procedures previously described remove a significant number of acceptable pairs, the models involved in real-world instances remain too large to be solved by state-of-the-art ILP solvers. There are constraints and variables and up to non-zero elements, depending on the length of the agents’ preference lists. In this section, we propose an alternative formulation for MAX-SMTI that uses dummy variables to keep track of the children’s and families’ assignments at each rank so that the overall number of non-zero elements is reduced. Let us consider the following additional notation:

  • is the number of distinct ranks (or ties) for child .

  • is the number of distinct ranks for family .

  • is the set of families acceptable for child with rank .

  • is the set of children acceptable for family with rank .

In addition, we introduce the dummy binary decision variables (respectively, ) that take value if child (respectively, family ) is matched with a family (respectively, a child) of rank at most , and otherwise (respectively, . Variables and can be seen as a replacement of the summations of and over the sets and . These variables have certain similarities with the cut-off scores for the college admission problem [2] and the radius formulation for the -median problem [13].

The new formulation for MAX-SMTI is:

(11)
 s.t. (12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)

The objective function (11) now uses the last variable for each child (i.e., the one associated with its last rank) as an indicator of whether the child is assigned to a family. First, we note that even if (11) uses fewer non-zero elements than (1), both objective functions are equivalent. Second, the version of (1) that considers the weight of each pair should be used to solve MAX-WT-SMTI-GRP as (11) cannot be adapted for the problem. Constraints (12)-(15) maintain the coherence of variables and . Constraints (16) ensure the stability of the matching by using the new variables: if child was not matched with a family of rank or better (i.e., , that means that all families that child ranks at level were already matched with a child of better or equal rank (i.e., ). Finally, by imposing binary values, constraints (18)-(19) prevent any child or family from being matched more than once. Note that the model would also be valid if variables and were defined as continuous. However, preliminary experiments showed that it was not beneficial to do so.

Model (11)-(19) requires additional variables. It still uses stability constraints, but they now involve only two variables, which reduces the overall size of the model.

By adopting similar notation for MAX-HRT, where is the number of ranks (or ties) for doctor , is the number of ranks (ties) for hospital , is a binary decision variable that takes the value if and only if doctor is assigned to a hospital of rank at most , and is an integer decision variables indicating how many doctors of rank at most are assigned to hospital , MAX-HRT becomes:

(20)
 s.t. (21)
(22)
(23)

where (12)-(15) and (17)-(18) are appropriately modified to follow HRT notation.

6 Alternative stability constraints

While dummy variables reduce the number of non-zero elements involved in the stability constraints, we introduce in this section some additional techniques that influence the number of stability constraints and the quality of the continuous relaxations of the models. It is well-known that the performance of an integer model depends not only on its size, but also on its linear relaxation. It was shown in the literature that for several problems (see, e.g., the Bin Packing Problem [5] or the Resource-Constrained Project Scheduling Problem [25]), it may be beneficial to use larger models if they have a better continuous relaxation (i.e., closer to the optimal solution).

6.1 Reduced stability constraints for MAX-SMTI

6.1.1 Constraint merging

Model (11)-(19) can be further reduced by merging, for a given child, all stability constraints with the same rank. Constraints (16) now become

(24)

This transformation reduces the size of the model, as it uses only stability constraints. However, as will be shown in the computational experiments section, it also leads to a deterioration of the continuous relaxation bound. We note that the reduction in terms of size with respect to model (11)-(19) is more significant when the number of ranks (i.e., tie groups) is low.

6.1.2 Double stability constraints

To compensate for the degradation of the continuous relaxation caused by the constraint merging, it is possible to use the additional stability constraints

(25)

These constraints can be seen as the counterparts of (24) when the merging is performed on the families instead of the children. These additional constraints improve the quality of the continuous relaxation with respect to the model that uses only (24). Overall, we observe a tradeoff between the number of stability constraints used in the model and the quality of the bound obtained by the continuous relaxation.

6.2 New stability constraints for MAX-HRT

For MAX-HRT, merging constraint (22) is not useful if there are no ties on the doctors’ side (i.e., if , . As our practical case allows ties on the hospitals’ side only, it is not an improvement we explored. In this section, we propose instead an enriched formulation for MAX-HRT that allows us to define a second set of stability constraints. We introduce new binary decision variables that take value if hospital has filled entirely its capacity with doctors of rank at most , and otherwise . An additional set of stability constraints for MAX-HRT is:

(26)
(27)