Dynamic Type Matching
Ming Hu
Rotman School of Management, University of Toronto, Toronto, Ontario, Canada M5S 3E6
Yun Zhou
DeGroote School of Business, McMaster University, Hamilton, Ontario, Canada L8S 4L8
Oct 21, 2018
We consider an intermediary’s problem of dynamically matching demand and supply of heterogeneous types in a periodicreview fashion. More specifically, there are two disjoint sets of demand and supply types, and a reward associated with each possible matching of a demand type and a supply type. In each period, demand and supply of various types arrive in random quantities. The platform’s problem is to decide on the optimal matching policy to maximize the total discounted rewards minus costs, given that unmatched demand and supply will incur waiting or holding costs, and will be carried over to the next period (with abandonment). For this dynamic matching problem, we provide sufficient conditions on matching rewards such that the optimal matching policy follows a priority hierarchy among possible matching pairs. We show those conditions are satisfied by vertically and unidirectionally horizontally differentiated types, for which quality and distance determine priority, respectively. As a result of the priority property, the optimal matching policy boils down to a matchdownto threshold structure when considering a specific pair of demand and supply types in the priority hierarchy.
Operations management is about managing the process of matching supply with demand. We consider a firm that periodically manages the matching between demand and supply. In each period, demand and supply of various types arrive in random quantities. Each “type” represents a distinct set of characteristics of demand or supply. The matching between demand and supply generates type and timedependent reward. With unmatched demand and supply fully or partially rolled over to the next period, the firm aims to maximize the total expected rewards (minus costs of waiting compensation for demand and inventory holding for supply).
The problem we describe above is crucial to many intermediaries who centrally manage matchings in a sharing economy. Sharing economy platforms often use crowdsourced supply and match it dynamically with customer demand. For example, commuter carpooling platforms such as UberCommute match a driver heading to a destination with a rider to the same destination (or in the same direction). Amazon crowdsources inventories of an identical item from thirdparty merchants to its warehouses, to fulfill online orders. The nonprofit organization, United Network for Organ Sharing (UNOS), allocates donated organs to patients in need of transplantation. In the center of those business and nonprofit sharingeconomy models, a platform is developed and maintained by an intermediary to enable sharingeconomy activities. Those models have the following features.
Heterogeneous demand and supply types. From the intermediary firm’s perspective, matching between demand and supply of different characteristics often generates distinct rewards (or equivalently, mismatch costs). We refer to demand/supply of different characteristics as different types of demand/supply, and consider two possible ways in which demand/supply types differ from each other. In particular, types can be horizontally or vertically differentiated. Horizontal differentiation means that the characteristics of a type are not always superior or inferior to those of another type (regarding generating matching rewards). Instead, the matching reward between a demand type and a supply type is determined by the two’s idiosyncratic taste on each other. For example, for a ridehailing platform, riders and drivers are characterized by their locations, with the matching between a pair closer to each other generating a higher reward (i.e., a shorter waiting time for the rider and shorter idle time for the driver). Vertical differentiation means quality differences in the demand/supply types. Under vertical differentiation, a particular type is always superior or inferior to a different type regarding generating matching rewards. For example, from the perspective of UNOS, patients and organs may differ in their health condition. A patient/donor in a better health condition, in general, leads to a better transplant outcome.
Timevariant uncertainty on both sides of the market. In contrast to conventional business models where supply is often treated as a decision (e.g., inventory replenishment decision) or a fixed capacity (e.g., in revenue management problems), crowdsourced supply in sharing economy activities may arrive at the system randomly and dynamically. For example, in ridehailing activities, drivers decide, on their own, when and how much time they make themselves available to provide service. In Amazon’s inventory commingling program, thirdparty merchants use their own inventoryregulating policies and may be subject to various timevarying supply shocks.
We use a finitehorizon stochastic dynamic program to formulate the problem with the features mentioned above. Next, we present an overview of the main results of the paper, as well as the applications and implications of the model and the results.
A key result of the paper is the establishment of the modified Monge conditions. Under those conditions, a particular pair of demand and supply types should have “priority” over a neighboring pair (i.e., a pair sharing the same demand or supply type) in the optimal matching policy. This allows us to simplify the matching decision within a period, and focus on the tradeoff between matching in the current period and that in the future.
The optimal matching policy is complicated even for the static problem. For example, consider a specific period without accounting for future arrivals of demand and supply. On the one hand, one may want to prioritize the matching between a type demand and a type supply if the unit matching reward is high. On the other hand, matching with may prevent both matching with another type supply and matching another type demand with . If , it may be undesirable to prioritize matching with . Moreover, the optimal matching policy is further complicated by possibly saving a demand or supply type for the current period and matching it with future supply or demand. In other words, there are tradeoffs within a period, as well as across the current period and future periods.
Under the conditions we establish in this paper, we are able to prioritize the matching within a period. This allows us to focus on the tradeoff between the current period and future periods.
Then we study two special versions of the model, namely, the model with horizontally differentiated types (in short, the horizontal model) and the model with vertically differentiated types (in short, the vertical model). Both satisfy the established modified Monge condition.
The horizontal model. We consider demand and supply types distributed in a metric space (which can be considered as the space of characteristics of demand/supply). The matching reward between a demand type and a supply type depends on the “distance” between the two. The shorter the distance, the higher the reward. We start by studying the case with two demand types and two supply types. In that case, a perfect pair (i.e., type 1 demand with type 1 supply, or type 2 demand with type 2 supply, both associated with the highest unit matching rewards) should be prioritized and matched greedily, whereas an imperfect pair (i.e., type 1 demand with type 2 supply, or type 2 demand with type 1 supply, which has a lower unit matching reward compared with a perfect pair) should be considered only when the corresponding demand and supply types have sufficiently high levels (after the greedy matching of the perfect pairs) and matched down to some threshold level. (We will define perfect and imperfect pairs formally in Section id1.) Therefore, the main tradeoff is between a lower reward from matching an imperfect pair in the current period and a possible higher reward by reserving demand/supply to form perfect pairs in a future period.
When there are multiple demand and supply types, we focus on the unidirectional case in which is a directed line segment, and the supply travels along a given direction to reach the demand for the matching. We show that a shorter distance implies a higher priority, i.e., the optimal policy would assign a demand type to the closest available supply type.
The horizontal model has the following applications.
Capacity management with upgrading. Upgrading uses a highclass supply to fulfill a lowclass demand, which is widely adopted in travel industries (see, e.g., Yu et al. 2015) and in production/inventory settings (see, e.g., Bassok et al. 1999). Shumsky and Zhang (2009) study a revenue management problem with fixed initial capacities of various supply types, and demand types can only be upgraded onelevel up. Yu et al. (2015) study the general upgrading problem, allowing demand types upgradable to be matched with a generally higherquality supply type. The upgrading reward structure in Yu et al. (2015) is a special case of unidirectionally horizontal types located along a line. Thus our results apply to a generalized capacity management problem with general upgrading and random replenishment. The feature of random supply is desirable for upgrading, even for those revenue management settings, not to mention for the production/inventory settings. For example, in car rental, car returns can be random, and in airline ticket selling, early cancellations or airplane swaps can result in random capacity changes.
Commuter carpooling along a fixed route. Carpooling platforms specifically designed for commuters, such as UberCommute and GrabHitch, match riders heading to the same destination (or in the same direction). In those cases, the matching reward has two additive components: The first one is a disutility associated with the distance traveled along the fixed route from the driver’s current location to pick up the demand. The second is a utility associated with traveling along the route from the demand’s pickup location to its dropoff location. The former is the unidirectionally horizontal case, whereas the latter is a vertically differentiated attribute because, given the same pickup location, it is more desirable if the demand’s travel distance is longer. We show that if riders and drivers head to the same destination at the end of the route, a shorter distance to pick up a rider on the way has a higher priority in matching.
The vertical model. Each demand and supply type is associated with a quality level, with higher quality types leading to higher matching rewards. In particular, we focus on the case where the reward of matching a pair is the sum of the contributions brought in by its components, which are increasing in quality. Then the optimal matching policy follows a simple structure, which we call topdown matching (in an economic term, assortative mating): line up demand and supply types in descending order of their “quality” levels from high to low; match them from the top, down to some level. Thus, the optimal matching policy in any period can be entirely determined by a total matching quantity. This result is generalizable to the case where the matching reward is nonlinear but not far from being additive.
In the vertical model, the main tradeoff is again between the current period and future periods. The optimal policy will reserve some (lowerquality) demand or supply type(s), to reduce the chance of losing or delaying the matching of potential highquality types arriving in the future.
For two special cases, namely, the case with patient demand and supply and the case with impatient demand and patient supply, we further derive monotonicity properties of the optimal total matching quantity with respect to the state of demand and supply. We also propose a onestepahead (OSA) heuristic policy, which is guaranteed to perform better than greedy matching, and significantly reduce the degree of statedependency.
The vertical model may shed light to the following applications.
Online dating. In the settings of assortative mating such as online dating platforms, the participants of matching have vertically distributed attributes such as wealth and education. Becker and Murphy (2003, Chapter 4, p. 31, Eq. (4.2)) assume that in a decentralized marriage market the output of a marriage is the sum of the marital incomes of male and female. We consider the same reward structure, but with dynamic and random arrivals of males and females and from a centralized perspective. The topdown matching structure in our vertical model implies that a centralized dating agency (or even a decentralized dating platform) may want to limit the number of matching pairs at any time, in anticipation of future arrivals of higherquality participants.
Organ allocation. Organ allocation decisions involve many factors, such as the efficient use of organs and health conditions of the patients. On the one hand, organs differ in their quality (which can be determined by risk factors such as age and cause of the donor’s death). Higher quality of the organ in general leads to better posttransplantation health outcomes. On the other hand, patients differ in their health condition. Those who are sicker suffer lower quality of life and greater risk of death, and thus receive a higher benefit from transplantation. The topdown matching procedure in our vertical model suggests that organs of higher quality levels and patients in worse health conditions should receive higher matching priority, and it can be optimal to reject some lowquality organs for patients in anticipation of highquality organs arriving in the near future.^{1}^{1}1In addition to quality differences, the matching between a patient and an organ is subject to compatibility constraints. The topdown structure sheds light on the matching among patients and organs that are mutually compatible.
We illustrate the highlevel positioning of our framework in Figure 1. The proposed dynamicmatching framework can be viewed as a generalization of two foundations of operations management, i.e., inventory management where the firm orders the supply centrally (Zipkin 2000), and revenue management where the firm regulates the demand side with a fixed supply side (Talluri and van Ryzin 2006), and of a combination of the two, i.e., joint pricing and inventory control (Chen and SimchiLevi 2012). Compared with existing work in inventory and revenue management, the supply in the sharing economy is crowdsourced and hence has uncertainty.
Driven by reallife applications, economists, computer scientists, and operations researchers have studied a variety of twosided matching problems (see, e.g., Roth and Sotomayor 1990, Abdulkadiroğlu and Sönmez 2013 for a survey), which include the college admissions problem (with the marriage problem as a special case), kidney exchange and the online bipartite matching problem. We compare our framework with those problems as follows.
The college admissions problem and the marriage problem are preferencebased, and focus on finding stable matchings in a static and deterministic setting. In those problems, parties on both the demand and supply sides submit preferences over options (see, e.g., Ashlagi and Shi 2016) to the matching agency. As the matching outcomes (i.e., college admissions and marriages) can be lifechanging events for the participants, serious efforts in soliciting preferences are necessary. In contrast, soliciting preferences may not be practical for daytoday, or even realtime operations in sharing economy activities. For instance, when riders hail a car on Uber, they do not have the option, or may not even bother with which driver to serve them. To handle such situations, we assign a “monetary” contribution to the matching between a pair of demand and supply types, instead of adopting preferences by demand and supply. For example, a lower reward will be generated if a fartheraway car is dispatched.
In a typical situation of the kidney exchange, patients and donors arrive in pairs, with an incompatible patient and donor in each pair. Subject to compatibility constraints, researchers have designed efficient matching mechanisms based on cycles (e.g., twoway exchanges) or chains of patientdonor pairs (see, e.g., Roth et al. 2004, 2007) to maximize the number of matchings. Ünver (2010) studies dynamic kidney exchange with intertemporal random arrivals of patientdonor pairs and attempts to maximize the number of matched compatible pairs. Our model differs from his by allowing arbitrary unbalanced arrivals of demand and supply, and considering the objective to maximize matching total reward minus cost (i.e., social welfare or profit).
Online bipartite matching problems have many applications such as allocation of display advertisements. Initiated by Karp et al. (1990), the classic version considers a bipartite graph , and assumes that the vertices in arrive in an “online” fashion. That is, only when a vertex (e.g., a web viewer) arrives, are its incident edges (e.g., his interests) revealed. Then can be matched to a previously unmatched adjacent vertex in (e.g., an advertiser). The objective is to maximize the number of matchings. The problem has many variants, all with the focus on algorithms’ competitive ratios (see Manshadi et al. 2012 for a more recent literature review). The main difference from our model is the “online” feature, other than that there is no explicit notation of inventory, with one side (e.g., advertisers) always there and the other (e.g., impressions) getting lost if not matched. Instead of worstcase analysis, we focus on the expected value optimization.
Operations researchers have studied twosided matching by the queueing approach or its fluid counterpart. Arnosti et al. (2014) study a decentralized twosided matching market and show that limiting the visibility of applicants can significantly improve the social welfare. With a fluid approach of modeling stochastic systems, Zenios et al. (2000) and Su and Zenios (2006) study kidney allocation by exploring the efficiencyequity tradeoff, and Akan et al. (2012) study liver allocation by exploring the efficiencyurgency tradeoff. Using doublesided queues, Zenios (1999) studies the transplant waiting list and Afèche et al. (2014) study trading systems of crossing networks. Su and Zenios (2004) analyze a queueing model with service discipline FCFS or LCFS to examine the role of patient choices in the kidney transplant waiting system. Adan and Weiss (2012) show that the stationary distribution of FCFS matching rates for two infinite multitype sequences is of product form. Gurvich and Ward (2014) study the dynamic control of matching queues with the objective of minimizing holding costs. Focusing on the fluid approximation and its asymptotic optimality, the authors observe that in principle, the controller may choose to wait until some “inventory” of items builds up to facilitate more rewardable matches in the future. We also make a similar observation. Kanoria and Saban (2018) study a dynamic fluid matching model in which agents on one side receive proposals from those on the other side and determine whether they would pay screening cost to discover the value of the proposing agent. They show that, suitable restriction imposed by the matching platform on the searching of the agents can reduce wasted search effort. In contrast to the above papers, we focus on the stochastic model (vs. the fluid counterpart) and optimal decision making (vs. performance evaluation).
Consider a finite horizon with a total number of periods. At the beginning of each period, types of demand and types of supply arrive in random quantities. Let be the set of demand types and be the set of supply types. With a slight abuse of notation, we write and , noting that and are disjoint sets. We use to index a demand type and to index a supply type. The pairs of demand and supply are shown in Figure 2 as a bipartite graph. An arc represents the matching of type demand and type supply. Without loss of generality, we consider a complete bipartite graph in the base model. In other words, any demand type can potentially be matched with any supply type, apparently with different rewards (or equivalently, mismatch costs). If a demand type is not allowed to pair with supply type , we can just set the matching reward between the two to zero. We denote the complete set of arcs by .
The state for a given period comprises the demand and supply levels of various types before matching but after the arrival of random demand and supply for that period. The distributions of supply and demand in one period can be exogenously correlated with those in another period. But our model does not account for endogenized correlations among distributions of demand and supply, e.g., a driver’s current pickup of a customer may affect future supply at the place where the driver drops off the customer. In other words, we assume away the possible dependence of future distributions of demand and supply on the current matching decisions.
We denote, as the system state, the demand vector by and the supply vector by , where and are the quantity of type demand and type supply available to be matched. Although we assume that the states and the demand and supply arrivals are continuous quantities (and therefore so are the matching decisions), our results can be readily replicated if those quantities are discrete. On observing the state , the firm decides on the quantity of type demand to be matched with type supply, for any and . For conciseness, we write the decision variables of matching quantities in a matrix form as , with its th row (as a row vector) and its th column (as a column vector). There is a reward for matching one unit of type demand and one unit of type supply for all .^{2}^{2}2We can account for the case with forbidden arcs. If , we can let be zero or a negative number. We can write the rewards in a matrix form as . Thus the total matching reward is linear in the matching quantities. That is, where “” gives the sum of elements of the Hadamard product of two matrices. The postmatching levels of type demand and type supply are given by and , respectively. That is, and . The postmatching levels cannot be negative; i.e., , .
The unmatched demand and supply at the end of a period carry over to the next period with a fraction of and , respectively. In other words, fraction of demand and fraction of supply leave the system. Without loss of generality, we assume they leave the system with zero surpluses. The carryover fractions and can be timedependent (in which case they should be written as and ). But because such time dependency would not affect our results, for ease of notation, we suppress the subscript .
The firm’s goal is to determine a matching policy that maximizes the expected total discounted surplus (i.e., reward minus cost). (Our perspective is socialwelfare maximization. Alternatively, the formulation can account for profit maximization if is interpreted as the revenue collected from a matching.) Let be the optimal expected total discounted surplus given that it is in period and the current state is . We formulate the finitehorizon problem by using the following stochastic dynamic program:
(1) 
The boundary conditions are for all , without loss of generality. In other words, at the end of the horizon, all unmatched demand and supply leave the system with zero surpluses. Note that we do not explicitly discount future rewards in (1) because discouting is implicitly accounted for by using timedependent rewards.
A matching policy consists of mappings, , where is a feasible matching decision in period for state .
As mentioned, the state of the system is assumed to be realvalued without loss of generality. Nevertheless, our formulation (1) applies to integervalued states (with and equal to either 0 or 1; i.e., each demand/supply type is either completely patient or completely inpatient).
We can account for waiting costs of those demand and supply types that are not immediately matched by incorporating those costs into the matching rewards. Suppose that a demand type (resp., supply type ) incurs a perunit waiting cost (resp., ) in period if unmatched. In Online Appendix id1, we prove that the problem with costs shares the same optimal matching policy as that without costs but with an updated perunit benefit of matching type demand with type supply in period as . To see this intuitively, if a unit of type demand (resp., type supply) is never matched, its total waiting cost from period to period is (resp., ), which could be saved if this unit for type demand is matched in period . Then, the perunit benefit of matching type demand with type supply in period becomes , which can be used as the unit matching reward in place of .
The existence of an optimal matching policy is resolved by the following proposition.
Proposition 1
The functions and are continuous and concave. There exists an optimal matching policy .
Note that the continuity and concavity in Proposition 1 hold only when the states and decisions take continuous values. For the problem with integervalued states and decisions, concavity is undefined in high dimensional spaces. Nevertheless, an optimal decision still exists, and all subsequent results still hold. In general, we expect the optimal policy to be statedependent and extremely complex. In the next section, we characterize some of its structural properties.
We are interested in matching policies with natural properties, e.g., matching an “essential” pair of a demand type and a supply type before matching any less so pairs.
In particular, we compare two neighboring pairs of demand and supply types (i.e., two arcs in the bipartite network that are incident to a common vertex) and provide sufficient conditions for one pair to be more “essential” than another.
We first define a partial relation to compare two pairs of demand and supply.
Definition 1
(Weak Modified Monge Condition) We say that if (i) ; and (ii) for all and all . Similarly, we say that if (i) ; and (ii) for all and all .
Let us examine the condition in Definition 1 (to which, the other condition, , is symmetric). The condition is easy to satisfy if suppliers are impatient (i.e., is small). If suppliers are relatively patient (i.e., is close to 1), for the inequality to hold, the differences in contributions to the matching brought by and should not increase notably over time.
We further define a stronger partial relation for comparing two pairs of demand and supply.
Definition 2
(Modified Monge Condition) We denote the partial relation by if for any and , it holds for all that
(2) 
We will show that there exists an optimal matching policy that is consistent with the partial relation . For a pair , we consider the two sets of pairs and , which contain neighboring pairs dominated by under . Note that any (or ) is not dominated by , i.e., either dominates, or is incomparable with, under . We also note that does not belong to or by definition.
Given the state and feasible matching decision in period , the quantity represents the remaining quantity of type demand, after matching through pairs nondominated by or incomparable with . Likewise, is the remaining quantity of type supply after matching it through pairs nondominated by or incomparable with . We then define a class of policies that respect a partial relation .
Definition 3
(Compatibility) We say that an optimal matching policy respects the partial relation if (i) for all and all , either or ; (ii) for all and all , either or .
Part (i) of Definition 3 says, unless (i.e., is completely consumed by and those nondominated by or incomparable with so that the further matching between and is impossible), any such that (i.e., any pair dominated by ) will not be matched with . In other words, matching of is prioritized over all such that . Similarly, part (ii) implies that matching of is prioritized over if .
The following theorem demonstrates the existence of an optimal policy that respects . Then, in that optimal policy, matching of is prioritized over in any period if .
Theorem 1
There exists an optimal matching policy that respects .
If only the weak modified Monge condition is satisfied, we show in the appendix that there exists an optimal policy compatible with the partial relation in a weaker sense than Definition 3.
We further provide sufficient conditions for greedy matching between a pair to be optimal.
Proposition 2
Suppose that the pair dominates all its neighboring pair by (i.e., and for all and all ). Also suppose . Then, greedy matching between and is optimal in all periods. In other words, in any period with any state , the optimal matching quantity between and is .
In the rest of the paper, we refer to the pair as a perfect pair if it dominates all its neighboring pair by . Any other pair is referred to as an imperfect pair.
As an immediate application of Proposition 2, consider demand and supply types that are specified by their locations in an Euclidean space (e.g., Uber drivers and riders in different locations; products and customers in different locations for Amazon’s inventory commingling program). In each period, the reward of matching supply with demand is a fixed prize minus the disutility proportional to the Euclidean distance between the demand location and the supply location (i.e., , where represents the Euclidean distance between and ). If the parameter is decreasing in time, we can verify that a demand type and a supply type from the same location forms a perfect pair, and by Proposition 2, they should be matched as much as possible.
Corollary 1
Suppose that the demand and supply types are uniquely characterized by their spatial locations. The perunit matching reward in period between and is , where is the Euclidean distance between and ’s locations. If both and are decreasing in , and should be matched greedily in any period.
The partial relations defined in Definitions 1 and 2 are reminiscent of the classic Monge sequence discovered by Gaspard Monge, a French mathematician, in 1781. Hoffman (1963) provides a necessary and sufficient condition for a static transportation problem to be solvable by a greedy algorithm, in which a permutation (referred to as the Monge sequence) is followed. The Monge condition provides a priority sequence for all the arcs (i.e., demandsupply pairs) in the bipartite network and requires only condition (2) of Definition 2. Our Definitions 1 and 2 compare two neighboring arcs to determine their priorities in the setting with the dynamic and stochastic arrival of demand and supply types over time. Naturally, our conditions may appear more restrictive than the requirements of the Monge sequence because our problem is more complex. In particular, to compare and we require the inequality to hold for all . (Similarly, to compare and we require to hold for all .) Nevertheless, in subsequent sections, we show that those conditions are satisfied by two classes of problems, namely the horizontal model and the vertical model.
Remark 1
Our model and results can be generalized to the case with timedependent carryover rates. Suppose that in period , a fraction of the unmatched demand and a fraction of the unmatched supply will carry over to the next period , for any type of demand and supply. Then, in Definition 1, the conditions and should be replaced with and , respectively. All subsequent results remain true. For example, Proposition 2 still holds if we replace the condition with .
Consider demand and supply types located in a space . Each point in represents the characteristics of the corresponding (demand/supply) type. A shorter distance between and implies a higher unit matching reward in each period. Thus, the types are “horizontally” distributed.
We begin with the space consisting of two distinct locations, namely, locations 1 and 2. There are two demand types and two supply types, and . Type 1 demand and type 1 supply share location 1, while type 2 demand and type 2 supply colocate at location 2. For , we denote the other index in the set by , i.e., . Since a shorter distance implies a higher reward, we make the following two assumptions for the rest of this subsection.
Assumption 1
, for .
The next assumption further compares the unit matching rewards across different periods.
Assumption 2
For any , all , and , and .
From Assumptions 1 and 2 it is straightforward to verify that and for . In other words, demand type 1 and supply type 1 form a perfect pair, and so do demand type 2 and supply type 2, while is an imperfect pair, for . As an application, consider a premier service and a regular service (e.g., luxury vs. economy car services) provided by crowdsourced suppliers. The fares for the two services are and , respectively. The intermediary firm pays the two types of suppliers and , respectively. If the firm offers the premier service to a customer requesting the regular service, the customer will only pay the regular fare (i.e., free upgrading). However, the intermediary firm still needs to pay the premier wage to the premier service provider. If a customer originally requesting the premier service is offered the regular service, s/he also pays the regular fare, with a possible penalty cost incurred to the firm (monetary compensation, loss of goodwill, etc.) It is natural to assume that , and that the margin of the premier service is higher than that of the regular service . Then, the reward for matching a premier customer with a premier supplier (i.e., ) is higher than that for matching a premier customer with a regular supplier (i.e., ), and also higher than that for matching a regular customer with a premier supplier (i.e., ). Likewise, matching a regular customer with a regular supply (i.e., ) generates more reward than matching a regular customer with a premier supplier (i.e., ), and than matching a premier customer with a regular supplier (i.e., ). This verifies Assumption 1, and as a result, Assumption 2 trivially holds when the parameters are assumed to be timeindependent.
It follows directly from Proposition 2 that type 1 demand should be matched with type 1 supply as much as possible, before we match type 1 demand with type 2 supply, or type 2 demand with type 1 supply. Likewise, type 2 demand should be matched with type 2 supply greedily. Clearly, after greedy matching between the pair , there cannot be any positive remaining quantity for both demand type and supply type (). This observation allows us to collapse the state space: In period with the (original) state , we define the new state as , where The quantity describes the imbalance between type 1 demand and type 1 supply. A nonnegative represents the remaining quantity of type 1 demand after greedy matching with type 1 supply in period (the remaining quantity of type 1 supply will be zero). For a negative value of , is the remaining quantity of type 1 supply after greedy matching with type 1 demand. Similarly, is the remaining quantity of type 2 supply after greedy matching with type 2 demand, whereas is the remaining quantity of type 2 demand after greedy matching with type 2 supply.
After the first round of greedy matching in period , if there are remaining type demand and type supply () simultaneously (i.e., either and , or and ), we will match the two with each other, but not necessarily in a greedy way. The intermediary may withhold some type demand in order to match it with type supply in a future period (or withhold type supply to match with type demand in the future). The amount of type demand to withhold generally depends on the available amount of type supply. For example, if there is a high level of type supply in the current period, it is unlikely for all of those supply to meet type demand (i.e., its best match) in the future, and we may, therefore, use more type demand to match with type supply. Symmetrically, the amount of type supply to withhold depends on the available type demand. Thus, the matching between an imperfect pair is governed by statedependent matchdownto target levels, where the statedependency is onedimensional (e.g., the target level for type demand depends only on the available type supply).
To formalize the above discussion, we define as the aggregate imbalance between demand and supply. We describe the structure of the optimal policy as follows.
Proposition 3
The optimal policy performs two rounds of matching in each period .

Round 1: Matching of perfect pairs.
For , match type demand with type supply greedily.

Round 2: Matching of an imperfect pair.

No matching in round 2 if .

If and , match type 1 demand and type 2 supply. There exist protection levels and dependent on the imbalance , such that , and the matching between the pair reduces type 1 demand to and type 2 supply to .

If and , match type 2 demand and type 1 supply. There exist protection levels and dependent on , such that , and that the matching between the pair reduces type 2 demand to and type 1 supply to .

According to Proposition 3, the matching of round 2 is dependent on the state . When , after round 1, either both type 1 and type 2 supply are depleted, or both type 1 and type 2 demand are depleted. With neither supply nor demand is available, there is no matching in round 2.
When and , we have remaining quantities of type 1 demand and type 2 supply. Part (ii) of Proposition 3 shows that the matching between the pair is characterized by the protection levels and , which are the target levels to reduce type 1 demand and type 2 supply to, respectively. In the beginning of round 2, if the quantity of available type 1 demand is above , the optimal policy will reduce it to (by the quantity ) by matching it with type 2 supply. In the mean time, the relation guarantees that type 2 supply will be reduced to . If is below , there is no matching and type 1 demand remains at the level of . (Note that implies , hence type 2 supply remains at the level of .)
The statedependent protection levels and only depend on the onedimensional quantity rather than on the full, twodimensional state . The case of and is symmetric to the case of and . We further consider two special cases, for which we will characterize the properties of the protection levels with respect to the state.
Consider , i.e., demand and supply are infinitely patient and stay until they are matched.
Proposition 4
The protection levels and for round 2 matching are increasing in the aggregate imbalance . The protection levels and are decreasing in . Moreover, the decreasing and increasing rates are no higher than 1.
Proposition 4 examines the monotonicity of the protection levels with respect to the aggregate imbalance. We interpret the proposition as follows.
When , demand is in excess. A higher value of suggests more demand over supply. The chance of a demand type meeting a better match in a future period becomes smaller. Therefore it becomes more imperative to consume more demand by lowering the protection level for supply. As a result, the protection levels and decrease as increases. The rate of decrease, however, is no higher than 1, which implies that the increment in (i.e., extra demand more than supply) will not be entirely matched in the current period, through reducing the protection level for supply. The relations and then immediately imply that and are increasing in with the increasing rates capped by 1.
When , supply is in excess. A larger suggests less supply in excess of demand. Thus, it is less imperative to consume the excess in supply, implying a higher protection level for demand.
Proposition 4 is particularly helpful when demand and supply quantities take integer values. In that case, once we obtained the value of the protection level , the protection level is either or , whichever yields higher matching rewards.
Although we have assumed , Proposition 4 is generalizable to the case with arbitrary values of and as long as the two carryover rates are equal to each other (i.e., ).
Consider and . In this case, demand is impatient and is lost if not matched in the current period. Thus we only need to record supply levels as the system state. Then, round 2 matching is fully characterized by protection levels on the supply side only, as shown in the following proposition. For ease of notation, let be the smaller of two numbers and .
Proposition 5
There exist stateindependent protection levels and such that in round 2 matching of period ,
(i) if and , the optimal matching policy reduces type 2 supply as close to the protection level as possible; the postmatching level of type 2 supply is ;
(ii) if and , the optimal matching policy reduces type 1 supply as close to the to protection level as possible; the postmatching level of .
Proposition 5 shows that the optimal policy always aims to reduce type 1 supply to the protection level , and type 2 supply to . This result is generalizable to the case with .
More specifically, consider the case with and . In this case we match type 1 demand with type 2 supply in round 2. According to the proposition, if type 1 demand is ample, the optimal policy will reduce type 2 supply to (i.e., to if the quantity of available type 2 supply is above , or there is no matching if is already no more than ). If there is a low level of type 1 demand, however, type 2 supply can be reduced at most by (when all available type 1 demand is matched with type 2 supply) to the level . The case of and is symmetric to the case of and .
We now study the more general case with demand types and supply types, all located in the space . Here we consider the case where is a line segment, with its two endpoints denoted by and , respectively. The fitness of matching a demand type and a supply type is determined by the distance between and on . We consider two distance metrics, but focus on the directed distance in this subsection.
Undirected distance. This is the shortest distance between the location of and on .
Directed distance. Suppose that is endowed with a direction, say, from endpoint to endpoint (in short, ). If the location of can be reached from the location of by traveling along the given direction (i.e., is located between and endpoint ), the distance between and , denoted by , is defined as the distance to be travelled by along the given direction to reach the location of .
We now focus on the directed distance, and assume that the unit matching reward between and is a linearly decreasing function of the distance if can reach by traveling along the direction , i.e., . If cannot reach by traveling along the direction , the unit reward is .
It is clear that the optimal matching quantity if cannot reach along the direction . Next, we compare two pairs of demand and supply, for both of which the supply type can reach the demand type along the direction .
Lemma 1
(i) Suppose that supply type can reach both and along the direction . Then, if and only if along the direction , the distance from to is shorter than the distance from to .
(ii) Suppose that both supply types and can reach type demand along the direction . Then, if and only if along the direction , is closer to than .
Lemma 1 suggests that for two neighboring pairs of demand and supply, the pair with a shorter, unidirectional distance should have a higher priority. It follows from this lemma that any two neighboring pairs are comparable by .
Proposition 6
(i) If , the optimal policy matches before . If , the optimal policy matches before .
(ii) Suppose that decreases in . If there are no other demand or supply types located between and on , and should be matched with each other greedily, i.e., .
Proposition 6 prescribes a priority hierarchy for the optimal matching policy, by classifying the pairs of demand and supply into priority tiers. Let the set of tier 0 pairs, denoted by , be those not dominated by any neighboring pair under . Recursively, we can define tier pairs, denoted by , as those pairs that belong to and are not dominated by any other neighboring pairs in . Suppose that there are a total number of tiers. The optimal policy always matches the pairs in before it moves on to match the pairs in , for . Moreover, if a pair is not matched to the full extent (i.e., there are remaining quantities of both type demand and type supply), any pair of the form or in will not be matched (i.e., with a zero matching quantity) in the optimal policy.
Proposition 6 provides a partial characterization of the optimal policy with respect to the priority structure, but does not prescribe how much to match for each pair of demand and supply types. Motivated by this proposition, we briefly describe a heuristic idea to compute the optimal matching decisions. For a given period , we consider and both located on the line segment such that is accessible from along the given direction. When matching with , we may want to reserve some type demand (resp., type supply) for future supply types (resp., demand types) located between and on the line segment . But we may not want to reserve type demand (resp., type supply) for any supply type (resp. demand type ) located outside the segment between and , due to the lower priority of the pair (resp., ) than the pair (see Proposition 6). As a heuristic, we determine the matching between and by considering a subproblem P that comprises only demand type , supply type and the types located between and on . According to Proposition 6, we should not match with until there is no remaining quantity for any demand type or supply type located between and . Thus, we assume that in the subproblem P, all types except demand type and supply type have zero remaining quantity. Analogous to the model in Section id1, we can show that the optimal matching between and is characterized by a protection level on type demand and on type supply, with both protection levels dependent on (which is the imbalance between type demand and type supply) and . More specifically, we will match with until type demand is reduced to and type supply is reduced to , or as close as possible.
Next, we outline the heuristic matching procedure for a period , assuming that the protection levels and are already obtained for all and .
Heuristic 1
(Prioritized matching for the horizontal model)
Within each priority tier, it does not matter which pair we match first, because the matching of one pair does not affect the subproblem for another pair within the same tier. The computation of the protection levels for each subproblem P, however, remains challenging. In the appendix, we discuss a heuristic method that converts the subproblem P to a model by consolidating demand and supply types. Next, we discuss a couple of applications of the horizontal model.
Carpooling platforms such as iCarpool and UberPool match a driver heading to a destination with several riders to the same destination (or in the same direction). Commuting patterns of many cities indicate that drivers and riders often share the same destination. For example, Figure 3 displays the New York City commuting pattern in the mornings of weekdays, from which we see that commuters travel from different suburban areas in the same direction to the city. In this case, the directed line segment is corresponding to the route that starts from a suburban area (i.e., endpoint ) and ends in the city (i.e., endpoint ). Drivers, who may be commuters themselves, pick up riders along the route.
Moreover, if all riders share the same destination (e.g., the city) and a driver picks up riders along the way to the destination, the closer a rider to the driver, the shorter the waiting time for the rider and the higher the payment for the ride (due to the longer distance travelled by the rider). To formalize this intuition, we generalize the reward function mentioned above as follows. If can reach along the direction , the unit reward of matching with is . Otherwise, . Here, represents the reward resulting from the match and is dependent on the attribute of type demand, e.g., the travel distance of the rider (from rider ’s initial location to the destination). The second term represents the disutility proportional to the traveling distance by the driver for the pickup, e.g., a longer distance implies a longer roaming time for the driver and longer waiting time for the rider. Following similar analysis, we can show that if and only if is closer to than along the direction of the route, and that if and only if is closer to than along the direction of the route.
Upgrading uses a highclass supply to fulfill a lowclass demand, which is widely adopted in the business practice, e.g., in travel industries (see, e.g., Yu et al. 2015) and in production/inventory settings (see, e.g., Bassok et al. 1999). Figure 5 illustrates such a model that allows general upgrading (see Yu et al. 2015). In this model, product classes are indexed according to the descending order of quality. Class products are intended for the customer segment . Thus it is mostly desirable to satisfy a class customer demand using a class product, more desirable to satisfy a class demand using a class product than using a class product (), and infeasible to satisfy a class demand using a class product ().
In contrast to the existing works in the literature where the supply side is either fixed or controlled through replenishing decisions, there are many settings in which new supply arrives randomly. For example, ridehailing platforms such as Uber randomly have new drivers coming online or existing drivers completing a service and becoming available, who provide differentiated types of service (UberX, UberSELECT, UberBLACK, etc.; a more premium vehicle can be used to serve a less premium customer class through upgrading). Car rental companies may have random supply levels due to early/late return of cars by customers. Airlines and hotels can also have random “arrival” of supply due to customer cancellations.
The problem of general upgrading has the structure of a directed line segment in the product line space. Class demand and class product share the same location on the line segment, and the lower the class index, the closer the class is located to the endpoint . For , let be the unit purchase cost for class product and be the fare paid by class customers in period . The unit profit for assigning to is then . If we define as the distance between and and as the unit profit from a class customer being satisfied by a class product, then the reward structure reduces to the one we have already considered in the previous application, i.e., . Then the optimal policy will satisfy a customer with a product class that is the same as or closer to the originally requested product, and assign a product to a customer the same as or closer to the customer class that the product is intended for.
Shumsky and Zhang (2009) study a capacity management problem in which each customer class can only be upgraded one level higher. Figure 5 demonstrates the structure of such a problem. Again, we can think of the customer classes and product classes located on a line segment , where the class product and its intended customer class share the same location.
The infeasibility of upgrading with more than one level makes the problem structurally different from the general upgrading problem. In the onelevelup upgrading problem, the reward structure is the same as before for any , i.e., , with decreasing in time. But for any , , different from the general upgrading problem. As a result, in the onelevelup upgrading problem, two neighboring pairs of demand and supply, and , are not necessarily comparable under . Specifically, , and . Therefore, a necessary condition for is that , which may not be guaranteed in general.
The above argument implies that it may not be optimal to prioritize the matching between a pair of demand and supply intended for each other (i.e., demand type with supply type ), over upgrading (i.e., demand type with supply type , or demand type with supply type ), when the supply is random. In the followings, we investigate the loss of optimality caused by enforcing the aforementioned priority structure.
Remark 2
Our priority structure for the general upgrading problem is consistent with Bassok et al. (1999), who consider a singleperiod version of the problem with general upgrading. They prove that greedy matching along the specified priority structure (i.e., a productcustomer pair has a higher priority if they are closer to each other) is optimal by showing that such a priority structure leads to a classical Monge sequence. However, similar to our arguments above, there no longer exits a Monge sequence when only onelevel upgrading is allowed, even in the singleperiod problem considered by Bassok et al. (1999).
Let be the set of matching policies that prioritizes intended pairs over upgrading. More specifically, a policy belongs to if and only if it matches greedily before