Minimizing Multimodular Functions and Allocating Capacity in BikeSharing Systems^{†}^{†}thanks: Work supported in part under NSF grants CCF1526067, CMMI1537394, CCF 1522054, and CMMI1200315.
Abstract
The growing popularity of bikesharing systems around the world has motivated recent attention to models and algorithms for the effective operation of these systems. Most of this literature focuses on their daily operation for managing asymmetric demand. In this work, we consider the more strategic question of how to (re)allocate dockcapacity in such systems. We develop mathematical formulations for variations of this problem (service performance over the course of one day, longrunaverage performance) and exhibit discrete convex properties in associated optimization problems. This allows us to design a practically fast polynomialtime allocation algorithm to compute optimal solutions for this problem, which can also handle practically motivated constraints, such as a limit on the number of docks moved in the system.
We apply our algorithm to data sets from Boston, New York City, and Chicago to investigate how different dock allocations can yield better service in these systems. Recommendations based on our analysis have been adopted by system operators in Boston and New York City. Beyond optimizing for improved quality of service through better allocations, our results also quantify the reduction in rebalancing achievable through strategically reallocating docks.
1 Introduction
As shared vehicle systems, such as bikesharing and carsharing, become an integral part of the urban landscape, novel lines of research seek to model and optimize the operations of these systems. In many systems, such as New York City’s Citi Bike, users can rent and return bikes at any location throughout the city. This flexibility makes the system attractive for commuters and tourists alike. From an operational point of view, however, this flexibility leads to imbalances when demand is asymmetric, as is commonly the case. The main contribution of this paper is to identify key questions in the design of operationally efficient bikesharing systems, to develop a polynomialtime algorithm for the associated discrete optimization problems, and to apply this algorithm on real usage data to investigate the effect this optimization can have in practice.
Most bikesharing systems are dockbased, meaning that they consist of stations, spread across the city, each of which has a number of docks in which bikes can be locked. If a bike is present in a dock, users can rent it and return it at any other station with an open dock. However, system imbalance often causes some stations to have only empty docks and others to have only full docks. In the former case, users need to find alternate modes of transportation, whereas in the latter they might not be able to end their trip at the intended destination. In many bikesharing systems, this has been found to be a leading cause of customer dissatisfaction (e.g., Capital Bikeshare (2014)).
In order to meet demand in the face of asymmetric traffic, bikesharing system operators seek to rebalance the system by moving bikes from locations with too few open docks to locations with too few bikes. To facilitate these operations, a burst of recent research has investigated models and algorithms to increase their efficiency and increase customer satisfaction. While similar in spirit to some of the literature on rebalancing, in this work we use a different control to increase customer satisfaction. Specifically, we answer the question how should bikesharing systems allocate dock capacity to stations within the system so as to minimize the number of dissatisfied customers? To answer this question, we consider two optimization models, both based on the underlying metric that system performance is captured by the expected number of customers that do not receive service. In the first model, we focus on planning one day, say 6ammidnight, where for each station we determine its allocation of bikes and docks; this framework assumes that there is sufficient rebalancing capacity to restore the desired bike allocation by 6am the next morning. Since in practice this turns out to be quite difficult, the second model considers a setup induced by a longrun average which assumes that no rebalancing is done overnight. The theory developed in this paper enabled extensive computational experiments on real datasets; through these we found that there are dock allocations that simultaneously perform well with respect to both models, yielding improvements to both (in comparison to the current allocation) of up to 25%. In Boston and New York City, system operators are adopting improvements based on our models.
Our Contribution
Raviv and Kolka (2013) defined a user dissatisfaction function that measures the expected number of outofstock events at an individual bikeshare station. To do so, they define a continuoustime Markov chain on the possible number of bikes (between 0 and the capacity of the station). Bikes are rented with rate and returned with rate . Each arrival triggers a change in the state, either decreasing (rental) or increasing (return) the number of available bikes by one. When the number of bikes is 0 and a rental occurs, or equals the station capacity and a return occurs, the customer experiences an outofstock event. Using a discrete Markov Chain, they approximate the expected number of outofstock events over a finite timehorizon. For fixed rates, the work of Schuijbroek et al. (2017) and O’Mahony (2015) give different techniques to compute the expected number of outofstock events exactly. A recursion suggested by Parikh and Ukkusuri (2014) shows that these methods extend to piecewiseconstant settings. We use these techniques to compute the expected number of outofstock events that occur over the course of one day at each station for a given allocation of bikes and empty docks (i.e., docks in total) at station at the start of the day.
Given the costfunctions , our goal is to find an allocation of bikes and docks in the system that minimizes the total expected number of outofstock events within a system of stations, i.e., . Since the number of bikes and docks is limited, we need to accommodate a budget constraint on the number of bikes in the system and another on the number of docks in the system. Other constraints are often important, such as lower and upper bounds on the allocation for a particular station; furthermore, through our collaboration with Citi Bike in NYC it also became apparent that operational constraints limit the number of docks moved from the current system configuration. Thus, we aim to minimize the objective among solutions that require at most some number of docks moved. Via standard dynamic programming approaches, our methods also generalize to other practically motivated constraints, such as lower bounds on the allocation within particular neighborhoods (e.g., in Brooklyn).
We first design a discrete gradientdescent algorithm that provably solves the minimization problem with oracle calls to evaluate cost functions and a (in practice, vastly dominated) overhead of elementary list operations. Using scaling techniques and a subtle extension of our gradientdescent analysis, we improve the bound on oracle calls to , which still dominate an term for elementary list operations.
The primary motivation of this analysis is to investigate whether the number of outofstock events in bikeshare systems can be significantly reduced by a datadriven approach. Through our ongoing collaboration with the operators of New York’s Citi Bike system, it has become evident that current rebalancing efforts overnight are vastly insufficient to realize an optimal (or nearoptimal) allocation of bikes for the current allocation of docks, which was designed without any prior knowledge of user demand. In computing an optimal design of the system to address this, the model discussed above still assumes that we can perfectly restore the system to the desired initial bike allocation overnight. Instead, one might consider the opposite regime and focus on optimizing the allocation of docks assuming that no rebalancing occurs at all. To model this, we define an extension of the cost function under a longrun average regime. In this regime, the assumed allocation of bikes at each station is a function of only the number of docks and the estimated demand at that station. Interestingly, our empirical results reveal that bikeshare operators can have their cake and eat it too: optimizing dock allocations for one of the objectives (optimally rebalanced or longrun average) yields most of the obtainable improvement for the other. We present the results of these analyses on datasets from Boston, NYC, and Chicago in Section 7. In the same section, we also provide comparisons of the runningtimes of the scaling and the naive gradientdescent algorithm (as well as a hybrid of the two) in different regimes.
Related Work
A recent line of work, including variations by Raviv et al. (2013), Forma et al. (2015), Kaspi et al. (2017), Ho and Szeto (2014), and Freund et al. (2016), considered static rebalancing problems, in which a capacitated truck (or a fleet of trucks) is routed over a limited time horizon. The truck may pick up and drop off bikes at each station, so as to minimize the expected number of outofstock events that occur after the completion of the route. These are evaluated by the same objective function of Raviv and Kolka (2013) that we consider as well.
In contrast to this line of work, O’Mahony (2015) addressed the question of allocating both docks and bikes; he uses the user dissatisfaction function (defined over a single interval with constant rates) to design a mixed integer program over the possible allocations of bikes and docks. Our work extends upon this by providing a fast polynomialtime algorithm for that same problem and extensions thereof. The optimal allocation of bikes has also been studied by Jian and Henderson (2015), Datner et al. (2015), and by Jian et al. (2016), with the latter also considering the allocation of docks.^{1}^{1}1In fact, the idea behind the algorithm considered by Jian et al. (2016) is based on an early draft of this paper. They each develop frameworks based on ideas from simulation optimization; while they also treat demand for bikes as being exogeneous, their framework captures the downstream effects of changes in supply upstream. Interestingly, the work of Jian et al. (2016) found that these effects are mostly captured by decensoring piecewiseconstant demand estimates.
Orthogonal approaches to the question of where to allocate docks have been taken by Kabra et al. (2015) and Wang et al. (2016). The former considers demand as endogeneous and aims to identify the station density that maximizes sales, whereas we consider demand and station locations as exogeneously given and aim to allocate docks and bikes to maximize the amount of demand that is being met. The latter aims to use techniques from retail location theory to find locations for stations to be added to an existing system.
Further related work includes a line of work on rebalancing triggered by Chemla et al. (2013). Susbequent papers, e.g., by Nair et al. (2013), Dell’Amico et al. (2014), Erdoğan et al. (2014), and Erdoğan et al. (2015), solve the routing problem with fixed numbers of bikes that need to be picked up/dropped off at each station – work surveyed by de Chardon et al. (2016). Other approaches to rebalancing include for example the papers of Liu et al. (2016), Ghosh et al. (2016), RainerHarbach et al. (2013), and Shu et al. (2013). While all of these fall into the wide range of recent work on the operation of bikesharing systems, they differ from our work in the controls and methodologies they employ.
Finally, a great deal of work has been conducted in the context of predicting demand. In this work, we assume that the predicted demand is given, e.g., using the methods of O’Mahony and Shmoys (2015) or Singhvi et al. (2015). Further methods to predict demand have been suggested by Li et al. (2015), Chen et al. (2016), and Zhang et al. (2016) among others. Our results can be combined with any approach that predict demand at each station independently of all others.
Relation to Multimodularity
Our algorithms and analyses exploit the fact that the costfunction is multimodular (cf. Definition 1) at each station. This provides an interesting connection to the literature on discrete convex analysis. In recent work by Kaspi et al. (2017) it was independently shown that the number of outofstock events at a bikeshare station with fixed capacity , bikes, and unusable bikes is natural convex in and (see the book by Murota (2003) and the references therein). Unusable bikes effectively reduce the capacity at the station, since they are assumed to remain in the station over the entire time horizon. A station with capacity , bikes, and unusable bikes, must then have empty docks; hence, for , which parallels our result that is multimodular. Though this would suggest that algorithms to minimize convex functions could solve our problem optimally, one can show that convexity is not preserved, even in the version with only budget constraints.^{2}^{2}2In Appendix A we provide an example in which a convex function restricted to a convex set is not convex; the example also shows that Murota’s algorithm for convex function minimization can be suboptimal in our setting. However, since multimodularity is preserved, the techniques of Murota (2004), combined with the submodular function minimization algorithms of Lee et al. (2015), yield an algorithm with runningtime guarantee to solve the version with only budget constraints.
By exploiting the separability of our objective function (w.r.t. stations) and the associated multimodularity of each station’s cost function, we obtain algorithms with significantly stronger runningtimes and quickly find solutions for instances at the scale that typically arises in practice.
2 Model
We denote by a sequence of customers at a bikeshare station. The sign of identifies whether customer arrives to rent or to return a bike, i.e., if customer wants to return a bike and if customer wants to rent a bike. The truncated sequence is written as . We denote throughout by and the number of open docks and available bikes at a station before any customer has arrived. Notice that a station with open docks and available bikes has docks in total. Whenever a customer arrives to return a bike at a station and there is an open dock, the customer returns the bike, the number of available bikes increases by 1, and the number of open docks decreases by 1. Similarly, a customer arriving to rent a bike when one is available decreases the number of available bikes by 1 and increases the number of open docks by 1. If, however, a customer arrives to rent (return) a bike when no bike (open dock) is available, then she disappears with an outofstock event. We assume that only customers affect the inventorylevel at a station, i.e., no rebalancing occurs. It is useful then to write
as a shorthand for the number of open docks and available bikes after the first customers.
Our cost function is based on the number of outofstock events. In accordance with the abovedescribed model, customer experiences an outofstock event if and only if . Since for every , this happens if and only if . Since we are interested in the number of outofstock events as a function of the initial number of open docks and available bikes, we can write our costfunction as
It is then easy to see that with , fulfills the recursion
Given for each station a distribution, which we call demandprofile, over , we can write for the expected number of outofstock events at station and . We then want to solve, given budgets on the number of bikes and on the number of docks, a current allocation , a constraint on the number of docks moved, and lower/upper bounds for each station , the following minimization problem
Here, the first constraint corresponds to a budget on the number of docks, the second to a budget on the number of bikes, the third to the operational constraints and the fourth to the lower and upper bound on the number of docks at each station. We assume without loss of generality that there exists an optimal solution in which the second constraint holds with equality; to ensure that, we may add a dummy (”depot”) station that has , , and run the algorithm with a dockbudget of .
In Section 3 we prove that fulfills the following inequalities and is thus multimodular.
Definition 1 (Hajek (1985), Altman et al. (2000))
A function with
(1)  
(2)  
(3) 
for all such that all terms are welldefined, is called multimodular. For future reference, we also define the following implied^{3}^{3}3(6) and (1) are equivalent, (1) and (2) imply (5), and (3) and (6) imply (4). additional inequalities:
(4)  
(5)  
(6) 
Even though we are motivated by the costfunctions defined in this section, our main results hold for arbitrary sums of twodimensional multimodular functions.
3 Multimodularity & An Allocation Algorithm
This section is structured as follows: we first prove that the costfunctions defined in Section 2 are multimodular. Next, we define a natural neighborhood structure on the set of feasible allocations and define a discrete gradientdescent algorithm on this neighborhood structure. We end the section with a proof that solutions that are locally optimal with respect to the neighborhood structure are also globally optimal; this proves that the algorithm finds optimal solutions.
Lemma 1
is multimodular for all .
Proof. We prove the lemma by induction, showing that is multimodular for all . With , by definition, and thus there is nothing to show. Suppose that through are all multimodular. We prove that is then multimodular as well.
We begin by proving inequality (1). Notice first that if
we can use that inequality (1), by inductive assumption, holds after customers. Else, we use the inductive assumption on inequality (4) and (5) to prove inequality (1). If (and ), then both sides of the inequality are 0 and , , , and . In that case, we may use the inductive assumption on inequality (5) applied to the remaining customers. If instead (and ), then both sides of the inequality are and we have , , , and , so we may apply inequality (4) inductively to the remaining .
It remains to prove inequalities (2) and (3). We restrict ourselves to inequality (2) as the proof for inequality (3) is symmetric with each replaced by and the coordinates of each term exchanged. As before, if
the inductive assumption applies. If instead and the maximum is positive, then the LHS and the RHS are both 0 and we have , , , . In that case, both sides of the inequality are subsequently coupled and the inequality holds with equality.
In contrast, if and the maximum is positive, then , the RHS is 1, and the LHS is 0. In this case we have , , , . Let denote the next customer such that one of the four terms changes.
If , then both terms on the LHS increase by 1, so it remains 0, whereas only the negative term on the RHS increases, so the inequality holds with . Moreover, since , and ; subsequently both sides of the inequality are again coupled.
Finally, if , then both terms on the RHS increase by 1 with customer , but only the negative term on the LHS. Thus, thereafter both sides are again equal. In this case as well, both sides remain coupled thereafter since we have , and . ∎
Corollary 1
is multimodular for any demandprofile .
Proof. The proof is immediate from Lemma 1 and linearity of expectation.∎
3.1 An Allocation Algorithm
We next present our algorithm for settings without the operational constraints. Intuitively, in each iteration the algorithm picks one dock and at most one bike within the system and moves them from one station to another. It chooses the dock, and the bike, so as to maximize the reduction in objective value. To formalize this notion, we define the movement of a dock via the following transformations.
Definition 2
We shall use the notation . Similarly, . Then a dockmove from to corresponds to one of the following transformations of feasible solutions:

– Moving one open dock from to ;

– Moving a dock & a bike from to ;

– Moving a dock from to and one bike from to ;

– Moving one bike from to and one open dock from to (equivalently, one full dock from to ).
Further, we define the neighborhood of as the set of allocations that are one dockmove away from . Formally,
Finally, define the dockmove distance between and as .
This gives rise to a very simple algorithm: we first find the optimal allocation of bikes for the current allocation of docks; the convexity of each in the number of bikes, with fixed number of docks, implies that this can be done greedily by taking out all the bikes and then adding them one by one. Then, while there exists a dockmove that improves the objective, we find the best possible such dockmove and update the allocation accordingly. Once no improving move exists, we return the current solution.
Remark. A fast implementation of the above algorithm involves six binary heaps for the six possible ways in which the objective at each station can be affected by a dockmove: an added bike, a removed bike, an added empty dock, a removed empty dock, an added full dock, or a removed full dock. In each iteration, we use the heaps to find the bestpossible move (in time) and update only the values in the heaps that correspond to stations involved. The latter requires a constant number of oracle calls to evaluate the cost functions locally as well as heapoperations that can be implemented in amortized time.
3.2 Proof of Optimality
We prove that the algorithm returns an optimal solution by showing that the condition in the whileloop is false only if minimizes the objective; in other words, if an allocation is locally optimal with respect to then it is globally optimal. Thus, if the algorithm terminates, the solution is optimal. Before we prove Lemma 3 to establish this, we first define an allocation of bikes and docks as bikeoptimal if it minimizes the objective among allocations with the same number of docks at each station and prove that bikeoptimality is an invariant of the whileloop.
Definition 3
Define an allocation as bikeoptimal if
Lemma 2
Suppose is bikeoptimal. Given and , one of the possible dockmoves from to , i.e., or , is bikeoptimal. Equivalently, when moving a dock from to , one has to move at most one bike within the system to maintain bikeoptimality.
Proof. It is known that multimodular functions fulfill certain convexity properties (see e.g., Murota (2003), Raviv and Kolka (2013)); in particular, for fixed and it is known that is a convex function of . Thus, if the best allocation out of and , was not bikeoptimal, there would have to be two stations such that moving a bike from one to the other improves the objective. By the bikeoptimality of , at least one of these two stations must have been involved in the move. We prove that the result holds if was the best of the set of possible moves – the other three cases are almost symmetric. Let denote a generic third station. Then a bike improving the objective could correspond to one being moved from to , from to , from to , from to , from to or from to . In this case, a move from to , to and to yield the allocations , and , respectively. Since is assumed to be the minimizer among the possible dockmoves, none of these have objective smaller than that of . It remains to show that moving a bike from to , to or to yields no improvement. These all follow from bikeoptimality of and the multimodular inequalities. Specifically, an additional bike at yields less improvement and a bike fewer at has greater cost in than in , since
Both of the above inequalities follow from inequality (3).∎
By Lemma 2, to prove optimality of the algorithm, it now suffices to prove that bikeoptimal solutions that are locally optimal w.r.t. our neighborhood structure are also global optimal.
Lemma 3
Suppose is bikeoptimal, but does not minimize subject to budget constraints. Let denote a better (feasible) solution at minimum dockdistance from . As is bikeoptimal, there exist and such that and . Pick any such and ; then either there exists a dockmove to or one from that improves the objective.
Proof. The proof of the lemma follows a a casebycase analysis, each of which resembles the same idea: minimizes the dockmove distance to among solutions with lower function value than , i.e., among all such that , , and , has minimum dockmove distance to . We show that with and as in the statement of the lemma, either there exists a dockmove to /from that improves the objective or there exists a solution with objective value lower than , , and , such that has smaller dockmove distance to . Since the latter contradicts our choice of , this proves, that in there must be a dockmove to /from that yields a lower objective. We distinguish among the following cases:

and ;

and ;

, , and

and there exists with , ;

and there exists with , ;

for all , we have , so ;


, , and ,

and there exists with and ;

and there exists with and ;

for all , we have , so .

We show that in case (1) a move from to yields improvement. The proof for case (2) is symmetric. Thus, in cases (3a) and (4a) there exists a move from to , respectively from to , that yields improvement. Since the proofs for cases (3b) and (4b) are also symmetric, we only present the proofs for (3b). Cases (3c) and (4c) contradict our assumption that and can thus be excluded. For case (1), we define , so
Given that , the definition of implies that this difference must be positive. Setting , we bound
We prove the inequality between the second and third expression by first showing that
Applying inequality (3) given in the definition of multimodularity, times bounds the righthand side (RHS) by . Setting , we then find that the RHS is bounded above by
On the other hand, applying inequality (6) repeatedly to the lefthand side (LHS) shows that , the LHS is at least . Hence, by setting , which is nonnegative since , we bound the LHS from below by
This equals the upper bound on the RHS and thus proves the desired inequality. Similarly, to show
(7) 
we apply inequality (3) times to bound the LHS in (7) by . Thereafter, we apply inequality (5) times to obtain the desired bound.
In case (3b), we define and . Similarly to the first case, we need to show that . Since all terms not involving and cancel out and the terms involving and can be bounded the same way as before, deriving
suffices. We obtain this by repeatedly applying inequalities (3) and (4) to the LHS. ∎
4 Operational Constraints & Running Time
In this section, we show that the allocation algorithm is optimal for the operational constraints introduced in Section 2 by proving that in iterations it finds the best allocation obtainable by moving at most docks. We thereby also provide an upper bound on the runningtime of the algorithm, since any two feasible dockallocations can be at most dockmoves apart. We begin by first formally defining the set of feasible solutions with respect to the operational constraints.
Definition 4
Define the dock ball around as the set of allocations with dockmove distance at most , i.e., and
We now want to prove that Lemma 3 continues to hold in the constrained setting; in particular, we show that with the operational constraints as well, local optima are global optima.
Lemma 4
With bikeoptimal and for some , there exists such that .
Proof. Notice that this lemma closely resembles Lemma 3: the sole difference lies in Lemma 3 not enforcing the dockmove to maintain a bound on the distance to some allocation .
Define as in Lemma 3 with the additional restriction that be in , i.e., pick a solution in that minimizes the dockmove distance to among solutions with strictly smaller objective value. We argue again that bikeoptimality of implies that there exist and , such that , and . Further, for any such and , we can apply the proof of Lemma 3 to find a move involving at least one of the two that decreases both the objective value and the dockmove distance to .
We aim to find and such that the move identified, say from to , is guaranteed to remain within . Notice that . We know that and . Suppose the move from to yields a solution outside of . It follows that and , so in particular either or . Thus, if we can identify and such that those two inequalities do not hold, we are guaranteed that the identified move remains within . Define We can then write
The latter is at least 1 unless it is the case for all that if then . Thus, unless the above condition fails, we have identified a with the required properties. Suppose the condition does fail. Then