An optimization model for line planning and timetabling in automated urban metro subway networks
Abstract.
In this paper we present a Mixed Integer Nonlinear Programming model that we developed as part of a pilot study requested by the R&D company Metrolab®^{1}^{1}1Société Metrolab®, Service Contrôle de Gestion, registered on the Paris Trade and Companies Register under the number 532 684 685 RCS Paris, with its registered office at 117/119 Quai de Valmy  75010 PARIS  FRANCE in order to design tools for finding solutions for line planning and timetable situations in automated urban metro subway networks. Our model incorporates important factors in public transportation systems from both, a costoriented and a passengeroriented perspective, as timedependent demands, interchange stations, shortturns and technical features of the trains in use. The incoming flows of passengers are modeled by means of piecewise linear demand functions which are parameterized in terms of arrival rates and bulk arrivals. Decisions about frequencies, train capacities, shortturning and timetables for a given planning horizon are jointly integrated to be optimized in our model. Finally, a novel MathHeuristic approach is proposed to solve the problem. The results of extensive computational experiments are reported to show its applicability and effectiveness to handle realworld subway networks.
Key words and phrases:
Line planning, shortturns, timetabling, Mixed Integer Nonlinear Programming, Mathheuristic.1. Introduction
In this paper, we propose a model for line planning and timetabling on general urban subway transportation systems. This study was originated by a realworld problem proposed by Metrolab®, a French R&D company, dealing with the line planning and timetabling of trains of existing subway networks. It was a pilot experience to automatize the decision making process, at the tactical and operational level, of a small section of the Paris subway network.
The development of flexible tools to control, automatically, a transportation system according to a set of indicators of its service quality may have a considerable impact in its efficiency and usefulness. The quality perception of a public transportation system from a customer’s point of view is highly dependent of its reliability, comfortability and effectiveness when comparing with alternative transportation means. If just poorquality connections are offered or the qualityprice relationship does not fulfil the passengers’ expectations, they may decide to use alternative transportation means. Therefore, the quality of a public transportation system, from the passengers’ point of view, is a key objective in its design and management, besides infrastructure constraints, operational limitations or budget considerations [12].
The existing literature on line planning is very extensive (see e.g., [11, 12, 25] and the references therein), including different models which can be classified according to the decisions covered (determination of train routes, frequency setting, or both), infrastructure and operational aspects, objective functions and the way in which passengers’ decisions are taken into account in the decision making process. For instance, in [12], line planning models are classified with respect to their objective functions into models with costoriented or with passengeroriented objective functions. Our model will consider both points of view in an attempt to find an equilibrium between these conflicting objectives which, as mentioned in [14], is an important challenge in a public transportation system.
However, building an effective model is much more complex than selecting the appropriate nature of the objective function. The correct delimitations of the considered features in the context of the transportation system or in the set of customers served by this system is a very important phase in the actual modeling. Often, an initial line planing must be reengineered motivated by changes on costumers’ flows induced by changes in passengers’ route choices, [12]. This gives rise to a bilevel optimization problem with a line planning problem on the upper level and a passenger’s route choice problem on the lower level. Moreover, the existence of several decisionmakers is not the only difficulty of this problem. There exist uncertainties in the number of passengers that must be served (demands), in the origindestination pairs that customers want to go across or in the times needed to cover the network links due, for instance, to machine failures or other incidents. In this situation it seems hard to integrate all these elements in a suitable optimization model.
Usually, some simplifications must be assumed in order to obtain operational solutions for realistic situations. The usefulness of the model will be strongly conditioned by these assumptions. Having a valid model for a wide range of scenarios might be a lofty target but one worth aiming for. The resulting approximation can be seen not only as a model to optimize the existing resources in a given transportation system but also as a whatif tool to make rational decisions. For instance, it can be used to check the implementation of possible modifications in the structure of the existing network (adding stations, new connections, …) or in the conditions (demand variation, service disruptions, …) under which the transportation system is working at present.
Line planning is only one of the planning process’s stages. Indeed, following [11, 21, 26], the planning process in public transportation includes several phases that usually are sequentially executed in the following order:

network design, where the stations, links and routes of the lines are established,

line planning, specifying the frequency and the capacity of the vehicles used in each line (line concept, [12]),

timetabling, defining the arrival/departure times and

scheduling, in which vehicles and/or crews are planned.
The first phase, namely network design, is done at the strategic level and implies a high cost (see e.g., [3]). Moreover, the remaining steps involve decisions at the tactical and operational levels which are conditioned by that design. Thus, it seems appropriate to assess different designs by means of a whatif analysis based on the efficiency of the system under a given scenario. In that case, after initializing with a reasonable design, the procedure could optimize the efficiency of the transportation system using tactical/operational decisions. This may reveal possible weaknesses of the current design and, after fixing some of them, will give rise to a new design. The process could be repeated until a compromise transportation design is found.
In addition to the literature on line planning commented above, one may also find a rich literature on timetabling and scheduling (see e. g. [1, 7, 8, 19, 27] for timetabling models and [6, 10, 28] for scheduling models). Timetabling models are usually classified according to the capacity of the transportation system or the requirements of passengers. Regarding to the scheduling literature, models can be classified into single and multiple depot frameworks [4]. Besides, we can find periodic and timedependent timetabling and scheduling models and many other variants depending on the considered constraints. Considerable effort has also been made to analyze the practical effects of the different elements or features covered by these models. For instance, as mentioned in [22], periodic timetables easily allow passengers to remember the exact departure times at stations but, in general, they are not fully sensitive to timevarying passenger demands, which could result in long waiting times and reduced service reliability, particularly under irregular oversaturated conditions.
Once the framework of the transportation system has been delimited, the resulting model should be optimized. However, as pointed out in [26], going through the abovementioned stages of the planning process in a sequential way, often leads just to suboptimal solutions. Recent research (see [13, 14, 18, 24, 26]) is increasingly oriented towards integrated planning in which two or even more of these planning stages are simultaneously addressed. These integrated optimization models are frequently superior to those optimizing sequentially the considered stages, as it has been recognized in the literature (see e.g., [26] and the references therein).
In this paper we propose a new integrated model in which line planning and timetabling are simultaneously optimized using a combination of cost and passengeroriented objective function. In our model, timedependency on demands is considered in addition to two other elements that, as far as we know, have only been addressed separately in the literature: shortturns and interchange stations.
Shortturning is a tactical decision for which some trains can perform short cycles in order to increase frequency in specific sections of a line. In general, due to their analytical complexity, the approaches to manage shortturnings in the context of railways planning are based on particular cases and there are no general models that can be applied without modifications to every situation [5]. Besides, most of the literature analyzing shortturning in a railway context is limited to a single twoway transit line [5, 9, 20, 29]. Our model incorporates decisions concerning the activation of shortturns simultaneously in several lines of the network.
On the other hand, interchange or transfer stations are shared by several lines in the system allowing passengers to change from one to another line. Papers dealing with interchange stations (see e.g. [15, 16, 30]) usually aim for minimizing the total transfer waiting time of passengers by synchronizing train arrival times at transfer stations. We will manage here the effective flows of passengers at the interchange stations and compute the corresponding effects both in the quality of the service and in technical constraints, as those related to the capacities of the trains.
The final aim is to model a real system, inspired by one initially proposed by Metrolab®, using its more relevant features whilst its computational tractability is preserved. This is an important challenge taking into account the combinatorial, stochastic, multilevel and multiobjective nature of the problem. The resulting outcome is a Mixed Integer Second Order Cone Programming model, which can be solved using offtheshelf optimization solvers, but only for limited sizes. For larger sizes we propose a MathHeuristic approach in which the system is decoupled into different lines. Afterward, each subsystem is optimized individually but including in the input data the flows of passengers generated after optimizing other lines. The process is repeated like in a block coordinate descent procedure (see e.g., [2]) until a given stoping rule is verified.
The remainder of this paper is organized as follows. Section 2 deals with a detailed description of the optimization problem and its main elements. In Section 3 we present the Mathematical Programming formulation of the problem. The demand function modelling how the flow of passengers entering into certain stations of a line changes according to the effects of the other lines or due to external factors is detailed in Section 4. The usefulness of the proposed model is illustrated in Section 5 with a case study using real data on a section of the Paris subway provided by Metrolab®. Our MathHeuristic approach is proposed in Section 6 and the corresponding computational results, including its comparison with the exact method, are reported in Section 7 using several network topologies adapted from the literature. The paper ends with a section where some conclusions and future extensions of the model are outlined.
2. Problem description
Let us assume that the technical features of a public transportation network, which is a part of a complex underground train system, are known. Our goal is to model the problem of how to operate different metro lines on this network according to a set of technical requirements and a given structure of the demand requesting for this service. Suppose that a set of routes for the potential lines (lines pool) and a set of interchange stations are specified. Furthermore, some other factors involved in the system performance such as passenger flows, set of possible train capacities, maximum number of allowed trips in a given line during the planning horizon, demand fluctuation (e.g., rush/offpeak hours) or stopping time windows, amongst others, are assumed to be also available. The goal is to model such a system using its more relevant features whilst its computational tractability is preserved. In the description of our approach we will consider three main blocks: input data, feasible actions and assessment of a particular solution together with some additional specifications of our model.
2.1. Input data
In addition to the input network, including the topological route map and the interchange stations, the main block of input data corresponds to the passenger flow amongst the considered set of stations. Obviously, the number of passengers awaiting in a specific station for the next train which connects to a given destination is a stochastic process. Furthermore, the stochastic processes corresponding to the set of considered stations are interdependent due to the flow relationships amongst the stations which, in addition, may change over time. In our model, given that we deal with a heavily congested subway line, these stochastic processes will be replaced by average rates of the number of passengers. The dynamic dependence of these processes on time should be preserved in some way in the optimization model since it is one of the relevant elements in order to obtain realistic operating solutions. We will do it through a function measuring the intensity of the demand.
In our framework, we assume that two main situations affect to the passengers flows: transitions between stations or lines and arrival of external demand functions. The first one refers to the behaviour of the passengers with respect to the mobility pattern. These data fix an assignment for the destination of the passengers catching a train in a given station, what is known in the literature as line planning with route assignment, [12]. The alternative approach of line planning with route choice seems to be more appropriate just in those transportation systems with high density of connections (with alternative paths between two given locations) and low trip frequency. However, this is not the case in our model since these two features rarely appear in underground train systems.
We will use an OriginDestination (OD)matrix per line having as entries the proportions of passengers moving between pairs of stations of the line and also a set of values quantifying the proportion of passengers which want to change from one metro line to another in each one of the transfer stations. These proportions may be considered as estimations of the probabilities with which a passenger moves through the network and could be dependent on the dynamic nature of the transportation system. Following [26], in order to model the passengers’ flows, ODdata gives rise to more realistic applications than those based on traffic loads since the paths followed by passengers depend strongly on the line concept. As observed by several authors ([12, 26]), the optimization models derived from the management of ODdata are often harder to be solved numerically, and thus, approximated adhoc algorithms need to be used to deal with problems of realistic sizes.
The second concept, the arrival of external demand functions, refers to the intensity of use of the transportation network. The external demand models the incoming flow of passengers entering to the system from outside during the planning horizon. These functions determine the relative importance, in the overall planning cost, of the stations used to access the system and it may change depending on time. Also, rush hours at given time periods, irregular weather conditions, or the celebration of events at certain places close to stations may provoke increasing or decreasing of the incoming flow of passengers taken into account in our model.
These external demand functions, together with the ODdata corresponding to movements between pairs of stations and the proportions of transfer passengers, give rise to a model having timedependent passenger flows. Timedependent flows are an appropriate feature of a realistic approach for traffic planning and, as pointed out in [26], at present, there is not much research literature covering integrated optimization planning models under these conditions.
2.2. Feasible actions
In our model, the line concept design is specified by choosing the operating frequencies and the train capacities for each line considered in the lines pool. Furthermore, our line concept design allows shortturning in some lines, i.e., the possibility of activating, for some or all the trains, and at certain time periods, short cycles, in order to increase the frequency in specific (consecutive) stations suffering from intensive demand. This situation is typical in lines which connect distant residential areas with the city center or economic centers.
In the integrated planning model, the line concept design is optimized together with their corresponding timetable. Both elements define the feasible solutions of the problem once a set of technical constraints, involving passenger flows and leaving/arrival times, is specified.
The line concept design will be the main source of discrete decision variables for the formulation of our problem. As, for instance, the selection, among a finite set of capacities for the trains. On the contrary, the actual number of trips of a given line will be modeled using a finite set of replicas of continuous variables corresponding to potential timetables. On the basis of these decision variables a number of additional auxiliary variables will be considered in the optimization model in order to control the times in which different events happen at each station during the planning horizon. In our model, unlike most of the approaches deriving optimal timetables in transportation systems, the departure times are not discretized and the period of time elapsing between consecutive train departures is not constrained to be constant. Thus, it provides more flexible decisions as well as timetables sensitive to the changeable conditions in the passengers’s flow during the planning horizon, turning out, in general, in an aperiodic timetabling.
Different train speeds are not taken into account in our line concept design because the pilot proposal by Metrolab® only considered constant and fixed speeds between each pair of consecutive stations. Nevertheless, continuous variables modeling the speed of a train between two consecutive stations could be easily added to our model as explained in Remark 1.
2.3. Assessing a planning solution
As commented above, due to the multiobjective nature of the problem, one of the most difficult modeling issues is that of assessing a feasible solution (line concept+timetable). Maintenance/operational planning costs are usually easy to handle as a part of the objective function. However, in order to consider also the passengeroriented nature of the objective, the cost induced by the quality of the service provided by the system should be included in the objective function. This will be incorporated into the model quantifying the cost of unmet (nonserved) demand, that is, passengers who cannot take a train due to lack of capacity. Nonserved passengers contribute a given amount, in terms of costs, due to their confidence loss in the transportation system, their balking rate or a combination of these two and some other factors. Obviously, calibrating these costs is not an easy task, but it may be partially handled by managing a finite set of cost estimates and solving the problem for each one of them in order to evaluate the influence of these hardtocalibrate parameters in the proposed solution.
2.4. Specifications of our model
In the following we list the assumptions that are imposed to derive a suitable Mathematical Programming formulation for our model.
 Planning horizon::

Our model considers a continuous time interval in which all the events must start and decisions occur. The range of this interval depends on the data collection accuracy and withinday variability of the demand changes. This planning horizon is fixed a priori but, as mentioned in Remark 2, our model allows to join two consecutive planning horizons by passing data about numbers of passengers and arrival/departure times obtained from an optimal solution on a given planning period as input data for the next one.
 One direction trips::

We assume that each line in the pool operates only in one direction, from a given lineheader station to the final one. Usual roundtrip lines are modeled as two symmetric lines by interchanging the order of the stations. A trip consists in making the complete walk along all the ordered stations of a given line, from the head to the final station. Thus, a physical roundtrip starting and ending at the same station will be given by two lines sharing the stations but in the opposite order.
 Shortturns::

We consider that specific sections of consecutive stations in certain lines are allowed to be activated in some of the trips.
 Interchange stations::

We consider that the lines in the lines pool may share common interchange stations where some of the passengers change of line to go to their final destination.
 Train capacities::

The model assumes that a finite set of admissible capacities for the trains operating a line is given.
 Safety interval::

A minimum security time window between consecutive trips in any line is established.
 Maximum number of trips::

We assume, w.l.o.g., that the maximum number of possible trips in each line is given beforehand. Note that an upper bound of this maximum number can be obtained taking into account the range of the planning horizon and the safety interval. In our formulation, we resort to a set of decision variables controlling the time in which different events happen. These variables must be replicated as many times as the maximum number of trips, although only some of those tripvariables are activated and then represent actual trips. Hence, the size of the formulation strongly depends on this maximum number. The idea is to consider that the potentially variable number of trips of the line concept is fixed to the maximum number although, in fact, some of them are really fake trips. This trick will ease the task of building constraints to ensure nonoverlapping events and the safety interval between consecutive trips in the stations of a given line.
 Piecewise linear cumulative incoming demand::

We model the accumulated number of passengers arriving to each station up to a given instant during the planning horizon by the socalled demand function. With this function we manage variable arrival rates during the planning horizon and bulk arrivals due to special events, like for instance the end of a football match in a close location, the arrival of passengers coming from another transportation mean or, in the interchange stations, the arrival of passengers coming from another line of the subway network. As proposed by Metrolab® for its pilot experience, we consider that this function is a piecewise linear function of time (further details are given in Section 4).
Figure 1 illustrates some of the considered features of the networks under study. There, we represent by circles or squares the nodes corresponding to the stations of two subway lines. An interchange station common to two lines is marked with a black square. We have also included a possible shortturn in the horizontal line covering a set of four consecutive squared stations (drawn as a dashed line in the picture).
3. MINLP Formulation
In this section we provide a Mathematical Programming formulation for the problem described in Section 2. First of all, we are given an input network, like the one depicted in Figure 1, including the topological route map and the stations, some of them being interchange stations and the lines pool defined over this network. In addition, shortturning decisions are allowed in some lines of the lines pool. These decisions always concern a set of given consecutive stations of those lines. In what follows and when no confusion is possible, we will refer to both, the set of consecutive stations and the trip concerned by a shortturning decision as a shortturn. On the other hand, trips trough the fulllength lines will be referred as whole trips. In order to introduce the model we start by defining the set of parameters describing the remainder of the input data as well as the decision variables used to identify a feasible solution.
3.1. Parameters
The input parameters for our model are the following:

: Planning horizon in which trains start their journeys at a given head of line station. Note that a train may start its last journey on while its last stop may occur in a time instant .

: Set of lines in the network (lines pool). As mentioned above, roundtrip lines are considered as two different lines sharing the stations but traversed in opposite directions (rigorously speaking, the stations represent platforms of the corresponding line). Each one of the lines is assumed to be described by its node stations and its directed connections between consecutive stations. Additionally we will denote by the set of lines containing a shortturn and by the remainder (those in which none of its proper subsets of stations can be activated as shortturns). Clearly, .

: Set of stations of line . Stations are assumed to be ordered in its travelling direction. Observe that if the lines and correspond to the same roundtrip line but traversed in opposite directions they have the same number of stations () and station represents the opposite platform to station .
Additionally, for each line we will denote by the set of stations in the (unique) shortturn, being the first station of the shortturn and the last one.
In order to present a clearer Mathematical Programming model, we will also denote from now on by , i.e., those stations which are not part of the available shortturns.

: Distance (measured as travel time) between the stations and of line . Recall, as mentioned in Section 2, that we assume that the speed of the trains operating between consecutive station and is fixed, and thus, this distance is trip independent.

: Stopping time for any train at station of line before leaving to station . This value represents a time window to perform different operations as the unload/load of passengers from/to the train or the finetuning of the train. Without loss of generality, we also consider that the stopping times are trip independent.

: Distance (measured as travel time plus stopping time at the last and intermediate stations) between the head of line station and the first station of the shortturn for line . Note that, this parameter is given by the following expression:
(1) 
: Minimum safety time interval between consecutive trips in a given line .

: Set of trips in line . It is worth noting that the exact number of trips in a line is a decision of the line planning process and which is not known beforehand. In fact, some trips will not be actually used on the planning. We will refer to them as fake trips. Note also that the maximum number of possible trips in the line can always be upper bounded by . Typically the value will be smaller than this bound and should be estimated on the basis of the technical specifications of the network.

: Admissible capacities for trains operating in all the lines. In some cases, a single capacity profile is allowed while in some others a base capacity is available and, by adding extra wagons, it can be doubled, triplicated, etc.
We also need the proportion of passengers, which, not being able to catch the train in a given attempt, still insist on using the system and wait for the next train. In addition, our model corresponds to a line planning with route assignment in which the ODmatrix, whose entries describe the proportions of passengers moving among stations of each line is given.

: Proportion of passengers that cannot get on a train because of lack of capacity and await for the next one (proportion of persisting passengers). This parameter is assumed to be independent of stations and lines. It can also model the probability that a passenger gives up from getting on a crowded train and awaits for the next one.

: Proportion of passengers awaiting at station to go to the station using the line . Note that this parameter can assume positive values only if , i.e., if station is previous to station .
The following parameters will be used to assess a given planning solution. As explained in Section 2, we propose the aggregation of the maintenance/operational planning costs together with a quantification of the social cost whose purpose is to measure the quality of the service provided by the system. Obviously, the correct estimation of these costs is fundamental for a meaningful model.

: Fixed cost for starting a whole trip at line with capacity . Usually, the larger the capacities, the higher the fixed costs on the trips. These costs model the consumption of resources to prepare a train for a new trip, the energy spent in the trip, the fixed costs of the staff needed to control the train, etc.

: Fixed cost for starting a shortturn at line with capacity . As in the previous cost, larger capacities and lengths of the shortturns usually involve higher costs on the trips. These fixed costs are usually smaller than the corresponding ones for whole trips.

: Unitary profit for transporting a passenger from the station to the station of the line . It represent the cost of the ticket paid by a single passenger to use the service for a given origindestination trip.

: Unitary penalty for passengers who cannot get on the first arriving train due to its limited capacity and still insist on using the system. This parameter is an estimation of the loss supported by the service for unsatisfied passengers that still keep waiting to use the subway service.

: Unitary penalty for passengers who give up using the network and leave the system after they cannot get on the first arriving train due to its limited capacity. This parameter is an estimation of the loss supported by the service for unsatisfied passengers that are lost by the system.
The above set of parameters are listed in a more compact form in Table 1
Parameter  Description 

Planning horizon.  
Lines of the network (Lines pool).  
Lines enabling shortturns.  
Lines without shortturns.  
Stations of line .  
Stations in the shortturn of line .  
Stations not affected by shortturns.  
Distance (measured as time) between stations and of line .  
Stopping time that a train spends in the station of the line .  
Distance (measured as time) between the head of line station and the first station of the shortturn for line .  
Minimum safety time interval between consecutive trips, for line .  
Set of trips in line , some of them being fake trips.  
Admissible capacities for trains operating in all the lines.  
Proportion of passengers moving between stations and of line .  
Proportion of persisting passengers.  
Fixed cost for starting a whole trip at line with capacity .  
Fixed cost for starting a shortturn at line with capacity .  
Unitary profit for transporting a passenger from the station to the station of line .  
Unitary penalty for persisting passengers.  
Unitary penalty for passengers who give up using the system. 
3.2. Decision variables
The following set of decision variables are used in our Mathematical Programming model:

Continuous Variables:

: Departure time from the initial station of line at its th trip, .
Since the travel time between consecutive stations and the stopping time at each station are fixed, the departure time from the initial station of a line will be the reference time to calculate the departure time from the rest of stations of the line.
It is worth mentioned that we will force any fake trip to operate on the line at the same time as the previous true trip. Thus, the departure time from the initial station of the line of a fake trip must coincide with the departure time from the initial station of the previous true trip and therefore, the departure time from the rest of stations must also coincide. Note also that if some of its trips may be shortturns. When a shortturn is activated the stations not affected by the shortturn are considered as nodes of a fake trip, being the departure times from these stations fixed to the ones of the previous true whole trip. In order to avoid infeasible solutions due to inconsistent times relating departure times in consecutive stations (in and out the shortturn) the following continuous variables are needed.

: Difference between the actual departure time from the first station of the shortturn of the th trip, , of line and the time when it should depart from this station taking into account its departure time from the initial station of the line. Note that if the th trip is a whole trip then, .
Note also that in the definition of these variables we are implicitly assuming that the initial station of the line is not part of the shortturn. If this occurs, we need to take as reference time the departure time from any other station out of the shortturn.
For modeling purpose we need to distinguish between the flow of passengers that get on a train that performs a whole trip and the flow of passengers that get on a train that only performs a shortturn. Note that the first one is defined for any line of the lines pool whilst the second is only defined for those lines containing a shortturn.

: Flow of passengers captured in the station by the train that covers the th trip of the line , being a whole trip.

: Flow of passengers captured in the station by the train that covers the th trip of the line , being a shortturn.
Note that the th trip of a line is either, a whole trip or a shortturn. Thus, if then, and viceversa. Furthermore, if the th trip is a fake trip for line (resp. ), then , for all and for all (resp. for all ).


Binary Variables:

The decisions about the capacities of the trains at each trip are modeled using the following variables:
Note that if the th trip of line is a shortturn with capacity then, and . On the other hand, if the th trip of line is a whole trip with capacity (), the train traverses all the stations of the line and, in particular, the stations of the shortturn, and then, . So, we have the following relationship between the two sets of variables:
(2) Observe that we can detect whether the th trip is a fake trip for line (resp. ) by checking if (resp. ).


Auxiliary Variables: A set of auxiliary variables, computed from the previous ones, is considered in order to ease the reading of our Mathematical Programming formulation.

: Time instant in which a train departs from station at the th trip of line . This value can be obtained by adding the travel times between consecutive stations plus the stopping times at the traversed stations. Thus, this time is given by the following expressions:
(3) (4) Expressions (3) allow us to appropriately define our auxiliary variables by updating the time in which a train starts its journey at the first station of the line to the time spent until it leaves station for stations outside a shortturn. Equations (4) allow us to compute the arrival time of a train to any station of the shortturn. As mentioned before, trips that only cover stations in the shortturn are considered as fake trips for the stations which are not part of the shortturn and their corresponding departure times are supposed to be the same as the times of the previous trip. Variables permit to fit departure times of the stations in each shortturn trip. These variables will take value if the train cover the complete line.

: Number of passengers accumulated at station of line from the origin of the planning horizon up to time . This number is given by a function of the time and takes into account the external arrivals and also the passengers arriving to the station to change to another line. As mentioned above, a complete description of this demand function is given in Section 4.

: Excess of passengers that where not able to get on the train at station of the th trip of line because a lack of capacity of the train. To compute this variable we distinguish between whole trips in any of the pool of lines and shortturn trips in those lines containing a shortturn. In the last case we assume that the passengers catching a shortturn train with destination to a station outside the shortturn getoff from the train in the last station of the shortturn to catch a whole trip train of the same line. Taking into account these assumptions, the excess of passengers can be computed as follows:
(5) (6) (7) (8) (9) (10) In the first trip, the excess of passenger is computed by subtracting to the demand the flow of passengers already caught by the train (equations (5)(7)). For modeling the flow of caught passengers we take into account in (6) that the trip could be either, a whole trip () or a shortturn (). In the special case of the last station of the shortturn, one needs to add to the excess of passengers, those that getoff from a train of a shortturn trip to catch a train of a whole trip of the same line (equations (7)). For the rest of the trips (equations (8)–(3.2)), we take into account the demand accumulated after the previous trip plus the excess of persisting passengers of the previous trip.


Semicontinuous Variables: We also consider a set of semicontinuous variables collecting the excess of passengers only for true trips:
This set of variables will allow us to account in the objective function for the actual excess of passengers, i.e., the ones associated to true trips.
Table 2 summarizes the set of above mentioned variables.
Variable  Description 

Departure time from the initial station of line at its th trip.  
Flow of passengers captured in the station by the train that covers the th trip of the line , when is a whole trip.  
Flow of passengers captured in the station by the train that covers the th trip of the line , when only covers the shortturn stations.  
Difference between the actual departure time from the first station of the shortturn of the th trip of line and the time when it should depart from this station taking into account its departure time from the initial station of the line.  
Departure time from the station of line in its th trip.  
Number of passengers accumulated from instant up to instant in the station of line .  
Excess of passengers that where not able to get on the train at station at the th trip of line because of a lack of capacity.  
Excess of passengers only if is a true trip for station of line . 
Using the variables described above, we give now the formulation of our problem distinguishing the two main elements: (1) an objective function aggregating the economical and social costs of any feasible solution; and (2) a set of technical constraints where the line concept and the timetable are specified in order to serve the estimated flow of passengers over the planning horizon.
3.3. Objective Function
In what follows we describe the different costs and rewards that are considered in the objective function of our model, for a given line , in terms of the variables and parameters described above.

Capacity cost:
It accounts for the costs and corresponding to the train capacity for every trip of line . This capacity is controlled by means of the variables or depending on whether the line contains shortturns or not.() Observe that in the second expression, if proper shortturns with capacity are activated, one has and , and thus, the amount is accounted for. In case of whole trips with capacity , by (2), one has that and . Thus, the second addend is zero, and only is accounted for, modeling adequately the capacity costs.

Reward per served passengers:
The unitary reward per served passenger computes the estimated revenue received when passengers use a line. It is obtained as the average revenue given by the different transport tickets used by the passengers. In order to compute the overall reward per served passengers, observe that if the th trip is a whole trip, the expression returns the estimated number of passengers getting off at the station (coming from any other previous station). Note that, in case a shortturn is activated on any line , it is needed to add the rewards of passengers being routed within the shortturn () together with the reward from those which use the shortturn to getoff at the last station and continue with the next whole trip. Then, the overall reward per served passengers is:() 
Costs of nonserved passengers:
It accounts for the social cost incurred when a passenger cannot get on the train arriving to the station due to its lack of capacity. The unitary cost should aggregate some indicators of the service quality and some subjective measures of the satisfaction degree perceived by the passengers. The overall cost is computed by using the total number of passengers exceeding the capacity of the system at some instant of the planning horizon, scaled by the average penalties of persisting/giving up passengers, as follows:() Observe that a high excess of passenger at the end of the planning horizon can be easily avoided by increasing the parameters for the last trip or by adding appropriate constraints involving the variables.
The overall cost of using line during the planning horizon can be expressed as:
(COST()) 
3.4. Modelling Constraints
In what follows we describe the constraints linking the variables and parameters in our model. They have been classified in four main blocks: capacity constraints, time control constraints, flow control constraints and passenger surplus constraints.

Capacities and true/fake trips:
() () () () () () () where , i.e. the number of shortturn trips of line which fit within the period of time taken by a train to go from the head of line to the first station of the shortturn.
When shortturns are not allowed for a line, the appropriate definition of the capacity variables is ensured by constraints ()–(). They enforce that exactly one of the allowed capacities is chosen for the first and the last trip and at most one for the rest of them. Fake (resp. true) trips are identified by trips with capacities equal to (resp. greater than) zero. Thus, constraints () and () determine that the first and the last trip of each line are true trips. We will see later that this permits the actual trains to be scheduled from the beginning to the end of the planning horizon, providing the users a complete service during that time interval. Note that constraints () and () are also valid for lines allowing shortturns and then, when shortturns are allowed for a line, the appropriate definition of the capacity variables is warranted by constraints ()–(). Constraints () indicate that when a whole trip is a true trip, it is also a true trip for the shortturn stations. Constraints () fix that the first trip (being either, a whole trip or a shortturn) is a true trip. Constraints () force trip to be a true whole trip (resp. a fake trip) if the first trip is a shortturn (resp. if the first trip is a whole trip). These constraints, together with () are the equivalent to () for lines with shortturns, and they ensure that a real train is scheduled from the beginning of the planning horizon. Finally, constraints () are the equivalent to () for shortturn trips.

Time control:
() () () () () () () () () Constraints () and () state that the first and the last trip of each line should exactly start at the first station of the complete line at instant time and , respectively. If the line does not contain a shortturn, these constraints together with () and () ensure that there are trains traveling the line during the whole planning horizon. If the line contains shortturns, recall that constraints () permit the first trip to be a shortturn trip. In this case, constraints () together with constraints () force trip to be a true whole trip starting at time at the head of the line.
Constraints (12) ensure that the arrival times between consecutive trains satisfy the safety time window interval. Constraints () force a fake trip to operate on the line at the same time as the previous true trip. For lines allowing shortturns constraints (12)–() represent the same as the above but taking into account that in this case the trip can be either, a whole trip or a shortturn.
Observe that in (12) if the case is excluded. The reason is that constraints () force trip to be a true whole trip if and only if the first trip () is a shortturn and, in this case, constraints () ensure that trip starts at the head of the line station at time 0, and then, it is the first whole trip. Finally, constraints () and () fix the upper and lower bounds on the values of variables when the th trip is a shortturn, enforcing that this variable is if the trip is a whole trip.

Flow control:
() () () () () () The flow of passengers catching a given train is determined by the capacity of the train and the mobility pattern of people. Hence, the effective capacity of the trains arriving to a given station depends of the passengers that caught the train in previous stations of this same trip and whose destination is a subsequent station of the line. Such an effective capacity is warranted, depending on the case (shortturn allowed or not), by constraints () and (). With constraints () (resp. ()) we ensure that the flow of passengers captured at a given station by a train that covers the first trip (resp. the th trip for ) is at most the demand of passengers accumulated at station since the beginning of the planning horizon (resp. since the instant in which the previous train departed from that station plus the passengers that were not able to get on the previous train because of lack of capacity and wait for the next one). Constraints () and () are the analogous ones for shortturn trips.

Passenger surplus:
In order to compute only the surplus of passengers of a true trip, we use the set of semicontinuous variables :
which can be linearized as follows:
() () being a large enough constant bounding the surplus of passengers at any station of line .
3.5. A compact MINLP formulation
According to the decision variables, the objective function and the constraints described above, the following Mathematical Programming formulation is valid for our line planning and timetabling model:
s.t.  
(P)  
Observe that although the above formulation seems to be separable by lines in , the lines are linked through the demand functions (constraints ()()) which represent the accumulated flow of passengers awaiting for a train at a given station of line at time instant . As we will describe in Section 4, such a flow is affected not only by the line but also by other lines through passengers changing of lines at transfer stations. This function introduces new variables and non linear constraints into the above formulation.
Several extensions may be easily accommodated within the above model as highlighted in the following remarks:
Remark 1.
In our model the speed of trains is considered to be constant during the whole journey. However, one can easily modify expressions (1), (3) and (4) using variables to decide the speed of the train during a trip of line between stations and . For instance, let be the physical distance between stations and and let represent the inverse of the speed, i.e., , then one could replace the travel times by in expressions (1), (3) and (4). By adding to the objective function a cost assessing the resource consumption due to speed changes one can have a more general model preserving the structure of the one stated above. Similar modifications can be also considered by enabling the model to decide about variable stopping times at any station.
Remark 2.
The model allows us to join two consecutive planning horizons by passing data about numbers of passengers and arrival/departure times obtained from an optimal solution on the first planning period as input data for the second one, and so on. In particular, the passengers that may remain at station of line at the end of the first planning horizon can be considered as passengers at station to use line at the beginning of the second planning horizon, and so on. This information has to be incorporated in order to compute the demand function, as we will see in the next section.
4. The Demand function
One of the main goals of our model is to incorporate, in the design of the line planning and timetabling of an existing network, information about the flow of passengers moving through the network during the planning horizon. Clearly, the flow of passengers arriving to a given station is a random variable. Thus, we will incorporate to the model an estimation of its average value.
In order to model the number of passengers entering to the transportation system through a given station of a fixed line, we use the socalled demand function, which maps at a given instant the accumulated number of passengers wanting to catch a train at this station (from the beginning of the planning horizon). Here, the estimation process should be carefully done in order to capture the essential behaviour of the demands served by the system.
Different shapes for the function are possible within this framework to approximate the demand. The choice of such a shape is a crucial step in the modeling process since one has to find an equilibrium between obtaining accurate estimations and providing manageable mathematical programming formulations. Once again, motivated by our pilot experience with Metrolab®, we use a piecewise linear approximation whose slope is fixed for any given station , but whose breakpoints and discontinuities may change according to the flow induced by external block of arrivals and by the rest of the lines. For a given line and a station , we estimate the demand function as follows:
() 
for , with , i.e., the maximum time in which the train can reach the last station of the line, and where:

is the number of passengers awaiting a train of line in the station at the beginning of the planning horizon.

is the average rate of passengers arriving to the station of line by unit of time.

is the sum of the external block of arrivals of passengers up to the instant to the station of line .

is the sum of the block arrivals of passengers up to the instant to the station of line from line .
In what follows we will refer to as the extended planning horizon for line . Thus, the demand function at a given time instant in station of line , consists of three parts. The first one is a linear part, in which, from an initial number of passengers, , the number increases by a rate . However, such a base estimation may be modified either by external block of arrivals (second part), , or by passengers coming from other interacting lines at interchange stations (third part), .
As can be seen, in the formulation (P), the demand function is used exclusively to access flows in the set of instants for and . Each of the time instants in which the demand function needs to be evaluated induces some sets of inequalities and variables as those described in subsections 4.1 and 4.2 (for external and internal arrivals).
4.1. External Arrivals:
We consider that we are given both a set of breakpoints representing time instants when the block of arrivals occur and the amounts of passengers entering to the system at these instants for each station . That is, we assume that a set of sorted instants as well as discontinuity flow jumps associated to each of those instants are known, i.e., a block arrival of is assumed at time instant , for . The external arrivals represent block of arrivals of passengers for instance, due to the end of a football match in a place close to one of our stations. For the sake of readability and without loss of generality, we will assume that